-
Notifications
You must be signed in to change notification settings - Fork 15
Conversation
It will break |
Yup, that's true. That seems fine to me, but there are other possible resolutions - for example, make all of |
@bakkot it will cause possible conflicts with non-standard |
spec/set-prototype-is-subset-of.html
Outdated
1. Let _iter_ be ? GetIterator(_set_). | ||
1. If _iter_ is undefined, throw a *TypeError* exception. | ||
1. If Type(_O_) is not Object, throw a *TypeError* exception. | ||
1. Let _hasCheck_ be ? Get(_O_, "has"). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How much harder would it be to make it both iterate the receiver and the argument, and not use “has” at all?
@zloirock That's true. I expect it to be quite rare; the far more common case will be to pass an array without @ljharb, very easy technically. The problem is that it would make this operation |
In the case where the receiver is a Set with an unmodified Symbol.iterator, would it not be O(m) at that point? |
I don't think so. It needs to perform an operation for each element of the receiver (i.e., check if it is contained in the argument). When the argument is a Set with an unmodified Symbol.iterator then it can be done in |
524c929
to
2f38d5f
Compare
I really like how all these Set methods work out of the box for non Set arguments too. It'd be great if we could have this method be consistent with that. |
Non-Set arguments make sense for the other Set methods, because they only care about iteration of their respective arguments (at least given how they're currently defined). But |
Alternatively, |
@bakkot allowing non-set argument is an important case for me. As I wrote, adding new "hasable" protocol could cause too many problems. I can't understand, why you against of internal conversion non-set argument to |
The proposal already adds such a protocol;
See #35 for that discussion. Anyway, I'm not super against converting non-set arguments to sets (though I am a little against it because of time-complexity concerns, and it is observable, since invoking the Set constructor on an iterable performs an observable lookup of (and calls to the current value of) |
It's not the same. Those methods are generics, but they are placed on new Set(iterable1).method(iterable2); and don't think have those iterables new Set([1, 2]).isSubsetOf(new Map([[1, 2], [2, 3], [3, 4]])); // -> ?
new Set([1, 2, 3]).isSupersetOf(new Map([[1, 2], [2, 3]])); // -> ?
You are right, I forgot about it, however, a new set can be initialized not in the constructor to prevent it.
Yes. However, the rest options look worse. As an option, we could convert an argument to internal |
I understand the appeal, but... not all methods are the same. Some methods need to iterate their argument; others need to query membership in it. (And there are other kinds of method too, like
Yeah, that's proposed above. As I say, I think making what should be an
I mean, yes, but the current proposal's behavior for those objects is strictly less useful than this PR's, so I'm not sure what point you're making. As yet another alternative, how would you feel about a symbol-based protocol for membership querying? I'm not proposing we depend on that proposal, just that we add a |
A symbol-based protocol seems much more robust, at least - especially if it could be genericized to work on Map, Set, and Array. |
One more point I keep forgetting to mention: in the current proposal, |
At least, it should be consistent with the paired method -
It should not work anyway because weak collections are not iterable and we can't make work
I think that it at least more useful than removing
I'm not a fan of adding well-known symbols and methods to many built-ins for this too narrow case. It's over-complication of this method. |
It's already inconsistent. As it must be, because
I don't share this intuition at all.
I'm sorry, I don't know what this means. |
I'd rather have this be part of the spec. It's pretty much impossible to optimize away the new Set in the baseline tier if we ask the user to create a Set.
Can you give me an example of this? We're pulling out the @zloirock I'm hiding your comment as it's really not helping the discussion. |
How do you feel about this alternative, or the one at the end of this comment? I think they'd be about as optimizable as the current spec, except with one more observable lookup first. Also, how bad is it to not optimize that construction away? It means an extra object allocation, but I would naively expect that the main bottleneck would be iterating the argument and creating a set structure holding its contents, which you presumably want to do in either case. I would expect that the only difference would be whether you create a JS object which exposes the the underlying set. But this is just speculation on my part.
Sorry, yes, it'll work for subclasses as currently specified. It won't work for, say, WeakSets, or anything else which behaves like a set but lacks the relevant internal slot. Edit: Oh, and the other half of it is that |
Just to confirm -- #41 (comment) would mean that instead of the current From an implementation POV, the The Symbol approach has pretty much the same performance characteristics as the
Maybe it's not too bad but if the only difference is that we support WeakSet by making this change, then I'm not sure if it's worth it to pay for both performance and memory.
I agree that it wouldn't work for WeakSets, but personally I think that's fine since WeakSets are not iterable, although I agree that they are a Set kind-of object. I don't see why your frozen set wouldn't work though -- we'd just create a new Set when it fails the internal slot check, right? |
Right.
Whether something is iterable is completely unrelated to whether it makes sense to ask if something is a subset of it, though. I don't get the intuition that this is relevant.
Sorry, I guess I'm overloading "work" slightly. Yes, it would end up returning the right result. But it would take time which is linear in both the size of the receiver and the size of the frozen set, which can be dramatically worse than it should be (which is just linear in the size of the receiver). |
Fair enough, I see your point. Do you think your use case could be better solved by having a
Ah, I see. Do you think this is something we should be optimizing for though? I would say that we should optimize for the 99% case which is probably something like a Set or even an Array sometimes. |
Sorry, missed this initially: why is that not already the case for the case for Sorry for my lack of familiarity with this part of engines.
You can't have
I agree that our first concern should be the case where both are Sets (and not subclasses, and no one has messed with any primordials, and so on). Though it's still worth the cost of a couple observable lookups to avoid directly depending on the internal slot, I think, if that's all it takes. (That's what we're doing for the other methods already, after all!) But for cases outside that, it's less clear. In particular, I am not at all sure it's worth optimizing (in the sense of avoiding a baseline call to the Set constructor) for the case where the argument is an array if that means that it would not work (with appropriate time complexity) on cases where it ought to (like WeakSet or FrozenSet). The tradeoff seems to be "users have to write very slightly more code in cases where they pass something which does not support membership testing and those cases are a small constant amount slower" vs "passing things like FrozenSet will silently have significantly worse time complexity and passing things like WeakSet won't work at all". I prefer the former cost to the latter, myself, if that's an accurate summary. |
That's correct. I'm just comparing the internal slot check with the has method check. Given a choice between the two I'd prefer the internal slot check.
I haven't thought too much of the GC issues here but isn't this already true with the WeakRefs proposal?
Personally, I weigh the Array use case way more compared to the WeakSet so this trade off is fine with me. I understand your argument about not having this method work with an iterable, but this is still important to me. To better understand why I keep going to back to my should work with iterable argument, I looked into other languages to see how they implement these methods. Swift, C#, Python allow an iterable to be used as an argument; whereas Rust and Ruby require the argument to be a Set -- but they also mandate all the other Set methods take a Set too. I don't think this is conclusive but there's precedence either ways. I'm coming around to the idea of just checking for the |
It still seems to me that, in general, for both receiver and argument, it seems the most versatile to have a check on all the methods like "if it has the slot, proceed, else SpeciesConstruct with the receiver/arg; then call the |
This wouldn't work for @bakkot's WeakSet use case right? |
@gsathya maybe for this use case it would be better to add |
The latter option seems best to me as well. |
I'd be fine with any of those. I don't think we should shy away from adding new Symbol-based protocols even in the absence of the protocols proposal, but am OK with delaying that to a separate proposal. I do think it's important that we do it fairly soon after this proposal, because silently falling back to worse-than-expected time complexity is bad. With a few caveats about the last item:
Still interested in your thoughts on the "Also:" and "Also also:" questions in my comment above. Would you as an implementor be OK with a later proposal which did the following:
? Because that's what I'd want from a follow-on. |
After thinking about this some more, the internal slot check wouldn't work as you'd expect if you use a Map as an argument (even though it's an iterable). With the current spec text, we'd create a Set of arrays from the Map (by iterating the Map) and use that to do the
But in Python, this would evaluate to
I know this isn't super intuitive but I have used this before. With the has method check instead of the internal slot check, it would work nicely for Maps. This makes me like the 'has' method check more than the internal slot check and be in favor of the second option I've listed above.
The reason I didn't do it for these methods is that I think it's reasonable to assume that the receiver is a Set, as these are not static methods. It seems needlessly complicated to allow any kind of receiver too.
This follows from my earlier point about it being reasonable to assume that the receiver is also a Set.
Looks pretty complicated but I think this is a reasonable request. I'd have to think more about the spec text to see if it's optimizable.
|
Passing a Map anywhere that uses the iterable protocol should only use the values of the default iterator; you’d use |
Sounds like a good use case for a symbol-based protocol! :D I think it would still be OK to introduce such a protocol on top of option 3 later, even including adding it to Map.prototype, because as you point it's totally useless to iterate the map right now and so unlikely anyone would rely on that behavior continuing. But it's a little riskier than just adding it to Set and WeakSet.
I agree it's a reasonable assumption, but it seems like it would be strictly more useful to make them more generic, and shouldn't slow down the common case. We do the same sort of thing for Array methods and I think it's beneficial there. (Plus it means this proposal only has two different ways it can treat an object, rather than three.) I know I at least would find use for But I don't feel especially strongly about this.
Well, sure, but other methods which can make a similar assumption don't. ( |
@bakkot, one more quick idea -- what if we use the SpeciesConstructor to construct the Set in this method? We'd just create a new instance of FrozenSet and then lookup and use the has method on this leading to O(1) lookup?
There's tons more we could do to make all these APIs more flexible, but I want to ship this sooner rather than later and we need to draw a line on what's really useful here.
Then you should patch Array.prototype.has = Array.prototype.contains and it'll work. I think it's fine to distinguish between the receiver and argument having different types and thus have different behaviors. Looking at the Google codebase, I don't see a single instance of a Set.prototype method being called with a different receiver, but I see calls to Set.prototype methods with Arrays as arguments. I think it's a reasonable expectation to have these Set methods not work out of the box for non Set receivers. |
Wouldn't that mean that you'd construct an Array when passed an Array? That seems like not what you want. Or am I not understanding your suggestion?
I mean... these APIs are going to be around for decades and used by millions of people. I think it's worth making them broadly more consistent and useful, where it's possible to do so without costs other than taking slightly longer to spec and implement.
Well, yes, because none of them make sense for non-Set receivers at the moment. But I'm pretty sure there are lots of instances of Array.prototype methods being called with different receivers. |
Ah, yes that's true :'(
I'm fully aware of this, but it just seems like we're nitpicking on every little detail here, most of which are just really subjective.
I meant closure's Set.prototype methods |
I think it's important to get details right. If you think there's some user-facing reason that Also, regardless of whether |
This is our responsibility as champions and reviewers. We should think deeply and discuss at length every detail of a proposal. |
Okay, I don't think there's one objectively correct answer here. Given that, let's try to come to consensus here.
I don't think it's useful, it's just unnecessary fallback behavior. But, it's easy to optimize away and spec it so I'll add it.
Ok. Please submit a PR for this I'll merge it. Can you also update your current PR to construct a new Set instead of throwing if the |
I'll land this PR once it's updated to do this. |
@gsathya Since #45 landed, this PR basically amounts to a switch from the third to the second of the options you laid out above. It sounded above like you preferred your third option above (use [[SetData]] check for now, maybe introduce a Symbol later), which I think corresponds to the current spec. Do you still want to merge this? Like I say I'm fine either way (though @michaelficarra expressed a preference not to take option 2); happy to update this if you do want to make that switch. |
f5c2b0a
to
0e05bed
Compare
Updated anyway, though. |
Yeah, I think I changed my mind in #41 (comment) and agree with you about membership testing being the core here. |
My preferred resolution to #35, in PR form because my comment got a little long.
This makes
isSubsetOf
identical toisSupersetOf
except that it iterates its receiver and getshas
from its argument, instead of the other way around. Iterating its receiver matches all the other methods in this proposal exceptisDisjointWith
andisSupersetOf
, but gettinghas
from its argument makes it unique among all methods in this proposal or in the spec (thoughisDisjointWith
andisSupersetOf
are unique in that they gethas
from their receiver).