Skip to content
This repository has been archived by the owner on Jul 8, 2024. It is now read-only.

Querying membership in iterables #43

Closed
bakkot opened this issue Jan 7, 2019 · 15 comments
Closed

Querying membership in iterables #43

bakkot opened this issue Jan 7, 2019 · 15 comments

Comments

@bakkot
Copy link
Collaborator

bakkot commented Jan 7, 2019

In cases where are given an iterable which we need to query membership in, assuming we decide (#35 / #41) not to immediately throw, we don't need to consume the whole iterable. (Current spec text starts by consuming the whole iterable.) For example:

Set.prototype.isSubsetOf = function (iterable) {
  if ([[SetData]] in iterable) {
    for (let item of this) {
      if (!iterable.has(other)) {
        return false;
      }
    }
    return true;
  } else {
    let seen = new Set; // pretend this is something spec-internal, not an actual Set
    let iterator = iterable[Symbol.iterator]();
    outer: for (let item of this) {
      if (seen.has(item)) {
        continue;
      }
      let record = iterator.next();
      while (!record.done) {
        seen.add(record.value);
        if (SameValueZero(record.value, item)) {
          continue outer;
        }
        record = iterator.next();
      }
      return false;
    }
    return true;
  }
};

I think we should do this. I generally expect that methods which can give their answer early will do so, like Array.prototype.every.

@bakkot bakkot changed the title Converting iterables to sets Querying membership in iterables Jan 7, 2019
@ljharb
Copy link
Member

ljharb commented Jan 7, 2019

What about adding a method to Iterator.prototype to query membership in the iterable? That way all iterators could provide that functionality.

@bakkot
Copy link
Collaborator Author

bakkot commented Jan 7, 2019

There's no way of doing that in general with reasonable time complexity. The implementation above makes it work by keeping its own cache of things already emitted by the iterator, but that's not something a method on Iterator.prototype could reasonably do - there wouldn't be any way to guarantee that the underlying structure hadn't changed between two invocations of the membership-querying method, which would invalidate any such cache.

@ljharb
Copy link
Member

ljharb commented Jan 7, 2019

I wouldn't expect the default implementation to be performant; but any iterator produced by, say, a Set could be much more so.

@bakkot
Copy link
Collaborator Author

bakkot commented Jan 7, 2019

I don't think it'd be a good idea to add a membership querying method which was linear in the size of the iterator in the general case, especially given that there exist infinite iterators. For cases like this, you wouldn't want to use such a method: the implementation above is linear in the size of the iterable, but if it used a membership querying method, it would be quadratic.

@ljharb
Copy link
Member

ljharb commented Jan 7, 2019

Sure, I guess I'd assume that an infinite iterator would provide its own querying implementation, but maybe that's not a reasonable default.

It could also be optional - ie, if the querying method isn't present, then the consumer of the iterator could choose an implementation, or throw, eg.

@bakkot
Copy link
Collaborator Author

bakkot commented Jan 7, 2019

If it were optional it couldn't go on Iterator.prototype, surely? The only way to put it on Iterator.prototype is to provide a default implementation in terms of the existing iterator protocol, which would have to either be linear, risk becoming invalidated, or throw.

I do think it's a good idea to introduce a membership querying protocol in general, but I would only want to provide it for things which could answer queries in constant (or I guess log) time. And I'd put it on the containers, rather than the iterators themselves.

@gsathya
Copy link
Member

gsathya commented Jan 11, 2019

I grepped through the Google code base to see if such patterns appear but I don't see anything like this. I find the code snippet you have to be unnecessarily complicated for such a trivial method and I'm not convinced it's worth it to support this use case given how rare this pattern is.

@bakkot
Copy link
Collaborator Author

bakkot commented Jan 11, 2019

I'm not convinced it's worth it to support this use case given how rare this pattern is.

I thought the idea was that iterable-as-argument wouldn't be rare?

The advantage of checking membership as you iterate rather than waiting until you finish is that

(new Set([0, 1])).isSubsetOf([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ..., 999])

terminates after two steps of the iterator protocol on the argument rather than 1000. That seems worth caring about to me.

(You'll still have to go through the entire argument in cases which return false, of course.)

@ljharb
Copy link
Member

ljharb commented Jan 11, 2019

I think iterable-as-argument will be common; non-set as receiver will be uncommon.

@bakkot
Copy link
Collaborator Author

bakkot commented Jan 11, 2019

Yeah, this is (mainly) about the iterable-as-argument case.

Another advantage of this approach: it means emptyset.isSubsetOf(iterable) returns true immediately without walking the iterable and without any special handling, which seems desirable.

@gsathya
Copy link
Member

gsathya commented Jan 14, 2019

All of these optimizations are something the engine can do already. The early return optimization should work for any unpatched builtins as arguments (Sets, Arrays, Strings, etc).

I'm not sold on adding all the extra complexity for the very narrow case of slowing down non builtin iterable (ie, user defined iterables) which already have a big performance cliff. It's better to focus on making the slow paths of passing Arrays, Strings (ie, non Sets) as arguments faster as they are more common. Although I don't think the changes you've requested cause the fast path to become slower (at least I can't think of any right now), I find it valuable to keep the spec as simple as possible when trying to optimize away observable behaviors.

@gsathya gsathya closed this as completed Jan 14, 2019
@bakkot bakkot mentioned this issue Jan 15, 2019
4 tasks
@gsathya
Copy link
Member

gsathya commented Jan 15, 2019

@bakkot, I don't know if you already knew this, but I just realized that user defined iterables having a has method can get the eager behavior now with your PR #4.

Essentially the following have the eager return behavior:

  • Sets
  • Other iterable builtins (like Arrays, Maps, Strings)
  • User defined iterables that have a has method (like a FrozenSet)

Only user defined iterables that don't have a has method are iterated to exhaustion.

@bakkot
Copy link
Collaborator Author

bakkot commented Jan 15, 2019

Yeah, but that seems like a fairly common case. Most iterable things don't have has.

(Also, I don't know how much one can trust "iterable builtins" being optimized. Would this include NodeLists, for example? On all browsers?)

@gsathya
Copy link
Member

gsathya commented Jan 15, 2019

Yeah, but that seems like a fairly common case. Most iterable things don't have has.

This seems like a case of moving goal posts. We went from supporting WeakMap and FrozenSet to having the eager return for them to this now.

(Also, I don't know how much one can trust "iterable builtins" being optimized. Would this include NodeLists, for example? On all browsers?)

NodeLists, probably not. I only mean builtins defined by ES262.

@bakkot
Copy link
Collaborator Author

bakkot commented Jan 15, 2019

This seems like a case of moving goal posts.

I was OK with not supporting iterables at all. But if we're going to support iterables, we should do so as best we can. "Things which support both iteration and .has" and "things which only support iteration" both seem like fairly common cases.

NodeLists, probably not.

Hm, ok. I don't know that myPossiblyEmptySetOfNodes.isSubsetOf(document.querySelectorAll('.some-common-class')) is necessarily that edge-y of a case, such that we should be OK with it taking much longer than it should in some cases. That's not even user-defined.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants