Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[[Enumerate]] and for-in on proxies can produce non-string keys #160

Closed
rossberg opened this issue Nov 4, 2015 · 50 comments
Closed

[[Enumerate]] and for-in on proxies can produce non-string keys #160

rossberg opened this issue Nov 4, 2015 · 50 comments
Labels
needs consensus This needs committee consensus before it can be eligible to be merged.

Comments

@rossberg
Copy link
Member

rossberg commented Nov 4, 2015

There currently is nothing in either the spec of for-in, or the spec of proxies' [[Enumerate]] method, that would ensure that all keys are converted to a string. Consequently, a for-in over a proxy could produce arbitrary key values. That may break code that assumes it can only get strings, and is particularly strange because [[Enumerate]] on ordinary objects explicitly excludes symbols to avoid this problem.

(As a secondary consequence, the sample implementation in 9.1.11 also is broken, because if proto is a proxy, it might return equivalent keys as different values (e.g. "10" vs 10), such that their lookup in the visited set fails due to the lack of canonicalisation.)

To fix this, [[Enumerate]] on proxies probably needs to create a wrapper iterator that invokes ToString on each value returned from the trap.

@rossberg
Copy link
Member Author

rossberg commented Nov 4, 2015

See also #161.

@rossberg
Copy link
Member Author

rossberg commented Nov 4, 2015

Additional point: there also is no guarantee that the same key isn't returned multiple times by a proxy's enumerate trap, and no place where it would be filtered (unless the proxy is only a prototype, see #161). The wrapper iterator constructed by proxy [[Enumerate]] should also do this filtering. (If it didn't, that might be a huge problem for implementations that factorise the filtering differently.)

@anba
Copy link
Contributor

anba commented Nov 4, 2015

The restriction to return only string-valued, non-duplicate property keys is only normative for 9.1.11. In https://bugs.ecmascript.org/show_bug.cgi?id=4531 the missing check for 9.1.11 is documented.

@rossberg
Copy link
Member Author

rossberg commented Nov 4, 2015

That may be so, but currently the spec actually requires to return duplicates and non-strings, unlike in other cases, and that is a serious problem for implementations like V8 (cf. #161), because it breaks our entire implementation of for-in. I suspect the same is true for other VMs. I think the spec needs to adjust.

@anba
Copy link
Contributor

anba commented Nov 4, 2015

Edge and SpiderMonkey can both return duplicate and non-string keys. Tested with:

var proxy = new Proxy({}, {
  enumerate() {
    var keys = ["a", "a", 1];
    return {
      next() {
        if (keys.length) {
          return {value: keys.shift(), done: false};
        }
        return {value: void 0, done: true};
      }
    };
  }
});
// Prints:
// key="a" [typeof=string]
// key="a" [typeof=string]
// key="1" [typeof=number]
for (var key in proxy) { console.log(`key="${key}" [typeof=${typeof key}]`); }

@rossberg
Copy link
Member Author

rossberg commented Nov 4, 2015

I see, thanks for the test. Nevertheless, this is a problem for us. The consistency and compatibility arguments I made above also still stand. What would be an argument in favour of this semantics?

@anba
Copy link
Contributor

anba commented Nov 4, 2015

You probably give me the evil eye when I use the standard response to compatibility concerns for legacy code in conjunction with new features: "As soon as you start to use new features, like Proxy, it's no longer pre-ES2015 legacy code. So there cannot be any backward compatibility issues." 😑

I guess it's necessary to check old meeting notes and es-discuss mails to find out why [[Enumerate]] was specified the way it is.

@bterlson bterlson added the needs consensus This needs committee consensus before it can be eligible to be merged. label Nov 5, 2015
@bterlson
Copy link
Member

bterlson commented Nov 5, 2015

There is a high bar to changing approved ES2015 semantics especially when there exist many implementations already conforming to the current spec. @rossberg-chromium you going to be at TC39 in November and want to take this item up (and probably #161)? Or maybe someone else from the Chrome team?

I don't really buy the compatibility argument. It would only be a problem if you implemented the enumerate trap in a buggy fashion. You could mess up implementing many traps and introduce similar problems. In general, introducing a proxy to your program will require a careful implementation in order to run without issue in existing code.

Confused about the consistency argument. Is it: since ordinary [[Enumerate]] filters out symbol-valued properties, proxy [[Enumerate]] should only return string properties? If so, I'm not sure this is true - wouldn't the same argument call for, for example, [[Delete]] only being allowed to return true if the property was actually deleted or not present on the target? In general it seems OK that proxy traps are allowed to return strange things.

That said, I don't think this is valuable behavior so I'm not necessarily opposed to changing it.

@rossberg
Copy link
Member Author

rossberg commented Nov 5, 2015

The compatibility argument that has been made in several occasions during ES6 discussions -- regarding for-in but also some of the pre-existing reflection methods on Object -- was that existing code might rightfully expect to only get string keys, and might break if it was fed a new form of object. That concern led to filtering symbols in for-in (for regular objects), as well as in Object methods. AFAICS, the same argument applies here, in even stronger form. The current filtering would be entirely moot if it could be undermined by proxies.

So the way I see it, either we decide that the filtering elsewhere was a mistake, or we should consider it a bug that it is not applied in this case.

We can discuss at the meeting. Adam & Dan will be there.

@allenwb
Copy link
Member

allenwb commented Nov 6, 2015

Let's talk about MOP invariants. In general, we limited the explicit invariants listed in 6.1.7.3 to things that were believed essential for security reasons. The MOP implementation provided by Proxy only attempts to enforce those invariants. Any deviation of expect behavior of the MOP operations by a Proxy is just a program level bug. The motivation for this was to minimize the required overhead of Proxies.

I buy the argument that the legacy semantics of for-in (and other legacy uses of [[Enumerate]]) is that the values it returns are strings. But rather than enforcing that in [[Enumerate]] I think it should be enforced by the legacy consumers of [[Enumerate]]. Basically, a ToString needs to be applied to the result of the next calls to the Iterator returned from [[Enumerate]]

The current factoring of the semantics of for-in/for-of means that the easiest way to specify that
ToString call is in step 7.c of ForIn/OfHeadEvaluation. Basically instead of returning the value of obj.[[Enumerate]]() return an spec. level Iterator that wraps next calls to obj.[[Enumerate]]() with a ToString/

@rossberg
Copy link
Member Author

rossberg commented Nov 6, 2015

@allenwb, same question as in #161: wouldn't it be more consistent to do this in Proxy [[Enumerate]], because that's where it is done for ordinary objects, too? Seems weird to me to implement such legacy compat logic in different places for different kinds of objects.

@bterlson
Copy link
Member

bterlson commented Nov 6, 2015

@rossberg-chromium I guess, for me, having for-in include symbols is quite likely to cause issues for people as adding symbols to objects will break downstream consumers, meaning you just can't use them easily without having downstream consumers change their code. Proxies seem less bad because you can implement your proxy in a non-buggy fashion. But, I agree there is a compatibility concern there, it just seems less likely to be hit.

I like @allenwb's proposal - I wonder if that meets with your satisfaction as well?

@allenwb what about the issue of allowing [[Enumerate]] to return duplicate keys? Should ForIn/OfHeadEvaluation take care of this case as well by maintaining a set of seen keys?

@allenwb
Copy link
Member

allenwb commented Nov 6, 2015

@bterlson I say no filtering of duplicates in for-in. In general, I don't think enforcement of complex invariants should be pushed down to this level. Basically we should avoid edge case overhead in low level operations as much as possible.

Ordinary Object [[Enumerate]] already imposes a no duplicate srequirements, so in theory,the only way a for-in could see duplicates is if the object expression evaluates to a Proxy or other exotic object. I think that is rare enough to just let the possibility of duplicate keys surface as a program bug.

@bterlson
Copy link
Member

bterlson commented Nov 6, 2015

@allenwb It is impossible for non-proxy and non-exotic-object objects to have non-string-valued keys in its enumeration as well, though, so wouldn't that argue for not bothering with the ToString coercion either just to fix buggy exotic objects and proxy implementations? Or am I missing something?

@allenwb
Copy link
Member

allenwb commented Nov 6, 2015

@bterlson Judgement call, and the cost difference between simple (eg, type) invariants and complex structural (eg, check for duplicates) invariants.

So, I agree that the ToString is not really necessary (and hence I didn't originally include it) but I also think that adding the ToString is reasonable if there is consensus that allowing non-strings values into for-ins is a significant problem.

Generally, there is a lot of grey in these sorts of issues.

@rossberg
Copy link
Member Author

rossberg commented Nov 6, 2015

Re duplicate filtering: in V8, the duplicate filtering is done in the for-in implementation, as part of caching the keys at the start of the loop. That is agnostic to where the keys are coming from. It would be quite difficult for us to change that to have a special case for the direct iteration of proxies (and proxies on the prototype can get their duplicates filtered per the spec -- it is even recommended by the sample implementation of Object [[Enumerate]]!).

In other words, I'm fine if the spec does not prescribe filtering of duplicates for proxies, but it should not forbid it. In the spirit of the existing underspecification of for-in (which is there to accommodate existing implementations), perhaps we can leave this un(der)specified as well?

@bterlson
Copy link
Member

Enumerate should only return string keys. If a non-string key is returned, a TypeError is thrown. So say we all! (PR's accepted :))

@bterlson bterlson removed the needs consensus This needs committee consensus before it can be eligible to be merged. label Nov 18, 2015
@rossberg
Copy link
Member Author

So not ToString, but just a type check? Not what I expected, but I guess it's okay.

@anba
Copy link
Contributor

anba commented Nov 19, 2015

Is the TypeError thrown in 9.1.11 [[Enumerate]] or in 13.7.5.13 ForIn/OfBodyEvaluation or in both?

@bterlson
Copy link
Member

@anba I think the current impl by @GeorgNeis should be sufficient - the check is actually in 9.5.11 [[Enumerate]] which seems appropriate as only proxy traps can produce non-string keys. If this seems bad, please reopen!

@bterlson
Copy link
Member

Cool, I'll write a PR :)

@anba
Copy link
Contributor

anba commented Jan 14, 2016

7.4.9 CreateListFromIterator(iterator [ , elementTypes ])
1. Assert: Type(_iterator_) is Object.
1. If _elementTypes_ was not passed, let _elementTypes_ be (Undefined, Null, Boolean, String, Symbol, Number, Object).
1. Let _list_ be an empty List.
1. Let _next_ be *true*.
1. Repeat, while _next_ is not *false*
  1. Let _next_ be ? IteratorStep(_iterator_).
  1. If _next_ is not *false*, then
    1. Let _nextValue_ be ? IteratorValue(_next_).
    1. If Type(_nextValue_) is not an element of _elementTypes_, throw a *TypeError* exception.
    1. Append _nextValue_ to the end of the List _list_.
1. Return _list_.


9.5.11 Proxy [[Enumerate]]
1. Let _propertyKeys_ be ? CreateListFromIterator(_trapResult_, « String »).
1. Return CreateListIterator(_propertyKeys_).

IterableToArrayLike( _items_ )
1. Let _usingIterator_ be ? GetMethod(_items_, @@iterator).
1. If _usingIterator_ is not *undefined*, then
  1. Let _iterator_ be ? GetIterator(_items_, _usingIterator_).
  1. Let _values_ be ? CreateListFromIterator(_iterator_).
  1. Return CreateArrayFromList(_values_).
1. Assert: _items_ is not an Iterable so assume it is already an array-like object.
1. Return ? ToObject(_items_).

@bterlson
Copy link
Member

... or you will :-P

@bterlson
Copy link
Member

This is so good. Please send as a PR!

IterableToArrayLike doesn't use IteratorClose even for abrupt completions. I wonder if this is deliberate now...

@anba
Copy link
Contributor

anba commented Jan 14, 2016

IterableToArrayLike doesn't use IteratorClose even for abrupt completions. I wonder if this is deliberate now...

IteratorClose is never called when an iterator itself throws an abrupt completion, so we're safe here.

@bterlson
Copy link
Member

Ahh yes, makes sense.

@bterlson
Copy link
Member

The location of IterableToArrayLike is somewhat questionable now. Should we make a new clause 7.5 for operations on iterables?

@anba
Copy link
Contributor

anba commented Jan 14, 2016

I think it's okay to leave IterableToArrayLike at its current position, because the abstract operation is only used for typed arrays.

@allenwb
Copy link
Member

allenwb commented Jan 15, 2016

Why are you guys going down this path?? It seems terrible to me. As I discussed in #160 (comment) it wasn't intended that Proxy should enforce this sort of complex invariant. I thought the consensus articulated in #160 (comment) was that if a "non-string key was returned" (presumably from a next call to the value produced by [[Enumerate]]) that a TypedError would be thrown. That would have to happen within for-in if you don't want to force flushing to the enumerator before starting the loop. Which is the whole point of using an iterator.

Who is helped by any of this complexity. People an do crazy things like change %ArrayIteratorPrototype%.next. That doesn't mean we have to do things at this level to try to remediate such possibilities. All we have to do is made sure that doing such things produce predictable behavior.

If somebody creates Proxy with an enumerate trap that violates the requirements of [[enumerate]] that should be viewed as just a program bug that will break for-in loops that assume they will get string values.

@bterlson
Copy link
Member

What seems terrible? All we're suggesting here is using a different mechanism to exhaust the iterator returned from proxy enumerate than originally proposed in #280 such that it isn't affected by the global environment. This seems much better to me.

Also, the original motivating issue in #160 was that proxies allowed for-in to see non-string keys which seems bad. With the PR provided by #280, this fixes the issue for proxy traps that are misbehaved, but then adds the case @anba pointed out as another way for for-in to see non-string keys. It seems to me that fixing #160 requires what @anba proposes above. Maybe I'm wrong though and @rossberg-chromium would be ok with the original fix in #280?

@bterlson
Copy link
Member

btw @anba I agree, I assumed it was used elsewhere but that is not the case.

@anba
Copy link
Contributor

anba commented Jan 15, 2016

It seems terrible to me

Is this actually now the first proxy method where a non-existent trap can produce a different result when compared to using the default delegation through Reflect?

function printKeysWithDelete(o) {
  for (var k in o) { delete o.b; print(k); }
}
// Prints: "a c"
printKeysWithDelete({a:0, b:0, c:0});
// Prints: "a c"
printKeysWithDelete(new Proxy({a:0, b:0, c:0}, {}));
// Prints: "a b c"
printKeysWithDelete(new Proxy({a:0, b:0, c:0}, {
  enumerate(target) {
    return Reflect.enumerate(target);
  }
}));

@bterlson
Copy link
Member

I think so :(. I'm not sure how to fix this and fix #161 though.

@allenwb
Copy link
Member

allenwb commented Jan 15, 2016

@bterison No, I'm suggesting that exhausting the handler provided iterator within Proxy [[Ennumerate]] as proposed in #280 is terrible. What's the point of having an iterator if it is going to be exhausted before it is intended early. Why is a Proxy MOP operation enforcing something that isn't a Proxy invariant and which potentially requires a large memory buffer?

The requirement that for-in only provides string values is not a security invariant of the sort that Proxy enforces. If you want to enforce that for-in then it should be done by for-in. Personally, I don't think it is really necessary to enforce it there either as anybody downstream that really depends upon a string value will do any necessary type checks. In other words, this issue (#160) only "seems" bad but really isn't (but as I said in #160 (comment) adding a string type check within for-in would be ok although probably not really necessary). There are an infinite number of ways to muck up the global environment in ways that will eventually break things. Adding complexity to deal with a tiny fraction of those possibilities is probably unwise.

The place to fix #160 if we really think it needs fixing is within ForIn/OfBodyEvaluation. Unfortunately, it probably requires adding an additional parameter to that abstract operation.

@littledan
Copy link
Member

@GeorgNeis What do you think? This seems like a pretty fundamental problem. We could spec checking whether the property is still there, but this would also be observable to all Proxies always.

@allenwb
Copy link
Member

allenwb commented Jan 15, 2016

@anba Thanks, that's exactly the sort of unintended consequence I was worrying about. IMO, that difference is likely more troublesome then the possibility of somebody unintentionally (or even intentionally) returning a non-string from a Proxy enumerate trap.

Buggy proxy handler can break client code that uses those Proxy object. It's a fact of life.

@bterlson
Copy link
Member

I don't want to litter for-in with proxy-specific has checks in an attempt to preserve some semblance of the current iteration semantics. I'm agreeing with @allenwb that iterator exhaustion is bad. I don't see a way out :)

The requirement that for-in only provides string values is not a security invariant of the sort that Proxy enforces.

This was not the consensus at TC39. Unfortunately you weren't there :( But the general feeling was that this kind of check is morally equivalent to checking that the object returned from GetOwnPropertyDescriptor trap is actually a descriptor.

Complexity goes down if we don't fix #161 at least, and also fixes @anba's issue.

I'll re-open this issue at TC39. Since we need to relitigate the fix for #161 in light of @anba's issue we can also bring up Allen's proposed fix for #160 as an alternative to the previous consensus.

@bterlson bterlson added the needs consensus This needs committee consensus before it can be eligible to be merged. label Jan 15, 2016
@bterlson
Copy link
Member

Also I'm going to revert #280 for now.

bterlson added a commit that referenced this issue Jan 15, 2016
@allenwb
Copy link
Member

allenwb commented Jan 15, 2016

This was not the consensus at TC39. Unfortunately you weren't there :( But the general feeling was that this kind of check is morally equivalent to checking that the object returned from GetOwnPropertyDescriptor trap is actually a descriptor.

This is actually a quite different situations. [[GetOwnProperty]] is defined to return an internal property descriptor record, while the Proxy getOwnPropertyDescriptor trap returns an object so Proxy [[GetOwnProxy]] must convert that object into a well-formed property descriptor record. This requirement was listed as an invariant in 6.1.7.3

One the other hand [[Enumerate]] and the Proxy enumerate are both defined to simply return an object and the only invariants for [[Enumerate]] listed in 6.1.7.3 is that the returned value is an object.

It wasn't an oversight that other invariant were listed. The MOP invariants are about ensuring internal consistency of the ES execution model and the essential security characteristics of object structures. The MOP invariants are about the current state of the runtime environment, not about future computations and their possible results and side-effects.

This was all carefully thought out when ES6 proxies were designed and specified. I suspect that when this was present at TC39 these nuisances weren't adequately understood by most TC39 attendees..

In the end, this sort of issue is probably something that should be resolved by the the feature champions rather than a TC39 "vote".

@bterlson
Copy link
Member

More later, but FWIW I don't intend to re-open everything. I'm hoping we can get consensus here as to what the best path forward is and simply get approval at the meeting :)

@rossberg
Copy link
Member Author

@allenwb, re symbols: ordinary Object [[Enumerate]] intentionally does not return symbols. Neither do O.keys or O.getOwnPropertyNames. We specifically introduced O.getOwnSymbolNames. The consensus at the time was that feeding new objects (that have symbol properties) to old code (that expects only strings) might break that code, which is why we explicitly made all these design choices. It would make absolutely no sense if we allowed proxies to undermine that and cause for-in to yield a symbol after all. Hence, I would argue that it is a bug that 6.1.7.3 does not include this as an invariant.

Re [[Enumerate]] vs for-in: the ES6 spec moved all the for-in underspecification hacks to the ordinary object [[Enumerate]] method. It would be very inconsistent to do the special hacks for proxies in a different place. Furthermore, I fail to see how it would be useful to enforce guarantees about proxy [[Enumerate]] that don't hold for the common case [[Enumerate]] anyway. The [[Enumerate]] contract is very weak, and its only real purpose is to support for-in.

@rossberg
Copy link
Member Author

That said, #161 is the more serious issue. The previous requirement was fundamentally incompatible with the web reality of for-in. From my perspective, not fixing it is not an option.

@bterlson
Copy link
Member

Update: the enumerate trap will be removed assuming Tom Van Cutsem does not object.

@ljharb
Copy link
Member

ljharb commented Feb 7, 2016

(this can be closed when #367 merges)

@bterlson
Copy link
Member

At long last, this is done!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs consensus This needs committee consensus before it can be eligible to be merged.
Projects
None yet
Development

No branches or pull requests

6 participants