New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify Map/Set intro notes #1152

Open
mathiasbynens opened this Issue Mar 22, 2018 · 13 comments

Comments

Projects
None yet
10 participants
@mathiasbynens
Member

mathiasbynens commented Mar 22, 2018

https://tc39.github.io/ecma262/#sec-map-objects says:

Map object must be implemented using either hash tables or other mechanisms that, on average, provide access times that are sublinear on the number of elements in the collection.

We could (should?) clarify that Maps (and Sets) have certain ordering guarantees in JS, making classic hash table implementations not entirely feasible for this purpose.

Twitter thread: https://twitter.com/bterlson/status/976770326372737024

cc @surma @bterlson @domenic @timoxley

@ljharb

This comment has been minimized.

Show comment
Hide comment
@ljharb

ljharb Mar 22, 2018

Member

While doing so, let's please also clarify that observable mutation of keys stored in a Map, or values stored in a Set, is not permitted? :-D

Member

ljharb commented Mar 22, 2018

While doing so, let's please also clarify that observable mutation of keys stored in a Map, or values stored in a Set, is not permitted? :-D

@ajklein

This comment has been minimized.

Show comment
Hide comment
@ajklein

ajklein Mar 22, 2018

Contributor

Clarifying in a NOTE makes sense. I have the impression, though, that the motivation for ordering here wasn't so much for usability as to ensure interop across implementations, to avoid accidental dependency on the ordering of a single implementation. That could be something to include in the note as well.

@ljharb I don't understand what you're asking, can you expand?

Contributor

ajklein commented Mar 22, 2018

Clarifying in a NOTE makes sense. I have the impression, though, that the motivation for ordering here wasn't so much for usability as to ensure interop across implementations, to avoid accidental dependency on the ordering of a single implementation. That could be something to include in the note as well.

@ljharb I don't understand what you're asking, can you expand?

@ljharb

This comment has been minimized.

Show comment
Hide comment
@ljharb

ljharb Mar 22, 2018

Member

@ajklein in general, we all understand and agree that observable mutations of user values that aren't in the spec, are disallowed - however, this isn't explicitly called out anywhere in the spec. One way to implement nearly O(1) lookup in a Map or Set implementation for object keys and values respectively is to add an ID to them - which is observable. It would be very helpful to point to something concrete in the spec that indicates that this lookup performance requirement on collections must not be achieved by such mutations.

Member

ljharb commented Mar 22, 2018

@ajklein in general, we all understand and agree that observable mutations of user values that aren't in the spec, are disallowed - however, this isn't explicitly called out anywhere in the spec. One way to implement nearly O(1) lookup in a Map or Set implementation for object keys and values respectively is to add an ID to them - which is observable. It would be very helpful to point to something concrete in the spec that indicates that this lookup performance requirement on collections must not be achieved by such mutations.

@bterlson

This comment has been minimized.

Show comment
Hide comment
@bterlson

bterlson Mar 22, 2018

Member

@ajklein yes good point, the lesson from the wild days of ES3 enumeration order as implementation-defined behavior in practice means ecosystem alignment on a mysterious ordering that falls out of complex compatibility motivations.

Member

bterlson commented Mar 22, 2018

@ajklein yes good point, the lesson from the wild days of ES3 enumeration order as implementation-defined behavior in practice means ecosystem alignment on a mysterious ordering that falls out of complex compatibility motivations.

@ajklein

This comment has been minimized.

Show comment
Hide comment
@ajklein

ajklein Mar 22, 2018

Contributor

@ljharb That seems like a separate set of concerns, I think a separate issue/PR would be appropriate.

Contributor

ajklein commented Mar 22, 2018

@ljharb That seems like a separate set of concerns, I think a separate issue/PR would be appropriate.

@zloirock

This comment has been minimized.

Show comment
Hide comment
@zloirock

zloirock Mar 22, 2018

@ljharb JS engines use unobservable object IDs. Polyfills can do the same and, on their layer, can make it completely unobservable. So the note about object IDs makes no sense.

zloirock commented Mar 22, 2018

@ljharb JS engines use unobservable object IDs. Polyfills can do the same and, on their layer, can make it completely unobservable. So the note about object IDs makes no sense.

@ljharb

This comment has been minimized.

Show comment
Hide comment
@ljharb

ljharb Mar 22, 2018

Member

@ajklein fair, I'll do that separately.

@zloirock I agree that if it's unobservable, nothing need be said - however if it's observable in any way, then it's a violation, and if that's not already implied by the specification and common sense, then an explicit note is warranted.

Member

ljharb commented Mar 22, 2018

@ajklein fair, I'll do that separately.

@zloirock I agree that if it's unobservable, nothing need be said - however if it's observable in any way, then it's a violation, and if that's not already implied by the specification and common sense, then an explicit note is warranted.

@timoxley

This comment has been minimized.

Show comment
Hide comment
@timoxley

timoxley Mar 27, 2018

Contributor

@ajklein:

the motivation for ordering here wasn't so much for usability as to ensure interop across implementations, to avoid accidental dependency on the ordering of a single implementation

Does this fact make any practical difference to a developer relying on the details of the ordering? I can see such a statement being interpreted as a reason for developers to not lean on the ordering functionality as it's "not what the authors intended".

Contributor

timoxley commented Mar 27, 2018

@ajklein:

the motivation for ordering here wasn't so much for usability as to ensure interop across implementations, to avoid accidental dependency on the ordering of a single implementation

Does this fact make any practical difference to a developer relying on the details of the ordering? I can see such a statement being interpreted as a reason for developers to not lean on the ordering functionality as it's "not what the authors intended".

@erights

This comment has been minimized.

Show comment
Hide comment
@erights

erights Mar 27, 2018

The primary and original motivation, going back to E and Waterken, is to avoid introducing a non-overt (covert or side) channel. If the actual order is subject to unspecified implementation details, what does it reveal?

Hopefully we are now past the era when people were systematically underestimating these issues ;)

That said, the other motivations mentioned in this thread are certainly significant as well. Determinism also makes testing easier.

erights commented Mar 27, 2018

The primary and original motivation, going back to E and Waterken, is to avoid introducing a non-overt (covert or side) channel. If the actual order is subject to unspecified implementation details, what does it reveal?

Hopefully we are now past the era when people were systematically underestimating these issues ;)

That said, the other motivations mentioned in this thread are certainly significant as well. Determinism also makes testing easier.

@allenwb

This comment has been minimized.

Show comment
Hide comment
@allenwb

allenwb Mar 27, 2018

Member

A few points,

@mathiasbynens
This statement is not a NOTE. It is instead stating a normative requirement of the specification. Don't mix it up with non-normative material. The actual normative requirement is "on average, provide access times that are sublinear on the number of elements in the collection". The mention of hash tables could be eliminated, but are there really any other known techniques for achieving sublinear access that don't involve some sort of hashing?

@ljharb

in general, we all understand and agree that observable mutations of user values that aren't in the spec, are disallowed

This statement isn't precise enough for me to agree or disagree with it. What do you mean? Certainly there is nothing in the spec that says an implementation couldn't do something like automatically adding an symbol keyed property to an object that identifies the last function it was passed to as argument. If we actually want to forbid such things then the Map spec. isn't the right place to do it.

@timoxley
I agree with you. Informative notes need to be very careful that they are actually providing some useful information to implementors or other readers of the spec. And we need to be aware that they are often read as being normative and since they are visually distinctive sometimes are read to the exclusion of the actual normative text. When in doubt, leave them out.

@erights

The primary and original motivation, going back to E and Waterken, is to avoid introducing a non-overt (covert or side) channel.

That may have been the E motivation, but it wasn't the primary motivation for the enumeration ordering requirements of ES6 maps. @ajklein has it right in that regard. Interoperability was the primary issue. Side channel avoidance was the reason we didn't provide for observable per object hash codes, unlike most other contemporary object based languages.

Member

allenwb commented Mar 27, 2018

A few points,

@mathiasbynens
This statement is not a NOTE. It is instead stating a normative requirement of the specification. Don't mix it up with non-normative material. The actual normative requirement is "on average, provide access times that are sublinear on the number of elements in the collection". The mention of hash tables could be eliminated, but are there really any other known techniques for achieving sublinear access that don't involve some sort of hashing?

@ljharb

in general, we all understand and agree that observable mutations of user values that aren't in the spec, are disallowed

This statement isn't precise enough for me to agree or disagree with it. What do you mean? Certainly there is nothing in the spec that says an implementation couldn't do something like automatically adding an symbol keyed property to an object that identifies the last function it was passed to as argument. If we actually want to forbid such things then the Map spec. isn't the right place to do it.

@timoxley
I agree with you. Informative notes need to be very careful that they are actually providing some useful information to implementors or other readers of the spec. And we need to be aware that they are often read as being normative and since they are visually distinctive sometimes are read to the exclusion of the actual normative text. When in doubt, leave them out.

@erights

The primary and original motivation, going back to E and Waterken, is to avoid introducing a non-overt (covert or side) channel.

That may have been the E motivation, but it wasn't the primary motivation for the enumeration ordering requirements of ES6 maps. @ajklein has it right in that regard. Interoperability was the primary issue. Side channel avoidance was the reason we didn't provide for observable per object hash codes, unlike most other contemporary object based languages.

@erights

This comment has been minimized.

Show comment
Hide comment
@erights

erights Mar 27, 2018

That may have been the E motivation, but it wasn't the primary motivation for the enumeration ordering requirements of ES6 maps.

Fortunately we don't need consensus on which motivation is primary for a proposal to move forward. Historically, I first raised the issue against quite a lot of resistance. A few of us were passionate to keep pushing because of the side channel --- it would not have happened otherwise. The main counterargument was, quite sensibly, performance fears. I pointed at Tyler Close's implementation in Waterken/Joe-E as a demonstration that it need not be any slower. I think it was @bzbarsky who did a serious implementation of Tyler's technique alongside a traditional hashtable implementation that was at least as serious. IIRC, to everyone's surprise, Tyler's technique actually outperformed the traditional one. (Cache effects? I don't know)

So, given the diversity of things people care about on the committee, I'm sure each of the three advantages were primary to some of us:

  • don't introduce new side channels
  • faster
  • interoperability
  • testing
  • all the other reasons why deterministic specs are better...

Side channel avoidance was the reason we didn't provide for observable per object hash codes, unlike most other contemporary object based languages.

true.

erights commented Mar 27, 2018

That may have been the E motivation, but it wasn't the primary motivation for the enumeration ordering requirements of ES6 maps.

Fortunately we don't need consensus on which motivation is primary for a proposal to move forward. Historically, I first raised the issue against quite a lot of resistance. A few of us were passionate to keep pushing because of the side channel --- it would not have happened otherwise. The main counterargument was, quite sensibly, performance fears. I pointed at Tyler Close's implementation in Waterken/Joe-E as a demonstration that it need not be any slower. I think it was @bzbarsky who did a serious implementation of Tyler's technique alongside a traditional hashtable implementation that was at least as serious. IIRC, to everyone's surprise, Tyler's technique actually outperformed the traditional one. (Cache effects? I don't know)

So, given the diversity of things people care about on the committee, I'm sure each of the three advantages were primary to some of us:

  • don't introduce new side channels
  • faster
  • interoperability
  • testing
  • all the other reasons why deterministic specs are better...

Side channel avoidance was the reason we didn't provide for observable per object hash codes, unlike most other contemporary object based languages.

true.

@bzbarsky

This comment has been minimized.

Show comment
Hide comment
@bzbarsky

bzbarsky Mar 27, 2018

It wasn't me.

bzbarsky commented Mar 27, 2018

It wasn't me.

@gsathya

This comment has been minimized.

Show comment
Hide comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment