Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Precisely specify iteration details. Fixes #396 #451

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

tabatkins
Copy link
Contributor

@tabatkins tabatkins commented May 5, 2022

Added a "sequence iteration" concept, and invoked it explicitly for list and map iteration.

Sequence iteration is defined in the same terms that ES does (see tc39/ecma262#2766 where ES is making itself more consistent in this regard): you maintain an internal index, incrementing it each step, and checking it against the length at each iteration. If the sequence changes size or order while executing its steps, that's fine, the next iteration will just see whatever the i+1th value of the sequence now is.

Happy to invoke this in a different way; what I have just seemed like a short and straightforward way to do so.


Preview | Diff

infra.bs Outdated Show resolved Hide resolved
infra.bs Outdated Show resolved Hide resolved
infra.bs Outdated Show resolved Hide resolved
@bakkot
Copy link

bakkot commented May 6, 2022

Do note that this is not the same semantics as ES Maps use, because deletion is specified differently. Specifically, for an ES Map, deleting leaves a ~empty~ tombstone rather than actually removing something from the list, so that the indices of all the other items in the list of entires are preserved. Deleting an item from a map as specified here will actually remove the underlying item from the list. That means that if, while iterating, you delete something which has already been visited from the map, you will end up skipping some random other item from the map (because all indices past the deleted item will get reduced by 1). That seems like something you might not want.

@domenic
Copy link
Member

domenic commented May 6, 2022

Thanks @bakkot, great catch.

Can you think of any way to specify those semantics without tombstones, which feel a bit awkward? Like... snapshotting the keys, but then doing a separate exists check before actually executing the steps?

@bakkot
Copy link

bakkot commented May 6, 2022

Offhand nothing else occurs to me, though you might be able to find something in the literature. If you're trying to get exactly ES Map semantics, using the same specification mechanism as ES is probably going to be easiest.

Snapshotting is going to be tricky. There's a bunch of weird edge cases: like, suppose you had [a, b, c], and the iterator reached and output b, and then while it's paused there a is removed and re-added - in an ES map the iterator will proceed to output c and then a again (fun!). That follows immediately from the use of tombstones but I think trying to write that down with snapshots is going to end up being pretty painful.

@tabatkins
Copy link
Contributor Author

Yeah, indeed, I think capturing the precise semantics without using tombstones would be a huge burden.

My inclination is to specify that lists and maps can have holes, but they're automatically skipped over by all spec-author-facing operations; only the list/map iteration internal algo sees them.

@domenic
Copy link
Member

domenic commented May 6, 2022

Yeah, that seems fair.

So there are several competing issues here:

  • We want spec authors to be able to write Web IDL maplikes/setlikes without doing a bunch of ECMASpeak (make hooking into maplike and setlike methods easier webidl#824). Two ways of doing this come to mind:
    • Having backing Infra maps/sets, like @tabatkins has done in Switch maplike/setlike to use infra maps and sets webidl#1138 . Then make sure Infra maps/sets align semantically with JS ones, at least when used in Web IDL contexts. +0/-0 I'm not too worried about, but iteration (discussed here) is the hard part.
    • Create a bunch of wrapper operations around the ECMASpeak, e.g. "maplike set", "setlike add", etc. in Web IDL, which spec authors have to use on their maplikes/setlikes.
    • The former of these options is obviously a bit more user-friendly from a spec author point of view.
  • Infra currently does not specify any well-defined iteration behavior ("For each" is not clear what happens if the list is modified while iterating #396). It would be good to specify this regardless of any Web IDL/ES overlap.
    • One potential path here we gestured at in "For each" is not clear what happens if the list is modified while iterating #396 is to just disallow modifications during iteration, i.e., Infra-using specs must not do that or allow author code to do that.
    • That is only viable if we don't have Infra maps/sets back maplikes/setlikes, because maplike/setlike iteration is user-hookable.
    • And it seems pretty unlikely to be viable regardless. It's too easy to call author code while iterating.
    • So, we probably need to pick some real semantics. We have several choices.
  • Infra sets are defined as a specialization of Infra lists.
    • However, JS Set and JS Array iteration behavior is not the same; array iteration does not have the tombstone behavior.
    • Do we want to have Infra list and Infra set iteration behavior differ?
    • If we do, then unless there's something really clever we can do in the iteration algorithm, per @bakkot's point above we probably need to rewrite how sets are done to support this tombstone-type semantic.
    • (We'd also need to rewrite maps to support that, but this is less of an issue because maps are hand-wavey "sequences" and not specializations of lists.)
  • Infra maps and sets are generally meant to map to C++ map/set classes used in implementations.
    • I don't know what iteration behavior those have in implementations. I kind of doubt it's the same in different browsers.
    • So, there might be existing interop problems with various web platform specs that use Infra maps/sets, if those maps/sets can be modified mid-iteration.

I think the best path forward here is to push toward aligning Infra maps/sets with JS ones, and try to be sure to have people write web platform tests for modifications-during-iteration for any Infra map/set usage (not just maplike/setlike, but other cases too).

But the wrapper operations path is also viable. It would basically defer the question of how to precisely specify Infra map/set iteration until later.

@domenic
Copy link
Member

domenic commented May 6, 2022

(For the record I typed up most of the above, went to a meeting, missed @tabatkins's response, then pressed submit.)

My inclination is to specify that lists and maps can have holes, but they're automatically skipped over by all spec-author-facing operations; only the list/map iteration internal algo sees them.

Do you mean set/map, or list/map? Changing lists to have holes might have other knock-on effects, e.g. by making list iteration mismatch Array iteration.

@tabatkins
Copy link
Contributor Author

I meant list, but I'm not wedded to that; i'd be okay with lists and sets having different iteration behavior if needed.

Regarding Array iteration, note that Arrays already differ from themselves between forEach() and iteration, so who knows what we should be consistent with.

@domenic
Copy link
Member

domenic commented May 6, 2022

Fair point. It's probably fine to add holes to lists; I'm happy for you to give it a shot.

@annevk
Copy link
Member

annevk commented May 9, 2022

I wonder to what extent we should give advice here to implementations as not all lists will need this complexity. I wonder if we can do similar things to "ASCII string" whereby we essentially assert the thing will have a subset of the total semantics.

@tabatkins
Copy link
Contributor Author

tabatkins commented May 10, 2022

Okay, I won't get back to finishing this PR until at least next week due to vacation, but just gathering some thoughts here:

  1. Holes are possible. I don't want normal uses to require thinking about holes, so I specify that they're only visible to Infra-internal algorithms; all the spec author-facing operations don't reveal holes.

  2. Holes aren't possible for normal lists/etc, but I go with Anne's suggestion and define a specialization of lists/etc that can have holes, and use those for maplike/setlike. These would generally be necessary for any spec that wants to have mutation during an iteration; mutation during iteration will be explicitly forbidden for normal lists/etc. Holes will still only be visible for Infra-internal purposes; spec authors will never see them.

  3. Holes aren't possible; instead, I describe iteration in terms of linked lists. I am pretty sure that this gives the same observable behavior as using holes, without requiring me to acknowledge the existence of holes. However, a literal implementation probably isn't very efficient, and carefully exposing the right behavior when using a different impl strategy would be non-trivial.

I think I don't like (3). I'm leaning to (2), but could be convinced on (1) if we think it's better to put the burden of thinking about which type to use on impls rather than spec authors. We'd have guidance either way in Infra, just targeted at different audiences.

@domenic
Copy link
Member

domenic commented Jul 26, 2022

@tabatkins ping on picking this back up!

I think I prefer (1) if possible, but I'm OK with (2) if you prefer.

@rakuco
Copy link
Member

rakuco commented Oct 12, 2022

Hi from Blink-land. I've been trying to review @yuki3's patches making Blink's pair iterators and map/setlike code more spec-conformant and ended up stumbling upon #396 and this PR. It would be great to have more clarity on this; I was wondering if there's been any progress with the patch and the approaches discussed above.

@tabatkins
Copy link
Contributor Author

It's remained on my list of things to wrap up "real soon now". :/

@tabatkins
Copy link
Contributor Author

tabatkins commented Dec 13, 2022

Okay, picking this back up now. I've gone with Option 1 from my comment - lists and maps can have holes, which are invisible to all algorithms unless explicitly specified. This is done in a somewhat handwavey style that is hopefully okay, but if we want to be more precise I can add an explicit "with holes" argument to every algo, defaulting to false, and pass that around appropriately.

Big thing left is to just decide which of the list iteration behaviors to steal from JS: Array.forEach records the length of the list at the start and stops at that length, even if items were added mid-iteration; Array iteration recalculates the length on each iteration to check if it should stop, so items added mid-iteration will be visited as well. JS Maps do the latter behavior in both instances, so I'm currently leaning towards (and the spec text reflects) doing that for lists as well.

(And I know that my wrapping isn't correct right now; I hate length-based wrapping so I'm leaving it to be done immediately prior to merging.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants