New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Editorial? reparsing the source text instead of the lexical token sequence #962

Closed
jmdyck opened this Issue Jul 29, 2017 · 2 comments

Comments

Projects
None yet
2 participants
@jmdyck
Collaborator

jmdyck commented Jul 29, 2017

This has to do with the distinction between "lexical token sequence/stream" (which I'll abbreviate as LTS) and "source text".

When the spec "refines the interpretation" of |X| using supplemental nonterminal |Y|, there are typically:

  • Early Error rule(s) that raise a Syntax Error if the LTS matched by |X| cannot be parsed using |Y| as the goal symbol, and
  • algorithm steps that get the result of parsing the LTS matched by |X| using |Y| as the goal symbol.

That's true for all these cases:

  • CoverParenthesizedExpressionAndArrowParameterList reparsed as ParenthesizedExpression
  • CoverParenthesizedExpressionAndArrowParameterList reparsed as ArrowFormalParameters
  • CoverCallExpressionAndAsyncArrowHead reparsed as CallMemberExpression
  • CoverCallExpressionAndAsyncArrowHead reparsed as AsyncArrowHead

However, things are different for the case of

  • LeftHandSideExpression (or DestructuringAssignmentTarget) reparsed as AssignmentPattern

The relevant EE rules use the LTS formulation, but the relevant algorithm steps don't. Instead, they're interested in the result of parsing the source text corresponding to |X| using |Y| as the goal symbol.

  1. Does anyone know why this difference occurs? @allenwb? (My guess is that it's unintentional. Note that the AssignmentPattern case was the first to be introduced, and is the only one in ES5. There, both the EE rules and algo steps used the "source text" formulation. Then in the 10th draft of ES6, the EE rules were changed to LTS, but not the algo steps. I can't find a reason for the change, let alone a reason for not applying it to the algo steps.)

  2. More importantly, is it a normative difference? That is, if we replaced the "source text" formulation with the LTS formulation, would it have a normative effect? In the algo steps, we know that the LTS matched by |X| can be parsed as |Y| (otherwise one of the EE rules would have raised a Syntax Error, preventing us from getting this far). Given that, is it possible that the parsing the source code corresponding to |X| can fail, or yield a different parse than would parsing the LTS matched by |X|? I don't think so, but I'm not sure. The failure case would be a spec bug (because the algo steps don't allow for any failure), but the 'different parse' case might or might not. Can anyone suggest a test case?

  3. If it does make a normative difference, what do implementations do?

@allenwb

This comment has been minimized.

Show comment
Hide comment
@allenwb

allenwb Jul 29, 2017

Member

Does anyone know why this difference occurs? @allenwb? (My guess is that it's unintentional.

I'm pretty sure it was unintentional.

More importantly, is it a normative difference? That is, if we replaced the "source text" formulation with the LTS formulation, would it have a normative effect?

I could, if the LTS contained any tokens whose selection was contextually dependent upon the syntactic context (eg, / as the div operator or as the first character of an regexp lieteral. But in practice. I don't think that is a problem for these cover grammars because the reparses won't be different enough to change the tokenization.

Finally, I believe I changed the spec text to use token stream because its exact meaning was more self descriptive then "the source text corresponding to ...". It saved me having to find a way (and place) to precisely define the meaning of the "source test" phrasing.

Member

allenwb commented Jul 29, 2017

Does anyone know why this difference occurs? @allenwb? (My guess is that it's unintentional.

I'm pretty sure it was unintentional.

More importantly, is it a normative difference? That is, if we replaced the "source text" formulation with the LTS formulation, would it have a normative effect?

I could, if the LTS contained any tokens whose selection was contextually dependent upon the syntactic context (eg, / as the div operator or as the first character of an regexp lieteral. But in practice. I don't think that is a problem for these cover grammars because the reparses won't be different enough to change the tokenization.

Finally, I believe I changed the spec text to use token stream because its exact meaning was more self descriptive then "the source text corresponding to ...". It saved me having to find a way (and place) to precisely define the meaning of the "source test" phrasing.

jmdyck added a commit to jmdyck/ecma262 that referenced this issue Aug 10, 2017

Editorial: eliminate "the source text corresponding to |X|"
Change it to "the lexical token sequence matched by |X|".

See tc39#962

bterlson pushed a commit that referenced this issue Aug 21, 2017

Editorial: eliminate "the source text corresponding to |X|"
Change it to "the lexical token sequence matched by |X|".

See #962
@jmdyck

This comment has been minimized.

Show comment
Hide comment
@jmdyck

jmdyck Aug 22, 2017

Collaborator

Closing this because bterlson has merged my commit that assumes the suggested change is editorial.

Collaborator

jmdyck commented Aug 22, 2017

Closing this because bterlson has merged my commit that assumes the suggested change is editorial.

@jmdyck jmdyck closed this Aug 22, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment