New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing of /a{5}/u with the grammar and parsing rule in section B.1.4 #527

Closed
nhahtdh opened this Issue Apr 7, 2016 · 2 comments

Comments

Projects
None yet
2 participants
@nhahtdh

nhahtdh commented Apr 7, 2016

According to B.1.4

When parsing using the following grammar, each alternative is considered only if previous production alternatives do not match.

However, with the regex /a{5}/u, the expansion:

Term[U] ::
    [~U] ExtendedTerm
    [+U] Assertion[U]
    [+U] Atom[U]
    [+U] Atom[U] Quantifier

will try [+U] Atom[U] first. As a result, the regex above will be parsed to a literal sequence of a{5}, instead of a repeated 5 times, in Unicode mode.

Is this a correct understanding of the specs? Or does the clause "This alternative pattern grammar and semantics only changes the syntax and semantics of BMP patterns." mean that Unicode mode parsing should not follow the parsing rule above?

@littledan

This comment has been minimized.

Show comment
Hide comment
@littledan

littledan Apr 17, 2017

Member

In Unicode mode, even with Annex B applied, { is not an Atom, as it's not in the set "SourceCharacter but not SyntaxCharacter". Therefore, the fourth production will be the only one that applies. Note that the part of the RegExp grammar that applies to Unicode RegExps is supposed to not have ambiguities (just like the main grammar) and therefore, the ordering does not matter.

Member

littledan commented Apr 17, 2017

In Unicode mode, even with Annex B applied, { is not an Atom, as it's not in the set "SourceCharacter but not SyntaxCharacter". Therefore, the fourth production will be the only one that applies. Note that the part of the RegExp grammar that applies to Unicode RegExps is supposed to not have ambiguities (just like the main grammar) and therefore, the ordering does not matter.

@littledan littledan closed this Apr 17, 2017

@nhahtdh

This comment has been minimized.

Show comment
Hide comment
@nhahtdh

nhahtdh Aug 15, 2017

See comment on issue #526. Anyway, it only affects 6.0 and has been fixed in ECMA 7.0.

nhahtdh commented Aug 15, 2017

See comment on issue #526. Anyway, it only affects 6.0 and has been fixed in ECMA 7.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment