New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The modified grammar in B.1.4 actually changes the behavior in Unicode mode #526

Closed
nhahtdh opened this Issue Apr 7, 2016 · 4 comments

Comments

Projects
None yet
2 participants
@nhahtdh

nhahtdh commented Apr 7, 2016

According to B.1.4:

This alternative pattern grammar and semantics only changes the syntax and semantics of BMP patterns.

Let's say we have a pattern in Unicode mode, according to the grammar in B.1.4, with the parsing rule:

When parsing using the following grammar, each alternative is considered only if previous production alternatives do not match.

starting from Term[U], we ignore [~U] ExtendedTerm, then try in order [+U] Assertion[U], [+U] Atom[U], [+U] Atom[U] Quantifier. Going into [+U] Atom[U], we encounter Atom[U] :: PatternCharacter, in which PatternCharacter allows free { and } to be treated as literal characters.

However, according to the grammar in 21.2.1, lone { and } without escape can never be parsed as an Atom, in both Unicode and BMP modes.

As we can see here, the statement above seems to contradict the grammar. The grammar in B.1.4 is actually extended to allow { and } to be parsed as Atom in both BMP and Unicode mode.

Which one is supposed to be correct? The statement or the grammar?

@nhahtdh nhahtdh changed the title from The modified grammar in B.1.4 seems to change the behavior in Unicode mode to The modified grammar in B.1.4 actually changes the behavior in Unicode mode Apr 7, 2016

@littledan

This comment has been minimized.

Show comment
Hide comment
@littledan

littledan Apr 17, 2017

Member

Where do you see the interpretation that { is extended in Unicode mode? I see ExtendedPatternCharacter (which adds {) added only from ExtendedAtom, which is included in Term only if the [U] flag is off.

Member

littledan commented Apr 17, 2017

Where do you see the interpretation that { is extended in Unicode mode? I see ExtendedPatternCharacter (which adds {) added only from ExtendedAtom, which is included in Term only if the [U] flag is off.

@nhahtdh

This comment has been minimized.

Show comment
Hide comment
@nhahtdh

nhahtdh Aug 15, 2017

@littledan If you follow my link above, the following production leads to my conclusion:

Set +U. Go to Atom[U] in this rule (we need to try Assertion[U] first, but we know that it will fail)

Term[U] ::
[~U] ExtendedTerm
[+U] Assertion[U]
[+U] Atom[U]
[+U] Atom[U] Quantifier

In Atom[U], enter PatternCharacter

Atom[U] ::
PatternCharacter
.
\ AtomEscape[?U]
CharacterClass[?U]
( Disjunction[?U] )
( ? : Disjunction[?U] )

PatternCharacter actually allows braces:

PatternCharacter ::
SourceCharacter but not one of
^ $ \ . * + ? ( ) [ ] |

Anyway, this issue only affects EMCA 6.0 specs. I don't see the issue in 7.0 specs.

nhahtdh commented Aug 15, 2017

@littledan If you follow my link above, the following production leads to my conclusion:

Set +U. Go to Atom[U] in this rule (we need to try Assertion[U] first, but we know that it will fail)

Term[U] ::
[~U] ExtendedTerm
[+U] Assertion[U]
[+U] Atom[U]
[+U] Atom[U] Quantifier

In Atom[U], enter PatternCharacter

Atom[U] ::
PatternCharacter
.
\ AtomEscape[?U]
CharacterClass[?U]
( Disjunction[?U] )
( ? : Disjunction[?U] )

PatternCharacter actually allows braces:

PatternCharacter ::
SourceCharacter but not one of
^ $ \ . * + ? ( ) [ ] |

Anyway, this issue only affects EMCA 6.0 specs. I don't see the issue in 7.0 specs.

@littledan

This comment has been minimized.

Show comment
Hide comment
@littledan

littledan Aug 15, 2017

Member

If you're filing issues, please refer to the current draft spec, which you can find at https://tc39.github.io/ecma262/ . We don't issue errata for older, historical releases, so you're likely to find fixed bugs there.

Member

littledan commented Aug 15, 2017

If you're filing issues, please refer to the current draft spec, which you can find at https://tc39.github.io/ecma262/ . We don't issue errata for older, historical releases, so you're likely to find fixed bugs there.

@nhahtdh

This comment has been minimized.

Show comment
Hide comment
@nhahtdh

nhahtdh Aug 16, 2017

I guess this issue is no longer relevant then. Closed.

nhahtdh commented Aug 16, 2017

I guess this issue is no longer relevant then. Closed.

@nhahtdh nhahtdh closed this Aug 16, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment