Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upThe modified grammar in B.1.4 actually changes the behavior in Unicode mode #526
Comments
nhahtdh
changed the title from
The modified grammar in B.1.4 seems to change the behavior in Unicode mode
to
The modified grammar in B.1.4 actually changes the behavior in Unicode mode
Apr 7, 2016
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
littledan
Apr 17, 2017
Member
Where do you see the interpretation that { is extended in Unicode mode? I see ExtendedPatternCharacter (which adds {) added only from ExtendedAtom, which is included in Term only if the [U] flag is off.
|
Where do you see the interpretation that |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
nhahtdh
Aug 15, 2017
@littledan If you follow my link above, the following production leads to my conclusion:
Set +U. Go to Atom[U] in this rule (we need to try Assertion[U] first, but we know that it will fail)
Term[U] ::
[~U] ExtendedTerm
[+U] Assertion[U]
[+U] Atom[U]
[+U] Atom[U] Quantifier
In Atom[U], enter PatternCharacter
Atom[U] ::
PatternCharacter
.
\ AtomEscape[?U]
CharacterClass[?U]
( Disjunction[?U] )
( ? : Disjunction[?U] )
PatternCharacter actually allows braces:
PatternCharacter ::
SourceCharacter but not one of
^ $ \ . * + ? ( ) [ ] |
Anyway, this issue only affects EMCA 6.0 specs. I don't see the issue in 7.0 specs.
nhahtdh
commented
Aug 15, 2017
•
|
@littledan If you follow my link above, the following production leads to my conclusion: Set
In
Anyway, this issue only affects EMCA 6.0 specs. I don't see the issue in 7.0 specs. |
nhahtdh
referenced this issue
Aug 15, 2017
Closed
Parsing of /a{5}/u with the grammar and parsing rule in section B.1.4 #527
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
littledan
Aug 15, 2017
Member
If you're filing issues, please refer to the current draft spec, which you can find at https://tc39.github.io/ecma262/ . We don't issue errata for older, historical releases, so you're likely to find fixed bugs there.
|
If you're filing issues, please refer to the current draft spec, which you can find at https://tc39.github.io/ecma262/ . We don't issue errata for older, historical releases, so you're likely to find fixed bugs there. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
nhahtdh
commented
Aug 16, 2017
|
I guess this issue is no longer relevant then. Closed. |
nhahtdh commentedApr 7, 2016
According to B.1.4:
Let's say we have a pattern in Unicode mode, according to the grammar in B.1.4, with the parsing rule:
starting from
Term[U], we ignore[~U] ExtendedTerm, then try in order[+U] Assertion[U],[+U] Atom[U],[+U] Atom[U] Quantifier. Going into[+U] Atom[U], we encounterAtom[U] :: PatternCharacter, in whichPatternCharacterallows free{and}to be treated as literal characters.However, according to the grammar in 21.2.1, lone
{and}without escape can never be parsed as an Atom, in both Unicode and BMP modes.As we can see here, the statement above seems to contradict the grammar. The grammar in B.1.4 is actually extended to allow
{and}to be parsed as Atom in both BMP and Unicode mode.Which one is supposed to be correct? The statement or the grammar?