Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upIdentifierPart is ambiguous re '_' ? #1059
Comments
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
mathiasbynens
Jan 5, 2018
Member
This is correct:
/\p{ID_Continue}/u.test('_');
// → trueWe can just drop the _ from IdentifierPart.
spec.html | 1 -
1 file changed, 1 deletion(-)
diff --git a/spec.html b/spec.html
index b552ed8..6b5339e 100644
--- a/spec.html
+++ b/spec.html
@@ -9813,7 +9813,6 @@
IdentifierPart ::
UnicodeIDContinue
`$`
- `_`
`\` UnicodeEscapeSequence
<ZWNJ>
<ZWJ>|
This is correct: /\p{ID_Continue}/u.test('_');
// → trueWe can just drop the spec.html | 1 -
1 file changed, 1 deletion(-)
diff --git a/spec.html b/spec.html
index b552ed8..6b5339e 100644
--- a/spec.html
+++ b/spec.html
@@ -9813,7 +9813,6 @@
IdentifierPart ::
UnicodeIDContinue
`$`
- `_`
`\` UnicodeEscapeSequence
<ZWNJ>
<ZWJ> |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
nicolo-ribaudo
Jan 5, 2018
In that case please add a note: it would be confusing for a non-expert reader to see _ in IdentifierStart but not in IdentifierPart.
nicolo-ribaudo
commented
Jan 5, 2018
|
In that case please add a note: it would be confusing for a non-expert reader to see |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
mathiasbynens
Jan 5, 2018
Member
We could also make IdentifierPart consume IdentifierStart to avoid the duplication. IdentifierStart is a guaranteed to be a subset of IdentifierPart because ID_Start is guaranteed to be a subset of ID_Continue.
spec.html | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/spec.html b/spec.html
index b552ed8..a64ca7c 100644
--- a/spec.html
+++ b/spec.html
@@ -9811,10 +9811,8 @@
`\` UnicodeEscapeSequence
IdentifierPart ::
+ IdentifierStart
UnicodeIDContinue
- `$`
- `_`
- `\` UnicodeEscapeSequence
<ZWNJ>
<ZWJ>|
We could also make spec.html | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/spec.html b/spec.html
index b552ed8..a64ca7c 100644
--- a/spec.html
+++ b/spec.html
@@ -9811,10 +9811,8 @@
`\` UnicodeEscapeSequence
IdentifierPart ::
+ IdentifierStart
UnicodeIDContinue
- `$`
- `_`
- `\` UnicodeEscapeSequence
<ZWNJ>
<ZWJ> |
added a commit
to jmdyck/ecma262
that referenced
this issue
Jan 5, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jmdyck
Jan 5, 2018
Collaborator
IdentifierPart :: IdentifierStart UnicodeIDContinue <ZWNJ> <ZWJ>
With that, IdentifierPart would be ambiguous on all the characters in the intersection of IdentifierStart and UnicodeIDContinue (i.e., _ and p{ID_Start}).
With that, |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
allenwb
Jan 5, 2018
Member
It seems to me that the simplest way to eliminate this ambiguity is:
UnicodeIDContinue ::
any Unicode code point other than U+005F (LOW LINE) with the Unicode property “ID_Continue”
|
It seems to me that the simplest way to eliminate this ambiguity is: UnicodeIDContinue :: |
added a commit
to jmdyck/ecma262
that referenced
this issue
Jan 24, 2018
added a commit
to jmdyck/ecma262
that referenced
this issue
Jan 26, 2018
added a commit
to jmdyck/ecma262
that referenced
this issue
Jan 26, 2018
pushed a commit
that referenced
this issue
Feb 13, 2018
pushed a commit
that referenced
this issue
Feb 13, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Yup, thanks. |
jmdyck commentedJan 4, 2018
•
edited
As I understand it,
'_'(U+005F LOW LINE) belongs to Unicode general category 'Pc' (Connector_Punctuation), and so has the 'ID_Continue' property, and so matches the ECMAScript nonterminalUnicodeIDContinue. This means thatIdentifierPartderives'_'in two distinct ways (viaUnicodeIDContinueand via the'_'literal), and so is technically ambiguous. I don't think this causes any semantic ambiguity (because the spec doesn't much care about howIdentifierPartmatches source text), but it's odd.