-
Notifications
You must be signed in to change notification settings - Fork 1.7k
JS: Add ECMAScript 2024 v
Flag Operators for Regex Parsing
#18899
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
84fddf1
to
94adaf8
Compare
605456f
to
f93419e
Compare
v
Flag Operators for Regex Parsing
6fe7753
to
430514b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Overview
This pull request introduces support for ECMAScript 2024 regex constructs under the new "v" flag. Key changes include:
- New AST node classes for character class operations (Subtraction, QuotedString, Intersection, Union)
- Enhancements to RegExpParser to conditionally enable nested character classes, new operators, and quoted string parsing with a fallback mechanism when errors are encountered
- New test inputs covering quoted strings, unions, intersections, subtractions, and nested character classes
Reviewed Changes
File | Description |
---|---|
javascript/extractor/src/com/semmle/js/ast/regexp/CharacterClassSubtraction.java | New AST node for subtraction operator in character classes |
javascript/extractor/src/com/semmle/js/ast/regexp/CharacterClassQuotedString.java | New AST node for handling quoted string escapes |
javascript/extractor/src/com/semmle/js/ast/regexp/CharacterClassIntersection.java | New AST node for intersection operator in character classes |
javascript/extractor/src/com/semmle/js/ast/regexp/CharacterClassUnion.java | New AST node for union operator in character classes |
javascript/extractor/src/com/semmle/js/parser/RegExpParser.java | Extended parser functionality to support the new "v" flag and corresponding regex operations |
javascript/extractor/src/com/semmle/js/extractor/ASTExtractor.java and RegExpExtractor.java | Updated extraction logic to accommodate new AST node types and conditional flag handling |
Copilot reviewed 31 out of 31 changed files in this pull request and generated 2 comments.
Tip: Leave feedback on Copilot's review comments with the 👎 and 👍 buttons to help improve review quality. Learn more
javascript/extractor/src/com/semmle/js/parser/RegExpParser.java
Outdated
Show resolved
Hide resolved
javascript/extractor/src/com/semmle/js/parser/RegExpParser.java
Outdated
Show resolved
Hide resolved
78aa5dc
to
9e1f050
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent work! I have a couple of comments to keep you busy during the week 😄
javascript/extractor/src/com/semmle/js/ast/regexp/CharacterClassIntersection.java
Outdated
Show resolved
Hide resolved
javascript/extractor/src/com/semmle/js/extractor/ASTExtractor.java
Outdated
Show resolved
Hide resolved
javascript/extractor/src/com/semmle/js/parser/RegExpParser.java
Outdated
Show resolved
Hide resolved
javascript/extractor/src/com/semmle/js/parser/RegExpParser.java
Outdated
Show resolved
Hide resolved
javascript/extractor/src/com/semmle/js/parser/RegExpParser.java
Outdated
Show resolved
Hide resolved
javascript/extractor/src/com/semmle/js/parser/RegExpParser.java
Outdated
Show resolved
Hide resolved
javascript/extractor/src/com/semmle/js/parser/RegExpParser.java
Outdated
Show resolved
Hide resolved
javascript/extractor/src/com/semmle/js/parser/RegExpParser.java
Outdated
Show resolved
Hide resolved
javascript/extractor/src/com/semmle/js/ast/regexp/CharacterClassUnion.java
Outdated
Show resolved
Hide resolved
javascript/extractor/src/com/semmle/js/parser/RegExpParser.java
Outdated
Show resolved
Hide resolved
d6df34e
to
8558ead
Compare
6380ec8
to
d40ff96
Compare
d40ff96
to
f48eab9
Compare
javascript/extractor/src/com/semmle/js/parser/RegExpParser.java
Outdated
Show resolved
Hide resolved
Co-authored-by: Asgerf <asgerf@github.com>
a337863
to
9c8e0a5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work 👍
I didn't look through it thoroughly, I assume Asger did that.
Did you run database creation on the latest main
of https://github.com/babel/babel and https://github.com/tc39/test262?
Those projects contain all kinds of valid and invalid syntax, so it's a nice test of whether something is horribly wrong.
javascript/extractor/src/com/semmle/js/ast/regexp/CharacterClassIntersection.java
Show resolved
Hide resolved
javascript/extractor/src/com/semmle/js/parser/RegExpParser.java
Outdated
Show resolved
Hide resolved
For The |
javascript/extractor/src/com/semmle/js/parser/RegExpParser.java
Outdated
Show resolved
Hide resolved
No, that is not expected. However, that seems to be unrelated to this PR. |
Co-authored-by: Erik Krogh Kristensen <erik-krogh@github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
Co-authored-by: Asger F <asgerf@github.com>
This pull request adds support for parsing ECMAScript 2024
v
flag operators, including:Example:
/[[abc][cz]]/v
&&
): Matches characters common to both sets.Example:
/[[abc]&&[cz]]/v
--
): Removes characters from a set.Example:
/[[abc]--[cz]]/v
Mixing operations at the same level is not allowed:
/[[abc]&&[cz]--[zz]]/v
/[[abc]&&[[cz]--[zz]]]/v
Example:
/[[abc][cz]]/v
\q{}
): Allows matching exact sequences.Example:
/[\q{ab|cb|db}]/v
Commit by commit review encouraged.
Useful links:
With correct parsing, this no longer produces an false positive in Closes #18854.