-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quoted codepoint is not matched while unquoted is matched #123
Comments
sjamesr
added a commit
to sjamesr/re2j
that referenced
this issue
Jun 2, 2021
Previously, the parser would match each individual character within a \Q...\E section. Runes requiring a surrogate pair would be incorrectly treated as two individual characters. E.g. String source = new StringBuilder().appendCodePoint(110781).toString(); Before this change: Parser.parse(source, ...) matches \x{1b0bd} Parser.parse("\\Q" + source + "\\E", ...) matches \x{d82c}\x{dcbd} After this change: Parser.parse(source, ...) matches \x{1b0bd} Parser.parse("\\Q" + source + "\\E", ...) matches \x{1b0bd} Fixes google#123.
sjamesr
added a commit
to sjamesr/re2j
that referenced
this issue
Jun 2, 2021
Previously, the parser would match each individual character within a \Q...\E section. Runes requiring a surrogate pair would be incorrectly treated as two individual characters. E.g. String source = new StringBuilder().appendCodePoint(110781).toString(); Before this change: Parser.parse(source, ...) matches \x{1b0bd} Parser.parse("\\Q" + source + "\\E", ...) matches \x{d82c}\x{dcbd} After this change: Parser.parse(source, ...) matches \x{1b0bd} Parser.parse("\\Q" + source + "\\E", ...) matches \x{1b0bd} Fixes google#123.
sjamesr
added a commit
to sjamesr/re2j
that referenced
this issue
Jun 2, 2021
Previously, the parser would match each individual character within a \Q...\E section. Runes requiring a surrogate pair would be incorrectly treated as two individual characters. E.g. String source = new StringBuilder().appendCodePoint(110781).toString(); Before this change: Parser.parse(source, ...) matches \x{1b0bd} Parser.parse("\\Q" + source + "\\E", ...) matches \x{d82c}\x{dcbd} After this change: Parser.parse(source, ...) matches \x{1b0bd} Parser.parse("\\Q" + source + "\\E", ...) matches \x{1b0bd} Fixes google#123.
Thank you for the report, I captured the issue in a unit test and added a potential fix. @adonovan if you could cast a quick glance over the fix to see if it's right, that would be great |
sjamesr
added a commit
to sjamesr/re2j
that referenced
this issue
Jun 2, 2021
Previously, the parser would match each individual character within a \Q...\E section. Runes requiring a surrogate pair would be incorrectly treated as two individual characters. E.g. String source = new StringBuilder().appendCodePoint(110781).toString(); Before this change: Parser.parse(source, ...) matches \x{1b0bd} Parser.parse("\\Q" + source + "\\E", ...) matches \x{d82c}\x{dcbd} After this change: Parser.parse(source, ...) matches \x{1b0bd} Parser.parse("\\Q" + source + "\\E", ...) matches \x{1b0bd} Fixes google#123.
sjamesr
added a commit
that referenced
this issue
Jun 2, 2021
Previously, the parser would match each individual character within a \Q...\E section. Runes requiring a surrogate pair would be incorrectly treated as two individual characters. E.g. String source = new StringBuilder().appendCodePoint(110781).toString(); Before this change: Parser.parse(source, ...) matches \x{1b0bd} Parser.parse("\\Q" + source + "\\E", ...) matches \x{d82c}\x{dcbd} After this change: Parser.parse(source, ...) matches \x{1b0bd} Parser.parse("\\Q" + source + "\\E", ...) matches \x{1b0bd} Fixes #123.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I am using re2j (thx for library!) and use randomly generated strings to test that patterns and logic that I wrote works correctly. I recently found a weird case and I am not sure if it is a bug but feels so because golang
regexp
(also re2, right?) behavior is different.Example
(I hope I did right with that rune to string conversion)
(link: https://play.golang.org/p/EPbFTmzsZm4)
The text was updated successfully, but these errors were encountered: