-
Notifications
You must be signed in to change notification settings - Fork 6.2k
8354908: javac mishandles supplementary character in character literal #24964
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
👋 Welcome back jlahoda! A progress list of the required criteria for merging this PR into |
|
@lahodaj This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be: You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 252 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
Webrevs
|
vicente-romero-oracle
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
| illegal reference to static field from initializer | ||
|
|
||
| compiler.err.illegal.char.literal.multiple.surrogates=\ | ||
| character literal contains more than one UTF-16 code point |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message is kind of vague. How about using "surrogate code point"?
https://www.unicode.org/glossary/#surrogate_code_point
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or "UTF-16 code unit"?
|
/integrate |
|
Going to push as commit 03dca03.
Your commit was automatically rebased without conflicts. |
Some Unicode characters consist of two surrogates, i.e. two
chars. And, such Unicode characters cannot be part of a char literal, as there's no way to represent them as a character literal. But, javac currently accepts code with such characters, and only puts the char, the high surrogate, into the literal, ignoring the second one.For example, the JDK 24 behavior is:
But, in JDK 11, such literals have been rejected:
The proposal in this PR is to explicitly check for this case when scanning character literal, and produce explicit error when a multi-surrogate character is used. javac will produce an error like:
Progress
Warning
8354908: javac mishandles supplementary character in character literalIssues
Reviewers
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/24964/head:pull/24964$ git checkout pull/24964Update a local copy of the PR:
$ git checkout pull/24964$ git pull https://git.openjdk.org/jdk.git pull/24964/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 24964View PR using the GUI difftool:
$ git pr show -t 24964Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/24964.diff
Using Webrev
Link to Webrev Comment