-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8311943: Cleanup usages of toLowerCase() and toUpperCase() in java.base #14763
Conversation
👋 Welcome back Glavo! A progress list of the required criteria for merging this PR into |
Hello Glavo, I've created https://bugs.openjdk.org/browse/JDK-8311943 to track this change. Please update the title to this PR to |
@@ -628,7 +629,7 @@ else if ('0' <= c && c <= '9') { | |||
peekc = c; | |||
sval = String.copyValueOf(buf, 0, i); | |||
if (forceLower) | |||
sval = sval.toLowerCase(); | |||
sval = sval.toLowerCase(Locale.ROOT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect this change to StreamTokenizer needs eyes. I think long standing behavior of the lowerCaseMode(true) has been to use the rules for the default locale so we need to be careful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect this change to StreamTokenizer needs eyes. I think long standing behavior of the lowerCaseMode(true) has been to use the rules for the default locale so we need to be careful.
I investigated usage of this method on GitHub:
https://github.com/search?q=%22lowerCaseMode%28true%29%22+language%3AJava&type=code
In some of the use cases I investigated, it seems that no one wants to rely on the default locale.
However, while I think this corrects the behavior, this caused a change in the behavior of the API, so a CSR may be required. I don't want to debate this in this PR, so I'll revert this change and open a new PR in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a small suggestion to make it clear whats wanted here. In other projects I am involved in (Apache Lucene/Solr, Apache TIKA, PostgresSQL JDBC, Checkstyle itsself, Elasticserach/Opensearch), which use the forbiddenapis Maven/Gradle/Ant plugin, we forbid all calls to several Java APIs (including toLowerCase/toUpperCase case). All bytecode using this will build failure (FYI, we also disallow other stuff like relying of default timezone or characterset).
To make it clear what is really intended, those projects agreed on having toLowerCase(Locale.getDefault())
, so it is explicit what's wanted.
Without that it could be that somebody else starts the discussion again.
This is just a suggestion to be explicit as it makes maintaining the code easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a small suggestion to make it clear whats wanted here. In other projects I am involved in (Apache Lucene/Solr, Apache TIKA, PostgresSQL JDBC, Checkstyle itsself, Elasticserach/Opensearch), which use the forbiddenapis Maven/Gradle/Ant plugin, we forbid all calls to several Java APIs (including toLowerCase/toUpperCase case). All bytecode using this will build failure (FYI, we also disallow other stuff like relying of default timezone or characterset). To make it clear what is really intended, those projects agreed on having
toLowerCase(Locale.getDefault())
, so it is explicit what's wanted. Without that it could be that somebody else starts the discussion again.This is just a suggestion to be explicit as it makes maintaining the code easier.
I agree with this.
I'm working on deprecating toLowerCase()
and toUpperCase()
, this PR is part of that effort. I wish to convert all use cases of them to toLowerCase(Locale)
and toUpperCase(Locale)
.
More backstory is detailed in #13434 (comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, while I think this corrects the behavior, this caused a change in the behavior of the API, so a CSR may be required. I don't want to debate this in this PR, so I'll revert this change and open a new PR in the future.
StreamTokenizer is a very old API and changing long standing behavior may break something or be observable with existing code/usages. I see youve reverted this part (thanks) and looking at it separately is fine. It might be that the conclusion is that it's just too risky to change, in which case Uwe's suggestion is good and would avoid it showing up on someone's else radar in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be that the conclusion is that it's just too risky to change, in which case Uwe's suggestion is good and would avoid it showing up on someone's else radar in the future.
Until we're sure we want to normalize a usage of toLowerCase()
to one of toLowerCase(Locale.ROOT)
or toLowerCase(Locale.getDefault())
, I think it should be left here as-is, thus keeping it in an ambiguous state to remind us to continue discussing it in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can't normalize this use case to be locale-independent, then I even think lowerCaseMode
should be deprecated, because it's almost impossible for users to get expected behavior with this method.
In order to make it meaningful, I think it is still necessary to consider making it locale insensitive. We can allow users to fall back to the old behavior through new system properties, or introduce new API methods in StreamTokenizer
to allow users to set the Locale to be used.
Mailing list message from Remi Forax on nio-dev: ----- Original Message -----
One solution is to deprecate String.toLowerCase()/toUpperCase(), forcing users to explicitly use the variants that takes a Locale. R?mi |
@Glavo This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration! |
Now that the incompatible change to StreamTokenizer is dropped from this change then I assume the rest can be reviewed. |
I updated this PR to resolve the merge conflict. Now it is waiting to be reviewed again. |
Can someone review this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for the changes.
@Glavo This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 22 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@naotoj) but any other Committer may sponsor as well. ➡️ To flag this PR as ready for integration with the above commit message, type |
/integrate |
/sponsor |
Going to push as commit b32d641.
Your commit was automatically rebased without conflicts. |
Clean up misuses of
toLowerCase()
/toUpperCase()
in java.base.Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/14763/head:pull/14763
$ git checkout pull/14763
Update a local copy of the PR:
$ git checkout pull/14763
$ git pull https://git.openjdk.org/jdk.git pull/14763/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 14763
View PR using the GUI difftool:
$ git pr show -t 14763
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/14763.diff
Webrev
Link to Webrev Comment