Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8307547: Support variant collations #13917

Closed
wants to merge 9 commits into from

Conversation

naotoj
Copy link
Member

@naotoj naotoj commented May 10, 2023

The fix to https://bugs.openjdk.org/browse/JDK-8306927 switched the default collation for Swedish to the modern one. In order to provide a means for users who need the old collation, this PR intends to make Collator recognize the co Unicode locale extension so that multiple implementations for a locale can be provided. I would also like reviews for the corresponding CSR.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Change requires CSR request JDK-8307565 to be approved
  • Commit message must refer to an issue

Issues

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/13917/head:pull/13917
$ git checkout pull/13917

Update a local copy of the PR:
$ git checkout pull/13917
$ git pull https://git.openjdk.org/jdk.git pull/13917/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 13917

View PR using the GUI difftool:
$ git pr show -t 13917

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/13917.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented May 10, 2023

👋 Welcome back naoto! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added csr Pull request needs approved CSR before integration rfr Pull request is ready for review labels May 10, 2023
@openjdk
Copy link

openjdk bot commented May 10, 2023

@naotoj The following labels will be automatically applied to this pull request:

  • core-libs
  • i18n

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added core-libs core-libs-dev@openjdk.org i18n i18n-dev@openjdk.org labels May 10, 2023
@mlbridge
Copy link

mlbridge bot commented May 10, 2023

Webrevs

@justin-curtis-lu
Copy link
Member

LGTM, does it make sense to add some other locale's collation variants?

@naotoj
Copy link
Member Author

naotoj commented May 10, 2023

Thanks, Justin

does it make sense to add some other locale's collation variants?

Maybe, but I am not seeing any requests for adding more collation variants. This enhancement is mainly for those who need the old Swedish collation as a workaround

* }
* </blockquote>
* may return a {@code Collator} instance with the traditional sorting, which
* gives 'v' and 'w' the same sorting order.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overall intent and wording looks okay but I wonder if it should be in the getInstance methods instead. Also wondering whether impNote is right because it's both informational for developers using the API and useful to implementors.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Will move it out of the class description into each getInstance() method, changing @implNote to @APinote

@naotoj naotoj changed the title 8307547: Support for multiple collations for a locale 8307547: Support variant collations May 11, 2023
@AlanBateman
Copy link
Contributor

The update, and API note in Collator.getInstance(Locale), looks okay to me. I'm just wondering where a developer might go to get a definitive list, i.e. aside from this API note, how would they know that "-trad" or "-traditional" can be used to configure the ordering. Locale.forLanguageTag supports more than BCP 47 language tag strings so is this considered a private use language tag.

@naotoj
Copy link
Member Author

naotoj commented May 12, 2023

I'm just wondering where a developer might go to get a definitive list, i.e. aside from this API note, how would they know that "-trad" or "-traditional" can be used to configure the ordering. Locale.forLanguageTag supports more than BCP 47 language tag strings so is this considered a private use language tag.

I think those should go into Oracle JDK's Supported Locales document. Created a task to include them (https://bugs.openjdk.org/browse/JDK-8308018)

@srl295
Copy link
Member

srl295 commented May 12, 2023

I'm just wondering where a developer might go to get a definitive list, i.e. aside from this API note, how would they know that "-trad" or "-traditional" can be used to configure the ordering. Locale.forLanguageTag supports more than BCP 47 language tag strings so is this considered a private use language tag.

I think those should go into Oracle JDK's Supported Locales document. Created a task to include them (https://bugs.openjdk.org/browse/JDK-8308018)

-u-co is defined here (UTS#35) and the keys are in the datafile here (link to main branch!) - note the since= attribute, these are very stable.

@AlanBateman

Locale.forLanguageTag supports more than BCP 47 language tag strings

It should still be all valid BCP47 including extensions and private use (such as x-lvalue).

so is this considered a private use language tag

Not private use at all. The -u- subtag is registered, and the links above are from the registrar, see

@naotoj
Copy link
Member Author

naotoj commented May 12, 2023

Thanks @srl295
I am just curious how CLDR handles the default switch of Swedish collation. Now the traditional collation used to be standard, and standard used to be reformed. How do apps specify their desired collation in Swedish, regardless of CLDR versions?

@srl295
Copy link
Member

srl295 commented May 12, 2023

Thanks @srl295
I am just curious how CLDR handles the default switch of Swedish collation. Now the traditional collation used to be standard, and standard used to be reformed. How do apps specify their desired collation in Swedish, regardless of CLDR versions?

I don't know off the top of my head but digging around I found https://unicode-org.atlassian.net/browse/CLDR-7088 and cross linked some tickets. Can you ask around on those?

@AlanBateman
Copy link
Contributor

so is this considered a private use language tag

Not private use at all. The -u- subtag is registered, and the links above are from the registrar, see

Good, I don't think I have any other questions on this now.

@naotoj
Copy link
Member Author

naotoj commented May 15, 2023

Thanks, @AlanBateman for the review.
@srl295 thanks. I found CLDR's collation documents are a bit outdated. Filed an issue here: https://unicode-org.atlassian.net/jira/software/c/projects/CLDR/issues/CLDR-16638

@@ -232,6 +227,18 @@ public static synchronized Collator getInstance() {

/**
* Gets the Collator for the desired locale.
* @apiNote Implementations of {@code Collator} class may produce
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For less l10n savvy readers, I suggest providing a bit more context in the apiNote, such as by explicitly mentioning Swedish and showing what getting the default collator would look like.

@openjdk
Copy link

openjdk bot commented May 15, 2023

@naotoj This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8307547: Support variant collations

Reviewed-by: srl, jlu, alanb

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 24 new commits pushed to the master branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added ready Pull request is ready to be integrated and removed csr Pull request needs approved CSR before integration labels May 15, 2023
@openjdk openjdk bot added csr Pull request needs approved CSR before integration ready Pull request is ready to be integrated and removed ready Pull request is ready to be integrated csr Pull request needs approved CSR before integration labels May 16, 2023
@naotoj
Copy link
Member Author

naotoj commented May 16, 2023

/integrate

@openjdk
Copy link

openjdk bot commented May 16, 2023

Going to push as commit f9a785e.
Since your change was applied there have been 26 commits pushed to the master branch:

  • 4e92991: 8300794: Use @snippet in java.util:i18n
  • 64d5157: 8306467: Fix nsk/jdb/kill/kill001 to work with new JVMTI StopThread support for virtual threads.
  • 488330d: 8307533: Use atomic bitset functions for metadata flags
  • 60ab135: 8308097: Generational ZGC: Update constructor syntax
  • 599fa77: 8308092: Replace NULL with nullptr in gc/x
  • 682359c: 8299075: TestStringDeduplicationInterned.java fails because extra deduplication
  • 9d5bab1: 8300081: Replace NULL with nullptr in share/asm/
  • 41ee125: 8278411: Implement UseHeavyMonitors consistently, s390 port
  • 316bc79: 8308188: ProblemList java/util/concurrent/locks/Lock/OOMEInAQS.java with ZGC on all platforms
  • 8686a36: 8307315: Missing ResourceMark in CDS and JVMTI code
  • ... and 16 more: https://git.openjdk.org/jdk/compare/57e7a3fbeae56f39f9434b4a97dd915fa14af93d...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label May 16, 2023
@openjdk openjdk bot closed this May 16, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels May 16, 2023
@openjdk
Copy link

openjdk bot commented May 16, 2023

@naotoj Pushed as commit f9a785e.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core-libs core-libs-dev@openjdk.org i18n i18n-dev@openjdk.org integrated Pull request has been integrated
5 participants