Skip to content

8174269: Remove COMPAT locale data provider from JDK#17991

Closed
naotoj wants to merge 45 commits intoopenjdk:masterfrom
naotoj:JDK-8174269-COMPAT-Removal
Closed

8174269: Remove COMPAT locale data provider from JDK#17991
naotoj wants to merge 45 commits intoopenjdk:masterfrom
naotoj:JDK-8174269-COMPAT-Removal

Conversation

@naotoj
Copy link
Member

@naotoj naotoj commented Feb 23, 2024

This PR intends to remove the legacy COMPAT locale data from the JDK. The COMPAT locale data was introduced for applications' migratory purposes transitioning to CLDR. It is becoming a technical debt and now is the time to remove it (we've been emitting a warning at JVM startup since JDK21, if the app is using COMPAT). A corresponding CSR has also been drafted.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change requires CSR request JDK-8325568 to be approved

Issues

  • JDK-8174269: Remove COMPAT locale data provider from JDK (Enhancement - P4)
  • JDK-8325568: Remove COMPAT locale data provider from JDK (CSR)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/17991/head:pull/17991
$ git checkout pull/17991

Update a local copy of the PR:
$ git checkout pull/17991
$ git pull https://git.openjdk.org/jdk.git pull/17991/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 17991

View PR using the GUI difftool:
$ git pr show -t 17991

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/17991.diff

Webrev

Link to Webrev Comment

@openjdk openjdk bot added build build-dev@openjdk.org core-libs core-libs-dev@openjdk.org i18n i18n-dev@openjdk.org labels Feb 23, 2024
@mlbridge
Copy link

mlbridge bot commented Feb 23, 2024

@AlanBateman
Copy link
Contributor

From a stewardship perspective I think we've done the right steps. To summarize:

  • JDK 8 added the option to use CLDR locale data (JEP 127).
  • JDK 9 switched to using CLDR locale data by default (JEP 252) with the option to run with -Djava.locale.providers=COMPAT and use the legacy/unmaintained locale data.
  • JDK 21 added a warning when you run with -Djava.locale.providers=COMPAT announcing that this provider will be removed in a future release.
  • With the proposal here, running with -Djava.locale.providers=COMPAT will print a warning to say that the configuration is ignored.

The reduction of 10Mb will be welcomed. There are likely projects that run their tests with the COMPAT provider. There may be some application deployments too. I've seen a few projects do changes in response to the run-time warning introduced in JDK 21 but there are likely projects/applications that will be "surprised" when they upgrade to JDK 23+ and tests fail. So I think one will need a bit of socialization and a loud release note.

Comment on lines 51 to 58
ifneq (, $(MISSING_RESOURCES)$(NEW_RESOURCES))
# There is a difference in the number of supported resources. Trigger a regeneration.
ifeq ($(MODULE), java.base)
$(shell $(RM) $(SUPPORT_OUTPUTDIR)/gensrc/java.base/sun/util/locale/provider/BaseLocaleDataMetaInfo.java \
$(SUPPORT_OUTPUTDIR)/gensrc/java.base/sun/util/cldr/CLDRBaseLocaleDataMetaInfo.java)
$(shell $(RM) $(SUPPORT_OUTPUTDIR)/gensrc/java.base/sun/util/cldr/CLDRBaseLocaleDataMetaInfo.java)
endif
ifeq ($(MODULE), jdk.localedata)
$(shell $(RM) $(SUPPORT_OUTPUTDIR)/gensrc/jdk.localedata/sun/util/resources/provider/NonBaseLocaleDataMetaInfo.java \
$(SUPPORT_OUTPUTDIR)/gensrc/jdk.localedata/sun/util/resources/cldr/provider/CLDRLocaleDataMetaInfo_jdk_localedata.java)
$(shell $(RM) $(SUPPORT_OUTPUTDIR)/gensrc/jdk.localedata/sun/util/resources/cldr/provider/CLDRLocaleDataMetaInfo_jdk_localedata.java)
endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The remainder of this file doesn't seem to be doing anything useful. Do we really need to keep it around? If I understand this correctly, the files referenced and deleted here are generated by the cldrconverter. Is that build tool relying on make removing the file first? Reading the java source, my impression is that it will generate the file unconditionally regardless of the files current existence as long as the tool is run.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, as the file currently in the PR, it does not seem like it does anything at all, since we no longer create _the.locale_resources. Or... hm, without that file, PREV_LOCALE_RESOURCES will always be empty. I wonder what $(filter-out $(PREV_LOCALE_RESOURCES), $(LOCALE_RESOURCES)) returns in that case. If it is empty, then this file does nothing. If it is $(LOCALE_RESOURCES), then we will unconditionally rm the two files.

I believe the original intention was to spot added or removed locale resources, and use that to trigger a re-generation. This does not seem to work at all anymore. Do we still need that kind of behavior?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, this is a rabbit hole. :-(

I tried to figure out where CLDRBaseLocaleDataMetaInfo.java comes from; I could not see where we generated that file. Turns out it is created in make/jdk/src/classes/build/tools/cldrconverter/ResourceBundleGenerator.java, in the function generateMetaInfo(). I assume this is called as part of calling the CLDRConverter buildtool. But this is done in make/modules/[java.base|jdk.localedata]/Gensrc.gmk. So if we need to invalidate that build tool result, it should be done in these files.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Good points.
As CLDR tool generates those files, GensrcLocaleData.gmk no longer seems necessary. I will remove it after I do some testings.

Copy link
Member

@justin-curtis-lu justin-curtis-lu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still working on getting a better understanding of all the parts here, but left some initial comments.

* @summary Basic tests for currency formatting.
* Tests both COMPAT and CLDR data.
* @modules jdk.localedata
* @run junit/othervm -Djava.locale.providers=COMPAT CurrencyFormat
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The methods currencySymbolsTest, currencySymbolsDataProvider, and getFutureSymbol can be removed since they are for COMPAT only.

The string array expectedCOMPATData can be removed from the data provider method currencyFormatDataProvider as well as isCompat variable and usage.

CurrencySymbols.properties can also be deleted since that is what currencySymbolsDataProvider uses to build the data and no other tests rely on the file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Removed COMPAT related tests/data.

import jdk.internal.util.StaticProperty;
import jdk.internal.vm.annotation.Stable;

import sun.security.action.GetPropertyAction;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although trivial change, not sure if the file needs a copyright year bump; not exactly sure on the policy here.


@Override
// In order to correctly report supported locales
public BreakIteratorProvider getBreakIteratorProvider() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More for my understanding but I am curious why FallbackLocaleProviderAdapter has to override getBreakIteratorProvider, but can rely on the getCollatorProvider from JRELocaleProviderAdapter? Also wondering why "BreakIteratorRules" is fetched when JRELocaleProviderAdapter fetches "FormatData" if the data is the same COMPAT data.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

COMPAT used to offer supported locales for the ones that exist as resource bundles. For Collator, JRELocaleProviderAdapter had a list for CollationData resource bundles, but BreakIterator shared with FormatData, which now only has root/en/ja (for Gan-nen support). So it had to override the method and return th (this is the main function for BreakIterator as of now)

@@ -78,10 +77,10 @@ public void simpleDateFormatWithTZNProvider() throws Throwable {
*/
@Test
@EnabledOnOs(WINDOWS)
@EnabledIfSystemProperty(named = "user.language", matches = "ja")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing, as it is HOST the locale value should be based on the machine at startup. Although, I'm wondering how the test passed previously then.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, it only works if the underlying machine is configured with ja locale as the default.

@@ -23,7 +23,7 @@

/*
* @test
* @bug 8304982
* @bug 8304982 8174269
* @summary Check if a warning is logged with COMPAT locale provider
* @run main/othervm -Djava.locale.providers=COMPAT CompatWarning
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth adding runs with COMPAT specified with other providers (both before and/or after) for coverage sake. Such as @run main/othervm -Djava.locale.providers=SPI,COMPAT CompatWarning.

@@ -23,7 +23,7 @@

/*
* @test
* @bug 8008577 8138613
* @bug 8008577 8138613 8174269
* @summary Check whether CLDR locale provider adapter is enabled by default
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I'm mistaken, there aren't any other dedicated tests to ensuring the FALLBACK adapter is included, it might be worth updating the summary to make it apparent that while the default preference list has CLDR, FALLBACK is always appended, since that is new behavior.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe some tests will break if FALLBACK is not included in the preferred list. I am kind of hesitant to specify it in the command line for the sake of testing, as it is not a public constant.

@openjdk openjdk bot removed the csr Pull request needs approved CSR before integration label Feb 27, 2024
Copy link
Member

@magicus magicus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good from a build perspective. Actually, it looks great! :) I'm happy to get rid of this old strange construct.

@openjdk
Copy link

openjdk bot commented Feb 27, 2024

@naotoj This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8174269: Remove COMPAT locale data provider from JDK

Reviewed-by: ihse, joehw

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 15 new commits pushed to the master branch:

  • c6641c7: 8326831: Clarify test harness control variables in make help
  • 3d106cb: 8325139: JFR SwapSpace event - add free swap space information on Linux when running in a container environment
  • c00c939: 8327364: Parallel: Remove unused ParallelCompactData::add_obj
  • 98f0b86: 8319690: [AArch64] C2 compilation hits offset_ok_for_immed: assert "c2 compiler bug"
  • 2372aba: 8326172: Dubious claim on long[]/double[] alignment in MemorySegment javadoc
  • c653e67: 8327225: Revert DataInputStream.readUTF to static final
  • a089ed2: 8326936: RISC-V: Shenandoah GC crashes due to incorrect atomic memory operations
  • 560cf59: 8327287: Remove unused FLSVerifyDictionary debug option
  • fec51d4: 8327130: Serial: Remove Generation::record_spaces_top
  • e9adceb: 8327208: Remove unused method java.util.jar.Manifest.make72Safe
  • ... and 5 more: https://git.openjdk.org/jdk/compare/045eea11307f5df89071c7c7e366effdc23a22db...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Feb 27, 2024
@naotoj
Copy link
Member Author

naotoj commented Feb 27, 2024

Thanks, @magicus

to get rid of this old strange construct.

Removing legacy clutter is exactly the purpose of this exercise!

Copy link
Member

@JoeWang-Java JoeWang-Java left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. This is a lot of work. Looking through the files alone takes hours. Kudos to the great work!

* "US<NBSP>dst" -> "D"
*
* These two mappings resolve the short names for time zones in each type,
* `tzdbLinks` retains `Link`s of time zones. if the value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, "if the value" seems to be an unfinished sentence.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Not only the sentence, the description itself was unfinished. Corrected.

* each resource in sun.util.resources & sun.text.resources.
* It is used to avoid loading non-existent localized resources so that
* jar files won't be opened unnecessary to look up them.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: move the class description to right above the class? "unnecessary" -> "unnecessarily"

@JoeWang-Java
Copy link
Member

I kind of like some of the date formats in COMPACT to be honest :-)

@naotoj
Copy link
Member Author

naotoj commented Mar 5, 2024

/integrate

@openjdk
Copy link

openjdk bot commented Mar 5, 2024

Going to push as commit 809995b.
Since your change was applied there have been 15 commits pushed to the master branch:

  • c6641c7: 8326831: Clarify test harness control variables in make help
  • 3d106cb: 8325139: JFR SwapSpace event - add free swap space information on Linux when running in a container environment
  • c00c939: 8327364: Parallel: Remove unused ParallelCompactData::add_obj
  • 98f0b86: 8319690: [AArch64] C2 compilation hits offset_ok_for_immed: assert "c2 compiler bug"
  • 2372aba: 8326172: Dubious claim on long[]/double[] alignment in MemorySegment javadoc
  • c653e67: 8327225: Revert DataInputStream.readUTF to static final
  • a089ed2: 8326936: RISC-V: Shenandoah GC crashes due to incorrect atomic memory operations
  • 560cf59: 8327287: Remove unused FLSVerifyDictionary debug option
  • fec51d4: 8327130: Serial: Remove Generation::record_spaces_top
  • e9adceb: 8327208: Remove unused method java.util.jar.Manifest.make72Safe
  • ... and 5 more: https://git.openjdk.org/jdk/compare/045eea11307f5df89071c7c7e366effdc23a22db...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Mar 5, 2024
@openjdk openjdk bot closed this Mar 5, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Mar 5, 2024
@openjdk
Copy link

openjdk bot commented Mar 5, 2024

@naotoj Pushed as commit 809995b.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

build build-dev@openjdk.org core-libs core-libs-dev@openjdk.org i18n i18n-dev@openjdk.org integrated Pull request has been integrated

Development

Successfully merging this pull request may close these issues.

7 participants