Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8261254: Initialize charset mapping data lazily #2449

Closed
wants to merge 4 commits into from

Conversation

cl4es
Copy link
Member

@cl4es cl4es commented Feb 7, 2021

This patch refactor JDK internal charsets to initialize charset mapping data lazily when needed via holder classes. This means both a startup improvement in some cases, and possible throughput improvements for all DoubleByte-based Charsets.

Testing: tier1-3


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

Reviewers

Download

$ git fetch https://git.openjdk.java.net/jdk pull/2449/head:pull/2449
$ git checkout pull/2449

@bridgekeeper
Copy link

bridgekeeper bot commented Feb 7, 2021

👋 Welcome back redestad! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Feb 7, 2021

@cl4es The following labels will be automatically applied to this pull request:

  • build
  • core-libs
  • i18n

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added build build-dev@openjdk.org core-libs core-libs-dev@openjdk.org i18n i18n-dev@openjdk.org labels Feb 7, 2021
@cl4es
Copy link
Member Author

cl4es commented Feb 7, 2021

Applications calling Charset.availableCharsets() might see a decent startup improvement (125-190ms -> 57-95ms first call overhead).

All DoubleByte-derived charsets can see a throughput improvement on benchmarks thanks to removing DCL initialization checks with a Holder pattern implementation, which allows static data to be declared final and be more amenable to optimizations:

Before:
Benchmark                                   (charsetName)  Mode  Cnt    Score    Error  Units
StringDecode.WithCharset.decodeCharset              MS932  avgt   15  280.072 ± 17.217  ns/op
StringDecode.WithCharset.decodeCharsetName          MS932  avgt   15  289.368 ± 18.174  ns/op

After:
Benchmark                                   (charsetName)  Mode  Cnt    Score    Error  Units
StringDecode.WithCharset.decodeCharset              MS932  avgt   15  254.906 ± 15.095  ns/op
StringDecode.WithCharset.decodeCharsetName          MS932  avgt   15  268.241 ± 15.725  ns/op

@cl4es
Copy link
Member Author

cl4es commented Feb 7, 2021

/label remove build

@openjdk openjdk bot removed the build build-dev@openjdk.org label Feb 7, 2021
@openjdk
Copy link

openjdk bot commented Feb 7, 2021

@cl4es
The build label was successfully removed.

@cl4es cl4es marked this pull request as ready for review February 7, 2021 19:32
@openjdk openjdk bot added the rfr Pull request is ready for review label Feb 7, 2021
@mlbridge
Copy link

mlbridge bot commented Feb 7, 2021

Webrevs

Copy link
Contributor

@AlanBateman AlanBateman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't expect enumerating all charsets with Charset::availableCharsets to be too common but moving the data to holder class looks okay. The missing "final" in a few places was an oversight. The replacement of the foreach and method ref in getServicesCatalog with imperative code is disappointment but okay here.

@openjdk
Copy link

openjdk bot commented Feb 8, 2021

@cl4es This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8261254: Initialize charset mapping data lazily

Reviewed-by: alanb, jkuhn, naoto

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 11 new commits pushed to the master branch:

  • d6d5d9b: 8261231: Windows IME was disabled after DnD operation
  • 29a428f: 8261229: MethodData is not correctly initialized with TieredStopAtLevel=3
  • 48c932e: 8231286: HTML font size too large with high-DPI scaling and W3C_LENGTH_UNITS
  • dbc35f6: 8261094: Open javax/swing/text/html/CSS/4765271/bug4765271.java
  • db0ca2b: 8261161: Clean up warnings in hotspot/jtreg/vmTestbase tests
  • 2c28e36: 8237352: Update DatagramSocket to add support for joining multicast groups
  • d0a8f2f: 8260593: javac can skip a temporary local variable when pattern matching over a local variable
  • deb0544: 8261251: Shenandoah: Use object size for full GC humongous compaction
  • d45343e: 8260899: ARM32: SyncOnValueBasedClassTest fails with assert(is_valid()) failed: invalid register
  • 9d59dec: 8248876: LoadObject with bad base address created for exec file on linux
  • ... and 1 more: https://git.openjdk.java.net/jdk/compare/c5ff454481c8ac2c30abd1fb11bd1cec2c95feeb...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Feb 8, 2021
@cl4es
Copy link
Member Author

cl4es commented Feb 8, 2021

I spotted usage of this in a real application. While they could work around it and remove the usage to gain an even larger startup win I figured I should do those that can't do so a favor, too.

Copy link
Member

@DasBrain DasBrain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

Copy link
Member

@naotoj naotoj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I initially thought of the same impression as Alan's, i.e., enumerating all charsets is not that common (in fact, same applies to Locale, and I once explored usages and it ended all up in test cases), but if real apps are using it, this is the right way to do so.

@cl4es
Copy link
Member Author

cl4es commented Feb 8, 2021

Thanks for reviewing!

/integrate

@openjdk openjdk bot closed this Feb 8, 2021
@openjdk openjdk bot added integrated Pull request has been integrated and removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Feb 8, 2021
@openjdk
Copy link

openjdk bot commented Feb 8, 2021

@cl4es Since your change was applied there have been 12 commits pushed to the master branch:

  • 351d788: 8259074: regex benchmarks and tests
  • d6d5d9b: 8261231: Windows IME was disabled after DnD operation
  • 29a428f: 8261229: MethodData is not correctly initialized with TieredStopAtLevel=3
  • 48c932e: 8231286: HTML font size too large with high-DPI scaling and W3C_LENGTH_UNITS
  • dbc35f6: 8261094: Open javax/swing/text/html/CSS/4765271/bug4765271.java
  • db0ca2b: 8261161: Clean up warnings in hotspot/jtreg/vmTestbase tests
  • 2c28e36: 8237352: Update DatagramSocket to add support for joining multicast groups
  • d0a8f2f: 8260593: javac can skip a temporary local variable when pattern matching over a local variable
  • deb0544: 8261251: Shenandoah: Use object size for full GC humongous compaction
  • d45343e: 8260899: ARM32: SyncOnValueBasedClassTest fails with assert(is_valid()) failed: invalid register
  • ... and 2 more: https://git.openjdk.java.net/jdk/compare/c5ff454481c8ac2c30abd1fb11bd1cec2c95feeb...master

Your commit was automatically rebased without conflicts.

Pushed as commit 92c6e6d.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core-libs core-libs-dev@openjdk.org i18n i18n-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

5 participants