Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8263038: Optimize String.format for simple specifiers #2830

Closed
wants to merge 12 commits into from

Conversation

cl4es
Copy link
Member

@cl4es cl4es commented Mar 4, 2021

This patch optimizes String.format expressions that uses trivial specifiers. In the JDK, the most common variation of String.format is a variation of format("foo: %s", s), which gets a significant speed-up from this.

Various other cleanups and minor improvements reduce overhead further and ensure we get a small gain also for more complex format strings.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8263038: Optimize String.format for simple specifiers

Reviewers

Download

$ git fetch https://git.openjdk.java.net/jdk pull/2830/head:pull/2830
$ git checkout pull/2830

@bridgekeeper
Copy link

bridgekeeper bot commented Mar 4, 2021

👋 Welcome back redestad! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Mar 4, 2021

@cl4es The following labels will be automatically applied to this pull request:

  • core-libs
  • i18n

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added core-libs core-libs-dev@openjdk.org i18n i18n-dev@openjdk.org labels Mar 4, 2021
@cl4es
Copy link
Member Author

cl4es commented Mar 4, 2021

Microbench results - baseline:

Benchmark                                                           Mode  Cnt     Score      Error   Units
StringFormat.complexFormat                                          avgt    5  8842.917 ±  658.269   ns/op
StringFormat.complexFormat:·gc.alloc.rate.norm                      avgt    5  2176.378 ±    0.483    B/op
StringFormat.stringFormat                                           avgt    5   859.863 ±   97.514   ns/op
StringFormat.stringFormat:·gc.alloc.rate.norm                       avgt    5   560.091 ±    0.011    B/op
StringFormat.stringIntFormat                                        avgt    5  1619.772 ±  147.646   ns/op
StringFormat.stringIntFormat:·gc.alloc.rate.norm                    avgt    5   728.132 ±    0.140    B/op
StringFormat.widthStringFormat                                      avgt    5  1060.200 ±  154.025   ns/op
StringFormat.widthStringFormat:·gc.alloc.rate.norm                  avgt    5   592.108 ±    0.093    B/op
StringFormat.widthStringIntFormat                                   avgt    5  2045.215 ±  246.189   ns/op
StringFormat.widthStringIntFormat:·gc.alloc.rate.norm               avgt    5   784.144 ±    0.121    B/op

Patched:

Benchmark                                                           Mode  Cnt     Score      Error   Units
StringFormat.complexFormat                                          avgt    5  8023.314 ± 1387.475   ns/op
StringFormat.complexFormat:·gc.alloc.rate.norm                      avgt    5  2120.399 ±    0.417    B/op
StringFormat.stringFormat                                           avgt    5   286.776 ±   30.645   ns/op
StringFormat.stringFormat:·gc.alloc.rate.norm                       avgt    5   256.044 ±    0.017    B/op
StringFormat.stringIntFormat                                        avgt    5   626.083 ±   68.652   ns/op
StringFormat.stringIntFormat:·gc.alloc.rate.norm                    avgt    5   432.073 ±    0.039    B/op
StringFormat.widthStringFormat                                      avgt    5  1061.631 ±  156.444   ns/op
StringFormat.widthStringFormat:·gc.alloc.rate.norm                  avgt    5   560.103 ±    0.106    B/op
StringFormat.widthStringIntFormat                                   avgt    5  1380.208 ±  267.445   ns/op
StringFormat.widthStringIntFormat:·gc.alloc.rate.norm               avgt    5   736.134 ±    0.144    B/op

-Xint similarly sees no change on complexString, but a 3-3.5x speed-up on stringFormat

@cl4es
Copy link
Member Author

cl4es commented Mar 4, 2021

/label remove i18n

@openjdk openjdk bot removed the i18n i18n-dev@openjdk.org label Mar 4, 2021
@openjdk
Copy link

openjdk bot commented Mar 4, 2021

@cl4es
The i18n label was successfully removed.

@cl4es cl4es marked this pull request as ready for review March 4, 2021 17:43
@openjdk openjdk bot added the rfr Pull request is ready for review label Mar 4, 2021
@mlbridge
Copy link

mlbridge bot commented Mar 4, 2021

Webrevs

Copy link
Contributor

@RogerRiggs RogerRiggs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@@ -3012,19 +3014,19 @@ private void printCharacter(Object arg, Locale l) throws IOException {
if (arg instanceof Character) {
s = ((Character)arg).toString();
} else if (arg instanceof Byte) {
byte i = ((Byte)arg).byteValue();
byte i = (Byte) arg;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the pattern matching for instanceof be used here to remove explicit casts. (Supported in JDK 16)

            if (arg instanceof Character c) {
                s = c.toString();
            } else if (arg instanceof Byte i) {
                if (Character.isValidCodePoint(i))
                    s = new String(Character.toChars(i));
                else
                    throw new IllegalFormatCodePointException(i);
            } else if (arg instanceof Short i) {
                if (Character.isValidCodePoint(i))
                    s = new String(Character.toChars(i));
                else
                    throw new IllegalFormatCodePointException(i);
            } ```
etc..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did think about it, but it seemed to stray a bit too far from the intent of this enhancement.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only looked at it because of the updates to use switch expressions...
ok, either way.

Copy link
Member Author

@cl4es cl4es Mar 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had reason to muck around with the switch expressions, since Conversion.isValid was inefficient (for startup) and subtly wrong (accepted 't').

Getting rid of explicit unboxing - .byteValue() - is just a syntactic improvement, so I indulged in a few places.

Using instanceof pattern matching OTOH changes bytecode shape a fair bit by introducing locals and changing the execution order. The suggestions you made above also mean we'd unbox twice. Probably inconsequential, but I'm wary of such changes when I don't have benchmarks to assert they are performance neutral.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, good to know... slick language features may come at a cost.

@openjdk
Copy link

openjdk bot commented Mar 5, 2021

@cl4es This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8263038: Optimize String.format for simple specifiers

Reviewed-by: rriggs, naoto

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 48 new commits pushed to the master branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Mar 5, 2021
@cl4es
Copy link
Member Author

cl4es commented Mar 8, 2021

Piling on another optimization: The getZero(..) called eagerly in the constructor is rather expensive in non-US locales, e.g. running with "-Duser.language=fr":

Benchmark                  Mode  Cnt    Score     Error  Units
StringFormat.stringFormat  avgt    5  924.536 ± 253.151  ns/op

Since the zero value is only used when printing floating point number I refactored so that the localized zero is evaluated lazily, which gets numbers on the micros in line with the numbers for a US locale:

Benchmark                  Mode  Cnt    Score    Error  Units
StringFormat.stringFormat  avgt    5  291.385 ± 64.626  ns/op

Copy link
Member

@naotoj naotoj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@cl4es
Copy link
Member Author

cl4es commented Mar 8, 2021

Thank you for reviews, Roger and Naoto!

/integrate

@openjdk openjdk bot closed this Mar 8, 2021
@openjdk openjdk bot added integrated Pull request has been integrated and removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Mar 8, 2021
@openjdk
Copy link

openjdk bot commented Mar 8, 2021

@cl4es Since your change was applied there have been 49 commits pushed to the master branch:

  • 14cfbda: 8261366: Add discussion of IEEE 754 to BigDecimal
  • 414ee95: 8261462: GCM ByteBuffer decryption problems
  • eb4a8af: 8260664: Phaser.arrive() memory consistency effects
  • 9221540: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest
  • 17853ee: 8263200: Add -XX:StressCCP to CTW
  • a2b8858: 8263041: Shenandoah: Cleanup C1 keep alive barrier check
  • 1f9ed90: 8219555: compiler/jvmci/compilerToVM/IsMatureTest.java fails with Unexpected isMature state for multiple times invoked method: expected false to equal true
  • bf9b74d: 8262446: DragAndDrop hangs on Windows
  • b1cc864: 8251210: Link JDK api docs to other versions
  • 0da889e: 8210100: ParallelGC should use parallel WeakProcessor
  • ... and 39 more: https://git.openjdk.java.net/jdk/compare/d2c4ed08a2f78c22e4d59b6c29d29abf3202199d...master

Your commit was automatically rebased without conflicts.

Pushed as commit f71b21b.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core-libs core-libs-dev@openjdk.org integrated Pull request has been integrated
3 participants