-
Notifications
You must be signed in to change notification settings - Fork 6.2k
Improve DateTimeFormatter performance by unrolling formatting/parsing loops #28465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
👋 Welcome back swen! A progress list of the required criteria for merging this PR into |
|
❗ This change is not yet ready to be integrated. |
d97e61e to
7088d79
Compare
|
Below are the performance results on a MacBook M1 Pro. 1. Shell2. Raw Benchmark Data-# 9db0a6d473654a37159c43bfd56210d9b5330b15 (master)
-Benchmark (pattern) Mode Cnt Score Error Units
-DateTimeFormatterBench.formatInstants HH:mm:ss thrpt 15 15.486 ± 0.385 ops/ms
-DateTimeFormatterBench.formatInstants HH:mm:ss.SSS thrpt 15 11.392 ± 0.271 ops/ms
-DateTimeFormatterBench.formatInstants yyyy-MM-dd'T'HH:mm:ss thrpt 15 8.985 ± 0.425 ops/ms
-DateTimeFormatterBench.formatInstants yyyy-MM-dd'T'HH:mm:ss.SSS thrpt 15 7.122 ± 0.646 ops/ms
-DateTimeFormatterBench.formatZonedDateTime HH:mm:ss thrpt 15 23.252 ± 0.073 ops/ms
-DateTimeFormatterBench.formatZonedDateTime HH:mm:ss.SSS thrpt 15 15.816 ± 0.110 ops/ms
-DateTimeFormatterBench.formatZonedDateTime yyyy-MM-dd'T'HH:mm:ss thrpt 15 13.137 ± 0.332 ops/ms
-DateTimeFormatterBench.formatZonedDateTime yyyy-MM-dd'T'HH:mm:ss.SSS thrpt 15 9.880 ± 0.050 ops/ms
-DateTimeFormatterParse.parseInstant N/A thrpt 15 2104.714 ± 115.593 ops/ms
-DateTimeFormatterParse.parseLocalDate N/A thrpt 15 5032.899 ± 299.974 ops/ms
-DateTimeFormatterParse.parseLocalDateTime N/A thrpt 15 3600.533 ± 231.355 ops/ms
-DateTimeFormatterParse.parseLocalDateTimeWithNano N/A thrpt 15 3468.901 ± 287.957 ops/ms
-DateTimeFormatterParse.parseLocalTime N/A thrpt 15 4435.315 ± 365.496 ops/ms
-DateTimeFormatterParse.parseLocalTimeWithNano N/A thrpt 15 4580.805 ± 440.078 ops/ms
-DateTimeFormatterParse.parseOffsetDateTime N/A thrpt 15 2226.291 ± 206.005 ops/ms
-DateTimeFormatterParse.parseZonedDateTime N/A thrpt 15 1851.087 ± 118.679 ops/ms
-DateTimeFormatterWithPaddingBench.formatWithPadding N/A thrpt 15 7467.859 ± 1289.107 ops/ms
-DateTimeFormatterWithPaddingBench.formatWithPaddingLengthOne N/A thrpt 15 11551.849 ± 1963.616 ops/ms
-DateTimeFormatterWithPaddingBench.formatWithPaddingLengthZero N/A thrpt 15 14187.603 ± 1433.589 ops/ms
+# d82d9405772e72d091a362db013159396e6a9cd4 (this pr)
+Benchmark (pattern) Mode Cnt Score Error Units
+DateTimeFormatterBench.formatInstants HH:mm:ss thrpt 15 15.581 ± 0.253 ops/ms
+DateTimeFormatterBench.formatInstants HH:mm:ss.SSS thrpt 15 12.467 ± 0.876 ops/ms
+DateTimeFormatterBench.formatInstants yyyy-MM-dd'T'HH:mm:ss thrpt 15 9.518 ± 0.233 ops/ms
+DateTimeFormatterBench.formatInstants yyyy-MM-dd'T'HH:mm:ss.SSS thrpt 15 8.466 ± 0.490 ops/ms
+DateTimeFormatterBench.formatZonedDateTime HH:mm:ss thrpt 15 24.074 ± 0.165 ops/ms
+DateTimeFormatterBench.formatZonedDateTime HH:mm:ss.SSS thrpt 15 18.582 ± 0.097 ops/ms
+DateTimeFormatterBench.formatZonedDateTime yyyy-MM-dd'T'HH:mm:ss thrpt 15 14.048 ± 0.083 ops/ms
+DateTimeFormatterBench.formatZonedDateTime yyyy-MM-dd'T'HH:mm:ss.SSS thrpt 15 11.959 ± 0.077 ops/ms
+DateTimeFormatterParse.parseInstant N/A thrpt 15 2117.107 ± 128.321 ops/ms
+DateTimeFormatterParse.parseLocalDate N/A thrpt 15 4974.686 ± 407.681 ops/ms
+DateTimeFormatterParse.parseLocalDateTime N/A thrpt 15 3702.507 ± 314.033 ops/ms
+DateTimeFormatterParse.parseLocalDateTimeWithNano N/A thrpt 15 3841.321 ± 286.259 ops/ms
+DateTimeFormatterParse.parseLocalTime N/A thrpt 15 4471.911 ± 546.928 ops/ms
+DateTimeFormatterParse.parseLocalTimeWithNano N/A thrpt 15 4604.469 ± 359.004 ops/ms
+DateTimeFormatterParse.parseOffsetDateTime N/A thrpt 15 2617.276 ± 366.495 ops/ms
+DateTimeFormatterParse.parseZonedDateTime N/A thrpt 15 1853.496 ± 123.845 ops/ms
+DateTimeFormatterWithPaddingBench.formatWithPadding N/A thrpt 15 8543.391 ± 1235.325 ops/ms
+DateTimeFormatterWithPaddingBench.formatWithPaddingLengthOne N/A thrpt 15 12248.395 ± 2299.613 ops/ms
+DateTimeFormatterWithPaddingBench.formatWithPaddingLengthZero N/A thrpt 15 25737.255 ± 1313.735 ops/ms3. Performance ComparisonComparison between commit 9db0a6d and d82d940
|
7088d79 to
76bc921
Compare
|
@wenshao Please do not rebase or force-push to an active PR as it invalidates existing review comments. Note for future reference, the bots always squash all changes into a single commit automatically as part of the integration. See OpenJDK Developers’ Guide for more information. |
76bc921 to
b254c01
Compare
|
@wenshao Please do not rebase or force-push to an active PR as it invalidates existing review comments. Note for future reference, the bots always squash all changes into a single commit automatically as part of the integration. See OpenJDK Developers’ Guide for more information. |
7088d79 to
1fbfd3e
Compare
1fbfd3e to
d82d940
Compare
This PR optimizes the performance of java.time.format.DateTimeFormatter by unrolling loops in the formatting and parsing operations.
When we run the format and parse of java.time.DateTimeFormatter using
-XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining, we can see the following output:As seen in this log, both the DateTimeFormatterBuilder$CompositePrinterParser::format and DateTimeFormatterBuilder$CompositePrinterParser::parse methods are
failed to inline: virtual call. We can eliminate this inline failure by manually unrolling the loop.Once manually unrolled, inline optimizations can work, enabling optimizations like TypeProfile to take effect and thus improve performance.
Below is the log output after manually unrolling the loop:
We see that the format and parse methods of both NumberPrinterParser and CharLiteralPrinterParser trigger TypeProfile optimization.
We can choose to generate the code for the unrolling loop based on MethodHandle, the ClassFile API, or Gensrc.gmk. Using MethodHandle or the ClassFile API will make the code obscure and difficult to understand. I recommend using Gensrc.gmk. One advantage of Gensrc.gmk is that the initial performance is better than other implementations.
Progress
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/28465/head:pull/28465$ git checkout pull/28465Update a local copy of the PR:
$ git checkout pull/28465$ git pull https://git.openjdk.org/jdk.git pull/28465/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 28465View PR using the GUI difftool:
$ git pr show -t 28465Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/28465.diff