-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8316704: Regex-free parsing of Formatter and FormatProcessor specifiers #15776
Conversation
👋 Welcome back wenshao! A progress list of the required criteria for merging this PR into |
/label remove i18n |
@wenshao |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be reasonable to add a few more common patterns to the FormatSpecifier
fast-path, but where to draw the line?
FWIW the intent of micros like complex
and widthString
wasn't necessarily to invite further optimizations, but to explore the cost of failure, i.e., make sure that the fast-path doesn't add a substantial cost when it doesn't help or only helps somewhat. Since you now specialize for most of the patterns in the micros I think you need to explore some variants that you don't optimize for, such as "%10.3f"
.
Orthogonal optimizations like the FormatSpecifier
fast-path extension and the print
fast-path should generally be separate PRs.
&& c == Conversion.DECIMAL_INTEGER | ||
&& fmt.a instanceof StringBuilder sb | ||
) { | ||
sb.append(value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a lot of if
s here, and this doesn't take into account locales with non-ASCII digits:
Locale ar = new Locale.Builder().setLanguageTag("ar-SA-u-nu-arab").build();
Locale.setDefault(ar);
System.out.println("%d".formatted(10000)); // should print "١٠٠٠٠" but prints "10000"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change code of print fast-path has been deleted, and parse fast-path has added support for the pattern "%8.3f".
Where to draw the line of parse fast-path? I have seen patterns that cause performance problems, and they can be easily implemented, so I added them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now parse fast-path supports "8.3f", but not "10.3". Because the fast-path method code size should be less than 325, for JIT inline
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I meant is that theoretically we could drop the regex code entirely and write a custom parser that specializes every formatter, but that we probably shouldn't as this means duplicating a lot of code and we'd likely end up having to maintain both. Exactly which patterns to specialize for is an open question. "%8.3f"
is common, sure, but so are specifiers like "%-6d"
. I think it'd be good if we could collect some stats on which specifier patterns are the most common rather than just picking a few at random.
I see you added a microbenchmark for yet another happy case, which sort of misses my point about setting micros up to explore the boundaries: the set of microbenchmarks should ideally explore and verify both fast-paths and slow-paths, to show that the benefit of the former is significant while the overhead added to the slow-path is negligible. Adding a floatFormat2
that does return "%1.12f".formatted...
, as an example:
Name Cnt Base Error Test Error Unit Diff%
StringFormat.floatFormat 15 316,133 ± 7,890 170,958 ± 8,231 ns/op 45,9% (p = 0,000*)
StringFormat.floatFormat2 15 342,767 ± 4,721 343,748 ± 3,753 ns/op -0,3% (p = 0,506 )
* = significant
This verifies that the added overhead is in the noise when the fast-path fail on this test.
We don't need to cover every possibility and have an ever-growing set of micros that all just test the fast-path, so I think you can remove the additions and instead adjust one or two of the existing microbenchmarks so that it verifies the slow-path with your PR applied.
I enhanced parse fast-path to support more specifiers, including: % flag_1 width_1
% flag_2
% width_2
% width_1 . precesion_1 now benchmark on macbook m1 pro result is: -Benchmark Mode Cnt Score Error Units (optimized)
-StringFormat.complexFormat avgt 15 2049.387 ? 121.539 ns/op
-StringFormat.flags2Format avgt 15 430.964 ? 2.414 ns/op
-StringFormat.flagsFormat avgt 15 257.851 ? 23.833 ns/op
-StringFormat.stringFormat avgt 15 63.564 ? 10.490 ns/op
-StringFormat.stringIntFormat avgt 15 88.111 ? 0.678 ns/op
-StringFormat.width2Format avgt 15 349.304 ? 31.349 ns/op
-StringFormat.width2PrecisionFormat avgt 15 464.621 ? 53.918 ns/op
-StringFormat.widthFormat avgt 15 301.997 ? 34.974 ns/op
-StringFormat.widthPrecisionFormat avgt 15 484.526 ? 38.098 ns/op
-StringFormat.widthStringFormat avgt 15 235.421 ? 32.955 ns/op
-StringFormat.widthStringIntFormat avgt 15 315.178 ? 15.154 ns/op
+Benchmark Mode Cnt Score Error Units
+StringFormat.complexFormat avgt 15 702.407 ? 85.481 ns/op (+191.77)
+StringFormat.flags2Format avgt 15 329.551 ? 1.610 ns/op (+30.78)
+StringFormat.flagsFormat avgt 15 125.798 ? 1.109 ns/op (+104.98)
+StringFormat.stringFormat avgt 15 60.029 ? 6.275 ns/op (+5.89)
+StringFormat.stringIntFormat avgt 15 89.020 ? 0.575 ns/op (-1.03)
+StringFormat.width2Format avgt 15 135.743 ? 0.643 ns/op (+157.33)
+StringFormat.width2PrecisionFormat avgt 15 351.408 ? 21.031 ns/op (+32.22)
+StringFormat.widthFormat avgt 15 208.843 ? 47.504 ns/op (+44.61)
+StringFormat.widthPrecisionFormat avgt 15 354.375 ? 67.314 ns/op (+36.73)
+StringFormat.widthStringFormat avgt 15 74.846 ? 19.604 ns/op (+214.55)
+StringFormat.widthStringIntFormat avgt 15 101.638 ? 0.961 ns/op (+210.10) |
I was worried this would sprawl out more, but perhaps ~230 lines of code is a reasonable extra weight to make the long tail of I was going to comment that the flag parsing was broken in f303f29 but it seems that it was fixed in the latest. I think we need to make a review pass over all existing tests to make sure all imaginable variants are covered. The parser code also ought to be shared between |
The codes of Formatter and FormatProcessor have been regex-free. There are many changes and require more detailed review. |
I think it makes sense to file an RFE and do a full review of this. "Regex-free parsing of Formatter and FormatProcessor specifiers"? |
@rgiulietti can you help me continue to review this PR? |
Thanks to @rgiulietti for the review, I have changed the FormatProcessor according to your suggestion and added test cases |
@rgiulietti Is there anything else that needs to be modified? |
Can anyone help me review this PR? |
@RogerRiggs Can you help me review this PR? |
@wenshao This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration! |
@AlanBateman @cl4es Can you help me continue to complete the review of this PR? |
@wenshao This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration! |
/integrate |
/sponsor |
Going to push as commit 50b17d9.
Your commit was automatically rebased without conflicts. |
@rgiulietti @wenshao Pushed as commit 50b17d9. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
@cl4es made performance optimizations for the simple specifiers of String.format in PR #2830. Based on the same idea, I continued to make improvements. I made patterns like %2d %02d also be optimized.
The following are the test results based on MacBookPro M1 Pro:
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/15776/head:pull/15776
$ git checkout pull/15776
Update a local copy of the PR:
$ git checkout pull/15776
$ git pull https://git.openjdk.org/jdk.git pull/15776/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 15776
View PR using the GUI difftool:
$ git pr show -t 15776
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/15776.diff
Webrev
Link to Webrev Comment