Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8316704: Regex-free parsing of Formatter and FormatProcessor specifiers #15776

Closed
wants to merge 41 commits into from

Conversation

wenshao
Copy link
Contributor

@wenshao wenshao commented Sep 17, 2023

@cl4es made performance optimizations for the simple specifiers of String.format in PR #2830. Based on the same idea, I continued to make improvements. I made patterns like %2d %02d also be optimized.

The following are the test results based on MacBookPro M1 Pro:

-Benchmark                          Mode  Cnt     Score     Error  Units
-StringFormat.complexFormat         avgt   15  1862.233 ? 217.479  ns/op
-StringFormat.int02Format           avgt   15   312.491 ?  26.021  ns/op
-StringFormat.intFormat             avgt   15    84.432 ?   4.145  ns/op
-StringFormat.longFormat            avgt   15    87.330 ?   6.111  ns/op
-StringFormat.stringFormat          avgt   15    63.985 ?  11.366  ns/op
-StringFormat.stringIntFormat       avgt   15    87.422 ?   0.147  ns/op
-StringFormat.widthStringFormat     avgt   15   250.740 ?  32.639  ns/op
-StringFormat.widthStringIntFormat  avgt   15   312.474 ?  16.309  ns/op

+Benchmark                          Mode  Cnt    Score    Error  Units
+StringFormat.complexFormat         avgt   15  740.626 ? 66.671  ns/op (+151.45)
+StringFormat.int02Format           avgt   15  131.049 ?  0.432  ns/op (+138.46)
+StringFormat.intFormat             avgt   15   67.229 ?  4.155  ns/op (+25.59)
+StringFormat.longFormat            avgt   15   66.444 ?  0.614  ns/op (+31.44)
+StringFormat.stringFormat          avgt   15   62.619 ?  4.652  ns/op (+2.19)
+StringFormat.stringIntFormat       avgt   15   89.606 ? 13.966  ns/op (-2.44)
+StringFormat.widthStringFormat     avgt   15   52.462 ? 15.649  ns/op (+377.95)
+StringFormat.widthStringIntFormat  avgt   15  101.814 ?  3.147  ns/op (+206.91)

Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed (2 reviews required, with at least 1 Reviewer, 1 Author)

Issue

  • JDK-8316704: Regex-free parsing of Formatter and FormatProcessor specifiers (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/15776/head:pull/15776
$ git checkout pull/15776

Update a local copy of the PR:
$ git checkout pull/15776
$ git pull https://git.openjdk.org/jdk.git pull/15776/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 15776

View PR using the GUI difftool:
$ git pr show -t 15776

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/15776.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Sep 17, 2023

👋 Welcome back wenshao! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Sep 17, 2023

@wenshao The following labels will be automatically applied to this pull request:

  • core-libs
  • i18n

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added core-libs core-libs-dev@openjdk.org i18n i18n-dev@openjdk.org labels Sep 17, 2023
@wenshao
Copy link
Contributor Author

wenshao commented Sep 17, 2023

/label remove i18n

@openjdk openjdk bot removed the i18n i18n-dev@openjdk.org label Sep 17, 2023
@openjdk
Copy link

openjdk bot commented Sep 17, 2023

@wenshao
The i18n label was successfully removed.

Copy link
Member

@cl4es cl4es left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be reasonable to add a few more common patterns to the FormatSpecifier fast-path, but where to draw the line?

FWIW the intent of micros like complex and widthString wasn't necessarily to invite further optimizations, but to explore the cost of failure, i.e., make sure that the fast-path doesn't add a substantial cost when it doesn't help or only helps somewhat. Since you now specialize for most of the patterns in the micros I think you need to explore some variants that you don't optimize for, such as "%10.3f".

Orthogonal optimizations like the FormatSpecifier fast-path extension and the print fast-path should generally be separate PRs.

&& c == Conversion.DECIMAL_INTEGER
&& fmt.a instanceof StringBuilder sb
) {
sb.append(value);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a lot of ifs here, and this doesn't take into account locales with non-ASCII digits:

Locale ar = new Locale.Builder().setLanguageTag("ar-SA-u-nu-arab").build();
Locale.setDefault(ar);
System.out.println("%d".formatted(10000)); // should print "١٠٠٠٠" but prints "10000"

Copy link
Contributor Author

@wenshao wenshao Sep 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change code of print fast-path has been deleted, and parse fast-path has added support for the pattern "%8.3f".

Where to draw the line of parse fast-path? I have seen patterns that cause performance problems, and they can be easily implemented, so I added them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now parse fast-path supports "8.3f", but not "10.3". Because the fast-path method code size should be less than 325, for JIT inline

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant is that theoretically we could drop the regex code entirely and write a custom parser that specializes every formatter, but that we probably shouldn't as this means duplicating a lot of code and we'd likely end up having to maintain both. Exactly which patterns to specialize for is an open question. "%8.3f" is common, sure, but so are specifiers like "%-6d". I think it'd be good if we could collect some stats on which specifier patterns are the most common rather than just picking a few at random.

I see you added a microbenchmark for yet another happy case, which sort of misses my point about setting micros up to explore the boundaries: the set of microbenchmarks should ideally explore and verify both fast-paths and slow-paths, to show that the benefit of the former is significant while the overhead added to the slow-path is negligible. Adding a floatFormat2 that does return "%1.12f".formatted..., as an example:

Name                      Cnt    Base   Error     Test   Error  Unit   Diff%
StringFormat.floatFormat   15 316,133 ± 7,890  170,958 ± 8,231 ns/op   45,9% (p = 0,000*)
StringFormat.floatFormat2  15 342,767 ± 4,721  343,748 ± 3,753 ns/op   -0,3% (p = 0,506 )
  * = significant

This verifies that the added overhead is in the noise when the fast-path fail on this test.

We don't need to cover every possibility and have an ever-growing set of micros that all just test the fast-path, so I think you can remove the additions and instead adjust one or two of the existing microbenchmarks so that it verifies the slow-path with your PR applied.

@wenshao
Copy link
Contributor Author

wenshao commented Sep 18, 2023

I enhanced parse fast-path to support more specifiers, including:

% flag_1 width_1
% flag_2
% width_2
% width_1 . precesion_1

now benchmark on macbook m1 pro result is:

-Benchmark                           Mode  Cnt     Score     Error  Units (optimized)
-StringFormat.complexFormat          avgt   15  2049.387 ? 121.539  ns/op
-StringFormat.flags2Format           avgt   15   430.964 ?   2.414  ns/op
-StringFormat.flagsFormat            avgt   15   257.851 ?  23.833  ns/op
-StringFormat.stringFormat           avgt   15    63.564 ?  10.490  ns/op
-StringFormat.stringIntFormat        avgt   15    88.111 ?   0.678  ns/op
-StringFormat.width2Format           avgt   15   349.304 ?  31.349  ns/op
-StringFormat.width2PrecisionFormat  avgt   15   464.621 ?  53.918  ns/op
-StringFormat.widthFormat            avgt   15   301.997 ?  34.974  ns/op
-StringFormat.widthPrecisionFormat   avgt   15   484.526 ?  38.098  ns/op
-StringFormat.widthStringFormat      avgt   15   235.421 ?  32.955  ns/op
-StringFormat.widthStringIntFormat   avgt   15   315.178 ?  15.154  ns/op

+Benchmark                           Mode  Cnt    Score    Error  Units
+StringFormat.complexFormat          avgt   15  702.407 ? 85.481  ns/op (+191.77)
+StringFormat.flags2Format           avgt   15  329.551 ?  1.610  ns/op (+30.78)
+StringFormat.flagsFormat            avgt   15  125.798 ?  1.109  ns/op (+104.98)
+StringFormat.stringFormat           avgt   15   60.029 ?  6.275  ns/op (+5.89)
+StringFormat.stringIntFormat        avgt   15   89.020 ?  0.575  ns/op (-1.03)
+StringFormat.width2Format           avgt   15  135.743 ?  0.643  ns/op (+157.33)
+StringFormat.width2PrecisionFormat  avgt   15  351.408 ? 21.031  ns/op (+32.22)
+StringFormat.widthFormat            avgt   15  208.843 ? 47.504  ns/op (+44.61)
+StringFormat.widthPrecisionFormat   avgt   15  354.375 ? 67.314  ns/op (+36.73)
+StringFormat.widthStringFormat      avgt   15   74.846 ? 19.604  ns/op (+214.55)
+StringFormat.widthStringIntFormat   avgt   15  101.638 ?  0.961  ns/op (+210.10)

@cl4es
Copy link
Member

cl4es commented Sep 20, 2023

I was worried this would sprawl out more, but perhaps ~230 lines of code is a reasonable extra weight to make the long tail of String.format's regex-free.

I was going to comment that the flag parsing was broken in f303f29 but it seems that it was fixed in the latest. I think we need to make a review pass over all existing tests to make sure all imaginable variants are covered.

The parser code also ought to be shared between Formatter and FormatProcessor so that there's a single source of truth going forward.

@wenshao
Copy link
Contributor Author

wenshao commented Sep 21, 2023

I was worried this would sprawl out more, but perhaps ~230 lines of code is a reasonable extra weight to make the long tail of String.format's regex-free.

I was going to comment that the flag parsing was broken in f303f29 but it seems that it was fixed in the latest. I think we need to make a review pass over all existing tests to make sure all imaginable variants are covered.

The parser code also ought to be shared between Formatter and FormatProcessor so that there's a single source of truth going forward.

The codes of Formatter and FormatProcessor have been regex-free. There are many changes and require more detailed review.

@cl4es
Copy link
Member

cl4es commented Sep 21, 2023

I think it makes sense to file an RFE and do a full review of this. "Regex-free parsing of Formatter and FormatProcessor specifiers"?

@wenshao wenshao changed the title optimization for String::format Regex-free parsing of Formatter and FormatProcessor specifiers Sep 21, 2023
@wenshao wenshao changed the title Regex-free parsing of Formatter and FormatProcessor specifiers 8316704: Regex-free parsing of Formatter and FormatProcessor specifiers Sep 22, 2023
@wenshao
Copy link
Contributor Author

wenshao commented Oct 16, 2023

@rgiulietti can you help me continue to review this PR?

@wenshao
Copy link
Contributor Author

wenshao commented Oct 19, 2023

Thanks to @rgiulietti for the review, I have changed the FormatProcessor according to your suggestion and added test cases

@wenshao
Copy link
Contributor Author

wenshao commented Oct 30, 2023

@rgiulietti Is there anything else that needs to be modified?

@wenshao
Copy link
Contributor Author

wenshao commented Nov 7, 2023

Can anyone help me review this PR?

@wenshao
Copy link
Contributor Author

wenshao commented Nov 21, 2023

@RogerRiggs Can you help me review this PR?

@bridgekeeper
Copy link

bridgekeeper bot commented Dec 19, 2023

@wenshao This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@wenshao
Copy link
Contributor Author

wenshao commented Dec 22, 2023

@AlanBateman @cl4es Can you help me continue to complete the review of this PR?

@bridgekeeper
Copy link

bridgekeeper bot commented Jan 19, 2024

@wenshao This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Feb 6, 2024
@wenshao
Copy link
Contributor Author

wenshao commented Feb 6, 2024

/integrate

@openjdk
Copy link

openjdk bot commented Feb 6, 2024

@wenshao
Your change (at version a5f1a4f) is now ready to be sponsored by a Committer.

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Feb 6, 2024
@rgiulietti
Copy link
Contributor

/sponsor

@openjdk
Copy link

openjdk bot commented Feb 6, 2024

Going to push as commit 50b17d9.
Since your change was applied there have been 180 commits pushed to the master branch:

  • 51d7169: 8320237: C2: late inlining of method handle invoke causes duplicate lines in PrintInlining output
  • fd89b33: 8316992: Potential null pointer from get_current_thread JVMCI helper function.
  • d1c8215: 8325194: GHA: Add macOS M1 testing
  • f356970: 8322535: Change default AArch64 SpinPause instruction
  • b75c134: 8325313: Header format error in TestIntrinsicBailOut after JDK-8317299
  • 4cd3187: 8324874: AArch64: crypto pmull based CRC32/CRC32C intrinsics clobber V8-V15 registers
  • b02599d: 8298046: Fix hidden but significant trailing whitespace in properties files for serviceability code
  • 6d911f6: 8317299: safepoint scalarization doesn't keep track of the depth of the JVM state
  • 542b0b6: 8324126: Error message for mistyping -XX:+Unlock...Options is not helpful
  • 9ee9f28: 8325213: Flags introduced by configure script are not passed to ADLC build
  • ... and 170 more: https://git.openjdk.org/jdk/compare/a474b37212da5edbd5868c9157aff90aae00ca50...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Feb 6, 2024
@openjdk openjdk bot closed this Feb 6, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Feb 6, 2024
@openjdk
Copy link

openjdk bot commented Feb 6, 2024

@rgiulietti @wenshao Pushed as commit 50b17d9.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core-libs core-libs-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

6 participants