Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8300818: Reduce complexity of padding with DateTimeFormatter #12131

Closed
wants to merge 10 commits into from

Conversation

stsypanov
Copy link
Contributor

@stsypanov stsypanov commented Jan 21, 2023

Currently it's O(n) - we do n shifts of bytes within StringBuilder. This can be reduced to O(1) improving the code like:

DateTimeFormatter dtf = new DateTimeFormatterBuilder()
  .appendLiteral("Date:")
  .padNext(20, ' ')
  .append(DateTimeFormatter.ISO_DATE)
  .toFormatter();
String text = dtf.format(LocalDateTime.now());

Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8300818: Reduce complexity of padding with DateTimeFormatter

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/12131/head:pull/12131
$ git checkout pull/12131

Update a local copy of the PR:
$ git checkout pull/12131
$ git pull https://git.openjdk.org/jdk.git pull/12131/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 12131

View PR using the GUI difftool:
$ git pr show -t 12131

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/12131.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Jan 21, 2023

👋 Welcome back stsypanov! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Jan 21, 2023
@openjdk
Copy link

openjdk bot commented Jan 21, 2023

@stsypanov The following labels will be automatically applied to this pull request:

  • core-libs
  • i18n

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added core-libs core-libs-dev@openjdk.org i18n i18n-dev@openjdk.org labels Jan 21, 2023
@mlbridge
Copy link

mlbridge bot commented Jan 21, 2023

for (int i = 0; i < padWidth - len; i++) {
buf.insert(preLen, padChar);
}
buf.insert(preLen, String.valueOf(padChar).repeat(padWidth - len));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you checked with a microbenchmark that this added allocation can be elided by JITs and that there's a significant speed-up with your changes? I don't have the necessary domain expertise to assert anything here but I suspect padding widths are typically short. Such as 2 or 4 (for day/month and year fields) so a micro should examine there's no regression for little to no padding. Unlike the original code you call insert even if padWidth - len == 0. This might be optimized away by JITs, but it'd be good to verify which is best.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The modified code is called only when a user explicitly calls padNext(int, char), i.e. if I modified the example snippet as

DateTimeFormatter dtf = new DateTimeFormatterBuilder()
  .appendLiteral("Date:")
//.padNext(20, ' ')
  .append(DateTimeFormatter.ISO_DATE)
  .toFormatter();
String text = dtf.format(LocalDateTime.now());

there's no regression.

Right now I cannot build ad-hoc JDK with my changes and check it with JMH, so I'd convert this to draft until I can verify it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meant that you should verify that something like this, which just add a little padding, doesn't regress with your changes:

DateTimeFormatter dtf = new DateTimeFormatterBuilder()
    .appendLiteral("Year:")
    .padNext(5)
    .appendValue(ChronoField.YEAR)
    .toFormatter();
... 
dtf.format(LocalDateTime.now());

And similar for effectively no padding (.padNext(4) in the above example). As this API might often be used to ensure short 2-4 char fields are correctly padded I think it's important that we're not adding overhead to such use cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added benchmark for this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Special casing for len == 0 and keeping the existing buf.insert for len == 1 would avoid object creation except when it would improve performance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RogerRiggs sorry I don't get it. Maybe you mean speacial casing for padWidth - len?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I meant on the length of the inserted padding.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@stsypanov stsypanov marked this pull request as draft January 22, 2023 13:32
@stsypanov stsypanov marked this pull request as draft January 22, 2023 13:32
@openjdk openjdk bot removed the rfr Pull request is ready for review label Jan 22, 2023
@RogerRiggs
Copy link
Contributor

fyi, you might want to wait for #12728

@stsypanov
Copy link
Contributor Author

@RogerRiggs thanks for pointing this out. Could we also have an overloaded method for inserting apart from appending?

@stsypanov
Copy link
Contributor Author

Benchmarking:

make test TEST="micro:time.format.DateTimeFormatterBench" MICRO="OPTIONS=-prof gc"

master

Benchmark                                                          Mode  Cnt     Score      Error   Units
DateTimeFormatterBench.formatWithPadding                          thrpt   15  1621.744 ±  216.203  ops/ms
DateTimeFormatterBench.formatWithPadding:·gc.alloc.rate           thrpt   15   295.938 ±   39.238  MB/sec
DateTimeFormatterBench.formatWithPadding:·gc.alloc.rate.norm      thrpt   15   192.000 ±    0.001    B/op
DateTimeFormatterBench.formatWithPadding:·gc.count                thrpt   15    23.000             counts
DateTimeFormatterBench.formatWithPadding:·gc.time                 thrpt   15   353.000                 ms
DateTimeFormatterBench.formatWithZeroPadding                      thrpt   15  2787.775 ± 1383.428  ops/ms
DateTimeFormatterBench.formatWithZeroPadding:·gc.alloc.rate       thrpt   15   401.775 ±  201.097  MB/sec
DateTimeFormatterBench.formatWithZeroPadding:·gc.alloc.rate.norm  thrpt   15   152.000 ±    0.001    B/op
DateTimeFormatterBench.formatWithZeroPadding:·gc.count            thrpt   15    22.000             counts
DateTimeFormatterBench.formatWithZeroPadding:·gc.time             thrpt   15   396.000                 ms


dtfb

Benchmark                                                          Mode  Cnt      Score     Error   Units
DateTimeFormatterBench.formatWithPadding                          thrpt   15   5489.480 ± 200.120  ops/ms
DateTimeFormatterBench.formatWithPadding:·gc.alloc.rate           thrpt   15   1171.637 ±  42.648  MB/sec
DateTimeFormatterBench.formatWithPadding:·gc.alloc.rate.norm      thrpt   15    224.000 ±   0.001    B/op
DateTimeFormatterBench.formatWithPadding:·gc.count                thrpt   15     62.000            counts
DateTimeFormatterBench.formatWithPadding:·gc.time                 thrpt   15    342.000                ms
DateTimeFormatterBench.formatWithZeroPadding                      thrpt   15  14355.239 ± 640.377  ops/ms
DateTimeFormatterBench.formatWithZeroPadding:·gc.alloc.rate       thrpt   15   2078.802 ±  93.278  MB/sec
DateTimeFormatterBench.formatWithZeroPadding:·gc.alloc.rate.norm  thrpt   15    152.000 ±   0.001    B/op
DateTimeFormatterBench.formatWithZeroPadding:·gc.count            thrpt   15     77.000            counts
DateTimeFormatterBench.formatWithZeroPadding:·gc.time             thrpt   15    555.000                ms

@stsypanov stsypanov marked this pull request as ready for review March 24, 2023 19:31
@stsypanov stsypanov requested a review from cl4es March 24, 2023 19:31
@stsypanov stsypanov marked this pull request as draft March 24, 2023 19:33
@openjdk openjdk bot added rfr Pull request is ready for review and removed rfr Pull request is ready for review labels Mar 24, 2023
@stsypanov stsypanov marked this pull request as ready for review April 10, 2023 13:55
@openjdk openjdk bot added the rfr Pull request is ready for review label Apr 10, 2023
@stsypanov
Copy link
Contributor Author

@RogerRiggs mentioned PR was merged without suggested repeat-into-position method, so I this code remains as is.

for (int i = 0; i < padWidth - len; i++) {
buf.insert(preLen, padChar);
}
buf.insert(preLen, String.valueOf(padChar).repeat(padWidth - len));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Special casing for len == 0 and keeping the existing buf.insert for len == 1 would avoid object creation except when it would improve performance.

@stsypanov
Copy link
Contributor Author

With latest changes

baseline

Benchmark                                                                     Mode  Cnt      Score     Error   Units
DateTimeFormatterWithPaddingBench.formatWithPadding                          thrpt   15   5848.753 ± 223.592  ops/ms
DateTimeFormatterWithPaddingBench.formatWithPadding:·gc.alloc.rate           thrpt   15   1070.358 ±  40.881  MB/sec
DateTimeFormatterWithPaddingBench.formatWithPadding:·gc.alloc.rate.norm      thrpt   15    192.000 ±   0.001    B/op
DateTimeFormatterWithPaddingBench.formatWithPadding:·gc.count                thrpt   15     53.000            counts
DateTimeFormatterWithPaddingBench.formatWithPadding:·gc.time                 thrpt   15    479.000                ms
DateTimeFormatterWithPaddingBench.formatWithZeroPadding                      thrpt   15  22032.107 ± 481.443  ops/ms
DateTimeFormatterWithPaddingBench.formatWithZeroPadding:·gc.alloc.rate       thrpt   15   3192.003 ±  69.901  MB/sec
DateTimeFormatterWithPaddingBench.formatWithZeroPadding:·gc.alloc.rate.norm  thrpt   15    152.000 ±   0.001    B/op
DateTimeFormatterWithPaddingBench.formatWithZeroPadding:·gc.count            thrpt   15     83.000            counts
DateTimeFormatterWithPaddingBench.formatWithZeroPadding:·gc.time             thrpt   15   1046.000                ms

patch

Benchmark                                                                     Mode  Cnt      Score     Error   Units
DateTimeFormatterWithPaddingBench.formatWithPadding                          thrpt   15   7982.747 ± 374.638  ops/ms
DateTimeFormatterWithPaddingBench.formatWithPadding:·gc.alloc.rate           thrpt   15   1704.393 ±  80.063  MB/sec
DateTimeFormatterWithPaddingBench.formatWithPadding:·gc.alloc.rate.norm      thrpt   15    224.000 ±   0.001    B/op
DateTimeFormatterWithPaddingBench.formatWithPadding:·gc.count                thrpt   15     64.000            counts
DateTimeFormatterWithPaddingBench.formatWithPadding:·gc.time                 thrpt   15    641.000                ms
DateTimeFormatterWithPaddingBench.formatWithZeroPadding                      thrpt   15  22024.559 ± 794.713  ops/ms
DateTimeFormatterWithPaddingBench.formatWithZeroPadding:·gc.alloc.rate       thrpt   15   3190.932 ± 115.145  MB/sec
DateTimeFormatterWithPaddingBench.formatWithZeroPadding:·gc.alloc.rate.norm  thrpt   15    152.000 ±   0.001    B/op
DateTimeFormatterWithPaddingBench.formatWithZeroPadding:·gc.count            thrpt   15     85.000            counts
DateTimeFormatterWithPaddingBench.formatWithZeroPadding:·gc.time             thrpt   15   1019.000                ms

Copy link
Contributor

@RogerRiggs RogerRiggs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM; thanks

@openjdk
Copy link

openjdk bot commented Apr 20, 2023

@stsypanov This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8300818: Reduce complexity of padding with DateTimeFormatter

Reviewed-by: redestad, rriggs

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 177 new commits pushed to the master branch:

  • 8d696ae: 8306575: Clean up and open source four Dialog related tests
  • 9ed456f: 8306634: Open source AWT Event related tests
  • b2240bf: 8304696: Duplicate class names in dynamicArchive tests can lead to test failure
  • cb158ff: 8296153: Bump minimum boot jdk to JDK 20
  • 117c5b1: 8279216: Investigate implementation of premultiplied alpha in the Little-CMS 2.13
  • 723037a: 8298048: Combine CDS archive heap into a single block
  • d518dbf: 8306440: Rename PSS:_num_optional_regions to _max_num_optional_regions
  • 9cd5741: 8306436: Rename PSS*:_n_workers to PSS*:_num_workers
  • 6e77e14: 8306456: Don't leak _worklist's memory in PhaseLive::compute
  • be6031b: 8303703: Add support of execution tests using virtual thread factory jtreg plugin
  • ... and 167 more: https://git.openjdk.org/jdk/compare/ddd50d0db31e50c0fcedafa290d6eac277ddae3e...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@cl4es, @RogerRiggs) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Apr 20, 2023
@stsypanov
Copy link
Contributor Author

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Apr 21, 2023
@openjdk
Copy link

openjdk bot commented Apr 21, 2023

@stsypanov
Your change (at version baa7f92) is now ready to be sponsored by a Committer.

@openjdk openjdk bot removed the sponsor Pull request is ready to be sponsored label Apr 21, 2023
@RogerRiggs
Copy link
Contributor

I added a few more small padding cases and and re-ran on my Mac:

Baseline:

Benchmark                                                    Mode  Cnt      Score     Error   Units
DateTimeFormatterWithPaddingBench.formatWithPadding         thrpt   15   5167.611 ±  85.096  ops/ms
DateTimeFormatterWithPaddingBench.formatWithPaddingLength0  thrpt   15  22004.618 ± 536.207  ops/ms
DateTimeFormatterWithPaddingBench.formatWithPaddingLength1  thrpt   15  18041.569 ± 142.627  ops/ms
DateTimeFormatterWithPaddingBench.formatWithPaddingLength2  thrpt   15  15966.853 ± 182.284  ops/ms
DateTimeFormatterWithPaddingBench.formatWithPaddingLength3  thrpt   15  13472.887 ± 181.794  ops/ms
DateTimeFormatterWithPaddingBench.formatWithPaddingLength4  thrpt   15  12991.458 ± 402.190  ops/ms
DateTimeFormatterWithPaddingBench.formatWithPaddingLength5  thrpt   15  11464.481 ± 603.970  ops/ms
DateTimeFormatterWithPaddingBench.formatWithPaddingLength6  thrpt   15  10386.347 ± 673.596  ops/ms


Patch:

Benchmark                                                    Mode  Cnt      Score     Error   Units
DateTimeFormatterWithPaddingBench.formatWithPadding         thrpt   15   6548.853 ± 201.695  ops/ms
DateTimeFormatterWithPaddingBench.formatWithPaddingLength0  thrpt   15  22269.428 ± 737.613  ops/ms
DateTimeFormatterWithPaddingBench.formatWithPaddingLength1  thrpt   15  19086.146 ± 117.055  ops/ms
DateTimeFormatterWithPaddingBench.formatWithPaddingLength2  thrpt   15  15627.898 ± 314.890  ops/ms
DateTimeFormatterWithPaddingBench.formatWithPaddingLength3  thrpt   15  15447.997 ± 121.946  ops/ms
DateTimeFormatterWithPaddingBench.formatWithPaddingLength4  thrpt   15  15185.318 ± 198.266  ops/ms
DateTimeFormatterWithPaddingBench.formatWithPaddingLength5  thrpt   15  15243.618 ± 147.131  ops/ms
DateTimeFormatterWithPaddingBench.formatWithPaddingLength6  thrpt   15  15093.142 ± 294.421  ops/ms

The expected steps in performance match the code paths of 0, 1, 2-6.

@stsypanov
Copy link
Contributor Author

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Apr 22, 2023
@openjdk
Copy link

openjdk bot commented Apr 22, 2023

@stsypanov
Your change (at version 48a0cc7) is now ready to be sponsored by a Committer.

@RogerRiggs
Copy link
Contributor

/sponsor

@openjdk
Copy link

openjdk bot commented May 1, 2023

Going to push as commit 561ec9c.
Since your change was applied there have been 317 commits pushed to the master branch:

  • ae5f678: 8282384: [LOOM] Need test for ThreadReference.interrupt() on a vthread
  • c7e1df8: 8304760: Add 2 Microsoft TLS roots
  • 6acf032: 8306678: Replace use of os.version with an internal Version record
  • f00a748: 8304915: Create jdk.internal.util.Architecture enum and apply
  • 7d07d19: 8305201: Improve error message for GroupLayouts that are too large on SysV
  • 67dd841: 8305093: Linker cache should not take layout names into account
  • d437c61: 8305672: Surprising definite assignment error after JDK-8043179
  • b39a9bf: 8301703: java.base jdk.internal.foreign.abi.BindingSpecializer uses ASM to generate classes
  • 1de1a38: 8303002: Reject packed structs from linker
  • 316d303: 8306851: Move Method access flags
  • ... and 307 more: https://git.openjdk.org/jdk/compare/ddd50d0db31e50c0fcedafa290d6eac277ddae3e...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label May 1, 2023
@openjdk openjdk bot closed this May 1, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels May 1, 2023
@openjdk
Copy link

openjdk bot commented May 1, 2023

@RogerRiggs @stsypanov Pushed as commit 561ec9c.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@stsypanov stsypanov deleted the dtfb branch May 1, 2023 21:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core-libs core-libs-dev@openjdk.org i18n i18n-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

3 participants