Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8276220: Reduce excessive allocations in DateTimeFormatter #6188

Closed
wants to merge 10 commits into from

Conversation

cl4es
Copy link
Member

@cl4es cl4es commented Nov 1, 2021

Prompted by a request from Volkan Yazıcı I took a look at why the java.time formatters are less efficient for some common patterns than custom formatters in apache-commons and log4j. This patch reduces the gap, without having looked at the third party implementations.

When printing times:

  • Avoid turning integral values into Strings before appending them to the buffer
  • Specialize appendFraction for NANO_OF_SECOND to avoid use of BigDecimal

This means a speed-up and reduction in allocations when formatting almost any date or time pattern, and especially so when including sub-second parts (S-SSSSSSSSS).

Much of the remaining overhead can be traced to the need to create a DateTimePrintContext and adjusting Instants into a ZonedDateTime internally. We could likely also win performance by specializing some common patterns.

Testing: tier1-3


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8276220: Reduce excessive allocations in DateTimeFormatter

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/6188/head:pull/6188
$ git checkout pull/6188

Update a local copy of the PR:
$ git checkout pull/6188
$ git pull https://git.openjdk.java.net/jdk pull/6188/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 6188

View PR using the GUI difftool:
$ git pr show -t 6188

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/6188.diff

@cl4es
Copy link
Member Author

cl4es commented Nov 1, 2021

Microbenchmark numbers for the supplied DateTimeFormatterBench.

Baseline:

Benchmark                               (pattern)   Mode  Cnt       Score       Error   Units
formatInstants                           HH:mm:ss  thrpt   15       6.558 ±     0.125  ops/ms
  ·gc.alloc.rate.norm                    HH:mm:ss  thrpt   15  192015.139 ±     0.352    B/op
formatInstants                       HH:mm:ss.SSS  thrpt   15       2.772 ±     0.036  ops/ms
  ·gc.alloc.rate.norm                HH:mm:ss.SSS  thrpt   15  696805.289 ± 11280.987    B/op
formatInstants              yyyy-MM-dd'T'HH:mm:ss  thrpt   15       4.025 ±     0.056  ops/ms
  ·gc.alloc.rate.norm       yyyy-MM-dd'T'HH:mm:ss  thrpt   15  264020.926 ±     0.555    B/op
formatInstants          yyyy-MM-dd'T'HH:mm:ss.SSS  thrpt   15       2.121 ±     0.026  ops/ms
  ·gc.alloc.rate.norm   yyyy-MM-dd'T'HH:mm:ss.SSS  thrpt   15  768811.221 ± 11281.118    B/op
formatZonedDateTime                      HH:mm:ss  thrpt   15       8.797 ±     0.254  ops/ms
  ·gc.alloc.rate.norm                    HH:mm:ss  thrpt   15   96007.787 ±     0.331    B/op
formatZonedDateTime                  HH:mm:ss.SSS  thrpt   15       3.109 ±     0.024  ops/ms
  ·gc.alloc.rate.norm                HH:mm:ss.SSS  thrpt   15  600798.055 ± 11280.992    B/op
formatZonedDateTime         yyyy-MM-dd'T'HH:mm:ss  thrpt   15       4.814 ±     0.037  ops/ms
  ·gc.alloc.rate.norm       yyyy-MM-dd'T'HH:mm:ss  thrpt   15  168013.636 ±     0.389    B/op
formatZonedDateTime     yyyy-MM-dd'T'HH:mm:ss.SSS  thrpt   15       2.299 ±     0.059  ops/ms
  ·gc.alloc.rate.norm   yyyy-MM-dd'T'HH:mm:ss.SSS  thrpt   15  680012.566 ± 11281.140    B/op

Patched:

Benchmark                               (pattern)   Mode  Cnt       Score      Error   Units
formatInstants                           HH:mm:ss  thrpt   15       7.721 ±    0.114  ops/ms
  ·gc.alloc.rate.norm                    HH:mm:ss  thrpt   15  120010.043 ±    0.934    B/op
formatInstants                       HH:mm:ss.SSS  thrpt   15       5.684 ±    0.083  ops/ms
  ·gc.alloc.rate.norm                HH:mm:ss.SSS  thrpt   15  120009.973 ±    0.608    B/op
formatInstants              yyyy-MM-dd'T'HH:mm:ss  thrpt   15       4.962 ±    0.058  ops/ms
  ·gc.alloc.rate.norm       yyyy-MM-dd'T'HH:mm:ss  thrpt   15  120010.027 ±    0.703    B/op
formatInstants          yyyy-MM-dd'T'HH:mm:ss.SSS  thrpt   15       3.889 ±    0.068  ops/ms
  ·gc.alloc.rate.norm   yyyy-MM-dd'T'HH:mm:ss.SSS  thrpt   15  120010.284 ±    1.003    B/op
formatZonedDateTime                      HH:mm:ss  thrpt   15      11.087 ±    0.404  ops/ms
  ·gc.alloc.rate.norm                    HH:mm:ss  thrpt   15   24002.072 ±    0.219    B/op
formatZonedDateTime                  HH:mm:ss.SSS  thrpt   15       7.325 ±    0.139  ops/ms
  ·gc.alloc.rate.norm                HH:mm:ss.SSS  thrpt   15   24002.080 ±    0.267    B/op
formatZonedDateTime         yyyy-MM-dd'T'HH:mm:ss  thrpt   15       6.127 ±    0.040  ops/ms
  ·gc.alloc.rate.norm       yyyy-MM-dd'T'HH:mm:ss  thrpt   15   24002.084 ±    0.459    B/op
formatZonedDateTime     yyyy-MM-dd'T'HH:mm:ss.SSS  thrpt   15       4.576 ±    0.049  ops/ms
  ·gc.alloc.rate.norm   yyyy-MM-dd'T'HH:mm:ss.SSS  thrpt   15   24002.102 ±    0.511    B/op

The most dramatic improvement is seen for formatZonedDateTime using the format yyyy-MM-dd'T'HH:mm:ss.SSS, which sees a 2x speed-up and almost 97% reduction in allocations. The same variant using Instants see a 1.8x speed-up and almost 85% reduction in allocations.

@bridgekeeper
Copy link

bridgekeeper bot commented Nov 1, 2021

👋 Welcome back redestad! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Nov 1, 2021

@cl4es The following labels will be automatically applied to this pull request:

  • core-libs
  • i18n

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added core-libs core-libs-dev@openjdk.org i18n i18n-dev@openjdk.org labels Nov 1, 2021
@cl4es cl4es marked this pull request as ready for review November 1, 2021 15:21
@openjdk openjdk bot added the rfr Pull request is ready for review label Nov 1, 2021
@mlbridge
Copy link

mlbridge bot commented Nov 1, 2021

private final boolean decimalPoint;

// only instantiated and used if there's ever a value outside the allowed range
private FractionPrinterParser fallback;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class has to be safe in a multi-threaded environment. I'm not convinced it is safe right now, as the usage doesn't follow the standard racy single check idiom. At a minimum, there needs to be a comment explaining the thread-safety issues.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'll make sure to read into a local variable.

range.checkValidValue(value, field);
BigDecimal minBD = BigDecimal.valueOf(range.getMinimum());
BigDecimal rangeBD = BigDecimal.valueOf(range.getMaximum()).subtract(minBD).add(BigDecimal.ONE);
field.range().checkValidValue(value, field);
BigDecimal valueBD = BigDecimal.valueOf(value).subtract(minBD);
BigDecimal fraction = valueBD.divide(rangeBD, 9, RoundingMode.FLOOR);
// stripTrailingZeros bug
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this bug was fixed a while back, so this code could be simplified.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got a reference to which bug this was and how it manifests?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're referring to JDK-6480539: "BigDecimal.stripTrailingZeros() has no effect on zero itself ("0.0")", it was fixed in JDK 8.

ValueRange range = field.range();
range.checkValidValue(value, field);
BigDecimal minBD = BigDecimal.valueOf(range.getMinimum());
BigDecimal rangeBD = BigDecimal.valueOf(range.getMaximum()).subtract(minBD).add(BigDecimal.ONE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't be surprised if there is a way to replace the use of BigDecimal with calculations using long. Fundamentally, calculations like 15/60 -> 0.25 are not hard, but it depends on whether the exact results can be matched across a wide range of possible inputs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the main engineering challenge is writing tests that ensure that we don't lose precision on an arbitrary fractional field.

// [0-999999999], we can't assume that holds for any custom Temporal
if (!field.range().isValidIntValue(value)) {
if (fallback == null) {
fallback = new FractionPrinterParser(field, minWidth, maxWidth, decimalPoint, subsequentWidth);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to see a test case cover this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll see to it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When adding a test for this I discovered that FractionPrinterParser::format will end up calling field.range().checkValidValue(value, field) here. This means that the pre-existing implementation does check the value range and throws exceptions when trying to print a value outside of the field range.

To mimic the existing behavior we have to do the same check in NanosPrinterParser::format and drop the fallback (which would have somewhat nonsensical output for values outside the range, anyhow).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a test case showing that values that are outside the range throw DateTimeException. This passes with and without the patch and mainly documents the pre-existing behavior.

Copy link
Contributor

@jodastephen jodastephen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the new tests, and finding that fraction formatting is more constrained than I thought it was.

I think there is a case for a spec update in DateTimeFormatterBuilder to clarify the constraint on the input value (at this point, that seems better than changing the behaviour of fraction formatting). As things stand, the exception that is thrown is not described by the spec. (The spec change could be part of this PR or a separate one).

@cl4es
Copy link
Member Author

cl4es commented Nov 3, 2021

Thanks for reviewing, @jodastephen!

I think a spec update sounds good, but I think that should be done in as a follow-up. If you would be willing to provide the wording for such a spec update I can take care of creating a CSR.

@naotoj
Copy link
Member

naotoj commented Nov 3, 2021

Looks good. I'd create a new test case class out of TestFractionPrinterParser, as you introduced the new NanosPrinterParser.

@cl4es
Copy link
Member Author

cl4es commented Nov 3, 2021

Looks good. I'd create a new test case class out of TestFractionPrinterParser, as you introduced the new NanosPrinterParser.

It was only indirectly a test of FractionPrinterParser; it's really a test of PrinterParsers built using appendFraction, which can be either FractionPrinterParser or the new NanosPrinterParser. So the name is still somewhat appropriate. We could rename it, but splitting it apart seems excessive.

I realized though that with my changes the test coverage of FractionPrinterParser is substantially reduced, since most of the testing is done using NANO_OF_SECOND. I'm adding a set of tests using similar input for MICRO_OF_SECOND that will exercise FractionPrinterParser.

@@ -79,7 +79,7 @@
/**
* Test FractionPrinterParser.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, then I'd add some comments that the test covers NanoPrinterParser as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, done.

@naotoj
Copy link
Member

naotoj commented Nov 3, 2021

Oh, just noticed the copyright year->2021 in modified files. Should have noticed before.

Copy link
Member

@naotoj naotoj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thank you for the fix!

@openjdk
Copy link

openjdk bot commented Nov 3, 2021

@cl4es This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8276220: Reduce excessive allocations in DateTimeFormatter

Reviewed-by: scolebourne, naoto

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 42 new commits pushed to the master branch:

  • f3320d2: 8276588: Change "ccc" to "CSR" in HotSpot sources
  • 32895ac: 8275650: Problemlist java/io/File/createTempFile/SpecialTempFile.java for Windows 11
  • c7f070f: 8276208: vmTestbase/nsk/jdb/repeat/repeat001/repeat001.java fails with "AssertionError: Unexpected output"
  • 684edbb: 8273922: (fs) UserDefinedFileAttributeView doesn't handle file names that are just under the MAX_PATH limit (win)
  • 7115892: 8276401: Use blessed modifier order in java.net.http
  • 0ef8cbe: 8276385: Re-run blessed-modifier-order script on java.desktop and jdk.accessibility
  • 724bf3b: 8275604: Zero: Reformat opclabels_data
  • 61cb4bc: 8276036: The value of full_count in the message of insufficient codecache is wrong
  • 8731846: 8276556: ProblemList java/nio/channels/FileChannel/LargeGatheringWrite.java on windows-x64
  • be1ca2b: 8276298: G1: Remove unused G1SegmentedArrayBufferList::add
  • ... and 32 more: https://git.openjdk.java.net/jdk/compare/5bb1992b8408a0d196b1afa308bc00d007458dbd...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Nov 3, 2021
@cl4es
Copy link
Member Author

cl4es commented Nov 3, 2021

Thanks, Naoto!

/integrate

@openjdk
Copy link

openjdk bot commented Nov 3, 2021

Going to push as commit ce8c767.
Since your change was applied there have been 43 commits pushed to the master branch:

  • 0ab910d: 8276066: Reset LoopPercentProfileLimit for x86 due to suboptimal performance
  • f3320d2: 8276588: Change "ccc" to "CSR" in HotSpot sources
  • 32895ac: 8275650: Problemlist java/io/File/createTempFile/SpecialTempFile.java for Windows 11
  • c7f070f: 8276208: vmTestbase/nsk/jdb/repeat/repeat001/repeat001.java fails with "AssertionError: Unexpected output"
  • 684edbb: 8273922: (fs) UserDefinedFileAttributeView doesn't handle file names that are just under the MAX_PATH limit (win)
  • 7115892: 8276401: Use blessed modifier order in java.net.http
  • 0ef8cbe: 8276385: Re-run blessed-modifier-order script on java.desktop and jdk.accessibility
  • 724bf3b: 8275604: Zero: Reformat opclabels_data
  • 61cb4bc: 8276036: The value of full_count in the message of insufficient codecache is wrong
  • 8731846: 8276556: ProblemList java/nio/channels/FileChannel/LargeGatheringWrite.java on windows-x64
  • ... and 33 more: https://git.openjdk.java.net/jdk/compare/5bb1992b8408a0d196b1afa308bc00d007458dbd...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot closed this Nov 3, 2021
@openjdk openjdk bot added integrated Pull request has been integrated and removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Nov 3, 2021
@openjdk
Copy link

openjdk bot commented Nov 3, 2021

@cl4es Pushed as commit ce8c767.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@cl4es cl4es deleted the improve_dtf branch November 3, 2021 22:58
@jodastephen
Copy link
Contributor

@cl4es For DateTimeFormatterBuilder spec, I propose changing the existing wording slightly

If the field value in the date-time to be printed is invalid it
cannot be printed and an exception will be thrown.

to

If the field value in the date-time to be printed is outside the
range of valid values then an exception will be thrown.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core-libs core-libs-dev@openjdk.org i18n i18n-dev@openjdk.org integrated Pull request has been integrated
4 participants