Skip to content

Conversation

@djelinski
Copy link
Member

@djelinski djelinski commented Oct 24, 2023

  • use LARGE_INTEGER.QuadPart instead of assembling the jlong from high/low parts
  • precalculate counts_per_nano to avoid costly floating-point division in counter to nanosecond conversion

Benchmark before:
SystemTime.nanoTime avgt 15 19,366 � 0,383 ns/op

After:
SystemTime.nanoTime avgt 15 15,812 � 0,385 ns/op

Tier1-2 clean.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8318709: Improve System.nanoTime performance on Windows (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/16336/head:pull/16336
$ git checkout pull/16336

Update a local copy of the PR:
$ git checkout pull/16336
$ git pull https://git.openjdk.org/jdk.git pull/16336/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 16336

View PR using the GUI difftool:
$ git pr show -t 16336

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/16336.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Oct 24, 2023

👋 Welcome back djelinski! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Oct 24, 2023

@djelinski The following label will be automatically applied to this pull request:

  • hotspot-runtime

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-runtime hotspot-runtime-dev@openjdk.org label Oct 24, 2023
@djelinski djelinski changed the title Improve nanoTime performance 8318709 Oct 24, 2023
@openjdk openjdk bot changed the title 8318709 8318709: Improve System.nanoTime performance on Windows Oct 24, 2023
@djelinski djelinski marked this pull request as ready for review October 24, 2023 10:07
@openjdk openjdk bot added the rfr Pull request is ready for review label Oct 24, 2023
@mlbridge
Copy link

mlbridge bot commented Oct 24, 2023

Webrevs

@dholmes-ora
Copy link
Member

Hi @djelinski , I hope to review this tomorrow but need to do some research first. :)

Copy link
Contributor

@c-cleary c-cleary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Can see that QueryPerformanceFrequency() sets count the frequency of the performance counter thats fixed at boot too which seems hardware specific, nice gain

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at:

https://learn.microsoft.com/en-us/windows/win32/api/winnt/ns-winnt-large_integer-r1

It seems the QuadPart needs native 64-bit compiler support, which I'm unclear is available when building for 32-bit Windows (which IIRC is a deprecated port but still present in this release). Also the web page indicates it is only available from Windows 10, which doesn't make a lot of sense.

To be honest I am struggling to understand how LARGE_INTEGER gets used by QPC/QPF as it is a union, so we have to know which member QPC/QPF sets, so we can read back the correct member. Though I wonder if given endian-ness the layout of the structs simply coincides with that of a native 64-bit variable?

@djelinski
Copy link
Member Author

Visual Studio supports 64bit integer types even in 32 bit mode. In fact, the JDK requires a compiler with 64bit integer support - it's used in jlong typedef, for example.

I checked the typedef for LARGE_INTEGER in Windows SDK, and it looks exactly like the one on the MSDN page - no ifdefs to check if we are compiling in 64bit mode.

Though I wonder if given endian-ness the layout of the structs simply coincides with that of a native 64-bit variable?

That's it exactly.

By the way, here's a nice article from Raymond Chen about this:
https://devblogs.microsoft.com/oldnewthing/20040825-00/?p=38053

@dholmes-ora
Copy link
Member

Hmm that blog ends with:

Exercise: Why are the LARGE_INTEGER and ULARGE_INTEGER structures not affected?

And I'd like to know why we do not have the same alignment issue? (This all seems rather hackish but presumably their library guys get the okay from their compiler guys ... though I wonder then about folk trying to build the Windows code with gcc?

@djelinski
Copy link
Member Author

LARGE_INTEGER does not have the same alignment issue because unions are aligned to the alignment required by the largest type, and LONGLONG (aka int64) QuadPart requires 8-byte alignment on 64bit machines.

@dholmes-ora
Copy link
Member

Ah I see. Okay. This still seems like a code smell on the MS side but that's not relevent to this PR.

return result;
}

static double counts_per_nano; // NANOSECS_PER_SEC / performance_frequency
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't that calculate nanos_per_count?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which is what we want of course.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

D'oh! Of course it does. I'll rename in a sec.

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks for this optimization!

@openjdk
Copy link

openjdk bot commented Oct 26, 2023

@djelinski This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8318709: Improve System.nanoTime performance on Windows

Reviewed-by: ccleary, dholmes

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 58 new commits pushed to the master branch:

  • 723db2d: 8305321: Remove unused exports in java.desktop
  • 811b436: 8318720: G1: Memory leak in G1CodeRootSet after JDK-8315503
  • a542f73: 8318843: ProblemList java/lang/management/MemoryMXBean/CollectionUsageThreshold.java in Xcomp
  • d96f38b: 8317510: Change Windows debug symbol files naming to avoid losing info when an executable and a library share the same name
  • 10427c0: 8318613: ChoiceFormat patterns are not well tested
  • ca3bdfc: 8318186: ChoiceFormat inconsistency between applyPattern() and setChoices()
  • a520887: 8318487: Specification of the ListFormat.equals() method can be improved
  • cf4ede0: 8317360: Missing null checks in JfrCheckpointManager and JfrStringPool initialization routines
  • 9e98ee6: 8318735: RISC-V: Enable related hotspot tests run on riscv
  • 29d462a: 8318727: Enable parallelism in vmTestbase/vm/gc/concurrent tests
  • ... and 48 more: https://git.openjdk.org/jdk/compare/bea2d48696ee2c213e475ca3aa3aa9c412b91089...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Oct 26, 2023
@djelinski
Copy link
Member Author

Thanks for the reviews!

/integrate

@openjdk
Copy link

openjdk bot commented Oct 30, 2023

Going to push as commit 3934127.
Since your change was applied there have been 91 commits pushed to the master branch:

  • 83eb206: 8318889: C2: add bailout after assert Bad graph detected in build_loop_late
  • 1183b22: 8310978: JFR events SocketReadEvent/SocketWriteEvent for Socket adaptor ops
  • 988e1df: 8318953: RISC-V: Small refactoring for MacroAssembler::test_bit
  • ce0ca47: 8319067: ProblemList serviceability/AsyncGetCallTrace/MyPackage/ASGCTBaseTest.java on linux-aarch64 in Xcomp mode
  • db34025: 8318827: RISC-V: Improve readability of fclass result testing
  • 1ec0d02: 8318225: RISC-V: C2 UModI
  • 96bec35: 8316996: Catalog API Enhancement: add a factory method
  • d226014: 8318850: Duplicate code in the LCMSImageLayout
  • c593f8b: 8318091: Remove empty initIDs functions
  • 4f9f195: 8318753: hsdis binutils may place libs in lib64
  • ... and 81 more: https://git.openjdk.org/jdk/compare/bea2d48696ee2c213e475ca3aa3aa9c412b91089...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Oct 30, 2023
@openjdk openjdk bot closed this Oct 30, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Oct 30, 2023
@openjdk
Copy link

openjdk bot commented Oct 30, 2023

@djelinski Pushed as commit 3934127.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@djelinski djelinski deleted the nanotime-perf branch October 30, 2023 07:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hotspot-runtime hotspot-runtime-dev@openjdk.org integrated Pull request has been integrated

Development

Successfully merging this pull request may close these issues.

3 participants