-
Notifications
You must be signed in to change notification settings - Fork 6.2k
8372584: [Linux]: Replace reading proc to get thread user CPU time with clock_gettime #28556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Fixy Remove imports added by IDE Remove imports added by IDE Don't touch bit 3 Fix name
|
👋 Welcome back jnorlinder! A progress list of the required criteria for merging this PR into |
|
@JonasNorlinder This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be: You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 69 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@dholmes-ora, @cl4es, @kevinjwalls) but any other Committer may sponsor as well. ➡️ To flag this PR as ready for integration with the above commit message, type |
|
@JonasNorlinder The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
|
/issue add JDK-8210452 |
|
@JonasNorlinder |
|
@JonasNorlinder this PR isn't fixing two issue. I think JDK-8372584 should just be closed as a duplicate of JDK-8210452 (which I had forgotten about and which @larry-cable did not get further with). |
dholmes-ora
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good. I'd forgotten that I found about this in 2018.
A few minor nits.
Can't really comment on the benchmark.
| // set, which return system+user time, which is what the POSIX standard mandates, see | ||
| // POSIX.1-2024/IEEE Std 1003.1-2024 §3.90. | ||
| static clockid_t get_thread_clockid(Thread* thread, bool total, bool* success) { | ||
| constexpr clockid_t CLOCK_TYPE_MASK = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the mask be covering 3-bits?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No we should not touch bit 3 which encodes if the clock is for a thread of process. See here https://elixir.bootlin.com/linux/v6.17.9/source/include/linux/posix-timers_types.h#L9-L19.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay so
encoding the clock types in the last three bits
needs a bit more explanation.
src/hotspot/os/linux/os_linux.cpp
Outdated
| // It's possible to encounter a terminated native thread that failed | ||
| // to detach itself from the VM - which should result in ESRCH. | ||
| assert_status(rc == ESRCH, rc, "pthread_getcpuclockid failed"); | ||
| *success = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The normal way I've seen this pattern used is to set it to true rather than assuming it was true to begin with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a positive outcome like success to encode the outcome make it possible to return like so return success ? os::Linux::thread_cpu_time(clockid) : -1;. I prefer having the -1 at the end as I find this reads easier. If we encode a failure we would need to write return !failure ? os::Linux::thread_cpu_time(clockid) : -1;. Hence, I would prefer keeping this as is as double-negatives may be harder to parse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay but you are setting up a usage requirement without documenting anywhere that that requirement exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer if we kept using the standard. When in Rome...
What we often do is this:
bool get_thread_clockid(Thread* thread, bool total, clock_id_t* is);
or, alternatively, what David wrote.
cl4es
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Microbenchmark looks good
|
/issue remove JDK-8210452 |
|
@JonasNorlinder |
|
Thanks for the comments @dholmes-ora and @tstuefe. I changed to aligning the signature to the standard. That way we don't have to document any requirements. |
dholmes-ora
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Thanks for the updates.
|
/integrate |
|
@JonasNorlinder |
kevinjwalls
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good - I remember that fix for parsing the program binary name containing brackets, good to have it gone.
|
/sponsor |
|
Going to push as commit 858d2e4.
Your commit was automatically rebased without conflicts. |
|
@kevinjwalls @JonasNorlinder Pushed as commit 858d2e4. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
|
Mailing list message from Jaromir Hamala on hotspot-runtime-dev: On Wed, Dec 3, 2025 at 10:35?AM Kevin Walls <kevinw at openjdk.org> wrote: Apologies for reviving an old treat. I was experimenting with this change, The change: Before: After: There is around 13% latency improvement on average. Would you be interested in merging a similar patch? Cheers, -- |
|
Mailing list message from Jonas Norlinder on hotspot-runtime-dev: Hi Jaromir, That sounds interesting :), as long as we are confident that your observation is part of the user ABI. Feel free to submit a PR and I will happily review it. Also add a link or reasoning to confirm that it is part of the user ABI. Thank you, From: hotspot-runtime-dev <hotspot-runtime-dev-retn at openjdk.org> on behalf of Jaromir Hamala <jaromir.hamala at gmail.com> On Wed, Dec 3, 2025 at 10:35?AM Kevin Walls <kevinw at openjdk.org<mailto:kevinw at openjdk.org>> wrote:
Looks good - I remember that fix for parsing the program binary name containing brackets, good to have it gone. ------------- Marked as reviewed by kevinw (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/28556#pullrequestreview-3534064399 Apologies for reviving an old treat. I was experimenting with this change, and I believe there is a further optimisation opportunity: When clockid has TID set to 0, then the kernel treats it as 'the current task' (=which is what getCurrentThreadUserTime() requires) and avoids a radix lookup required for an arbitrary TID. The change: https://github.com/jerrinot/jdk/compare/master...jerrinot:jdk:jh_faster_getCurrentThreadUserTime Before: After: There is around 13% latency improvement on average. Would you be interested in merging a similar patch? Cheers, -- |
Since kernel v2.6.12 the Linux ABI have had support for encoding the clock types in the last three bits. Setting bit to 001 (CPUCLOCK_VIRT) will result in the kernel returning only user time. POSIX compliant implementations of pthread_getcpuclockid for the Linux kernel defaults to construct a clockid that with 010 (CPUCLOCK_SCHED) set, which return system+user time, which is what the POSIX standard mandates, see POSIX.1-2024/IEEE Std 1003.1-2024 §3.90. This patch joins the family of glibc, musl etc. that utilities this bit pattern.
This PR also results in improved performance and thus a reduced observer effect, especially for the 100th percentile (max).
Before patch:
After patch:
Testing:
java/lang/management/ThreadMXBean/ThreadUserTime.javaand the added microbenchmark.Progress
Issue
Reviewers
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/28556/head:pull/28556$ git checkout pull/28556Update a local copy of the PR:
$ git checkout pull/28556$ git pull https://git.openjdk.org/jdk.git pull/28556/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 28556View PR using the GUI difftool:
$ git pr show -t 28556Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/28556.diff
Using Webrev
Link to Webrev Comment