-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8259808: Add JFR event to detect GC locker stall #2088
Conversation
👋 Welcome back ddong! A progress list of the required criteria for merging this PR into |
/label add hotspot-jfr |
@D-D-H |
Webrevs
|
/label add hotspot-gc |
@D-D-H |
GC locker will stall the operation of GC, resulting in some Java threads can not continue to run until GC locker is released, thus affecting the response time of the application. Add a JFR event to report this information is helpful to detect this issue. For the test purpose, I add two Whitebox methods to lock/unlock critical. |
Greetings, Thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a review, but a few comments about what probably needs to be cleaned up before a proper review starts.
Refactored. Testing: jdk/jfr all passed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments.
Thanks for the review :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good now but please await a second reviewer.
I took if for a spin in out internal testing and tier1-2 looks ok as well as the JFR tests.
@D-D-H This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 194 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@stefank, @kstefanj, @tschatzl, @egahlin) but any other Committer may sponsor as well. ➡️ To flag this PR as ready for integration with the above commit message, type |
|
||
import sun.hotspot.WhiteBox; | ||
|
||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This block should be the first thing in the test after the copyright notice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. But I notice that there are many other tests that didn't comply with this rule.
for (var i = 0; i < STALL_THREAD_COUNT; i++) { | ||
ts[i] = new Thread(() -> { | ||
STALL_COUNT_SIGNAL.countDown(); | ||
for (int j = 0; j < LOOP; j++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the test already uses WhiteBox, please use whitebox to trigger a gc instead of this dodgy method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Triggering a GC is not enough, I hope these threads could be stalled by the GC locker(call GCLocker::stall_until_clear) so that a correct assertion of the number of stall count could be added.
I think it could not be done by WhiteBox::youngGC/fullGC, please correct me if I'm wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, although this is more a theoretical concern since it's not checked. Let's keep this for now as is though.
@@ -97,6 +98,7 @@ bool GCLocker::check_active_before_gc() { | |||
if (is_active() && !_needs_gc) { | |||
verify_critical_count(); | |||
_needs_gc = true; | |||
GCLockerTracer::start_gc_locker(_jni_lock_count); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really convinced that passing _jni_lock_count
here gives a lot of information: this is the number of threads in a critical section at the point of the first thread needing a gc. It's probably better than nothing. At least this information should be added to the description of the event (if that is possible).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this field can be used to judge whether there are many threads that are often in a critical section, but I am not sure if it really helps to analyze the problem., and just as you said, it's better than nothing. An appropriate description of this field has been added.
@@ -0,0 +1,126 @@ | |||
/* | |||
* Copyright (c) 2021 Alibaba Group Holding Limited. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to keep with the general format of copyright messages in other code, i.e. "Copyright (c) , . ..."? I.e. if possible please add a comma after the year.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
@@ -1090,6 +1090,11 @@ | |||
<Field type="boolean" name="onOutOfMemoryError" label="On Out of Memory Error" /> | |||
</Event> | |||
|
|||
<Event name="GCLocker" category="Java Virtual Machine, GC, Detailed" description="GC Locker Information" label="GC Locker" startTime="true" thread="true" stackTrace="true"> | |||
<Field type="uint" name="lockCount" label="Lock Count" /> | |||
<Field type="uint" name="stallCount" label="Stall Count" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add descriptions to the fields as mentioned above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
@@ -1091,8 +1091,8 @@ | |||
</Event> | |||
|
|||
<Event name="GCLocker" category="Java Virtual Machine, GC, Detailed" description="GC Locker Information" label="GC Locker" startTime="true" thread="true" stackTrace="true"> | |||
<Field type="uint" name="lockCount" label="Lock Count" /> | |||
<Field type="uint" name="stallCount" label="Stall Count" /> | |||
<Field type="uint" name="lockCount" label="Lock Count" description="the number of java threads in a critical section when the GC locker is started" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest changing this to:
"The number of Java threads in a critical section when the GC locker is started"
"The number of Java threads stalled by the GC locker"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed.
@@ -1091,8 +1091,8 @@ | |||
</Event> | |||
|
|||
<Event name="GCLocker" category="Java Virtual Machine, GC, Detailed" description="GC Locker Information" label="GC Locker" startTime="true" thread="true" stackTrace="true"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"GC Locker Information" is not very useful. Remove the description completely or provide more information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed.
/integrate |
Since @tschatzl requested changes yesterday I will wait for him to sponsor this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
/sponsor |
@tschatzl @D-D-H Since your change was applied there have been 194 commits pushed to the
Your commit was automatically rebased without conflicts. Pushed as commit 311a0a9. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
GC locker will stall the operation of GC, resulting in some Java threads can not continue to run until GC locker is released, thus affecting the response time of the application. Add a JFR event to report this information is helpful to detect this issue.
For the test purpose, I add two Whitebox methods to lock/unlock critical.
Progress
Issue
Reviewers
Download
$ git fetch https://git.openjdk.java.net/jdk pull/2088/head:pull/2088
$ git checkout pull/2088