-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JDK-8289524: Add JFR JIT restart event #9334
Conversation
👋 Welcome back mbaesken! A progress list of the required criteria for merging this PR into |
Webrevs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good and simple
Hi, thanks for the review ! May I have a second review ? |
Hi @MBaesken, perhaps we should take a larger view of this functionality and incorporate it into the SweepCodeCache event. I don't see any information provided that would reflect on overall CodeCache memory before vs after in relation to the sweeper? There are only counts. What if we extend the event SweepCodeCache with fields to reflect "memory before sweep", "memory after sweep" and a boolean "compiler restart". In addition, there are metadata aspects that need to be addressed, like for example, this is not a durational event, and since it is issued only by the Sweeper thread it will not have a stack trace etc. |
A memory before or after field should have the contentType="bytes" and we might as well use ulong as data type. There is no additional cost since data is stored using compressed integers. |
Thanks for clarifying Erik, I forgot to mention the contentType. |
Hi Markus, I think it would be possible and probably make sense to instead enhance the existing EventSweepCodeCache. |
Thanks for considering. It is fine to move the EventSweepCodeCache event around, you can see that the constructor takes the UNTIMED value. This means that timestamping is handled explicitly, by setting the set_starttime() and set_endtime() fields For compatibility, we can think of it like we are creating a subclass with extended fields. Just append the additional fields to the event. As for the memory field declarations, you can take a peek at other events in metadata.xml, for example, this field declaration is from CodeCacheFull:
|
Hi Markus, another question that came up while looking into this - the current SweepCodeCache event has a threshold of 100 ms set in both default.jfc and profile.jfc . This was probably fine for existing usages. But would we loose the JIT restart events sometimes in case we incorporate the Jit restart and frred memory into the current SweepCodeCache event ? |
That is a good reflection. Yes, if the duration of the restartable sweep is below the threshold, then no event will be sent. |
Can you explain a bit more about what "JIT restart" actually means? |
The comment at line 433 of sweeper.cpp and following is explaining it. |
Ok, so there is a corresponding "compiler stopped", implicitly noted by firing EventCodeCacheFull? |
Yes I think the EventCodeCacheFull (see CodeCache::report_codemem_full ) covers the JIT compiler stop pretty well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
@MBaesken This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 138 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
Hi again Mathias, based on your observation that this does not really align well with the threshold parameter for SweepCodeCache and is a complement event to EventCodeCacheFull, I believe your original suggestion is better (an instant event, only for "jit restart"). I was slightly tripped up with the concept of a "JIT restart", because at this point the JIT is already stopped, as implied by EventCodeCacheFull. So "JIT restart" here contextually means, "JIT start", as in "start JITting code again". Do you know if we have an event that describes the CodeCache settings in terms of memory sizes so that the "freed memory" can be interpreted relative to it? |
So the timestamp of this "JIT restart" event minus the timestamp of the previous "CodeCacheFull" event is the duration where JIT compilation is disabled, the reason being no memory available to accommodate new code. That's a good data point. |
It would be good to add a sanity check. See: The test in on the ProblemList.txt due to timeouts, but probably works most of the time if you run it. Also remove "JitRestart" from TestLookForUntestedEvents.java |
Hi Markus, at least we store a couple of addresses in the EventCodeCacheFull event. Those could potentially be used for interpretation. |
Yes, thank you, I took a look at those fields too. Unfortunately, there is no value that bears directly on "current in-use". The reserved and committed will not be updated. If there was a value that could reflect the current usage, even if it means adding it to both EventCodeCacheFull and the JIT restart, that would be perfect. EventCodeCacheFull because in-use is x, JIT restart because in-use is now y. There are some statistics-related properties in the CodeHeap, like for example heap->unallocated_capacity(); et al. Could some of those expose this running value perhaps? |
Probably we would need CodeCache::unallocated_capacity() (contains a number of code heaps as far as I know) to compare freed_memory to, because from what I see we iterate the whole CodeCache in NMethodSweeper::sweep_code_cache() . |
To get an idea of how much memory was freed in relation to what's actually available, you could evaluate CodeCache:max_capacity(). This function returns the overall size of the code heap (across all segments). The performance impact is minimal and no locks are acquired. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good to me.
So is the common opinion to get back to a separate JIT start event (I think that naming is prefered over JIT restart, am I correct) ? |
I would be in favour of that. |
I think it is fine to use the term "JIT restart" because it is in use both in the code as well as in the output of log statements. Another reason would be that the first "JIT start" event would always be missing. It was only the meaning that got me a bit confused. Yes, a separate event is preferrable having no duration (startTime=false), no stack trace (stackTrace=false). The .jfc configs only need one element:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good.
Now you can relate the freed memory to what's available in total.
@@ -558,6 +558,11 @@ | |||
<Field type="ulong" contentType="bytes" name="used" label="Used" /> | |||
</Event> | |||
|
|||
<Event name="JitRestart" category="Java Virtual Machine, Compiler" label="JIT restart" stackTrace="false" startTime="false" thread="true"> | |||
<Field type="int" name="freedMemory" label="Freed Memory" /> | |||
<Field type="ulong" name="codeCacheMaxCapacity" label="CodeCache maximum capacity" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is codeCacheMaxCapacity the current in-use size of the CodeCache, in bytes? It should have the contentType="bytes" in that case. Also "freedMemory" should have the same contentType.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, codeCacheMaxCapacity is given in bytes. It specifies the maximum size, not the current in-use size, of the CodeCache. It is the sum over all CodeHeap segments.
@@ -612,6 +617,7 @@ | |||
<Field type="int" name="adaptorCount" label="Adaptors" /> | |||
<Field type="ulong" contentType="bytes" name="unallocatedCapacity" label="Unallocated" /> | |||
<Field type="int" name="fullCount" label="Full Count" /> | |||
<Field type="ulong" name="codeCacheMaxCapacity" label="CodeCache maximum capacity" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
contentType="bytes"
@@ -54,7 +54,7 @@ public class TestLookForUntestedEvents { | |||
|
|||
private static final Set<String> hardToTestEvents = new HashSet<>( | |||
Arrays.asList( | |||
"DataLoss", "IntFlag", "ReservedStackActivation", | |||
"DataLoss", "IntFlag", "ReservedStackActivation", "JitRestart", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we create a test? There is an existing test for CodeCacheFull, maybe derive from it to create a test also for the JitRestart event?
@@ -558,6 +558,11 @@ | |||
<Field type="ulong" contentType="bytes" name="used" label="Used" /> | |||
</Event> | |||
|
|||
<Event name="JitRestart" category="Java Virtual Machine, Compiler" label="JIT restart" stackTrace="false" startTime="false" thread="true"> | |||
<Field type="int" name="freedMemory" label="Freed Memory" /> | |||
<Field type="ulong" name="codeCacheMaxCapacity" label="CodeCache maximum capacity" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be "Code Cache Maximum Capacity"
@@ -612,6 +617,7 @@ | |||
<Field type="int" name="adaptorCount" label="Adaptors" /> | |||
<Field type="ulong" contentType="bytes" name="unallocatedCapacity" label="Unallocated" /> | |||
<Field type="int" name="fullCount" label="Full Count" /> | |||
<Field type="ulong" name="codeCacheMaxCapacity" label="CodeCache maximum capacity" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be "Code Cache Maximum Capacity"
@@ -558,6 +558,11 @@ | |||
<Field type="ulong" contentType="bytes" name="used" label="Used" /> | |||
</Event> | |||
|
|||
<Event name="JitRestart" category="Java Virtual Machine, Compiler" label="JIT restart" stackTrace="false" startTime="false" thread="true"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"JITRestart" . The convention for other events have been to use capital laters for acronyms.
"JIT restart" -> "JIT Restart"
@@ -1366,6 +1366,7 @@ void CodeCache::report_codemem_full(int code_blob_type, bool print) { | |||
event.set_adaptorCount(heap->adapter_count()); | |||
event.set_unallocatedCapacity(heap->unallocated_capacity()); | |||
event.set_fullCount(heap->full_count()); | |||
event.set_codeCacheMaxCapacity(CodeCache::max_capacity()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add test of field in TestCodeCacheFull
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added an assertion in test/jdk/jdk/jfr/event/compiler/TestCodeCacheFull.java
@@ -442,6 +437,15 @@ void NMethodSweeper::sweep_code_cache() { | |||
CompileBroker::set_should_compile_new_jobs(CompileBroker::run_compilation); | |||
log.debug("restart compiler"); | |||
log_sweep("restart_compiler"); | |||
EventJitRestart event; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to have a sanity test of the event.
If the event can't be provoked reliably, retry or accept as OK.
Hi , I adjusted the contentType as suggested, adjusted the label and TestCodeCacheFull. |
I added a test after some playing around with the WhiteBox functionality. |
import jdk.test.lib.Asserts; | ||
import jdk.test.lib.jfr.EventNames; | ||
import jdk.test.lib.jfr.Events; | ||
import sun.hotspot.WhiteBox; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This package has been removed. Please use jdk.test.whitebox.WhiteBox.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Coleen thanks for the advice. After switching to the new package jdk.test.whitebox.WhiteBox I get
java.lang.UnsatisfiedLinkError: 'void sun.hotspot.WhiteBox.registerNatives()'
Do I need to do more than just renaming the package ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had to change BlobType import too, this one is also in a new package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With all the fine-tuning, changes look even better now. And there is a test now so we get alerted if anything breaks. Thanks!
/integrate |
Going to push as commit dfbc691.
Your commit was automatically rebased without conflicts. |
The JIT compiler restarts (see restart_compiler in NMethodSweeper::sweep_code_cache) would be a helpful addition to the JFR events. Currently we log the JIT stop operations in JFR (EventCodeCacheFull) but no restart.
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/9334/head:pull/9334
$ git checkout pull/9334
Update a local copy of the PR:
$ git checkout pull/9334
$ git pull https://git.openjdk.org/jdk pull/9334/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 9334
View PR using the GUI difftool:
$ git pr show -t 9334
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/9334.diff