New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8278020: ~13% variation in Renaissance-Scrabble #6838
Conversation
|
Webrevs
|
I would have kept the two fields together after the switch so that you can add a comment. Seems totally bizarre that two such separated fields would have such an affect. Isn't this needed in 18 though? |
Yes, this small change is good for JDK 18. |
An other experiment could be done is to add padding (space) before Klass data to make sure it is in different cache line. |
David, if my theory is correct, the contention does not happen between the two fields. It happens between the I have added comments near _vtable_len to explain why it's placed there inside the Klass. I swapped it with |
It's possible to make CDS segmented so that the Klasses are allocated together. That can be done in 19. |
That's a good idea. I will try that to see if it has the same effect as the current patch. |
@iklam This change now passes all automated pre-integration checks. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 14 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.
|
I approved but still think this should be targeted at 18 - assuming this was a performance regression in 18. Padding may have the same performance affect but will also impact footprint potentially - which in turn may impact the caching behaviour. |
Hi Ioi,
The fix looks fine.
this is interesting to me, because in the context of Lilliput (openjdk/lilliput#13) I was kind of counting on CDS to intermix Klass and non-class metadata, since that way CDS uses the larger Klass alignment gaps. In fact, I have this wild idea to shape metaspace in that form, merging Klass and non-class metadata into one larger class space. It would be really good to have a better idea of these interactions.
What tool did you use to measure the dcache misses?
Cheers, Thomas
I suggested padding only as experiment to prove Ioi's theory. Current changes are good as the fix. |
@ericcaspole checked our benchmark database and the regression seems to have started around JDK 15. So yes, I will backport the fix to 17 and 18. I want to integrate into the mainline first so it can be baked a little before the backport. |
BTW you could test that theory, if you wanted, by repeating the test with CDS off and disabling compressed class pointers. |
Hi Thomas, @ericcaspole did the measurements so he will have more information, but I believe he used https://github.com/jvm-profiling-tools/async-profiler to generate traces like this (which I pasted into the bug report):
@vnkozlov I found that most Klasses in CDS are preceded by a Method. Does the jitted code write into a Method often? |
Okay, thanks.
Method counters? May be worth spreading them out better, or to pad them to prevent false sharing. |
Thanks @dholmes-ora @vnkozlov @tstuefe for the review. |
Going to push as commit 4ba980b.
Your commit was automatically rebased without conflicts. |
I don't think compiled code updates something in Method. We need to look on fields layout. |
We found that when CDS is enabled, there is a ~13% variation in the Renaissance-Scrabble benchmark between different builds of the JDK. In one example, only two core-lib classes, unrelated to the benchmark, changed between two builds, but one build is consistently faster than the other.
When CDS is disabled, we do not see such variations.
In the slow case, there seems to be frequent dcache misses when loading the
Klass::_vtable_len
field, which is at offset 24 from the beginning of the Klass (see bug report for details).We suspect that the problem is with the layout of the CDS archive. Specifically, in CDS, Klass objects are inter-mixed with other metadata objects (such as Methods). In contrast, when CDS is disabled, (on 64-bit platforms with compressed klass pointers), Klass objects are allocated in their own space, separated from other metadata objects.
My theory is: when CDS is enabled, perhaps the modification of an object that sits immediately above the Klass invalidates the cacheline that holds
Klass::_vtable_len
. In a different JDK build, the exact addresses of the metadata objects in the CDS archive may be slightly nudged so we don't see the cacheline effect anymore.As an experiment, I swapped
Klass::_vtable_len
withKlass::_modifier_flags
(which was at offset 164 before this patch), and the variation stopped. Both fields are 32 bits in size.I have no concrete proof that my theory is correct, but this change seems to be harmless. @ericcaspole has run all the benchmarks in Oracle's CI and found consistent improvement with Renaissance-Scrabble, and no degradation in other benchmarks.
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/6838/head:pull/6838
$ git checkout pull/6838
Update a local copy of the PR:
$ git checkout pull/6838
$ git pull https://git.openjdk.java.net/jdk pull/6838/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 6838
View PR using the GUI difftool:
$ git pr show -t 6838
Using diff file
Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/6838.diff