-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8300148: Consider using a StoreStore barrier instead of Release barrier on ctor exit #18505
Conversation
👋 Welcome back caojoshua! A progress list of the required criteria for merging this PR into |
@caojoshua This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 79 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@shipilev, @vnkozlov, @dean-long) but any other Committer may sponsor as well. ➡️ To flag this PR as ready for integration with the above commit message, type |
@caojoshua The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you want to merge from master to get the clean GHA runs.
I am running some more tests here too.
test/hotspot/jtreg/compiler/c2/irTests/ConstructorBarriers.java
Outdated
Show resolved
Hide resolved
test/hotspot/jtreg/compiler/c2/irTests/ConstructorBarriers.java
Outdated
Show resolved
Hide resolved
I propose we also add this benchmark that verifies barrier costs and coalescing: ConstructorBarriers.txt. Maybe these also should be the IR tests. The benchmarks show that most combinations with On my Graviton 3 instance:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about MemBarRelease
in PhaseStringOpts::replace_string_concat()
?
Can we also add statistic about how many different barriers C2 generates and eliminates? It will help to know if we missing some optimization with these changes. |
I heard rumors that storeStore is only safe for the scenarios where the constructor doesn't read its already assigned final fields; so if we have something like class Sample {
final int a, b;
Sample(int v) {
this.a = v;
this.b = this.a + 1; // performs read instance field before publication
}
} then we still need a regular release barrier. Am I correct here? |
@liach, in your example, I don't see an issue, unless "this" somehow escaped and allowed another thread to write to this.a, which would be difficult considering it is "final". This earlier discussion might help: |
Missed this. Added a commit for this |
Benchmark results on my graviton instances see similar improvements to @shipilev 's Before:
After:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks nearly complete. I think for sanity reasons we still want the diagnostic flag that could be used to restore old behavior for in-field diagnostics. This is especially important since we are touching the optimizer code. Turning the flag off should restore all behavior, including the optimizer paths.
Something like UseStoreStoreForInit
?
@IR(failOn = IRNode.MEMBAR_RELEASE) | ||
@IR(failOn = IRNode.MEMBAR_STORESTORE) | ||
@IR(failOn = IRNode.MEMBAR_RELEASE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here and later: Looks weird to test for MEMBAR_RELEASE
twice. Also, should it be just failOn = IRNode.MEMBAR
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meant to test for MEMBAR_VOLATILE
. We can't fail on all MEMBAR
because there are some cases that have MEMBAR_ACQUIRE
. Added those checks in latest commits.
Added these statistics. Example output: I think there are missing cases for barriers eliminated. There could be cases of aside from |
storestore_worklist.append(n->as_MemBarStoreStore()); | ||
if (!UseStoreStoreForCtor || n->req() > MemBarNode::Precedent) { | ||
storestore_worklist.append(n->as_MemBarStoreStore()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case and the next case could use a more detailed explanation. We have 4 different possible inputs:
{StoreStore, Release} x {w/ Precedent, w/o Precedent} and 2 possible outcomes: worklist or record_for_optimizer. It's not obvious to me that we are doing the right thing for all cases, both in the old code and the new code.
Previously, I believe this optimization did not apply to the end-of-ctor-with-final barrier, but now it does. If it should always apply, then shouldn't it also apply to the Release barrier when !UseStoreStoreForCtor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case and the next case could use a more detailed explanation. We have 4 different possible inputs:
{StoreStore, Release} x {w/ Precedent, w/o Precedent} and 2 possible outcomes: worklist or record_for_optimizer.
We can eliminate barriers when it's precedent is an escaping object. If the barrier does not have a precedent, we cannot elide it, which is why we don't include it in the worklist / record_for_optimizer
.
I think its confusing because StoreStore barriers are optimized in escape.cpp
, while Release
barriers are optimized in memnode.cpp. I would have preferred if all escape-based on optimizations of barriers were just done in one place.
Previously, I believe this optimization did not apply to the end-of-ctor-with-final barrier, but now it does.
This is correct. End of ctor did not have StoreStore
barriers. They had Release
barriers, which escape analysis already handles. We have to check n->req() > MemBarNode::Precedent
, or else we run into assertion errors here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have preferred if all escape-based on optimizations of barriers were just done in one place.
It sounds like there is still some cleanup that could be done in this area. Is it worth a separate RFE?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think so. Created https://bugs.openjdk.org/browse/JDK-8330062
On a side note, would it be safe to replace explicit constructor emulation release barriers (Unsafe.storeFence) elsewhere in the JDK with storeStore, like in ClassValue, MutableCallSite, ClassSpecializer.Factory, ObjectInputStream, and Properties? |
some formatting suggestions from @shipilev Co-authored-by: Aleksey Shipilëv <shipilev@amazon.de>
I don't think its safe in general. Maybe for some of the use cases it would be desirable. |
@TheRealMDoerr, @GoeLin -- I think you'd want to ack that covering "IRIW" parts with just a |
Correct, a StoreStore barrier is sufficient. Thanks for the notification! |
* @bug 8300148 | ||
* @summary Test barriers emitted in constructors | ||
* @library /test/lib / | ||
* @requires os.arch=="aarch64" | os.arch=="x86_64" | os.arch=="amd64" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, Could you please enable this test for riscv64 as well? It passes on linux-riscv64 (fastdebug build).
@requires os.arch=="aarch64" | os.arch=="riscv64" | os.arch=="x86_64" | os.arch=="amd64"
All right, checking, are we still good here? We need someone else to ack as well. Maybe @theRealAph would be interested to sanity-check this? @vnkozlov, would you like to run this through Oracle testing? |
@TobiHartmann submitted tier1-3,stress,xcopm testing for version v11 yesterday. Testing passed - no new failures. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good.
Our internal testing completes fine as well, including aggressive compiler testing with jcstress, CTW runs and Fuzzers. Looks like we are ready to go. |
/integrate |
@caojoshua |
/sponsor |
Going to push as commit 1d06170.
Your commit was automatically rebased without conflicts. |
@shipilev @caojoshua Pushed as commit 1d06170. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
The JSR 133 cookbook has long recommended using a
StoreStore
barrier at the end of constructors that write to final fields.StoreStore
barriers are much cheaper on arm machines as shown in benchmarks in this issue as well as https://bugs.openjdk.org/browse/JDK-8324186.This change does not improve the case for constructors for objects with volatile fields because MemBarRelease is emitted for volatile stores. This is demonstrated in test case
classWithVolatile
, where this patch does not impact the IR.I had to modify some code around escape analysis to make sure there are no regressions in eliminating allocations and
StoreStore
's. The current handling of StoreStore's in escape analysis makes the assumption that the barriers input is aProj
to anAllocate
(example). This is contrary to the barriers in the end of the constructor where there the barrier directly takes in anAllocate
without an in betweenProj
. I opted to instead eliminateStoreStore
s in GVN, exactly howMemBarRelease
is handled.I had to add checks for StoreStore in macro.cpp, or else we fail some cases for reducing allocation merges.
Passes hotspot tier1 locally on a Linux machine.
Benchmarks
Running Renaissance ParNnemonics on an Amazon Graviton (arm) instance.
Baseline:
After patch:
We see a 16% improvement in throughput
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/18505/head:pull/18505
$ git checkout pull/18505
Update a local copy of the PR:
$ git checkout pull/18505
$ git pull https://git.openjdk.org/jdk.git pull/18505/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 18505
View PR using the GUI difftool:
$ git pr show -t 18505
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/18505.diff
Webrev
Link to Webrev Comment