-
Notifications
You must be signed in to change notification settings - Fork 6.2k
8343840: Rewrite the ObjectMonitor lists #23421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8343840: Rewrite the ObjectMonitor lists #23421
Conversation
|
👋 Welcome back fbredberg! A progress list of the required criteria for merging this PR into |
|
@fbredber This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be: You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 169 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
Webrevs
|
dholmes-ora
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disclaimer for other reviewers, I have been looking at this code for some time now.
Overall code looks good. I have quite a few comments/suggestions about comments.
I suggest renaming _vthread_cxq_head to just _vthread_head as the cxq part is no longer meaningful.
I agree that even though this seems performance neutral, the code simplification (for people reading it for the first time) will be worth it.
Thanks.
coleenp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really good - I have some small change and improvement requests.
coleenp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change looks great. Thank you!
|
I've used QEMU to smoke test this PR on ppc64le, riscv64 and s390x, But it would be nice if @TheRealMDoerr, @RealFYang and @offamitkumar could check if it runs okay on real hardware as well. |
|
@pchilano |
FYI: hs:tier1 - hs:tier3 test good on linux-riscv64 platform. |
dholmes-ora
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay that's good enough for me. :)
Thanks
Tier1 test passed on s390x. |
The PPC64 code looks correct and some quick tests have passed. I'll run larger test suites over the weekend. |
Test results look good (including tier 1-4 on many platforms). I didn't see any new issue related to this. |
pchilano
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good to me. Just a few comments.
Thanks,
Patricio
|
@mur47x111 |
mur47x111
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JVMCI changes look go to me! We are good to go!
pchilano
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, looks good.
dholmes-ora
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Thanks
|
Thanks everyone for the reviews, testing and Graal adaptation. /integrate |
|
Going to push as commit 7a5acb9.
Your commit was automatically rebased without conflicts. |
I've combined two
ObjectMonitor's lists,EntryListandcxq, into one list. Theentry_list.This way c2 no longer has to check both
EntryListandcxqin order to opt out if the "conceptual entry list" is empty, which also means that the constant question about if it's safe to first check theEntryListand thencxqwill be a thing of the past.In the current multi-queue design new threads where always added to the
cxq, thenObjectMonitor::exitwould choose a successor from the head ofEntryList. When theEntryListwas empty andcxqwas not,ObjectMonitor::exitwhould detached the singly linkedcxqlist, and add the elements to the doubly linkedEntryList. The element that was first added tocxqwhould be at the tail of theEntryList. This way you ended up working through the contending threads in LIFO-chunks.The new list-design is as much a multi-queue as the current. Conceptually it can be looked upon as if the old singly linked
cxqlist doesn't end with a null pointer, but instead has a link that points to the head of the doubly linkedentry_list.You always add to the
entry_listby Compare And Exchange to the head. The most common case is that you remove from the tail (the successor is chosen in strict FIFO order). The head is volatile, but the interior is stable.The first contending thread that "pushes" itself onto
entry_list, will be the last thread in the list. Each newly pushed thread inentry_listwill be linked trough its next pointer, and have its prev pointer set to null, thus pushing new threads ontoentry_listwill form a singly linked list. The list is always in the right order (via the next-pointers) and is never moved to another list.Since we choose the successor in FIFO order, the exiting thread needs to find the tail of the
entry_list. This is done by walking from theentry_listhead. While walking the list we assign the prev pointers of each thread, essentially forming a doubly linked list. The tail pointer is cached inentry_list_tailso that we don't need to walk from theentry_listhead each time we need to find the tail (successor).Performance wise the new design seems to be equal to the old design, even though c2 generates two less instructions per monitor unlock operation.
However the complexity of the source has been reduced by removing the
TS_CXQstate and adding functions instead of inliningcmpxchghere and there, and the fact that c2 no longer has to check bothEntryListandcxqmakes this PR worthwhile, I think.Tests tier1-7 passes okay as well as micro-benchmarks like
vm.lang.LockUnlock.Unsupported platforms { ppc, riscv, s390 } has been tested with QEmu.
Progress
Issue
Reviewers
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/23421/head:pull/23421$ git checkout pull/23421Update a local copy of the PR:
$ git checkout pull/23421$ git pull https://git.openjdk.org/jdk.git pull/23421/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 23421View PR using the GUI difftool:
$ git pr show -t 23421Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/23421.diff
Using Webrev
Link to Webrev Comment