Skip to content

Conversation

@fbredber
Copy link
Contributor

@fbredber fbredber commented May 29, 2024

Removed the concept of an ObjectMonitor Responsible thread.

The reason to have an ObjectMonitor Responsible thread was to avoid threads getting stranded due to a hole in the successor protocol. This hole was there because adding the necessary memory barrier was considered too expensive some 20 years ago.

The ObjectMonitor Responsible thread code adds complexity, and doing timed parks just to avoid getting stranded is not the way forward. More info about the problems with the ObjectMonitor responsible thread can be found in JDK-8320318.

After removing the ObjectMonitor Responsible thread we see increased performance on all supported platforms except Windows. JDK-8339730 has been created to handle this.

Passes tier1-tier7 on supported platforms.
x64, AArch64, Riscv64, ppc64le and s390x passes ok on the test/micro/org/openjdk/bench/vm/lang/LockUnlock.java test.
Arm32 and Zero doesn't need any changes as far as I can tell.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8320318: ObjectMonitor Responsible thread (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/19454/head:pull/19454
$ git checkout pull/19454

Update a local copy of the PR:
$ git checkout pull/19454
$ git pull https://git.openjdk.org/jdk.git pull/19454/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 19454

View PR using the GUI difftool:
$ git pr show -t 19454

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/19454.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented May 29, 2024

👋 Welcome back fbredberg! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented May 29, 2024

@fbredber This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8320318: ObjectMonitor Responsible thread

Reviewed-by: aboldtch, coleenp, pchilanomate, eosterlund

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 84 new commits pushed to the master branch:

  • 2a2ecc9: 8339475: Clean up return code handling for pthread calls in library coding
  • 85dba47: 8325090: javadoc fails when -subpackages option is used with non-modular -source
  • 1bc13a1: 8340552: Harden TzdbZoneRulesCompiler against missing zone names
  • e6373b5: 8340679: Misc tests fail assert(!set || SafepointSynchronize::is_at_safepoint()) failed: set once or at safepoint
  • 2349bb7: 8340974: Ambiguous name of jtreg property vm.libgraal.enabled
  • 5d062e2: 8340576: Some JVMCI flags are inconsistent
  • 1447967: 8339261: Logs truncated in test javax/net/ssl/DTLS/DTLSRehandshakeTest.java
  • a02d895: 8333403: Write a test to check various components events are triggered properly
  • bb040ef: 8340983: Use index and definition tags in Object and Double
  • 8225a5f: 8340981: Update citations to "Hacker's Delight"
  • ... and 74 more: https://git.openjdk.org/jdk/compare/0f253d11033a26d15ea20df19db6765bb274a848...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk
Copy link

openjdk bot commented May 29, 2024

@fbredber The following label will be automatically applied to this pull request:

  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot hotspot-dev@openjdk.org label May 29, 2024
@openjdk
Copy link

openjdk bot commented Jul 17, 2024

@fbredber this pull request can not be integrated into master due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout 8320318_objectmon_responsible_thread
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push

@openjdk openjdk bot added the merge-conflict Pull request has merge conflict with target branch label Jul 17, 2024
@openjdk openjdk bot removed the merge-conflict Pull request has merge conflict with target branch label Jul 17, 2024
@fbredber fbredber marked this pull request as ready for review September 9, 2024 08:36
@openjdk openjdk bot added the rfr Pull request is ready for review label Sep 9, 2024
@mlbridge
Copy link

mlbridge bot commented Sep 9, 2024

Webrevs

@fbredber
Copy link
Contributor Author

fbredber commented Sep 10, 2024

I've done basic testing on ppc64le, riscv64 and s390x using QEMU, but would appreciate if @TheRealMDoerr, @RealFYang and @offamitkumar could take it for a real test drive.

@TheRealMDoerr
Copy link
Contributor

Works with micro:LockUnlock on real PPC64 hardware, too. However, we need to run more tests and also check performance.
Please note that this PR has conflicts with other changes (#20922 and recent developments in the loom repo).

The JBS issue refers to "memory barriers (not a fence)", but you're using StoreLoad barriers which are nothing else than a "fence". I don't agree with the general statement that they have become significantly cheaper. That may be true for single chip designs, but not for large server systems (multi-socket). Did you run benchmarks which stress monitors on any large multi-socket system?

Copy link
Contributor

@coleenp coleenp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is such a nice simplifying change. I have some more suggestions.

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've started looking at this and to be honest I'm surprised by the extent and complexity of the changes. The problem description sounded quite simple: get rid of the notion of the Responsible thread by putting in the fence that when missing could lead to stranding. I find it very hard to map many of the actual code changes to that problem statement. And I'm very unclear about the impact on the deflation protocol that this is causing.

I think trying to look at diffs is the wrong way to analyze this change, I need to just look at the new code and try to understand the protocol - but that makes it hard to put comments into the PR. :(

Copy link
Member

@RealFYang RealFYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performed hs-tier1 - hs-tier3 tests on linux-riscv64 platform. Two minor comments for the riscv part.

@TheRealMDoerr
Copy link
Contributor

@fbredber, @dholmes-ora: I got a substantial performance drop on our 96 Thread Xeon server: LockUnlock.testContendedLock seems to be less than half as fast as without this patch. Also, some of the LockUnlock.testInflated* seem to be affected. (Large PPC64 servers as well.) Can you reproduce this on your side?

@dholmes-ora
Copy link
Member

@fbredber, @dholmes-ora: I got a substantial performance drop on our 96 Thread Xeon server:

What OS for the Xeon? We have only seen issues with Windows.

@TheRealMDoerr
Copy link
Contributor

@fbredber, @dholmes-ora: I got a substantial performance drop on our 96 Thread Xeon server:

What OS for the Xeon? We have only seen issues with Windows.

Sorry, I forgot to mention that it's linux (SUSE Linux Enterprise Server 15 SP4).

Copy link
Contributor

@coleenp coleenp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great. Very nice work!

@fbredber fbredber requested a review from RealFYang September 19, 2024 21:53
Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've taken another pass through and have a few queries.

Thanks

@TheRealMDoerr
Copy link
Contributor

@fbredber: If you need help to resolve the PPC64 conflicts with 7579d37, just let me know.

Copy link
Contributor

@fisk fisk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks generally good to me. Found some weirdness, but I think we can fix it after this goes in. Some nit too but I don't need to see the updated patch for that. Thanks for fixing this!

@openjdk openjdk bot removed the merge-conflict Pull request has merge conflict with target branch label Sep 25, 2024
@fbredber
Copy link
Contributor Author

@TheRealMDoerr @offamitkumar
I resolved merge conflicts in src/hotspot/cpu/ppc/macroAssembler_ppc.cpp and src/hotspot/cpu/s390/macroAssembler_s390.cpp. I've smoke tested it with QEMU, but it would be nice if you could check if it's ok as well.

Copy link
Contributor

@coleenp coleenp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really good. I have an issue with a new comment which might conflict with another reviewer.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Sep 25, 2024
@TheRealMDoerr
Copy link
Contributor

@TheRealMDoerr @offamitkumar I resolved merge conflicts in src/hotspot/cpu/ppc/macroAssembler_ppc.cpp and src/hotspot/cpu/s390/macroAssembler_s390.cpp. I've smoke tested it with QEMU, but it would be nice if you could check if it's ok as well.

Thanks for rebasing! The PPC64 implementation still looks good and some quick tests have passed on real hardware. I'll run more tests.

@offamitkumar
Copy link
Member

@TheRealMDoerr @offamitkumar
I resolved merge conflicts in src/hotspot/cpu/ppc/macroAssembler_ppc.cpp and src/hotspot/cpu/s390/macroAssembler_s390.cpp. I've smoke tested it with QEMU, but it would be nice if you could check if it's ok as well.

s390x Changes looks good and I ran test and didn't see any regression.

@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Sep 27, 2024
@openjdk openjdk bot added the ready Pull request is ready to be integrated label Sep 27, 2024
@TheRealMDoerr
Copy link
Contributor

Test results on PPC64 look good. Is it already documented somewhere that LockUnlock.testContendedLock is getting slower with this change? I couldn't see it.

@fbredber
Copy link
Contributor Author

@TheRealMDoerr

Test results on PPC64 look good. Is it already documented somewhere that LockUnlock.testContendedLock is getting slower with this change? I couldn't see it.

Thank you for testing on PPC64. I've added the missing documentation here.

@fbredber
Copy link
Contributor Author

Thanks all for good review comments and testing.

@fbredber
Copy link
Contributor Author

/integrate

@openjdk
Copy link

openjdk bot commented Sep 30, 2024

Going to push as commit 180affc.
Since your change was applied there have been 108 commits pushed to the master branch:

  • e19c7d8: 8340874: Open source some of the AWT Geometry/Button tests
  • 58b6fc5: 8341197: [BACKOUT] 8322770: Implement C2 VectorizedHashCode on AArch64
  • 1cf26a5: 8341013: Optimize x86/aarch64 MD5 intrinsics by reducing data dependency
  • 475b894: 8322770: Implement C2 VectorizedHashCode on AArch64
  • 52ba728: 8327114: Attach in Linux may have wrong behaviour when pid == ns_pid (Kubernetes debug container)
  • 988a531: 8340181: Shenandoah: Cleanup ShenandoahRuntime stubs
  • 822a773: 8340605: Open source several AWT PopupMenu tests
  • 6514aef: 8340419: ZGC: Create an UseLargePages adaptation of TestAllocateHeapAt.java
  • ae4d2f1: 8340621: Open source several AWT List tests
  • dd56990: 8340639: Open source few more AWT List tests
  • ... and 98 more: https://git.openjdk.org/jdk/compare/0f253d11033a26d15ea20df19db6765bb274a848...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Sep 30, 2024
@openjdk openjdk bot closed this Sep 30, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Sep 30, 2024
@openjdk
Copy link

openjdk bot commented Sep 30, 2024

@fbredber Pushed as commit 180affc.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hotspot hotspot-dev@openjdk.org integrated Pull request has been integrated

Development

Successfully merging this pull request may close these issues.

10 participants