Skip to content

8307106: Allow concurrent GCs to walk CLDG without ClassLoaderDataGraph_lock #13718

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

fisk
Copy link
Contributor

@fisk fisk commented Apr 28, 2023

A concurrent GC with concurrent class unloading can't currently walk the CLDG without the CLDG_lock today. We should add some synchronization code so it can do that safely. This patch adds the missing bits.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8307106: Allow concurrent GCs to walk CLDG without ClassLoaderDataGraph_lock

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/13718/head:pull/13718
$ git checkout pull/13718

Update a local copy of the PR:
$ git checkout pull/13718
$ git pull https://git.openjdk.org/jdk.git pull/13718/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 13718

View PR using the GUI difftool:
$ git pr show -t 13718

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/13718.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Apr 28, 2023

👋 Welcome back eosterlund! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Apr 28, 2023
@openjdk
Copy link

openjdk bot commented Apr 28, 2023

@fisk The following label will be automatically applied to this pull request:

  • hotspot-runtime

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-runtime hotspot-runtime-dev@openjdk.org label Apr 28, 2023
@openjdk
Copy link

openjdk bot commented Apr 28, 2023

@fisk This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8307106: Allow concurrent GCs to walk CLDG without ClassLoaderDataGraph_lock

Reviewed-by: stefank, aboldtch, coleenp, dholmes

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 57 new commits pushed to the master branch:

  • c8f3756: 8306729: Add nominal descriptors of modules and packages to Constants API
  • 0b5b642: 8307150: RISC-V: Remove remaining StoreLoad barrier with UseCondCardMark for Serial/Parallel GC
  • 418a825: 8306466: Open source more AWT Drag & Drop related tests
  • 74667e3: 8303919: Instant.ofEpochMilli says it can throw an exception that it can't
  • 76991c8: 8282232: [Win] GetMousePositionWithPopup test fails due to wrong mouse position
  • 05b9b58: 8302496: Runtime.exit incorrectly says it never throws an exception
  • 8a70664: 8293117: Add atomic bitset functions
  • 8c106b0: 8303784: no-@target annotations should be applicable to type parameter declarations
  • b76f320: 8307123: Fix deprecation warnings in DPrinter
  • a8bf2ac: 8304888: Add dedicated VMProps for linker and fallback linker
  • ... and 47 more: https://git.openjdk.org/jdk/compare/83a98c66f1747fec3da77578b646498c4cb5637d...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Apr 28, 2023
@mlbridge
Copy link

mlbridge bot commented Apr 28, 2023

Webrevs

Copy link
Contributor

@coleenp coleenp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me too. Thanks for the preview, Erik.

@fisk
Copy link
Contributor Author

fisk commented Apr 28, 2023

Thanks for the reciews @coleenp @stefank and @xmas92!

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any change here where code stops acquiring the CLDG lock. Is this just a preparatory change?

Using acquire/release etc is certainly necessary to be able to walk the list safely, but it is not obvious it is sufficient. I can't tell when nodes in the CLDG actually get deleted such that the GC thread doing the walking can't access a no longer existing node.

Thanks.

@@ -520,7 +527,8 @@ bool ClassLoaderDataGraph::do_unloading() {
prev->set_next(data);
} else {
assert(dead == _head, "sanity check");
_head = data;
// The GC might be walking this concurrently
Atomic::store(&_head, data);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are using load_acquire then surely this must be a release-store as they need to be paired.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The load_acquire synchronizes with release_store on insertions. This relaxed unlinking store, doesn't publish anything new that the load_acquire is interested in. It rolls back the head to a CLD that was installed at some earlier point with release_store, meaning it has already been published safely before.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying

@fisk
Copy link
Contributor Author

fisk commented May 2, 2023

Thanks for having a look, @dholmes-ora.

I don't see any change here where code stops acquiring the CLDG lock. Is this just a preparatory change?

Yes, indeed. It's for generational ZGC.

Using acquire/release etc is certainly necessary to be able to walk the list safely, but it is not obvious it is sufficient. I can't tell when nodes in the CLDG actually get deleted such that the GC thread doing the walking can't access a no longer existing node.

  1. We unlink all unloading classes during do_unloading(). Then the GC performs a thread-local handshake with all threads, before it finally calls purge() to delete them. So in that sense, we use safe memory reclamation to ensure that nobody is poking around at the CLD concurrently any longer, when we delete them.
  2. Note that we already perform release_store on the head when CLDs are inserted (always prepended). And that's because we need to safely publish the next pointer (and generally the initial CLD contents). However, when we unlink CLDs, we only use relaxed stores, because we are not mutating the CLD in any interesting way that needs ordering w.r.t. the unlinking stores.

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the additional explanations @fisk.

@fisk
Copy link
Contributor Author

fisk commented May 3, 2023

Thanks for the additional explanations @fisk.

Thanks @dholmes-ora!

@fisk
Copy link
Contributor Author

fisk commented May 3, 2023

/integrate

@openjdk
Copy link

openjdk bot commented May 3, 2023

Going to push as commit 462b1df.
Since your change was applied there have been 57 commits pushed to the master branch:

  • c8f3756: 8306729: Add nominal descriptors of modules and packages to Constants API
  • 0b5b642: 8307150: RISC-V: Remove remaining StoreLoad barrier with UseCondCardMark for Serial/Parallel GC
  • 418a825: 8306466: Open source more AWT Drag & Drop related tests
  • 74667e3: 8303919: Instant.ofEpochMilli says it can throw an exception that it can't
  • 76991c8: 8282232: [Win] GetMousePositionWithPopup test fails due to wrong mouse position
  • 05b9b58: 8302496: Runtime.exit incorrectly says it never throws an exception
  • 8a70664: 8293117: Add atomic bitset functions
  • 8c106b0: 8303784: no-@target annotations should be applicable to type parameter declarations
  • b76f320: 8307123: Fix deprecation warnings in DPrinter
  • a8bf2ac: 8304888: Add dedicated VMProps for linker and fallback linker
  • ... and 47 more: https://git.openjdk.org/jdk/compare/83a98c66f1747fec3da77578b646498c4cb5637d...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label May 3, 2023
@openjdk openjdk bot closed this May 3, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels May 3, 2023
@openjdk
Copy link

openjdk bot commented May 3, 2023

@fisk Pushed as commit 462b1df.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-runtime hotspot-runtime-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

5 participants