Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8317007: Add bulk removal of dead nmethods during class unloading #16767

Closed

Conversation

tschatzl
Copy link
Contributor

@tschatzl tschatzl commented Nov 21, 2023

Hi all,

please review this change that bulk-removes dead nmethods for the STW collectors instead of unregistering nmethod by nmethod. This significantly speeds up the class unloading phase.

For G1, this is almost 100% the code that has been removed from the review for JDK-8315503.

For Serial and Parallel GC, the code is new.

This change does not try to improve the situation for concurrent collectors - this would at first glance require extending the scope of the CodeCache_lock which I did not want to do. See the CR for more details.

Also, no parallelism for Parallel GC: the existing data structure for code root remembered set is a linked list. There is in almost all cases no point in trying to parallelize what is basically a traversal of the linked list (with each element not having a lot of work to do). I filed JDK-8320067. There should already be a significant speedup with this change, as each unregister_nmethod call previously traversed the entire linked list to find that element (i.e. complexity reduction of O(n^2) -> O(n)).

One more comment:

  • So it would be possible to merge the ScavengableNMethod::prune_nmethods_not_into_young which removes elements from the code root sets that do no longer need to be remembered (due to object movement) with removing dead/unlinked nmethods. However, this would mean to put that class unloading code to the end of gc as during phase 1 the new locations which are relevant for pruning uninteresting code root remembered set entries did not move yet. I wanted to keep determining the unlinking nmethods and class/code unloading them together in the code, but I could be convinced to move it.

Testing: tier1-4

Depends on #16759, please review first.

Thanks,
Thomas


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8317007: Add bulk removal of dead nmethods during class unloading (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/16767/head:pull/16767
$ git checkout pull/16767

Update a local copy of the PR:
$ git checkout pull/16767
$ git pull https://git.openjdk.org/jdk.git pull/16767/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 16767

View PR using the GUI difftool:
$ git pr show -t 16767

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/16767.diff

Webrev

Link to Webrev Comment

Thomas Schatzl added 4 commits November 20, 2023 12:20
…ptimization when adding, making this procedure O(n) instead of O(n^2)

Introduce a globally available ClassUnloadingContext that contains common methods pertaining to class and code unloading.
GCs may use it to efficiently manage unlinked class loader datas and nmethods to allow use of common methods (unlink/merge).

The steps typically are registering a new to be unlinked CLD/nmethod, and then purge its memory later. STW collectors perform
this work in one big chunk taking the CodeCache_lock, for the entire duration, while concurrent collectors lock/unlock for every
insertion to allow for concurrent users for the lock to progress.

Some care has been taken to stay consistent with an "unloading = unlinking + purge" scheme; however particularly the existing
CLD handling API (still) mixes unlinking and purging in its CLD::unload() call. To simplify this change that is mostly geared
towards separating nmethod unlinking from purging, to make code blob freeing O(n) instead of O(n^2).

Upcoming changes will
* separate nmethod unregistering from nmethod purging to allow doing that in bulk (for the STW collectors); that can significantly
  reduce code purging time for the STW collectors.
* better name the second stage of unlinking (called "cleaning" throughout, e.g. the work done in `G1CollectedHeap::complete_cleaning`)
* untangle CLD unlinking and what's called "cleaning" now to allow moving more stuff into the second unlinking stage for better
  parallelism
* G1: move some signifcant tasks from the remark pause to concurrent (unregistering nmethods, freeing code blobs and cld/metaspace purging)
* Maybe move Serial/Parallel GC metaspace purging closer to other unlinking/purging code to keep things local and allow easier logging.
please review this change that bulk-removes dead nmethods for the STW collectors instead of unregistering nmethod by nmethod. This significantly speeds up the class unloading phase.

For G1, this is almost 100% the code that has been removed from the review for JDK-8315503.

For Serial and Parallel GC, the code is new.

This change does not try to improve the situation for concurrent collectors - this would at first glance require extending the scope of the CodeCache_lock which I did not want to do. See the CR for more details.

Also, no parallelism for Parallel GC: the existing data structure for code root remembered set is a linked list. There is in almost all cases no point in trying to parallelize what is basically a traversal of the linked list (with each element not having a lot of work to do). I file JDK-8320067. There should still be a significant speedup, as each unregister_nmethod call previously traversed the entire linked list to find that element (i.e. complexity reduction of O(n^2) -> O(n)).

One more comment:
So it would be possible to merge the ScavengableNMethod::prune_nmethods which removes elements from the code root sets that do no longer need to be remembered (due to object movement) and removing dead nmethods.
However, this would mean to put that code at the end of gc (where prune_nmethods is called now) as during phase 1 the new locations which are relevant for pruning uninteresting code root remembered set entries did not move yet. I wanted to keep determining the dead nmethods and removing them together in the code, but I could be convinced to move it.

Testing: tier1-4
@tschatzl
Copy link
Contributor Author

/label add hotspot-gc

@bridgekeeper
Copy link

bridgekeeper bot commented Nov 21, 2023

👋 Welcome back tschatzl! A progress list of the required criteria for merging this PR into pr/16759 will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot changed the title 8317007 8317007: Add bulk removal of dead nmethods during class unloading Nov 21, 2023
@openjdk openjdk bot added rfr Pull request is ready for review hotspot-gc hotspot-gc-dev@openjdk.org labels Nov 21, 2023
@openjdk
Copy link

openjdk bot commented Nov 21, 2023

@tschatzl
The hotspot-gc label was successfully added.

@mlbridge
Copy link

mlbridge bot commented Nov 21, 2023

@openjdk-notifier openjdk-notifier bot changed the base branch from pr/16759 to master December 5, 2023 10:41
@openjdk-notifier
Copy link

The parent pull request that this pull request depends on has now been integrated and the target branch of this pull request has been updated. This means that changes from the dependent pull request can start to show up as belonging to this pull request, which may be confusing for reviewers. To remedy this situation, simply merge the latest changes from the new target branch into this pull request by running commands similar to these in the local repository for your personal fork:

git checkout submit/8317007-nmethod-bulk-removal
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# if there are conflicts, follow the instructions given by git merge
git commit -m "Merge master"
git push

@openjdk
Copy link

openjdk bot commented Dec 5, 2023

@tschatzl this pull request can not be integrated into master due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout submit/8317007-nmethod-bulk-removal
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push

@openjdk openjdk bot added the merge-conflict Pull request has merge conflict with target branch label Dec 5, 2023
@openjdk openjdk bot removed the merge-conflict Pull request has merge conflict with target branch label Dec 5, 2023
Copy link
Member

@albertnetymk albertnetymk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much perf impact does "bulk removal" introduce? (If I understand this PR correctly, this IS an optimization after all.)

src/hotspot/share/code/nmethod.cpp Outdated Show resolved Hide resolved
src/hotspot/share/code/compiledMethod.hpp Outdated Show resolved Hide resolved
src/hotspot/share/gc/g1/g1CollectedHeap.cpp Outdated Show resolved Hide resolved
@tschatzl
Copy link
Contributor Author

tschatzl commented Dec 7, 2023

How much perf impact does "bulk removal" introduce? (If I understand this PR correctly, this IS an optimization after all.)

On G1, bulk removal reduces time spent in unregistering nmethods from 600ms to 30ms (and only that bad because in the test the code roots are very unbalanced and distribution is per-region) for a stress test, i.e. a 20x improvement. Sorry for not mentioning (again).

I think this is sufficient enough to warrant improvement.

@tschatzl
Copy link
Contributor Author

tschatzl commented Dec 7, 2023

Here are two before/after graphs for this change on ccstress (also added to the CR):

The next one shows Remark Pause time before and after this change on the ccstress benchmark (from JDK-8315503).

CCStress Remark Pause time before after 20231207

This one shows a breakdown of Nmethod Purge and NMethod unregister time (and the sum) on that benchmark, compared to current master.
CCStress Purge NMethod Unregistering breakdown 20231207

Still a good improvement, and mostly only that bad because of imbalance of the bulk nmethod unregister phase (iirc). That could be improved at a later point.

@tschatzl
Copy link
Contributor Author

tschatzl commented Dec 7, 2023

Here are two before/after graphs for this change on ccstress (also added to the CR):

The next one shows Remark Pause time before and after this change on the ccstress benchmark (from JDK-8315503).

CCStress Remark Pause time before after 20231207

The large peaks at second 180 and 225 in the graph are caused by class loader/metaspace purging unrelated to this change.

Copy link
Member

@walulyai walulyai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@openjdk
Copy link

openjdk bot commented Dec 15, 2023

@tschatzl This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8317007: Add bulk removal of dead nmethods during class unloading

Reviewed-by: ayang, iwalulya

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 143 new commits pushed to the master branch:

  • bdebf19: 8322175: test/langtools/tools/javac/classreader/BadMethodParameter.java doesn't compile
  • 20de541: 8322040: Missing array bounds check in ClassReader.parameter
  • b31454e: 8322098: os::Linux::print_system_memory_info enhance the THP output with /sys/kernel/mm/transparent_hugepage/hpage_pmd_size
  • 0be0775: 8320397: RISC-V: Avoid passing t0 as temp register to MacroAssembler:: cmpxchg_obj_header/cmpxchgptr
  • 6dfb812: 8321823: Remove redundant PhaseGVN transform_no_reclaim
  • a7dde57: 8322057: Memory leaks in creating jfr symbol array
  • 692be57: 8322065: Initial nroff manpage generation for JDK 23
  • d02bc87: 8309981: Remove expired flags in JDK 23
  • 8b24851: 8321480: ISO 4217 Amendment 176 Update
  • c328f95: 8296787: Unify debug printing format of X.509 cert serial numbers
  • ... and 133 more: https://git.openjdk.org/jdk/compare/672f37324f9f15ae3e03b9b3b86c7106e6a09eed...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Dec 15, 2023
src/hotspot/share/code/nmethod.cpp Outdated Show resolved Hide resolved
src/hotspot/share/code/nmethod.cpp Outdated Show resolved Hide resolved
@tschatzl
Copy link
Contributor Author

Thanks @albertnetymk @walulyai for your reviews
/integrate

@openjdk
Copy link

openjdk bot commented Dec 18, 2023

Going to push as commit f553819.
Since your change was applied there have been 149 commits pushed to the master branch:

  • 34351b7: 8321892: Typo in log message logged by src/hotspot/share/nmt/virtualMemoryTracker.cpp
  • b061b66: 8322041: JDK 22 RDP1 L10n resource files update
  • dcdcd48: 8321479: java -D-D crashes
  • 87ef733: 8321958: @param/@return descriptions of ZoneRules#isDaylightSavings() are incorrect
  • 05f7f0a: 8321288: [JVMCI] HotSpotJVMCIRuntime doesn't clean up WeakReferences in resolvedJavaTypes
  • 6311dab: 8322018: Test java/lang/String/CompactString/MaxSizeUTF16String.java fails with -Xcomp
  • bdebf19: 8322175: test/langtools/tools/javac/classreader/BadMethodParameter.java doesn't compile
  • 20de541: 8322040: Missing array bounds check in ClassReader.parameter
  • b31454e: 8322098: os::Linux::print_system_memory_info enhance the THP output with /sys/kernel/mm/transparent_hugepage/hpage_pmd_size
  • 0be0775: 8320397: RISC-V: Avoid passing t0 as temp register to MacroAssembler:: cmpxchg_obj_header/cmpxchgptr
  • ... and 139 more: https://git.openjdk.org/jdk/compare/672f37324f9f15ae3e03b9b3b86c7106e6a09eed...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Dec 18, 2023
@openjdk openjdk bot closed this Dec 18, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Dec 18, 2023
@openjdk
Copy link

openjdk bot commented Dec 18, 2023

@tschatzl Pushed as commit f553819.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@tschatzl tschatzl deleted the submit/8317007-nmethod-bulk-removal branch December 18, 2023 09:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-gc hotspot-gc-dev@openjdk.org integrated Pull request has been integrated
3 participants