-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8264634: CollectCLDClosure collects duplicated CLDs when dumping dynamic archive #3320
Conversation
👋 Welcome back yyang! A progress list of the required criteria for merging this PR into |
@kelthuzadx The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, Yi
The _loaded_cld is a global list, in this case it looks contain duplicated CLD in it.
The duplication could from the thread run shutdown hook.
Could you try
if (!cld->is_unloading()) {
cld->inc_keep_alive();
+
if (!_loaded_cld->contains(cld)) {
_loaded_cld->append(cld);
+
}
}
Please let us know if you can avoid the crash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fix looks reasonable. If MetaspaceShared::link_and_cleanup_shared_classes may be called twice, it's better to isolate the loaded_cld for each invocation. Allocating it locally will also avoid any potential threading issues.
I have some requests for cleaning up the code.
cld->dec_keep_alive(); | ||
} | ||
loaded_cld.trunc_to(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no need for the trucate -- loaded_cld
is locally allocated and will be freed after this function returns.
Also, to improve modularity, I think we should move the dec_keep_alive loop into the destructor of CollectCLDClosure.
Also, loaded_cld
can be moved as a field into CollectCLDClosure.
CollectCLDClosure collect_cld; | ||
ResourceMark rm; | ||
GrowableArray<ClassLoaderData*> loaded_cld; | ||
CollectCLDClosure collect_cld(&loaded_cld); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should add a comment to say why it's necessary to first collect the ClassLoaderDatas first:
// ClassLoaderDataGraph::loaded_cld_do requires ClassLoaderDataGraph_lock.
// We cannot link the classes while holding this lock (or else we may run into deadlock).
// Therefore, we need to first collect all the CLDs, and then link their classes after
// releasing the lock.
Hi Yumin, this fix still crashes because the CLDs collected at the first invocation of MetaspaceShared::link_and_cleanup_shared_classes are not cleaned, they will decrement their _keep_alives as before at the second invocation of MetaspaceShared::link_and_cleanup_shared_classes. |
Hi Ioi,
Suggestions make sense, changed. Tests under runtime/cds/ are all passed with slowdebug mode. |
@kelthuzadx This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 47 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@yminqi, @iklam) but any other Committer may sponsor as well. ➡️ To flag this PR as ready for integration with the above commit message, type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make the CLD list local is a reasonable solution. LGTM.
/integrate |
@kelthuzadx Unknown command |
@kelthuzadx |
/sponsor |
@yminqi @kelthuzadx Since your change was applied there have been 48 commits pushed to the
Your commit was automatically rebased without conflicts. Pushed as commit 54b4070. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
We observed VM crashed when dumping dynamic archive in a simple springboot application(See detailed content on JBS attachment). I did some investigations. In rare case, both of the following paths may be stepped on when dumping dynamic archive:
They would call MetaspaceShared::link_and_cleanup_shared_classes, and CollectCLDClosure collects duplicated CLDs into _loaded_cld, _keep_alive is decrementing twice, causing a negative _keep_alive.
Testing(linux_x64):
[+] test/hotspot/jtreg/runtime/cds
[+] test/hotspot/jtreg/gc
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/3320/head:pull/3320
$ git checkout pull/3320
Update a local copy of the PR:
$ git checkout pull/3320
$ git pull https://git.openjdk.java.net/jdk pull/3320/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 3320
View PR using the GUI difftool:
$ git pr show -t 3320
Using diff file
Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/3320.diff