Skip to content

8278602: CDS dynamic dump may access unloaded classes #6859

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 11 commits into from

Conversation

iklam
Copy link
Member

@iklam iklam commented Dec 16, 2021

Cause of crash:

When dumping a CDS archive, while iterating over entries of the SystemDictionaryShared::_dumptime_table, we do not check whether the classes are already unloaded. In the crash, we are trying to call InstanceKlass::signer() but the class has already been unloaded.

Fix:

Override the template function DumpTimeSharedClassTable::iterate to ensure iteration safety. Do not iterate over a class if its class_loader_data is no longer alive.

The assert in DumpTimeSharedClassTable::IterationHelper found another existing bug -- we were calling SystemDictionaryShared::is_dumptime_table_empty() without holding the DumpTimeTable_lock. I delayed the call until we have grabbed the lock.

Testing:

I have attached a test case into the bug report. Without the fix, it would reproduce the same crash in less than a minute. With the fix, the crash is no longer reproducible.

Unfortunately, the test case requires a ZGC patch (thanks to @stefank) that adds delays to increase the likelihood of seeing unloaded classes inside the _dumptime_table. Therefore, I cannot integrate the test as a jtreg test. I'll mark the bug as noreg-hard


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8278602: CDS dynamic dump may access unloaded classes

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/6859/head:pull/6859
$ git checkout pull/6859

Update a local copy of the PR:
$ git checkout pull/6859
$ git pull https://git.openjdk.java.net/jdk pull/6859/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 6859

View PR using the GUI difftool:
$ git pr show -t 6859

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/6859.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Dec 16, 2021

👋 Welcome back iklam! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Dec 16, 2021

@iklam The following label will be automatically applied to this pull request:

  • hotspot-runtime

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-runtime hotspot-runtime-dev@openjdk.org label Dec 16, 2021
@iklam
Copy link
Member Author

iklam commented Dec 16, 2021

/label add hotspot
/label remove hotspot-runtime

@openjdk openjdk bot added the hotspot hotspot-dev@openjdk.org label Dec 16, 2021
@openjdk
Copy link

openjdk bot commented Dec 16, 2021

@iklam
The hotspot label was successfully added.

@openjdk openjdk bot removed the hotspot-runtime hotspot-runtime-dev@openjdk.org label Dec 16, 2021
@openjdk
Copy link

openjdk bot commented Dec 16, 2021

@iklam
The hotspot-runtime label was successfully removed.

@iklam iklam marked this pull request as ready for review December 16, 2021 03:58
@openjdk openjdk bot added the rfr Pull request is ready for review label Dec 16, 2021
@mlbridge
Copy link

mlbridge bot commented Dec 16, 2021

Webrevs

Copy link
Member

@stefank stefank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reviewed the interaction of the klasses in the _dumptime_table with the new is_loader_alive() check. I don't know the reset of the CDS code to know if the other changes are correct or not. I spotted something that looks weird:

Comment on lines 190 to 194
void SystemDictionaryShared::stop_dumping() {
assert_lock_strong(DumpTimeTable_lock);
_dump_in_progress = true;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you really intend to set _dump_in_progress to true in stop_dumping()? start_dumping() also sets it to true.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, it should be _dump_in_progress = false;. I'll fix that.

Copy link
Contributor

@coleenp coleenp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also add your unloads a lot test even though it doesn't reproduce this particular error without the ZGC change? It might find a similar bug under stress conditions.

assert(SafepointSynchronize::is_at_safepoint(), "invariant");
assert_lock_strong(DumpTimeTable_lock);
if (k->is_loader_alive()) {
assert(k->is_loader_alive(), "must be");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does seem a bit paranoid and redundant here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, that's was left over code. I'll remove it.

assert(k->is_loader_alive(), "must not change");
return result;
} else {
if (!SystemDictionaryShared::is_excluded_class(k)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this was the original bug? is_excluded_class() looks at mirror->signers() which if the class isn't alive, mirror->signers() will crash. This has to be in the k->is_loader_alive() too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_excluded_class() only checks the DumpTimeClassInfo::_is_excluded field. It doesn't examine the mirror->signers(). The crash happened with SystemDictionaryShared::check_excluded_classes(), which does examine the signers.

bool SystemDictionaryShared::is_excluded_class(InstanceKlass* k) {
  assert(_no_class_loading_should_happen, "sanity");
  assert_lock_strong(DumpTimeTable_lock);
  Arguments::assert_is_dumping_archive();
  DumpTimeClassInfo* p = find_or_allocate_info_for_locked(k);
  return (p == NULL) ? true : p->is_excluded();
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, sorry I got the names confused.

@iklam
Copy link
Member Author

iklam commented Dec 16, 2021

Could you also add your unloads a lot test even though it doesn't reproduce this particular error without the ZGC change? It might find a similar bug under stress conditions.

OK, I'll add the test case.

Comment on lines 80 to 81
static double d = 123;
static float f = 456;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the above declarations needed? They are not being used.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are left overs. I've remove them.

Comment on lines 86 to 88
public void doit(Runnable r) {
r.run();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the above method being called.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed it.

Copy link
Member

@calvinccheung calvinccheung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Just couple of questions on the test.

Copy link
Contributor

@coleenp coleenp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thank you for adding the test.

assert(k->is_loader_alive(), "must not change");
return result;
} else {
if (!SystemDictionaryShared::is_excluded_class(k)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, sorry I got the names confused.

@openjdk
Copy link

openjdk bot commented Dec 17, 2021

@iklam This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8278602: CDS dynamic dump may access unloaded classes

Reviewed-by: coleenp, ccheung

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 4 new commits pushed to the master branch:

  • 8dc4437: 8278434: timeouts in test java/time/test/java/time/format/TestZoneTextPrinterParser.java
  • 6b906bb: 8279223: Define version in .jcheck/conf
  • c295e71: 8276700: Improve java.lang.ref.Cleaner javadocs
  • 3a1fca3: 8278146: G1: Rework VM_G1Concurrent VMOp to clearly identify it as pause

Please see this link for an up-to-date comparison between the source branch of this pull request and the master branch.
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Dec 17, 2021
@iklam
Copy link
Member Author

iklam commented Jan 4, 2022

Thanks @stefank @calvinccheung @coleenp for the review. Latest version passed Mach5 CI tiers 1-4.
/integrate

@openjdk
Copy link

openjdk bot commented Jan 4, 2022

Going to push as commit 09cf5f1.
Since your change was applied there have been 4 commits pushed to the master branch:

  • 8dc4437: 8278434: timeouts in test java/time/test/java/time/format/TestZoneTextPrinterParser.java
  • 6b906bb: 8279223: Define version in .jcheck/conf
  • c295e71: 8276700: Improve java.lang.ref.Cleaner javadocs
  • 3a1fca3: 8278146: G1: Rework VM_G1Concurrent VMOp to clearly identify it as pause

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Jan 4, 2022
@openjdk openjdk bot closed this Jan 4, 2022
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Jan 4, 2022
@openjdk
Copy link

openjdk bot commented Jan 4, 2022

@iklam Pushed as commit 09cf5f1.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot hotspot-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

4 participants