Skip to content
This repository was archived by the owner on Sep 2, 2022. It is now read-only.
/ jdk17 Public archive

8268524: nmethod::post_compiled_method_load_event racingly called on zombie #107

Closed
wants to merge 2 commits into from

Conversation

fisk
Copy link
Contributor

@fisk fisk commented Jun 21, 2021

In the code exercised by JvmtiCodeBlobEvents::generate_compiled_method_load_events, we grab a code cache iterator with the NMethodIterator::only_alive_and_not_unloading mode, under the CodeCache_lock. The idea is to then call post_compiled_method_load_event() on each of these is_alive() nmethods. Surely none of them will be a zombie. Inside of post_compiled_method_load_event() we filter out nmethods that racingly can die, like this:

if (is_not_entrant() && can_convert_to_zombie()) {
return;
}

So if the nmethod was dead or is_unloading(), we wouldn't get it into the iterator, and here we explicitly filter out nmethods that can become zombies. Now we should have all bases covered, no way we can end up calling the subsequent code on a zombie!

Except... the code called by the sweeper that flips an nmethod to zombie, doesn't hold the CodeCache_lock. Instead it holds the CompiledMethod_lock, which this JVMTI code does not hold. So between it being alive in the iterator, and calling is_not_entrant(), the nmethod could have racingly already become zombie. So when we check is_not_entrant(), it will return false. Because it's a zombie. Therefore we are tricked into believing the nmethod is safe to post around these events, while in fact it is already dead.

After we have mistakenly grabbed a zombie nmethod, when we use ZGC, we call the nmethod entry barriers on it. It gets indigestion due to being called on a zombie. I'm sure there are more sources of indigestion as well.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8268524: nmethod::post_compiled_method_load_event racingly called on zombie

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk17 pull/107/head:pull/107
$ git checkout pull/107

Update a local copy of the PR:
$ git checkout pull/107
$ git pull https://git.openjdk.java.net/jdk17 pull/107/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 107

View PR using the GUI difftool:
$ git pr show -t 107

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk17/pull/107.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Jun 21, 2021

👋 Welcome back eosterlund! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@fisk fisk marked this pull request as ready for review June 21, 2021 08:50
@openjdk openjdk bot added the rfr Pull request is ready for review label Jun 21, 2021
@openjdk
Copy link

openjdk bot commented Jun 21, 2021

@fisk The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.java.net label Jun 21, 2021
@mlbridge
Copy link

mlbridge bot commented Jun 21, 2021

Webrevs

Copy link
Contributor

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems reasonable

@openjdk
Copy link

openjdk bot commented Jun 21, 2021

@fisk This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8268524: nmethod::post_compiled_method_load_event racingly called on zombie

Reviewed-by: kvn, neliasso, coleenp

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 9 new commits pushed to the master branch:

  • 01f12fb: 8266631: StandardJavaFileManager: getJavaFileObjects() impl violates the spec
  • 6b14c8a: 8267421: j.l.constant.DirectMethodHandleDesc.Kind.valueOf(int) implementation doesn't conform to the spec regarding REF_invokeInterface handling
  • ef4ba22: 8268349: Provide clear run-time warnings about Security Manager deprecation
  • 4099810: 8268293: VectorAPI cast operation on mask and shuffle is broken
  • e2d7ec3: 8267100: [BACKOUT] JDK-8196415 Disable SHA-1 Signed JARs
  • d3ad8cd: 8268672: C2: assert(!loop->is_member(u_loop)) failed: can be in outer loop or out of both loops only
  • f25e719: 8268717: Upstream: 8268673: Stack walk across optimized entry frame on fresh native thread fails
  • 22ebd19: 8268362: [REDO] C2 crash when compile negative Arrays.copyOf length after loop
  • f8df953: 8268702: JFR diagnostic commands lack argument descriptors when viewed using Platform MBean Server

Please see this link for an up-to-date comparison between the source branch of this pull request and the master branch.
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jun 21, 2021
Copy link

@neliasso neliasso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

// release this lock, so we check that this is not going to be the case.
if (is_not_entrant() && can_convert_to_zombie()) {
return;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we change state from alive -> not_entrant here? Should this have a mark_as_seen_on_stack somewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not_entrant() is also implicitly is_alive(). That's why we need to check if they are not_entrant() in addition to is_alive(), to see if we are in a scenario where after we release the locks, it may racingly flip to zombie. That only happens for not_entrant nmethods, and we filter out the ones that can happen to by checking can_convert_to_zombie() on those.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh duh, yes, we don't have to mark them as seen if they can be converted to zombie here.

Copy link
Contributor

@coleenp coleenp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for fixing this!

@coleenp
Copy link
Contributor

coleenp commented Jun 22, 2021

-      // Walk the CodeCache notifying for live nmethods, don't release the CodeCache_lock
-      // because the sweeper may be running concurrently.
+      // Walk the CodeCache notifying for live nmethods.

Could you remove this comment in jvmtiCodeBlobEvents.cpp also since it's wrong?

@fisk
Copy link
Contributor Author

fisk commented Jun 22, 2021

Thanks Vladimir and Coleen, for the reviews.

// Walk the CodeCache notifying for live nmethods. We hold the CodeCache_lock
// to ensure the iteration is safe and nmethods are not concurrently freed.
// However, they may still change states and become !is_alive(). Filtering
// those out is done inside of nmethod::post_compiled_method_load_event().
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I like this.

@fisk
Copy link
Contributor Author

fisk commented Jun 22, 2021

Thanks Nils for the review.

@fisk
Copy link
Contributor Author

fisk commented Jun 22, 2021

/integrate

@openjdk
Copy link

openjdk bot commented Jun 22, 2021

Going to push as commit 9ec7180.
Since your change was applied there have been 9 commits pushed to the master branch:

  • 01f12fb: 8266631: StandardJavaFileManager: getJavaFileObjects() impl violates the spec
  • 6b14c8a: 8267421: j.l.constant.DirectMethodHandleDesc.Kind.valueOf(int) implementation doesn't conform to the spec regarding REF_invokeInterface handling
  • ef4ba22: 8268349: Provide clear run-time warnings about Security Manager deprecation
  • 4099810: 8268293: VectorAPI cast operation on mask and shuffle is broken
  • e2d7ec3: 8267100: [BACKOUT] JDK-8196415 Disable SHA-1 Signed JARs
  • d3ad8cd: 8268672: C2: assert(!loop->is_member(u_loop)) failed: can be in outer loop or out of both loops only
  • f25e719: 8268717: Upstream: 8268673: Stack walk across optimized entry frame on fresh native thread fails
  • 22ebd19: 8268362: [REDO] C2 crash when compile negative Arrays.copyOf length after loop
  • f8df953: 8268702: JFR diagnostic commands lack argument descriptors when viewed using Platform MBean Server

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot closed this Jun 22, 2021
@openjdk openjdk bot added integrated Pull request has been integrated and removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Jun 22, 2021
@openjdk
Copy link

openjdk bot commented Jun 22, 2021

@fisk Pushed as commit 9ec7180.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
hotspot-compiler hotspot-compiler-dev@openjdk.java.net integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

4 participants