Skip to content
This repository has been archived by the owner on Sep 19, 2023. It is now read-only.

8289709: fatal error: stuck in JvmtiVTMSTransitionDisabler::disable_VTMS_transitions #129

Closed
wants to merge 4 commits into from

Conversation

sspitsyn
Copy link

@sspitsyn sspitsyn commented Jul 9, 2022

This is a test bug. The test should filter out non-tested threads to avoid generating such kind of deadlocks.

In short, the deadlock dependencies are:

  • The Common-Cleaner thread is executing the JVM TI agent MethodEntry event callback which grabbed the agent_lock raw monitor and calls JVM TI GetFrameCount. The GetFrameCount is blocked in disabling VTMS transitions because the ForkJoinPool-1-worker-2 is at a mount (VTMS) transition.
  • The ForkJoinPool-1-worker-2 is at mount (VTMS) transition and blocked in java.lang.ref.NativeReferenceQueue.poll() when acquiring the NativeReferenceQueue lock which is held by the Reference Handler thread.
  • The Reference Handler thread grabbed the NativeReferenceQueue lock and is entering the signal() method. It triggered a JVM TI MethodEntry event. The JVM TI agent MethodEntry event callback is blocked on grabbing the agent_lock raw monitor which is held by the Common-Cleaner thread.

Also, the timeout=360 is explicitly set to avoid frequent timeouts in locals runs.

Testing: submitted mach5 job with 100 runs on 3 debug platforms.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8289709: fatal error: stuck in JvmtiVTMSTransitionDisabler::disable_VTMS_transitions

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk19 pull/129/head:pull/129
$ git checkout pull/129

Update a local copy of the PR:
$ git checkout pull/129
$ git pull https://git.openjdk.org/jdk19 pull/129/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 129

View PR using the GUI difftool:
$ git pr show -t 129

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk19/pull/129.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Jul 9, 2022

👋 Welcome back sspitsyn! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Jul 9, 2022
@openjdk
Copy link

openjdk bot commented Jul 9, 2022

@sspitsyn The following label will be automatically applied to this pull request:

  • serviceability

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the serviceability serviceability-dev@openjdk.org label Jul 9, 2022
@mlbridge
Copy link

mlbridge bot commented Jul 9, 2022

Webrevs

@@ -69,7 +69,7 @@ void print_current_time() {
}

static
int isTestThread(JNIEnv *jni, jvmtiEnv *jvmti, jthread thr) {
bool isTestThread(JNIEnv *jni, jvmtiEnv *jvmti, jthread thr) {
jvmtiThreadInfo inf;
const char* TEST_THREAD_NAME_BASE = "Test Thread";
check_jvmti_status(jni, jvmti->GetThreadInfo(thr, &inf), "Error in GetThreadInfo.");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In passing, should isTestThread deallocate inf.name to avoid leaking memory?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alan, thank you for looking at this fix.
Good suggestion.

@@ -213,23 +215,25 @@ void JNICALL FramePop(jvmtiEnv *jvmti, JNIEnv *jni,
jthread thr, jmethodID method, jboolean wasPopedByException) {
jint frameCount;

RawMonitorLocker rml(jvmti, jni, agent_lock);
if (!isTestThread(jni, jvmti, thr)) {
return; // not a tested thread

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I read the test correctly, NotifyFramePop will only be called with the test thread so therefore the FramePop callback should only be called by the test thread, is that correct? I'm asking because I'm wondering why FramePop also checks the thread is the test thread.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand, it is kind of a double-check in this test.
I agree, we can simplify this code little bit.

@openjdk openjdk bot removed the rfr Pull request is ready for review label Jul 10, 2022
@openjdk openjdk bot added the rfr Pull request is ready for review label Jul 10, 2022
@openjdk
Copy link

openjdk bot commented Jul 10, 2022

@sspitsyn This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8289709: fatal error: stuck in JvmtiVTMSTransitionDisabler::disable_VTMS_transitions

Reviewed-by: alanb, amenkov, lmesnik

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 7 new commits pushed to the master branch:

  • 39715f3: 8287902: UnreadableRB case in MissingResourceCauseTest is not working reliably on Windows
  • 62fbc3f: 8287379: Using @inheritdoc in an inapplicable context shouldn't crash javadoc
  • fed3af8: 8287809: Revisit implementation of memory session
  • cb6e9cb: 8290004: [PPC64] JfrGetCallTrace: assert(_pc != nullptr) failed: must have PC
  • 0494291: 8289692: JFR: Thread checkpoint no longer enforce mutual exclusion post Loom integration
  • 25f4b04: 8289894: A NullPointerException thrown from guard expression
  • b542bcb: 8289729: G1: Incorrect verification logic in G1ConcurrentMark::clear_next_bitmap

Please see this link for an up-to-date comparison between the source branch of this pull request and the master branch.
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jul 10, 2022
Copy link

@alexmenkov alexmenkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

Copy link
Member

@lmesnik lmesnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, assumint

return strncmp(inf.name, TEST_THREAD_NAME_BASE, strlen(TEST_THREAD_NAME_BASE)) == 0;

bool result = strncmp(inf.name, TEST_THREAD_NAME_BASE, strlen(TEST_THREAD_NAME_BASE)) == 0;
jvmti->Deallocate((unsigned char *)inf.name);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional, we have 'deallocate' method in './jdk/test/lib/jvmti/jvmti_common.h' which check error status.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leonid, thank you for review. Yes, I know it. But there are other places with Deallocate. I decided to keep it consistent and avoid touching other spots to avoid unnecessary risk.

@sspitsyn
Copy link
Author

Alan and Alex, thank you for reviews!

@sspitsyn
Copy link
Author

/integrate

@openjdk
Copy link

openjdk bot commented Jul 11, 2022

Going to push as commit c3806b9.
Since your change was applied there have been 7 commits pushed to the master branch:

  • 39715f3: 8287902: UnreadableRB case in MissingResourceCauseTest is not working reliably on Windows
  • 62fbc3f: 8287379: Using @inheritdoc in an inapplicable context shouldn't crash javadoc
  • fed3af8: 8287809: Revisit implementation of memory session
  • cb6e9cb: 8290004: [PPC64] JfrGetCallTrace: assert(_pc != nullptr) failed: must have PC
  • 0494291: 8289692: JFR: Thread checkpoint no longer enforce mutual exclusion post Loom integration
  • 25f4b04: 8289894: A NullPointerException thrown from guard expression
  • b542bcb: 8289729: G1: Incorrect verification logic in G1ConcurrentMark::clear_next_bitmap

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Jul 11, 2022
@openjdk openjdk bot closed this Jul 11, 2022
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Jul 11, 2022
@openjdk
Copy link

openjdk bot commented Jul 11, 2022

@sspitsyn Pushed as commit c3806b9.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
integrated Pull request has been integrated serviceability serviceability-dev@openjdk.org
4 participants