Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8231627: ThreadsListHandleInErrorHandlingTest.java fails in printing all threads #1891

Closed
wants to merge 2 commits into from

Conversation

@dcubed-ojdk
Copy link
Member

@dcubed-ojdk dcubed-ojdk commented Dec 24, 2020

A small robustness fix in ThreadsSMRSupport::print_info_on() to reduce the
likelihood of crashes during error reporting. Uses Threads_lock->try_lock()
for safety and restricts some reporting to when the Threads_lock has been
acquired (depends on JDK-8256383). Uses a ThreadsListHandle to make
the current ThreadsList safe for reporting (depends on JDK-8258284). Also
detects when the system ThreadsList (_java_thread_list) has changed and
will warn that some of the reported info may now be stale.

Two existing tests have been updated to reflect the use of a ThreadsListHandle
in ThreadsSMRSupport::print_info_on(). Mach5 Tier[1-6] testing has no regressions.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8231627: ThreadsListHandleInErrorHandlingTest.java fails in printing all threads

Reviewers

Download

$ git fetch https://git.openjdk.java.net/jdk pull/1891/head:pull/1891
$ git checkout pull/1891

…ava fails because error occurred during printing all threads
@dcubed-ojdk
Copy link
Member Author

@dcubed-ojdk dcubed-ojdk commented Dec 24, 2020

/label add hotspot-runtime serviceability

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Dec 24, 2020

👋 Welcome back dcubed! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

@openjdk openjdk bot commented Dec 24, 2020

@dcubed-ojdk
The hotspot-runtime label was successfully added.

The serviceability label was successfully added.

@dcubed-ojdk dcubed-ojdk marked this pull request as ready for review Dec 24, 2020
@openjdk openjdk bot added the rfr label Dec 24, 2020
@dcubed-ojdk
Copy link
Member Author

@dcubed-ojdk dcubed-ojdk commented Dec 24, 2020

@dholmes-ora, @fisk and @robehn - You folks might be interested in this
Thread-SMR related robustness fix. It should help during error reporting.

@mlbridge
Copy link

@mlbridge mlbridge bot commented Dec 24, 2020

Webrevs

fisk
fisk approved these changes Dec 24, 2020
Copy link
Contributor

@fisk fisk left a comment

Looks good. We have something similar in the precious GC log code during error reporting.

@openjdk
Copy link

@openjdk openjdk bot commented Dec 24, 2020

@dcubed-ojdk This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8231627: ThreadsListHandleInErrorHandlingTest.java fails in printing all threads

Reviewed-by: eosterlund, coleenp, pchilanomate, sspitsyn

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 50 new commits pushed to the master branch:

  • 8b45497: 8259037: livenmethods cannot find hsdis library
  • 7d76966: 8255757: Javac emits duplicate pool entries on array::clone
  • cf9908b: 8258937: Remove JVM IgnoreRewrites flag
  • 4d3d599: 8259223: Simplify boolean expression in the SunJSSE provider
  • 1b60acd: 8259252: Shenandoah: Shenandoah build failed on AArch64 after JDK-8258459
  • 7ddc2b5: 8258852: Arrays.asList() for single item could be replaced with List.of()
  • 85bac8c: 8259021: SharedSecrets should avoid double racy reads from non-volatile fields
  • d5aa49d: 8259236: C2 compilation fails with assert(is_power_of_2(value)) failed: value must be a power of 2: 8000000000000000
  • 82bdbfd: 8258857: Zero: non-PCH release build fails after JDK-8258074
  • f4122d6: 8258896: Remove the JVM ForceFloatExceptions option
  • ... and 40 more: https://git.openjdk.java.net/jdk/compare/91244cc738e92163d99b8951c5d95b546447f341...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready label Dec 24, 2020
@dcubed-ojdk
Copy link
Member Author

@dcubed-ojdk dcubed-ojdk commented Dec 24, 2020

@fisk - Thanks for the review! And Merry Christmas Eve!!

@fisk
Copy link
Contributor

@fisk fisk commented Dec 24, 2020

@fisk - Thanks for the review! And Merry Christmas Eve!!

Merry Christmas to you too Dan!

@dcubed-ojdk
Copy link
Member Author

@dcubed-ojdk dcubed-ojdk commented Jan 4, 2021

Ping! I could use a second review here...

coleenp
coleenp approved these changes Jan 4, 2021
Copy link
Contributor

@coleenp coleenp left a comment

Looks good. One comment. Also, ha ha, Harold caught me today: you need to update the copyrights!

" so some of the above information may be stale.",
p2i(saved_threads_list), p2i(_java_thread_list));
}
}
return;
}
st->print_cr("_java_thread_list_alloc_cnt=" UINT64_FORMAT ", "
Copy link
Contributor

@coleenp coleenp Jan 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these statistics stable if you don't have the Threads_lock? Seems like a good place to return unconditionally to me, but it's up to you whether this is wrong and it matters. It doesn't follow any pointers so doesn't look like it'll crash.

Copy link
Member Author

@dcubed-ojdk dcubed-ojdk Jan 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No the statistics starting on L1161 are not stable if we don't
have the Threads_lock. That's why we detect a change in the
_java_thread_list on L1194 and print a message:

    if (_java_thread_list != saved_threads_list) {
      st->print_cr("The _java_thread_list has changed from " INTPTR_FORMAT
                   " to " INTPTR_FORMAT
                   " so some of the above information may be stale.",
                   p2i(saved_threads_list), p2i(_java_thread_list));
    }

@iklam
Copy link
Member

@iklam iklam commented Jan 4, 2021

Maybe the PR/JBS title should be shortened to "ThreadsListHandleInErrorHandlingTest.java fails in printing all threads" for conciseness and to make GitHub happy?

Copy link
Contributor

@pchilano pchilano left a comment

Hi Dan,

Looks good to me!

}

if (_to_delete_list != NULL) {
if (has_Threads_lock) {
Copy link
Contributor

@pchilano pchilano Jan 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we check for Threads_lock->owned_by_self() instead? (case where thread already owns Threads_lock before calling print_info_on()).

Copy link
Member Author

@dcubed-ojdk dcubed-ojdk Jan 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! That will cover one more "safe" case, but I'll want to rename
the new has_Threads_lock to something else.

}
} else {
st->print_cr("_to_delete_list=" INTPTR_FORMAT, p2i(_to_delete_list));
st->print_cr("Skipping _to_delete_list fields and contents for safety.");
}
}
if (!EnableThreadSMRStatistics) {
Copy link
Contributor

@pchilano pchilano Jan 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could switch to "if(EnableThreadSMRStatistics)" instead and put this code at the end to avoid repetition. Also I think the comparison with _java_thread_list could be done unconditionally at the end since it's already racy anyways (even if the info was printed with the Threads_lock held it could have changed right after it's released and before returning).

Copy link
Contributor

@sspitsyn sspitsyn Jan 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the refactoring suggestion from Patricio above to switch to "if(EnableThreadSMRStatistics)". The code will be a little more elegant.

Copy link
Member Author

@dcubed-ojdk dcubed-ojdk Jan 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

@sspitsyn sspitsyn left a comment

Hi Dan,

It looks good modulo a couple of suggestions from Patricio.

Thanks,
Serguei

@dcubed-ojdk dcubed-ojdk changed the title 8231627: runtime/ErrorHandling/ThreadsListHandleInErrorHandlingTest.java fails because error occurred during printing all threads 8231627: ThreadsListHandleInErrorHandlingTest.java fails in printing all threads Jan 5, 2021
@dcubed-ojdk
Copy link
Member Author

@dcubed-ojdk dcubed-ojdk commented Jan 5, 2021

Made copyright changes based on comments from coleenp,
code changes based on comments from pchilano, and changed
the title based on comments from iklam.
Thanks to coleenp, pchilano and sspitsyn for the code reviews.
Re-testing now...

@dcubed-ojdk
Copy link
Member Author

@dcubed-ojdk dcubed-ojdk commented Jan 5, 2021

Passed local builds and testing on my MBP13. A Mach5 Tier1 has finished all
of the build tasks and is mostly done with the test tasks.

Please re-review when you get the chance.

coleenp
coleenp approved these changes Jan 5, 2021
Copy link
Contributor

@coleenp coleenp left a comment

LGTM!

@pchilano
Copy link
Contributor

@pchilano pchilano commented Jan 6, 2021

Looks good, thanks Dan!

@dcubed-ojdk
Copy link
Member Author

@dcubed-ojdk dcubed-ojdk commented Jan 6, 2021

sspitsyn, coleenp, and pchilano - Thanks for the re-reviews!

My local Linux-X64 build and test went fine. My Mach5 Tier1 passed
without failures. Mach5 Tier[23] are still running.

@dcubed-ojdk
Copy link
Member Author

@dcubed-ojdk dcubed-ojdk commented Jan 6, 2021

/integrate

@openjdk openjdk bot closed this Jan 6, 2021
@openjdk openjdk bot added integrated and removed ready rfr labels Jan 6, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Jan 6, 2021

@dcubed-ojdk Since your change was applied there have been 55 commits pushed to the master branch:

  • 7e01bc9: 8255264: Support for identifying the full range of IPv4 localhost addresses on Windows
  • 8a05d60: 8259042: Inconsistent use of general primitives loops
  • e3b9da1: 8259287: AbstractCompiler marks const in wrong position for is_c1/is_c2/is_jvmci
  • 32538b5: 8193942: Regression automated test '/open/test/jdk/javax/swing/JFrame/8175301/ScaledFrameBackgroundTest.java' fails
  • 52d3fee: 8258813: [TESTBUG] Fix incorrect Vector API test output message
  • 8b45497: 8259037: livenmethods cannot find hsdis library
  • 7d76966: 8255757: Javac emits duplicate pool entries on array::clone
  • cf9908b: 8258937: Remove JVM IgnoreRewrites flag
  • 4d3d599: 8259223: Simplify boolean expression in the SunJSSE provider
  • 1b60acd: 8259252: Shenandoah: Shenandoah build failed on AArch64 after JDK-8258459
  • ... and 45 more: https://git.openjdk.java.net/jdk/compare/91244cc738e92163d99b8951c5d95b546447f341...master

Your commit was automatically rebased without conflicts.

Pushed as commit c0540ff.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@dcubed-ojdk dcubed-ojdk deleted the JDK-8231627 branch Jan 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment