Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JDK-8322237: Heap dump contains duplicate thread records for mounted virtual threads #17134

Closed
wants to merge 5 commits into from

Conversation

alexmenkov
Copy link

@alexmenkov alexmenkov commented Dec 16, 2023

HeapDumper dumps virtual threads in 2 places:

  • dumping platform threads (mounted virtual threads are dumped as separate thread object);
  • dumping heap objects when the object is java.lang.VirtualThread.

In the 2nd case mounted virtual threads should be skipped (as they are already dumped with correct stack traces/stack references)
Check that a virtual thread is mounted is non-trivial, method from JvmtiEnvBase was used for this.

Testing: tier1..3, heapdump-related tests: open/test/hotspot/jtreg/serviceability,open/test/hotspot/jtreg/runtime/ErrorHandling,open/test/hotspot/jtreg/gc/epsilon,open/test/jdk/sun/tools/jhsdb


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8322237: Heap dump contains duplicate thread records for mounted virtual threads (Bug - P3)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/17134/head:pull/17134
$ git checkout pull/17134

Update a local copy of the PR:
$ git checkout pull/17134
$ git pull https://git.openjdk.org/jdk.git pull/17134/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 17134

View PR using the GUI difftool:
$ git pr show -t 17134

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/17134.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Dec 16, 2023

👋 Welcome back amenkov! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Dec 16, 2023

@alexmenkov The following labels will be automatically applied to this pull request:

  • hotspot-runtime
  • serviceability

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added serviceability serviceability-dev@openjdk.org hotspot-runtime hotspot-runtime-dev@openjdk.org labels Dec 16, 2023
@openjdk openjdk bot added the rfr Pull request is ready for review label Dec 18, 2023
@mlbridge
Copy link

mlbridge bot commented Dec 18, 2023

Webrevs

Comment on lines 1645 to 1647
static bool is_vthread_mounted(oop vt) {
return JvmtiEnvBase::get_JavaThread_or_null(vt) != nullptr;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem appropriate to couple this to the JVMTI code (can this code be present if JVMTI is not part of the build?). Doesn't the VT state give you a good enough approximation of whether it is mounted i.e. RUNNABLE?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I'll remove dependency on JVMTI.
I don't think approximation would be good here (comparing state to RUNNABLE/PINNED/TIMED_PINNED or comparing carrierThread with null).
It's racy and we have a chance to not dump unmounted vthread or dump mounted vthread twice.
Maybe is_vthread_mounted should check if the virtual thread continuation has non-empty chunk.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that is racy then any solution is going to be racy. I assumed this was all happening at a global safepoint, otherwise threads could be mounting/unmounting at any time.

I said "approximation" only because I'm unsure exactly when the thread state gets updated in the mounting/unmounting process.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean race between virtual thread state change and the thread stack switch (to/from carrier).
Correct condition here is "dump the virtual thread if it was not dumped as mounted virtual thread".
Most likely for RUNNABLE/PINNED/TIMED_PINNED we can be sure the thread is mounted, but we can get vthread in transition (PARKING, YIELDING, etc). Virtual threads may be in transition at safepoints (but they can't change mounted/unmounted state).
So I think is_vthread_mounted can be implemented in 2 ways:

  1. copy logic of JvmtiEnvBase::get_JavaThread_or_null:
    get JavaThread for java_lang_VirtualThread::carrier_thread(vt);
    if not null, check Continuation::is_continuation_mounted(java_thread, java_lang_VirtualThread::continuation(vt)) - this is to handle transition, when vthread is already unmounted, but carrierThread is not yet set to null;
  2. check that java_lang_VirtualThread::continuation(vt) doesn't have non-empty chunk.
    AFAIU this is true for mounted vthreads. If we get it for unmounted vt, its stack trace of the thread is empty anyway, so it's ok to skip it (most likely it can happen only if thread state is NEW or TERMINATED, we already skip such vthreads).

@AlanBateman could you please comment if my understanding is correct

Copy link
Contributor

@sspitsyn sspitsyn Dec 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean race between virtual thread state change and the thread stack switch (to/from carrier).

I'm not sure if I understand you correctly or if we can call it a race. Alan will correct me if I'm wrong.
You are talking about thread state change. At least, mount state transition happens on the same JavaThread (it seems, you call it thread state switch). Mount state transition can go over a safepoint. But it should not progress while in a safepoint. David pointed out, "this was all happening at a global safepoint". My understanding is this assumption is correct. Then your approach to identify that a virtual thread is mounted or not should work in general. The condition java_lang_VirtualThread::carrier_thread(vt) != nullptr should indicate that the vt is mounted or is being in mount or unmount transition.
If the vt is in mount or unmount transition then (it is a gray zone) the way we identify mounted state should match the way we did it when dumped mounted virtual threads.
It is done this way: oop mounted_vt = thread->is_vthread_mounted() ? thread->vthread() : nullptr;
So, it seems any of yous suggestion should work here. Though, it would be nice to simplify it a little if possible. Again, to be consistent, a vt in mount state transition just have to be identified as mounted or unmounted in both fragments in a similar way .

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like "race" is wrong word here. There is no race between different threads, we just cannot rely on vt state or carrierThread value when the thread in mount/unmount transition.
Sorry for the confusion.
Serguei, thank you for the analysis. I agree, the code for mounted and unmounted vthreads should be consistent.
For unmounted threads we have to get JavaThread of the carrier thread and if it's not null, check java_thread->is_vthread_mounted().
We don't need to check is_continuation_mounted as we are at safepoint.

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems good to me know - thanks for the updates. One minor suggestion below.

src/hotspot/share/services/heapDumper.cpp Show resolved Hide resolved
@openjdk
Copy link

openjdk bot commented Dec 22, 2023

@alexmenkov This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8322237: Heap dump contains duplicate thread records for mounted virtual threads

Reviewed-by: dholmes, sspitsyn

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 71 new commits pushed to the master branch:

  • 4fc6b0f: 8068958: Timestamp.from(Instant) should throw when conversion is not possible
  • 28c82bf: 8322661: Build broken due to missing jvmtiExport.hpp after JDK-8320139
  • 7263e25: 8322490: cleanup CastNode construction
  • f695ca5: 8321151: JDK-8294427 breaks Windows L&F on all older Windows versions
  • 93fedc1: 8321802: (zipfs) Add validation of incorrect LOC signature in ZipFileSystem
  • 1230853: 8322163: runtime/Unsafe/InternalErrorTest.java fails on Alpine after JDK-8320886
  • dce7a57: 8321683: Tests fail with AssertionError in RangeWithPageSize
  • c53f845: 8322539: Parallel: Remove duplicated methods in PSAdaptiveSizePolicy
  • 84c2379: 8320139: [JVMCI] VmObjectAlloc is not generated by intrinsics methods which allocate objects
  • 3b908c4: 8319795: Static huge pages are not used for CodeCache
  • ... and 61 more: https://git.openjdk.org/jdk/compare/c328f9589ddc3a981a2c63801bd991f8e593e69f...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Dec 22, 2023
Copy link
Contributor

@sspitsyn sspitsyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix looks good. Thank you for the update.
Also, added a minor comment.

@@ -1918,7 +1931,9 @@ void HeapObjectDumper::do_object(oop o) {
if (o->is_instance()) {
// create a HPROF_GC_INSTANCE record for each object
DumperSupport::dump_instance(writer(), o, &_class_cache);
if (java_lang_VirtualThread::is_instance(o) && ThreadDumper::should_dump_vthread(o)) {
// If we encounter an unmounted virtual thread it needs to be dumped explicitly.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: It is nice to have a comment here.
I'm thinking if it'd make sens to shortly explain why mounted threads are not dumped here.

Copy link
Contributor

@sspitsyn sspitsyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the update!

@alexmenkov
Copy link
Author

/integrate

@openjdk
Copy link

openjdk bot commented Jan 9, 2024

Going to push as commit dd8ae61.
Since your change was applied there have been 186 commits pushed to the master branch:

  • ee98d26: 8323066: gc/g1/TestSkipRebuildRemsetPhase.java fails with 'Skipping Remembered Set Rebuild.' missing
  • 886386c: 8322890: Directly return in OldPLABSizeConstraintFunc
  • 438ab7c: 8323284: Remove unused FilteringClosure declaration
  • 52c7ff1: 8322330: JavadocHelperTest.java OOMEs with Parallel GC and ZGC
  • ff499ef: 8233443: G1 DetailedUsage class names overly generic for global namespace
  • 37a6172: 8322936: Update blessed-modifier-order.sh for default, sealed, and non-sealed
  • 7d42aa1: 8310277: jdk/javadoc/doclet/testMethodCommentAlgorithm/TestMethodCommentsAlgorithm.java fails with IllegalStateException
  • 6e9671a: 8323264: Serial: Remove unused GenerationBlockSizeClosure
  • 52a6c37: 8322858: compiler/c2/aarch64/TestFarJump.java fails on AArch64 due to unexpected PrintAssembly output
  • 075fed9: 8323241: jcmd manpage should use lists for argument lists
  • ... and 176 more: https://git.openjdk.org/jdk/compare/c328f9589ddc3a981a2c63801bd991f8e593e69f...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Jan 9, 2024
@openjdk openjdk bot closed this Jan 9, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Jan 9, 2024
@openjdk
Copy link

openjdk bot commented Jan 9, 2024

@alexmenkov Pushed as commit dd8ae61.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@alexmenkov
Copy link
Author

/backport jdk22

@openjdk
Copy link

openjdk bot commented Jan 11, 2024

@alexmenkov the backport was successfully created on the branch backport-alexmenkov-dd8ae616 in my personal fork of openjdk/jdk22. To create a pull request with this backport targeting openjdk/jdk22:master, just click the following link:

➡️ Create pull request

The title of the pull request is automatically filled in correctly and below you find a suggestion for the pull request body:

Hi all,

This pull request contains a backport of commit dd8ae616 from the openjdk/jdk repository.

The commit being backported was authored by Alex Menkov on 9 Jan 2024 and was reviewed by David Holmes and Serguei Spitsyn.

Thanks!

If you need to update the source branch of the pull then run the following commands in a local clone of your personal fork of openjdk/jdk22:

$ git fetch https://github.com/openjdk-bots/jdk22.git backport-alexmenkov-dd8ae616:backport-alexmenkov-dd8ae616
$ git checkout backport-alexmenkov-dd8ae616
# make changes
$ git add paths/to/changed/files
$ git commit --message 'Describe additional changes made'
$ git push https://github.com/openjdk-bots/jdk22.git backport-alexmenkov-dd8ae616

@alexmenkov alexmenkov deleted the heapdump_dup_vt branch January 11, 2024 21:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-runtime hotspot-runtime-dev@openjdk.org integrated Pull request has been integrated serviceability serviceability-dev@openjdk.org
Development

Successfully merging this pull request may close these issues.

3 participants