-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8299426: Heap dump does not contain virtual Thread stack references #16665
Conversation
👋 Welcome back amenkov! A progress list of the required criteria for merging this PR into |
@alexmenkov The following labels will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are dumping all virtual threads found in the heap, even if they are unreachable. Although this would be true for platform threads also, there is a high likelihood of a lot of unreachable virtual threads in the heap, but not so much for platform threads.
Also, have you tested scalability? Not just a large number of live virtual threads (like a million), but also combined with a large number of unreachable virtual threads.
@plummercj said:
There is a check for virtual thread liveness which has to be good enough in general:
|
writer()->flush(); | ||
|
||
// At this point, all fragments of the heapdump have been written to separate files. | ||
// We need to merge them into a complete heapdump and write HPROF_HEAP_DUMP_END at that time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like how the code in VM_HeapDumper::work
function was re-arranged by removing some duplication and making more common vs conditional lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good. Nice piece of work!
static bool is_vm_dumper(int dumper_id) { return dumper_id == VMDumperId; } | ||
// the 1st dumper calling get_next_dumper_id becomes VM dumper | ||
int get_next_dumper_id() { | ||
return Atomic::fetch_then_add(&_dump_seq, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: This can be used instead:
2196 return Atomic::inc(&_dump_seq);
@alexmenkov This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 373 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
That should in general take care of most unreachable virtual threads, but technically I don't think a virtual thread has to reach the TERMINATED state in order to become unreachable. However, it will never get scheduled. |
Agreed. |
I'm not sure I understand the scenario. The state is set to TERMINATED after the thread completes its execution. |
I wasn't thinking in terms of the scheduler somehow no longer references the virtual thread, but instead the program no longer referencing the scheduler (and also not referencing the virtual thread). |
AFAIU unfinished unmounted virtual threads are referenced from other objects (they are parked on), so they can't be unreachable even is the application is not referencing them and the scheduler. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One concern I have is when there is a large number of virtual threads, the dump may take too long and the hprof file gets bloated. My concern mostly comes from the large number of virtual thread stack traces that will be present. Dumping all these hprof records related to unmounted virtual threads could do more harm than good in some instances, and we may want a way for the user to disable it.
It would be nice if there could be some data sharing between threads with identical stack traces, but I don't see a way to do that with the current hprof spec.
My understanding that information about references is one of the most important things for dump analysis (and that's what the issue about). So we cannot avoid stack unwinding for unmounted virtual threads.
Hprof spec says nothing about 1:1 relation between threads and stack traces, so theoretically several HPROF_GC_ROOT_THREAD_OBJ subrecords may refer to the same stack trace, but search for identical stack traces may be expensive. |
Mailing list message from David Holmes on serviceability-dev: On 1/12/2023 2:08 pm, Alex Menkov wrote:
There is (or was - there may be a property that affects this: David |
Mailing list message from Alan Bateman on serviceability-dev: On 04/12/2023 12:41, David Holmes wrote:
That's right, the door is not closed to introducing ephemeral threads in -Alan |
My concern was with the memory usage of the stack traces. Yes I agree that including all referenced objects in the dump is important. An option just to leave out the stack traces seems like a good idea.
I was thinking initially that you couldn't do this because each stack has it's own unique set of locals that are referenced, and the locals were part of the stack trace, but they are not. There is instead a HPROF_GC_ROOT_JAVA_FRAME record for each local reference. It does include the ThreadID, and we could probably get away with multiple Thread records referring to the same stack trace. |
VM_HeapDumper caches stack traces for platform/carrier and mounted virtual threads only.
I think this possible improvement is out of scope for this PR |
I was actually referring to the footprint of the hprof file, not the in process memory usage while producing it. My concern with not doing the option to exclude stack traces now is that it could result in some unusable or unmanageably large heap dumps, or tools simply being overwhelmed by the number of threads. For example, I just looked at the VisualVM threads view, and it just produces a scrollable list of all threads. What happens if there are suddenly 10's of thousands if not millions of threads? If we are lucky is doesn't choke on them and the platform threads are first in the list, but this is the type of thing I'd like to see testing of before pushing this change. |
Mailing list message from Chris Plummer on serviceability-dev: On 12/4/23 5:20 AM, Alan Bateman wrote:
So does this mean if the application is no longer referencing the Chris |
Mailing list message from Alan Bateman on serviceability-dev: On 04/12/2023 18:59, Chris Plummer wrote:
Yes, this is possible. It would be an unusual scenario of course. -Alan |
Heap dumps are usually big. I think stack traces would not add much (comparing to the size of heapdump itself). So in my opinion the option to exclude stack traces doesn't make much difference. Tools should be ready to handle big number of threads. |
/integrate |
Going to push as commit 354ea4c.
Your commit was automatically rebased without conflicts. |
@alexmenkov Pushed as commit 354ea4c. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
The change impelements dumping of unmounted virtual threads data (stack traces and stack references).
Unmounted vthreads can be detected only by iterating over the heap, but hprof stack trace records (HPROF_FRAME/HPROF_TRACE) should be written before HPROF_HEAP_DUMP/HPROF_HEAP_DUMP_SEGMENT.
HeapDumper supports segment dump (parallel dump to separate files with subsequent file merge outside of safepoint), the fix switches HeapDumper to always use segment dump: 1st segment contains only non-heap data, other segments are used for dumping heap objects. For serial dumping single-threaded dumping is performed, but 2 segments are created anyway.
When HeapObjectDumper detects unmounted virtual thread, it writes HPROF_FRAME/HPROF_TRACE records to the 1st segment ("global writer"), and writes thread object (HPROF_GC_ROOT_JAVA_FRAME) and stack references (HPROF_GC_ROOT_JAVA_FRAME/HPROF_GC_ROOT_JNI_LOCAL) to the HeapObjectDumper segment.
As parallel dumpers may write HPROF_FRAME/HPROF_TRACE concurrently and VMDumper needs to write non-heap data before heap object dumpers can write virtual threads data, writing to global writer is protected with DumperController::_global_writer_lock.
Testing: run tests which perform heap dump (in different scenarios):
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/16665/head:pull/16665
$ git checkout pull/16665
Update a local copy of the PR:
$ git checkout pull/16665
$ git pull https://git.openjdk.org/jdk.git pull/16665/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 16665
View PR using the GUI difftool:
$ git pr show -t 16665
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/16665.diff
Webrev
Link to Webrev Comment