Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8327645: Serial heap dump should not consume double amount of disk space #18160

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

caoman
Copy link
Contributor

@caoman caoman commented Mar 8, 2024

Hi all,

Could anyone review this fix to make serial heap dump only write to a single file?
We highly appreciate the work in https://bugs.openjdk.org/browse/JDK-8306441. However, many of our customers still need to use serial heap dump, as they have limited disk space and need to dump to a network socket.

Tested:
Stress tested with a fastdebug build with tests in hotspot/jtreg/serviceability/dcmd/gc/HeapDump*.

-Man


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8327645: Serial heap dump should not consume double amount of disk space (Bug - P4)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/18160/head:pull/18160
$ git checkout pull/18160

Update a local copy of the PR:
$ git checkout pull/18160
$ git pull https://git.openjdk.org/jdk.git pull/18160/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 18160

View PR using the GUI difftool:
$ git pr show -t 18160

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/18160.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Mar 8, 2024

👋 Welcome back manc! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Mar 8, 2024
@openjdk
Copy link

openjdk bot commented Mar 8, 2024

@caoman The following label will be automatically applied to this pull request:

  • hotspot-runtime

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-runtime hotspot-runtime-dev@openjdk.org label Mar 8, 2024
@mlbridge
Copy link

mlbridge bot commented Mar 8, 2024

Webrevs

@openjdk
Copy link

openjdk bot commented Mar 8, 2024

@caoman Please do not rebase or force-push to an active PR as it invalidates existing review comments. Note for future reference, the bots always squash all changes into a single commit automatically as part of the integration. See OpenJDK Developers’ Guide for more information.

@plummercj
Copy link
Contributor

/label add serviceability

@openjdk openjdk bot added the serviceability serviceability-dev@openjdk.org label Mar 8, 2024
@openjdk
Copy link

openjdk bot commented Mar 8, 2024

@plummercj
The serviceability label was successfully added.

Copy link
Contributor

@jianglizhou jianglizhou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change seems reasonable and avoids some of the extra file overhead in the serial case.

@alexmenkov
Copy link

Heap dumper was switched to always use segments by JDK-8299426 / JDK-8321565.
I suppose your fix will cause broken heapdump if there are some unmounted virtual threads.
You can try to run test/hotspot/jtreg/serviceability/jvmti/vthread/HeapDump/VThreadInHeapDump.java

@caoman
Copy link
Contributor Author

caoman commented Mar 8, 2024

@alexmenkov Thank you for the quick feedback. I can reproduce the failure with VThreadInHeapDump.java and -XX:ActiveProcessorCount=1. The problem seems that the writer cannot write HPROF_FRAME and HPROF_TRACE records in the middle of dumping an HPROF_HEAP_DUMP/HPROF_HEAP_DUMP_SEGMENT, so it resorts to a separate global writer.

Do we really need to keep all the HPROF_FRAME and HPROF_TRACE records located together? If not, two possible solutions are:

  1. Keep track of all unmounted vthread oops, and dump their stack traces after finishing the HPROF_HEAP_DUMP_SEGMENT.
  2. Write an HPROF_HEAP_DUMP_END, dump the vthread's stack trace, then start another HPROF_HEAP_DUMP_SEGMENT.

First approach seems resulting in a more organized heap dump. In any case, we probably only need to do this for serial heap dump. Parallel heap dump could keep using the current approach.

@alexmenkov
Copy link

Right, global writer is used to write HPROF_FRAME/HPROF_TRACE records when unmounted VT is found in the heap.
The spec does not require HPROF_FRAME/HPROF_TRACE to be in the beginning. Main concern here is compatibility as we don't control external hprof tools.
Solution #2 looks better to me.
It does not require extra memory to track unmounted VT (we may have millions) and it can prevent compatibility problem if we first write HPROF_FRAME/HPROF_TRACE records and then HPROF_GC_ROOT_THREAD_OBJ and stack reference subrecords which refer to the HPROF_TRACE

@openjdk
Copy link

openjdk bot commented Mar 13, 2024

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@bridgekeeper
Copy link

bridgekeeper bot commented May 7, 2024

@caoman This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@caoman caoman marked this pull request as draft May 7, 2024 23:50
@caoman
Copy link
Contributor Author

caoman commented May 7, 2024

Converted to draft until I have time to address the issue.

@openjdk openjdk bot removed the rfr Pull request is ready for review label May 7, 2024
@bridgekeeper
Copy link

bridgekeeper bot commented Jul 3, 2024

@caoman This pull request has been inactive for more than 8 weeks and will be automatically closed if another 8 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-runtime hotspot-runtime-dev@openjdk.org serviceability serviceability-dev@openjdk.org
4 participants