Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8319876: Reduce memory consumption of VM_ThreadDump::doit #16598

Closed
wants to merge 3 commits into from

Conversation

yanglong1010
Copy link
Contributor

@yanglong1010 yanglong1010 commented Nov 10, 2023

I would like to fix this.

Create 4096 threads, and the stack depth of each thread is 256.
After running jmx.dumpAllThreads(true, true), the RSS reaches 5.3GiB.
After optimization, the RSS is 250MiB.

I would appreciate it if anyone could review this.


update

If the number of threads and stack depth are relatively large, we need to apply for more space in ResourceArea during the execution of jmx.dumpAllThreads(true, true).

The reason is that VM_ThreadDump::doit creates vframe for each frame of each thread.

for (vframe* f = start_vf; f; f = f->sender() ) {

sizeof vframe is 4808 (bytes), and sizeof compiledVFrame is 4824 (bytes), mainly because the xmm registers in RegisterMap are relatively large. Assuming there are 4096 threads and each thread has 256 frames, the memory required is 4096 * 256 * 4824 = 4.7GiB。

These memories of all threads are released once by the the initial ResourceMark of VM_ThreadDump::doit.


My solution is to add a ResourceMark for each thread.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8319876: Reduce memory consumption of VM_ThreadDump::doit (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/16598/head:pull/16598
$ git checkout pull/16598

Update a local copy of the PR:
$ git checkout pull/16598
$ git pull https://git.openjdk.org/jdk.git pull/16598/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 16598

View PR using the GUI difftool:
$ git pr show -t 16598

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/16598.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Nov 10, 2023

👋 Welcome back yanglong1010! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Nov 10, 2023
@openjdk
Copy link

openjdk bot commented Nov 10, 2023

@yanglong1010 The following label will be automatically applied to this pull request:

  • hotspot-runtime

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-runtime hotspot-runtime-dev@openjdk.org label Nov 10, 2023
@mlbridge
Copy link

mlbridge bot commented Nov 10, 2023

Webrevs

@shipilev
Copy link
Member

Please describe the fix a bit better.

My reading: vframe-s are allocated in resource arena during the stack walking, and we do not need them after the walk is completed. Safety: a) we do not persist the vframe-s after the walk, we only sometimes converting them to StackFrameInfo in ThreadStackTrace::add_stack_frame; b) all allocations on protected path are done in C heap, or under internal ResourceMark-s (e.g. in StackFrameInfo constructor).

Please run tier1 tier2 tier3 tier4 with fastdebug to catch any accidents.

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes please explain the problem and the fix better. It is far from clear that every ResourceArea allocation that is made from this code can be safely discarded when this method returns. Which objects are the ones consuming so much space, and which can be erased? Maybe we should be avoiding these allocations rather than cleaning them up more promptly. ??

src/hotspot/share/services/threadService.cpp Outdated Show resolved Hide resolved
@yanglong1010
Copy link
Contributor Author

Please describe the fix a bit better.

My reading: vframe-s are allocated in resource arena during the stack walking, and we do not need them after the walk is completed. Safety: a) we do not persist the vframe-s after the walk, we only sometimes converting them to StackFrameInfo in ThreadStackTrace::add_stack_frame; b) all allocations on protected path are done in C heap, or under internal ResourceMark-s (e.g. in StackFrameInfo constructor).

Please run tier1 tier2 tier3 tier4 with fastdebug to catch any accidents.

hi Aleksey,
After the execution of VM_ThreadDump::doit is completed, glibc (2.17) on my machine will not return the memory to the OS immediately. It takes a while to see RSS return to normal values.
To be precise and the most important, with 1024 threads and 1024 stack depth per thread, just before the destructor of ResourceMark of VM_ThreadDump::doit is invoked, the RSS reached 5.2GiB (detail below).
I'm running tier1 to tier4 in my environment, it may take some time.

@yanglong1010
Copy link
Contributor Author

hi David,

The situation we encountered is that in a Java process with a large number of threads, executing VM_ThreadDump::doit will cause Java to be killed by the Linux OOM Killer.

The reason is that VM_ThreadDump::doit creates vframe for each frame of each thread, sizeof vframe is 4808 (bytes), and sizeof compiledVFrame is 4824 (bytes).
RegisterMap._location of vframe occupies 577 words, mainly because the xmm registers is relatively large.
Assuming there are 1024 threads and each thread has 1024 frames, the memory required is 1024 * 1024 * 4824 = 4.7GiB.

I saw that in StackFrameInfo::StackFrameInfo(javaVFrame* jvf, bool with_lock_info), Method*, bci, and MonitorInfo._owner of javaVFrame have been copied to CHeap, so I feel that ResourceMark can be released.

@tstuefe
Copy link
Member

tstuefe commented Nov 13, 2023

@yanglong1010

ThreadStackTrace::_jni_locked_monitors and ThreadStackTrace::_frames are both GrowableArray, and IIUC its backing memory lives in RA as well. At least _frames will get added to inside your ResourceMark via add_stack_frame.

Are we sure adding an RM at that place is okay?

BTW, should be obvious but if you run tests, pls make sure to test debug builds to get Arena zapping on chopped arena chunks.

@stefank
Copy link
Member

stefank commented Nov 13, 2023

ThreadStackTrace::_jni_locked_monitors and ThreadStackTrace::_frames are both GrowableArray, and IIUC its backing memory lives in RA as well. At least _frames will get added to inside your ResourceMark via add_stack_frame.

From what I can see these two are CHeap-allocated:

ThreadStackTrace::ThreadStackTrace(JavaThread* t, bool with_locked_monitors) {
  _thread = t;
  _frames = new (mtServiceability) GrowableArray<StackFrameInfo*>(INITIAL_ARRAY_SIZE, mtServiceability);
  _depth = 0;
  _with_locked_monitors = with_locked_monitors;
  if (_with_locked_monitors) {
    _jni_locked_monitors = new (mtServiceability) GrowableArray<OopHandle>(INITIAL_ARRAY_SIZE, mtServiceability);
  } else {
    _jni_locked_monitors = nullptr;
  }
}

The MEMFLAGS indicate that they are CHeap-allocated.

@tstuefe
Copy link
Member

tstuefe commented Nov 13, 2023

ThreadStackTrace::_jni_locked_monitors and ThreadStackTrace::_frames are both GrowableArray, and IIUC its backing memory lives in RA as well. At least _frames will get added to inside your ResourceMark via add_stack_frame.

From what I can see these two are CHeap-allocated:

ThreadStackTrace::ThreadStackTrace(JavaThread* t, bool with_locked_monitors) {
  _thread = t;
  _frames = new (mtServiceability) GrowableArray<StackFrameInfo*>(INITIAL_ARRAY_SIZE, mtServiceability);
  _depth = 0;
  _with_locked_monitors = with_locked_monitors;
  if (_with_locked_monitors) {
    _jni_locked_monitors = new (mtServiceability) GrowableArray<OopHandle>(INITIAL_ARRAY_SIZE, mtServiceability);
  } else {
    _jni_locked_monitors = nullptr;
  }
}

The MEMFLAGS indicate that they are CHeap-allocated.

Wait, how does this work on expansion?

ThreadStackTrace::add_stack_frame -> GrowableArray::append -> ... ->

template <typename E, typename Derived>
void GrowableArrayWithAllocator<E, Derived>::expand_to(int new_capacity) {
...
  E* newData = static_cast<Derived*>(this)->allocate();

Calls GrowableArray::allocate() without MEMFLAGS, which calls GrowableArrayResourceAllocator::allocate

What am I missing?


Update, never mind, I see the big switch in GrowableArray::allocate now.

@dcubed-ojdk
Copy link
Member

/label add serviceability

@openjdk openjdk bot added the serviceability serviceability-dev@openjdk.org label Nov 13, 2023
@openjdk
Copy link

openjdk bot commented Nov 13, 2023

@dcubed-ojdk
The serviceability label was successfully added.

@yanglong1010
Copy link
Contributor Author

Hi
I ran tier1, tier2, tier3, tier4 on my host machine.
tier1, tier2, and tier3 all passed.
Because my host does not have a display device, I added export JTREG_KEYWORDS="!headful" before running tier4.
Finally, some tests in tier4 that depend on the printing device failed, and the rest were successful.

@dholmes-ora
Copy link
Member

Okay I'm running this through our CI testing as well. I can't see any easy way to check for escaping resource objects other than testing.

@dholmes-ora
Copy link
Member

Our testing passed tiers 1-5. Thanks

@openjdk
Copy link

openjdk bot commented Nov 17, 2023

⚠️ @yanglong1010 the full name on your profile does not match the author name in this pull requests' HEAD commit. If this pull request gets integrated then the author name from this pull requests' HEAD commit will be used for the resulting commit. If you wish to push a new commit with a different author name, then please run the following commands in a local repository of your personal fork:

$ git checkout 8319876
$ git commit --author='Preferred Full Name <you@example.com>' --allow-empty -m 'Update full name'
$ git push

@openjdk
Copy link

openjdk bot commented Nov 17, 2023

@yanglong1010 This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8319876: Reduce memory consumption of VM_ThreadDump::doit

Reviewed-by: dholmes, stefank

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 82 new commits pushed to the master branch:

  • bbf52e0: 8319897: Move StackWatermark handling out of LockStack::contains
  • 129c470: 8311932: Suboptimal compiled code of nested loop over memory segment
  • 369bbec: 8319896: Remove monitor deflation from final audit
  • 1588dd9: 8319567: Update java/lang/invoke tests to support vm flags
  • 9727f4b: 8320199: Fix HTML 5 errors in java.math.BigInteger
  • d6aa7c8: 8314621: ClassNotFoundException due to lambda reference to elided anonymous inner class
  • 52e2878: 8319987: compilation of sealed classes leads to infinite recursion
  • b05e69f: 8320209: VectorMaskGen clobbers rflags on x86_64
  • f3ed275: 8319103: Popups that request focus are not shown on Linux with Wayland
  • 9e7a3ae: 8319630: Monitor final audit log lacks separator
  • ... and 72 more: https://git.openjdk.org/jdk/compare/9cce9fe06780aa095b3aabdfa421f376ca7bfd08...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@dholmes-ora, @stefank) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Nov 17, 2023
@yanglong1010
Copy link
Contributor Author

@dholmes-ora David Thanks.

@yanglong1010
Copy link
Contributor Author

yanglong1010 commented Nov 17, 2023

Could I get another review ?

@yanglong1010
Copy link
Contributor Author

/integrate

@D-D-H
Copy link
Contributor

D-D-H commented Nov 17, 2023

/sponsor

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Nov 17, 2023
@openjdk
Copy link

openjdk bot commented Nov 17, 2023

@yanglong1010
Your change (at version b485290) is now ready to be sponsored by a Committer.

@openjdk
Copy link

openjdk bot commented Nov 17, 2023

Going to push as commit 8ec6b8d.
Since your change was applied there have been 82 commits pushed to the master branch:

  • bbf52e0: 8319897: Move StackWatermark handling out of LockStack::contains
  • 129c470: 8311932: Suboptimal compiled code of nested loop over memory segment
  • 369bbec: 8319896: Remove monitor deflation from final audit
  • 1588dd9: 8319567: Update java/lang/invoke tests to support vm flags
  • 9727f4b: 8320199: Fix HTML 5 errors in java.math.BigInteger
  • d6aa7c8: 8314621: ClassNotFoundException due to lambda reference to elided anonymous inner class
  • 52e2878: 8319987: compilation of sealed classes leads to infinite recursion
  • b05e69f: 8320209: VectorMaskGen clobbers rflags on x86_64
  • f3ed275: 8319103: Popups that request focus are not shown on Linux with Wayland
  • 9e7a3ae: 8319630: Monitor final audit log lacks separator
  • ... and 72 more: https://git.openjdk.org/jdk/compare/9cce9fe06780aa095b3aabdfa421f376ca7bfd08...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Nov 17, 2023
@openjdk openjdk bot closed this Nov 17, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Nov 17, 2023
@openjdk
Copy link

openjdk bot commented Nov 17, 2023

@D-D-H @yanglong1010 Pushed as commit 8ec6b8d.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@yanglong1010
Copy link
Contributor Author

Thank you all ~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-runtime hotspot-runtime-dev@openjdk.org integrated Pull request has been integrated serviceability serviceability-dev@openjdk.org
Development

Successfully merging this pull request may close these issues.

7 participants