Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JDK-8320859: gtest high malloc footprint caused by BufferNodeAllocator stress test #16845

Conversation

tstuefe
Copy link
Member

@tstuefe tstuefe commented Nov 28, 2023

BufferNodeAllocatorTest.stress_free_list_allocator_vm is too expensive. On my box, this test accumulates ~1.5 GB of malloc footprint and raises the libc memory retention from 50m to 800m.

It is quite alone in its hunger. The rest of the tests together accumulate just ~50mb of libc retention.

The buffer does a stress test of the BufferNodeAllocator. Four "mutator threads" race four "gc threads". Mutator threads allocate buffers, GC threads release them. No processing is done on the buffers; this seems to be purely a test of the allocator and its freelist mechanism. The memory footprint of this test depends on the number of retained free buffers in the allocator. The allocator bulk-releases free buffers triggered by a free-count threshold. On my box, the number of free items in the unmodified stock VM is 100k..200k.

It is not clear whether the fact that so many free list items exist indicates a problem with the allocator itself. It looks like the free list should be drained if there are more than 10 items in this list.

In any case, each buffer's capacity is 1024 * sizeof(void*)+header, so 8KB+x, and with 100..200k of those things it explains the NMT-reported malloc footprint of ~1..2GB. An easy fix is to reduce the size of this buffer; since they are not processed, their size should not matter.


The patch decreases the size per buffer node to cache line size. I chose cache line size out of the vague feeling that I don't want to cause false sharing and thereby degrade the test.

Interestingly enough, reducing the buffer size greatly increases the number of free items since it makes allocation cheaper. This also happens in release builds, so this is not us zapping stuff; my unproven assumption is that this is the libc just being slower when allocating 8K vs 64-byte blocks. But with smaller buffers, we spend more time in freelist management and less time in the libc, which is good for a stress test.

This patch also increases the number of "processor" threads vs "mutator" threads. Since allocation seems to have a speed edge over deallocation, this reduces the number of free items somewhat.

This reduces the malloc spike seen from this thread from 1.5-2GB to ~160MB (release) 3xxMB (debug). We could reduce it a lot more if we reduced the buffer size to its minimum (1 slot); that would risk false sharing and may degrade test performance.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8320859: gtest high malloc footprint caused by BufferNodeAllocator stress test (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/16845/head:pull/16845
$ git checkout pull/16845

Update a local copy of the PR:
$ git checkout pull/16845
$ git pull https://git.openjdk.org/jdk.git pull/16845/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 16845

View PR using the GUI difftool:
$ git pr show -t 16845

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/16845.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Nov 28, 2023

👋 Welcome back stuefe! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Nov 28, 2023

@tstuefe The following label will be automatically applied to this pull request:

  • hotspot-gc

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-gc hotspot-gc-dev@openjdk.org label Nov 28, 2023
@tstuefe tstuefe marked this pull request as ready for review November 28, 2023 11:17
@openjdk openjdk bot added the rfr Pull request is ready for review label Nov 28, 2023
@mlbridge
Copy link

mlbridge bot commented Nov 28, 2023

Webrevs

Comment on lines 192 to 193
const uint nthreads_mut = 4;
const uint nthreads_proc = 6;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming them "num_processor_threads" and "num_allocator_threads" is more consistent with the corresponding types ProcessorThread and AllocatorThread, IMO.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay

@openjdk
Copy link

openjdk bot commented Nov 28, 2023

@tstuefe This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8320859: gtest high malloc footprint caused by BufferNodeAllocator stress test

Reviewed-by: ayang, kbarrett

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 6 new commits pushed to the master branch:

  • 2fae07f: 8319311: JShell Process Builder should be configurable
  • 63ad868: 8319668: Fixup of jar filename typo in BadFactoryTest.sh
  • 4bcda60: 8319713: Parallel: Remove PSAdaptiveSizePolicy::should_full_GC
  • 99f870c: 8320781: Fix whitespace in j.l.Double and j.u.z.ZipInputStream @snippets
  • a5ccd3b: 8267532: C2: Profile and prune untaken exception handlers
  • 464dc3d: 8319633: runtime/posixSig/TestPosixSig.java intermittent timeouts on UNIX

Please see this link for an up-to-date comparison between the source branch of this pull request and the master branch.
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Nov 28, 2023
Copy link

@kimbarrett kimbarrett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@tstuefe
Copy link
Member Author

tstuefe commented Nov 28, 2023

Thanks @kimbarrett and @albertnetymk. I will wait the obligatory 24 hrs before pushing tomorrow.

@tstuefe
Copy link
Member Author

tstuefe commented Nov 29, 2023

/integrate

@openjdk
Copy link

openjdk bot commented Nov 29, 2023

Going to push as commit abf2e49.
Since your change was applied there have been 32 commits pushed to the master branch:

  • 033cced: 8320368: Per-CPU optimization of Klass range reservation
  • 48960df: 8316734: URLEncoder should specify that replacement bytes will be used in case of coding error
  • 1594653: 8310644: Make panama memory segment close use async handshakes
  • 65dfcae: 8308399: Recommend --release when -source and -target are misused
  • 335f5db: 8320911: RISC-V: Enable hotspot/jtreg/compiler/intrinsics/chacha/TestChaCha20.java
  • 77d604a: 8319373: Serial: Refactor dirty cards scanning during Young GC
  • 38cfb22: 8318706: Implement JEP 423: Region Pinning for G1
  • e44d4b2: 8320858: Move jpackage tests to tier3
  • 5dcf3a5: 8320715: Improve the tests of test/hotspot/jtreg/compiler/intrinsics/float16
  • 78b6c2b: 8320898: exclude compiler/vectorapi/reshape/TestVectorReinterpret.java on ppc64(le) platforms
  • ... and 22 more: https://git.openjdk.org/jdk/compare/efc392259c64986bbbe880259e95b09058b9076a...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Nov 29, 2023
@openjdk openjdk bot closed this Nov 29, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Nov 29, 2023
@openjdk
Copy link

openjdk bot commented Nov 29, 2023

@tstuefe Pushed as commit abf2e49.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-gc hotspot-gc-dev@openjdk.org integrated Pull request has been integrated
3 participants