Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8267703: runtime/cds/appcds/cacheObject/HeapFragmentationTest.java crashed with OutOfMemory #4225

Closed
wants to merge 3 commits into from

Conversation

kstefanj
Copy link
Contributor

@kstefanj kstefanj commented May 27, 2021

Please review this change to lower fragmentation when heap usage is low.

Summary
The above test case fails because the G1 Full GC fails to compact the single used region below the needed threshold. In this case the region needs to be compacted below region index 400 to be able to fit the large array. The reason this fails is that the full GC uses a lot of parallel workers and if the system is under load it is possible that the worker finding the region to compact hasn't been able to claim any regions low enough in the heap to compact the live objects to.

To fix this we can add a third thing to consider in the code calculating the number of workers to use for a given compaction. So far we only look at keeping the waste down and using the default adaptive calculations. If we also consider how much is used we can get a lower worker count in cases like this and that will make it much more likely to succeed with the compaction. In this case it will guarantee it since there is a single region used, so there will be only one worker and then it will compact the region to the bottom of the heap.

Testing
Manual verification that this will cause these collections to only use a single worker. Also currently running some performance regression testing to make sure this doesn't cause any big regressions.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8267703: runtime/cds/appcds/cacheObject/HeapFragmentationTest.java crashed with OutOfMemory

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/4225/head:pull/4225
$ git checkout pull/4225

Update a local copy of the PR:
$ git checkout pull/4225
$ git pull https://git.openjdk.java.net/jdk pull/4225/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 4225

View PR using the GUI difftool:
$ git pr show -t 4225

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/4225.diff

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented May 27, 2021

👋 Welcome back sjohanss! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr label May 27, 2021
@openjdk
Copy link

@openjdk openjdk bot commented May 27, 2021

@kstefanj The following label will be automatically applied to this pull request:

  • hotspot-gc

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-gc label May 27, 2021
@mlbridge
Copy link

@mlbridge mlbridge bot commented May 27, 2021

Webrevs

@kstefanj
Copy link
Contributor Author

@kstefanj kstefanj commented May 27, 2021

As an additional note, this will cause a Full GC pause time increase for cases where few regions are used. So we might want to change the logic to require fewer regions per worker. The reason I used HeapSizePerGCThread was just because this is an existing flag that is already used for similar calculations. We could of course say that G1s policy in don't care about this and instead just makes sure to limit the number of workers to the number of used regions. This will lower the risk of getting big regressions.

@kstefanj
Copy link
Contributor Author

@kstefanj kstefanj commented Jun 1, 2021

I did some more analysis of the regression in GC time and found that the bulk of the regression was caused by not using all workers for clearing of bitmaps before and after the actual collection. To avoid this I now temporarily increase the number of workers when clearing the bitmap.

I also changed the the limiting to only look at heap->num_used_regions() as mentioned above.

Copy link
Contributor

@tschatzl tschatzl left a comment

Not really requesting changes, just answering my comments.

worker_count, heap_waste_worker_limit, active_worker_limit, used_worker_limit);
worker_count = heap->workers()->update_active_workers(worker_count);
log_info(gc, task)("Using %u workers of %u for full compaction", worker_count, max_worker_count);
Copy link
Contributor

@tschatzl tschatzl Jun 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's pre-existing, but this will change the number of active workers for the rest of the garbage collection. That made some sense previously as G1FullCollector::calc_active_workers() typically was very aggressive, but now it may limit other phases a bit, particularly marking which distributes on a per-reference basis.
Overall it might not make much difference though as we are talking about the very little occupied heap case.
I.e. some rough per-full gc phase might be better and might be derived easily too.

Copy link
Contributor Author

@kstefanj kstefanj Jun 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was one of the reasons I went with using "just used regions" and skipping the part that each worker will handle a set of regions. In most cases looking at used regions will not limit the workers much, and if it does we don't have much work to do. I've done some benchmarking and not seen any significant regressions with this patch. The biggest problem was not using enough workers for the bitmap-work.

Calculating workers per phase might be a good improvement to consider, but that would require some more refactoring.

Copy link
Contributor

@tschatzl tschatzl Jun 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. If you think it would take too long, please file an issue for per-phase thread sizing in parallel gc then (if there isn't).

Copy link
Contributor Author

@kstefanj kstefanj Jun 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll file an RFE.

@@ -94,10 +94,14 @@ uint G1FullCollector::calc_active_workers() {
uint current_active_workers = heap->workers()->active_workers();
uint active_worker_limit = WorkerPolicy::calc_active_workers(max_worker_count, current_active_workers, 0);

// Finally consider the amount of used regions.
uint used_worker_limit = MAX2(heap->num_used_regions(), 1u);
Copy link
Contributor

@tschatzl tschatzl Jun 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we actually end up here with zero used regions? It does not hurt, but seems superfluous

Copy link
Contributor Author

@kstefanj kstefanj Jun 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No we should not, but I went with the super-safe option and added the MAX2(...). If you like, I can remove it :)

Copy link
Contributor

@tschatzl tschatzl Jun 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to either remove it or assert it. It confused at least me more than it seemed to be worth.

Copy link
Contributor Author

@kstefanj kstefanj Jun 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add an assert.

Turn safety MAX2(used, 1) into assert.
Copy link
Contributor

@tschatzl tschatzl left a comment

Lgtm, thanks.

@openjdk
Copy link

@openjdk openjdk bot commented Jun 2, 2021

@kstefanj This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8267703: runtime/cds/appcds/cacheObject/HeapFragmentationTest.java crashed with OutOfMemory

Reviewed-by: tschatzl, kbarrett

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 136 new commits pushed to the master branch:

  • 7e41ca3: 8266957: SA has not followed JDK-8220587 and JDK-8224965
  • 6ff978a: 8267204: Expose access to underlying streams in Reporter
  • 76b54a1: 8263512: [macos_aarch64] issues with calling va_args functions from invoke_native
  • 4e6748c: 8267687: ModXNode::Ideal optimization is better than Parse::do_irem
  • 48dc72b: 8268272: Remove JDK-8264874 changes because Graal was removed.
  • 20b6312: 8268151: Vector API toShuffle optimization
  • 64ec8b3: 8212155: Race condition when posting dynamic_code_generated event leads to JVM crash
  • cd0678f: 8199318: add idempotent copy operation for Map.Entry
  • b27599b: 8268222: javax/xml/jaxp/unittest/transform/Bug6216226Test.java failed, cannot delete file
  • 59a539f: 8268129: LibraryLookup::ofDefault leaks symbols from loaded libraries
  • ... and 126 more: https://git.openjdk.java.net/jdk/compare/37bc4e2e3c2968d7419dae4f421755b6f7d06090...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready label Jun 2, 2021
@kstefanj
Copy link
Contributor Author

@kstefanj kstefanj commented Jun 3, 2021

@kimbarrett, are you good with the change as it is now?

Copy link

@kimbarrett kimbarrett left a comment

Looks good.

@kstefanj
Copy link
Contributor Author

@kstefanj kstefanj commented Jun 7, 2021

Thanks for the reviews @tschatzl and @kimbarrett

/integrate

@openjdk openjdk bot closed this Jun 7, 2021
@openjdk openjdk bot added integrated and removed ready rfr labels Jun 7, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Jun 7, 2021

@kstefanj Since your change was applied there have been 149 commits pushed to the master branch:

  • 2aeeeb4: 8268279: gc/shenandoah/compiler/TestLinkToNativeRBP.java fails after LibraryLookup is gone
  • b05fa02: 8267904: C2 crash when compile negative Arrays.copyOf length after loop
  • 95ddf7d: 8267839: trivial mem leak in numa
  • 52d88ee: 8268292: compiler/intrinsics/VectorizedMismatchTest.java fails with release VMs
  • 042f0bd: 8256465: [macos] Java frame and dialog presented full screen freeze application
  • 8abf36c: 8268289: build failure due to missing signed flag in x86 evcmpb instruction
  • b05c40c: 8266951: Partial in-lining for vectorized mismatch operation using AVX512 masked instructions
  • f768fbf: 8268286: ProblemList serviceability/sa/TestJmapCore.java on linux-aarch64 with ZGC
  • b2e9eb9: 8268087: Update documentation of the JPasswordField
  • 91f9adc: 8268139: CDS ArchiveBuilder may reference unloaded classes
  • ... and 139 more: https://git.openjdk.java.net/jdk/compare/37bc4e2e3c2968d7419dae4f421755b6f7d06090...master

Your commit was automatically rebased without conflicts.

Pushed as commit 204b492.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-gc integrated
3 participants