-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8267703: runtime/cds/appcds/cacheObject/HeapFragmentationTest.java crashed with OutOfMemory #4225
Conversation
…ashed with OutOfMemory
👋 Welcome back sjohanss! A progress list of the required criteria for merging this PR into |
Webrevs
|
As an additional note, this will cause a Full GC pause time increase for cases where few regions are used. So we might want to change the logic to require fewer regions per worker. The reason I used |
I did some more analysis of the regression in GC time and found that the bulk of the regression was caused by not using all workers for clearing of bitmaps before and after the actual collection. To avoid this I now temporarily increase the number of workers when clearing the bitmap. I also changed the the limiting to only look at |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really requesting changes, just answering my comments.
worker_count, heap_waste_worker_limit, active_worker_limit, used_worker_limit); | ||
worker_count = heap->workers()->update_active_workers(worker_count); | ||
log_info(gc, task)("Using %u workers of %u for full compaction", worker_count, max_worker_count); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's pre-existing, but this will change the number of active workers for the rest of the garbage collection. That made some sense previously as G1FullCollector::calc_active_workers()
typically was very aggressive, but now it may limit other phases a bit, particularly marking which distributes on a per-reference basis.
Overall it might not make much difference though as we are talking about the very little occupied heap case.
I.e. some rough per-full gc phase might be better and might be derived easily too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was one of the reasons I went with using "just used regions" and skipping the part that each worker will handle a set of regions. In most cases looking at used regions will not limit the workers much, and if it does we don't have much work to do. I've done some benchmarking and not seen any significant regressions with this patch. The biggest problem was not using enough workers for the bitmap-work.
Calculating workers per phase might be a good improvement to consider, but that would require some more refactoring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. If you think it would take too long, please file an issue for per-phase thread sizing in parallel gc then (if there isn't).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll file an RFE.
@@ -94,10 +94,14 @@ uint G1FullCollector::calc_active_workers() { | |||
uint current_active_workers = heap->workers()->active_workers(); | |||
uint active_worker_limit = WorkerPolicy::calc_active_workers(max_worker_count, current_active_workers, 0); | |||
|
|||
// Finally consider the amount of used regions. | |||
uint used_worker_limit = MAX2(heap->num_used_regions(), 1u); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we actually end up here with zero used regions? It does not hurt, but seems superfluous
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No we should not, but I went with the super-safe option and added the MAX2(...)
. If you like, I can remove it :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to either remove it or assert it. It confused at least me more than it seemed to be worth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add an assert.
Turn safety MAX2(used, 1) into assert.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm, thanks.
@kstefanj This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 136 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
@kimbarrett, are you good with the change as it is now? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
Thanks for the reviews @tschatzl and @kimbarrett /integrate |
@kstefanj Since your change was applied there have been 149 commits pushed to the
Your commit was automatically rebased without conflicts. Pushed as commit 204b492. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
Please review this change to lower fragmentation when heap usage is low.
Summary
The above test case fails because the G1 Full GC fails to compact the single used region below the needed threshold. In this case the region needs to be compacted below region index 400 to be able to fit the large array. The reason this fails is that the full GC uses a lot of parallel workers and if the system is under load it is possible that the worker finding the region to compact hasn't been able to claim any regions low enough in the heap to compact the live objects to.
To fix this we can add a third thing to consider in the code calculating the number of workers to use for a given compaction. So far we only look at keeping the waste down and using the default adaptive calculations. If we also consider how much is used we can get a lower worker count in cases like this and that will make it much more likely to succeed with the compaction. In this case it will guarantee it since there is a single region used, so there will be only one worker and then it will compact the region to the bottom of the heap.
Testing
Manual verification that this will cause these collections to only use a single worker. Also currently running some performance regression testing to make sure this doesn't cause any big regressions.
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/4225/head:pull/4225
$ git checkout pull/4225
Update a local copy of the PR:
$ git checkout pull/4225
$ git pull https://git.openjdk.java.net/jdk pull/4225/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 4225
View PR using the GUI difftool:
$ git pr show -t 4225
Using diff file
Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/4225.diff