Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8297186: G1 triggers unnecessary full GCs when heap utilization is low #11209

Conversation

tschatzl
Copy link
Contributor

@tschatzl tschatzl commented Nov 17, 2022

Hi all,

can I have reviews for this change that better enforces that after gc there is at least one eden region to allocate into?

That avoids the problem described in the PR.

Thanks,
Thomas


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8297186: G1 triggers unnecessary full GCs when heap utilization is low

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/11209/head:pull/11209
$ git checkout pull/11209

Update a local copy of the PR:
$ git checkout pull/11209
$ git pull https://git.openjdk.org/jdk pull/11209/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 11209

View PR using the GUI difftool:
$ git pr show -t 11209

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/11209.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Nov 17, 2022

👋 Welcome back tschatzl! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot changed the title 8297186 8297186: G1 triggers unnecessary full GCs when heap utilization is low Nov 17, 2022
@openjdk
Copy link

openjdk bot commented Nov 17, 2022

@tschatzl The following label will be automatically applied to this pull request:

  • hotspot-gc

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-gc hotspot-gc-dev@openjdk.org label Nov 17, 2022
@tschatzl tschatzl marked this pull request as ready for review November 17, 2022 17:17
@openjdk openjdk bot added the rfr Pull request is ready for review label Nov 17, 2022
@mlbridge
Copy link

mlbridge bot commented Nov 17, 2022

Webrevs

Copy link
Contributor

@caoman caoman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the quick fix!

I haven't fully understood JDK-8253413 yet, so better to have someone more familiar with JDK-8253413 to review this change.

src/hotspot/share/gc/g1/g1Policy.cpp Outdated Show resolved Hide resolved
test/hotspot/jtreg/gc/g1/TestOneEdenRegionAfterGC.java Outdated Show resolved Hide resolved
// Also request at least one eden region, see above for reasons.
uint desired_eden_length = MAX3(desired_eden_length_by_pause,
desired_eden_length_by_mmu,
MinDesiredEdenLength);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean G1Policy::calculate_young_desired_length() could return 0 instead of a minimum of 1 now? Comment on line 228 may need updating.

I noticed young_list_desired_length() is also used in G1Policy::update_ihop_prediction(). Could this change mess up with _ihop_control->update_allocation_info()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Line 285 clamps the value to min/max of the young gen sizer; minimum size returned by it is 1 region, see G1YoungSizer code. So the original code to ensure a minimum region size of one has already been superfluous in the original implementation.
Since it serves no purpose, I removed it. I added an assertion at the end.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the quick fix!

I haven't fully understood JDK-8253413 yet, so better to have someone more familiar with JDK-8253413 to review this change.

Feel free to ask if there is something you do not understand.

if (after_gc && (desired_young_length <= allocated_young_length)) {
log_trace(gc, ergo, heap)("Young target length: Desired young length smaller than allocated after GC. Allowing one extra eden region.");
desired_young_length = allocated_young_length + 1;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this. The log message doesn't match the test being performed (log says "smaller" but
the test is for <=). And the action taken seems kind of mysterious. I think it's to force us into the else
clause of the following if statement. It might be more obvious if instead of this the if condition was
changed to (!after_gc && allocated_young_length >= desired_young_length), though then there is some
additional stuff to do in the else branch too. A comment instead would help.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change forces the remaining code in calculate_young_target_length to add an extra region to the number of desired regions if the number of currently allocated regions is equal or exceeds them.
This override should only be used if the caller requested it, i.e. after a young gc to avoid an immediate full gc right away because the allocation that caused the gc could not be satisfied with zero eden regions.

E.g.
desired regions = 1 (because of very small heap size)
allocation regions = 1 (because of survivor regions from previous gc)

Now this leaves zero regions for eden for the next allocation, causing the described full gcs.
This code pretends to the actual target calculation (if after a gc) that we actually want one eden region.

As for the location of this if, I moved it to calculate_young_desired_length with a comment, since this code actually modifies the desired length. It's imho still a bit awkward there because it is kind of a tacked on condition, but I do not know a better place for this code (I've been fairly undecided about that in the original code alreayd).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the restructuring. I think I understand what this is doing now.

I think the new structure is better, though I think the comment on the place
where the desired_young_length adjustment occurs could be improved. I think
what's happening there is that when computing the target after a GC, since we
want at least one eden region, and since allocated_young_length at that point
is the number of survivor regions, the desired length must exceed the
allocated (survivor) length.

But with that in mind, I think the prior name for the new argument
("after_gc") is better than the new name ("need_one_eden_region"). It's not
so much that the caller wants at least one eden region, it's that the policy
is that there should be at least one eden region after a GC.

Copy link

@kimbarrett kimbarrett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@openjdk
Copy link

openjdk bot commented Dec 5, 2022

@tschatzl This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8297186: G1 triggers unnecessary full GCs when heap utilization is low

Reviewed-by: kbarrett, sjohanss

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 303 new commits pushed to the master branch:

  • 923c746: 8298057: (fs) Remove PollingWatchService.POLLING_INIT_DELAY
  • 0bd04a6: 8297951: C2: Create skeleton predicates for all If nodes in loop predication
  • f5ad515: 8297247: Add GarbageCollectorMXBean for Remark and Cleanup pause time in G1
  • e975418: 8298102: Remove DirtyCardToOopClosure::_last_explicit_min_done
  • 04012c4: 8298111: Cleanups after UseMallocOnly removal
  • ee9ba74: 8295184: Printing messages with a RecordComponentElement does not include position
  • ba2d28e: 8298027: Remove SCCS id's from awt jtreg tests
  • 8d8a28f: 8296489: tools/jpackage/windows/WinL10nTest.java fails with timeout
  • 884b9ad: 8293453: tools/jpackage/share/AddLShortcutTest.java "Failed: Check the number of mismatched pixels [1024] of [1024] is < [0.100000] threshold"
  • 8af6e8a: 8298123: Problem List MaximizedToIconified.java on macOS
  • ... and 293 more: https://git.openjdk.org/jdk/compare/bd57e2138fc980822a149af905e572ab71ccbf11...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Dec 5, 2022
Copy link
Contributor

@kstefanj kstefanj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

The change to LotsOfClasses.java is not obvious to me, did this change somehow alter the stability of that test?

@tschatzl
Copy link
Contributor Author

tschatzl commented Dec 6, 2022

Looks good.

The change to LotsOfClasses.java is not obvious to me, did this change somehow alter the stability of that test?

The test loads lots of classes (30000+) which causes many Metaspace-sizing induced gcs. This can cause heap fragmentation at the end of the heap so that CDS fails to get a contiguous amount of memory for taking the CDS snapshot.

E.g. in the failure case the heap looks like this (after the full gc the CDS generation executes); using random region numbers.

Region 0-24: old
Region 25-470: uncommitted
Region 471-480: free
Region 481-504: old
Region 505-512: free

Now the CDS archive is larger than these 8 regions at the end, and CDS generation fails in like 50% of cases.

The reason why the block at 481-504 exists, is because gc compacts young gen formerly from ~481-512 into these regions as old, and due to other gcs (mixed phase/remark) happening before, due to it observing MaxFreePercentage option, it uncommits that huge block so that that CDS compacting full gc won't simply move that data to the very front of the heap (i.e. to region 25 onwards).

That MaxMetaspaceSize option greatly cuts down on these non-essential metaspace-induced gcs that mess up the heap to make the test fail.

@tschatzl
Copy link
Contributor Author

tschatzl commented Dec 6, 2022

Thanks @kstefanj @kimbarrett for your reviews.

/integrate

@openjdk
Copy link

openjdk bot commented Dec 6, 2022

Going to push as commit a9e6c62.
Since your change was applied there have been 307 commits pushed to the master branch:

  • 4458de9: 8297172: Fix some issues of auto-vectorization of Long.bitCount/numberOfTrailingZeros/numberOfLeadingZeros()
  • a613998: 8297689: Fix incorrect result of Short.reverseBytes() call in loops
  • f8f4630: 8297963: Partially fix string expansion issues in UTIL_DEFUN_NAMED and related macros
  • 2a243a3: 8267617: Certificate's IP x509 NameConstraints raises ArrayIndexOutOfBoundsException
  • 923c746: 8298057: (fs) Remove PollingWatchService.POLLING_INIT_DELAY
  • 0bd04a6: 8297951: C2: Create skeleton predicates for all If nodes in loop predication
  • f5ad515: 8297247: Add GarbageCollectorMXBean for Remark and Cleanup pause time in G1
  • e975418: 8298102: Remove DirtyCardToOopClosure::_last_explicit_min_done
  • 04012c4: 8298111: Cleanups after UseMallocOnly removal
  • ee9ba74: 8295184: Printing messages with a RecordComponentElement does not include position
  • ... and 297 more: https://git.openjdk.org/jdk/compare/bd57e2138fc980822a149af905e572ab71ccbf11...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Dec 6, 2022
@openjdk openjdk bot closed this Dec 6, 2022
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Dec 6, 2022
@openjdk
Copy link

openjdk bot commented Dec 6, 2022

@tschatzl Pushed as commit a9e6c62.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@tschatzl tschatzl deleted the submit/8297186-unnecessary-fullgc-on-small-heap branch December 6, 2022 14:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-gc hotspot-gc-dev@openjdk.org integrated Pull request has been integrated
5 participants