Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8268163: Change the order of fallback full GCs in G1 #4342

Closed
wants to merge 1 commit into from

Conversation

kstefanj
Copy link
Contributor

@kstefanj kstefanj commented Jun 3, 2021

Please review this change to make the order of G1 Full GCs a bit more straight forward.

Summary
In JDK-8233822 the way Full GCs were scheduled was changed a bit to support a use-case introduced by JDK-8202286 (which allowed the heap to have the old generation on an alternative memory device). This feature has been removed (JDK-8256181), but the Full GC scheduling has not been reverted. This can lead to situations where we do three Full GCs in a row. Doing more than two seems a bit over the top, so this change more or less reverts back to the old behavior

  • Young collection requesting to allocate when done will cause at most two Full GCs, the first will not clear any soft references and allow dead wood to be left. The second one, if still not able to satisfy the allocation will clear soft references and compact everything not allowing any dead wood.
  • For concurrent start collections and young collections not requesting any allocation, one Full GC will be scheduled if no regions were freed up by the initial collection.

A change compared to current behavior is that a concurrent collection started because of the metadata threshold not being met, will no longer be upgraded to a Full GC. This is still equal to how things were handled prior to JDK-8233822.

Testing
Tier 1-3, plus manual testing looking at output in near OOM situations.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8268163: Change the order of fallback full GCs in G1

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/4342/head:pull/4342
$ git checkout pull/4342

Update a local copy of the PR:
$ git checkout pull/4342
$ git pull https://git.openjdk.java.net/jdk pull/4342/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 4342

View PR using the GUI difftool:
$ git pr show -t 4342

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/4342.diff

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Jun 3, 2021

👋 Welcome back sjohanss! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

@openjdk openjdk bot commented Jun 3, 2021

@kstefanj The following label will be automatically applied to this pull request:

  • hotspot-gc

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-gc label Jun 3, 2021
@kstefanj kstefanj marked this pull request as ready for review Jun 4, 2021
@openjdk openjdk bot added the rfr label Jun 4, 2021
@mlbridge
Copy link

@mlbridge mlbridge bot commented Jun 4, 2021

Webrevs

log_info(gc, ergo)("Attempting full compaction clearing soft references");
bool success = do_full_collection(false /* explicit gc */,
true /* clear_all_soft_refs */,
false /* do_maximum_compaction */);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this have do_maximum_compaction == true? That's what the change to VM_G1CollectForAllocation expects. In which case, do we really need separate arguments for do_full_collection for clear_all_soft_refs and do_maximum_compaction or could they be collapsed to one maximum_compaction argument?

Copy link
Contributor Author

@kstefanj kstefanj Jun 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is currently the only case where we need separate arguments for the two. So previously the "upgraded" Full collections used clear_all_soft_refs == true and do_maximum_compaction == false, so I kept it that way. In some sense I think this is fine, when upgrading this is the first time we try a Full GC so leaving dead wood (maximum==false) could still find some space. It is only when we attempt a second Full collection in a row we set do_maximum_compaction == true. Makes sense, or do you think we should force a "maximum compaction" when upgrading?

Copy link
Contributor

@tschatzl tschatzl Jun 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fwiw, I am good with either, i.e. do a maximally compacting gc or not; both will likely free some space but the non-maximally compacting should be faster. Maybe this change should just remove the third full gc, and in a separate change think about the upgrade policy. I already filed https://bugs.openjdk.java.net/browse/JDK-8268393 for such work.

Copy link
Contributor Author

@kstefanj kstefanj Jun 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least with this change we will never do 3 consecutive Full GCs in the same VM operation.

Copy link
Contributor

@tschatzl tschatzl left a comment

Apart from these pre-existing issues, lgtm.

I also asked @kstefanj to consider renaming "maximum compaction/collection" to "maximal compaction" or at least streamline the naming of that term in a separate RFR too.

false /* do_maximum_compaction */);
// do_full_collection only fails if blocked by GC locker and that can't
// be the case here since we only call this when already completed one gc.
assert(success, "invariant");
Copy link
Contributor

@tschatzl tschatzl Jun 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... this when we already completed a gc.

// Failure to perform the collection at all occurs because GCLocker is
// active, and we have the bad luck to be the collection request that
// makes a later _gc_locker collection needed. (Else we would have hit
// the GCLocker check in the prologue.)
_transient_failure = true;
} else if (g1h->should_upgrade_to_full_gc(_gc_cause)) {
Copy link
Contributor

@tschatzl tschatzl Jun 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is a pre-existing bug here: should_upgrade_to_full_gc always returns false here because the called should_do_concurrent_full_gc() used there (and returning true) is always true for VM_G1TryInitiateConcMark.

Afaict this has been introduced in (https://bugs.openjdk.java.net/browse/JDK-8232588) where VM_G1CollectForAllocation has been split into VM_G1TryInitiateConcMark and VM_G1CollectForAllocation, and should_upgrade_to_full_gc() not updated.

Tote that JDK-8232588 suggests to not upgrade for a System.gc call with -XX:+ExplicitGCInvokesConcurrent, but strangely adds the code to do so. Later, JDK-8232588, due to some refactoring error, disables the upgrade again.

I'll file a bug to investigate this (mess).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the time of JDK-8232588 the should_do_concurrent_full_gc() check was preceded by a check of policy()->force_upgrade_to_full() that could make the full GC happen. That seems to have disappeared.

Copy link
Contributor Author

@kstefanj kstefanj Jun 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that was removed when we removed the support for old-gen och alternative memory device. I think having a bug/enhancement to investigate how to improve this is good.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The by-policy upgrade was removed by JDK-8256181: Remove Allocation of old generation on alternate memory devices functionality.

Copy link
Contributor Author

@kstefanj kstefanj Jun 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly, and we could probably remove this check, the reason for having it was to change as little functionality as possible at this point.

@openjdk
Copy link

@openjdk openjdk bot commented Jun 8, 2021

@kstefanj This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8268163: Change the order of fallback full GCs in G1

Reviewed-by: kbarrett, tschatzl

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 212 new commits pushed to the master branch:

  • 7b1e402: 8266598: Exception values for AnnotationTypeMismatchException are not always informative
  • 13d6180: 8264859: Implement Context-Specific Deserialization Filters
  • dd34a4c: 8268372: ZGC: dynamically select the number of concurrent GC threads used
  • 4388959: 8268056: Update java.net and java.nio to use switch expressions
  • 9cfd560: 8267663: [vector] Add unsigned comparison operators on AArch64
  • 4413142: 8268017: C2: assert(phi_type->isa_int() || phi_type->isa_ptr() || phi_type->isa_long()) failed: bad phi type
  • 2bfd708: 8266557: assert(SafepointMechanism::local_poll_armed(_handshakee)) failed: Must be
  • 4d1cf51: 8240349: jlink should not leave partial image output directory on failure
  • 07108c9: 8268241: deprecate JVM TI Heap functions 1.0
  • c9dbc4f: 8266891: Provide a switch to force the class space to a specific location
  • ... and 202 more: https://git.openjdk.java.net/jdk/compare/37bc4e2e3c2968d7419dae4f421755b6f7d06090...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready label Jun 8, 2021
Copy link

@kimbarrett kimbarrett left a comment

@kstefanj and I talked about this offline, and agreed to some followup work, after this goes in as-is. He'll explain. So looks good.

@kstefanj
Copy link
Contributor Author

@kstefanj kstefanj commented Jun 9, 2021

Thanks for the reviews @tschatzl and @kimbarrett!

As Kim mention there will be another follow up on this that will remove the check on the gc-cause in should_upgrade_to_full_gc(GCCause) (see JDK-8268390). We can't do that in this change since it would re-introduce the assertion-failure in JDK-8195158. The assertion is caused because we use the "system GC cause" but the explicit flag is not set. I have another change out for review that I will integrate short after this one, PR #4357, it will add a new GC cause for upgraded GCs. So in the future an upgraded system GC would not keep that cause and the assertion will no longer trigger.

/integrate

@openjdk openjdk bot closed this Jun 9, 2021
@openjdk openjdk bot added integrated and removed ready rfr labels Jun 9, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Jun 9, 2021

@kstefanj Since your change was applied there have been 212 commits pushed to the master branch:

  • 7b1e402: 8266598: Exception values for AnnotationTypeMismatchException are not always informative
  • 13d6180: 8264859: Implement Context-Specific Deserialization Filters
  • dd34a4c: 8268372: ZGC: dynamically select the number of concurrent GC threads used
  • 4388959: 8268056: Update java.net and java.nio to use switch expressions
  • 9cfd560: 8267663: [vector] Add unsigned comparison operators on AArch64
  • 4413142: 8268017: C2: assert(phi_type->isa_int() || phi_type->isa_ptr() || phi_type->isa_long()) failed: bad phi type
  • 2bfd708: 8266557: assert(SafepointMechanism::local_poll_armed(_handshakee)) failed: Must be
  • 4d1cf51: 8240349: jlink should not leave partial image output directory on failure
  • 07108c9: 8268241: deprecate JVM TI Heap functions 1.0
  • c9dbc4f: 8266891: Provide a switch to force the class space to a specific location
  • ... and 202 more: https://git.openjdk.java.net/jdk/compare/37bc4e2e3c2968d7419dae4f421755b6f7d06090...master

Your commit was automatically rebased without conflicts.

Pushed as commit 5fbb62c.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-gc integrated
3 participants