-
Notifications
You must be signed in to change notification settings - Fork 6.2k
8348595: GenShen: Fix generational free-memory no-progress check #23306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8348595: GenShen: Fix generational free-memory no-progress check #23306
Conversation
Previously, we were using size of heap to asses progress of generational degenerated cycle. But that is not appropriate, because the collection set is chosen based on the size of young generation.
As previously implemented, we used the heap size to measure goodness of progress. However, heap size is only appropriate for non-generational Shenandoah. Freeset abstraction works for both.
|
👋 Welcome back kdnilsen! A progress list of the required criteria for merging this PR into |
|
@kdnilsen This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be: You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 21 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@phohensee) but any other Committer may sponsor as well. ➡️ To flag this PR as ready for integration with the above commit message, type |
|
The attached spreadsheet provides a summary of performance benefits of this patch. In the spreadsheet: Compared to Control, "Better Both" results are better across all measures: |
Webrevs
|
| ShenandoahFreeSet* free_set = _heap->free_set(); | ||
| size_t free_actual = free_set->available(); | ||
| // The sum of free_set->capacity() and ->reserved represents capacity of young in generational, heap in non-generational. | ||
| size_t free_expected = ((free_set->capacity() + free_set->reserved()) / 100) * ShenandoahCriticalFreeThreshold; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may pass ShenandoahGeneration as parameter to is_good_progress to simplify the calculation of free_expected, it should be like:
generation->max_capacity() / 100 * ShenandoahCriticalFreeThreshold
Good part is, free_expected might be more accurate in Full GC/Degen for global cycle, e.g. Full GC collects memory for global, free_expected should be calculated using the metrics from global generation. But either way, free_expected is not clearly defined in generational mode now, current code also works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this suggestion. I've made change. It turns out there was actually a bug in the original implementation, so I am retesting the performance results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, honest I didn't understand that why (free_set->capacity() + free_set->reserved() represents capacity of young in generational, is it the bug you found? free_set->capacity() is the capacity of all mutator regions which also excludes the regions doesn't have capacity for new object alloc(it is calculated when rebuild free set)
I thought a bit more, it makes more sense to calculate free_expected in snap_before, max_capacity of generations may change after collection, the free_expected should be calculated before the cycle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting thoughts. So young-generation size will change under these circumstances:
- There's a lot of young-gen memory to be promoted, or we choose to promote some young-gen regions in place (by relabeling the regions as OLD without evacuating their data). In both of these cases, we may shrink young in order to expand old.
- The GC cycle is mixed, so it has the side effect of reclaiming some old regions. These reclaimed old regions will typically be granted back to young, until such time as we need to expand old in order to hold results of promotion.
While it makes sense for expected to be computed based on "original size" of young generation, the question of how much free remaining in young represents "good progress" should probably be based on the current size of young. Ultimately, we are trying to figure out if there's enough memory in young to make it worthwhile to attempt another concurrent GC cycle.
I realize this thinking is a bit "fuzzy". The heuristic was originally designed for non-generational use.
I'm inclined to keep as is currently implemented, but should probably add a comment to explain why. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation, I agree with it is bit "fuzzy".
I'm not sure we should consider following case:
Degen cycle doesn't reclaim any memory, but promoted some young regions resulting in young capacity to shrink, in this case we may treat it as "good progress" but actually it is not.
A "good progress" could be free_actual_after > free_actual_before && free_actual_after > free_expected, what do you think? I am not sure all cases triggering degen cycle, this might be a false case that never happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we manage to pass the test "free_actual_after > free_expected" following the degen, even if young has shrunk, I think it is reasonable to pursue concurrent GC. Passing this exact test at the end of the next GC (assuming no further adjustments to generation sizes) would qualify us to continue with concurrent GC on the next cycle.
In general, it is very rare that "full gc" is the right thing to do. we're in the process of deprecating it entirely.
I will add a comment to clarify the thinking here.
| ShenandoahFreeSet* free_set = _heap->free_set(); | ||
| size_t free_actual = free_set->available(); | ||
| // The sum of free_set->capacity() and ->reserved represents capacity of young in generational, heap in non-generational. | ||
| size_t free_expected = ((free_set->capacity() + free_set->reserved()) / 100) * ShenandoahCriticalFreeThreshold; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As an outsider, the units involved and what exactly is being calculated is pretty opaque. Why would we divide by 100 to compute free_expected and not do the same for free_actual? Do we care about integer division truncation? The default value of ShenandoahCriticalFreeThreshold is 1, so multiplying by it is a nop by default, which seems strange.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ShenandoahCriticalFreeThreshold represents a percentage of the "total size". To calculate N% of the young generation size, we divide the generation size by 100 and then multiply by ShenandoahCriticalFreeThreshold. This code is a bit different in the most recent revision. Do you think it needs a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a comment here. Thanks.
In testing suggested refinements, I discovered a bug in original implementation. ShenandoahFreeSet::capacity() does not represent the size of young generation. It represents the total size of the young regions that had available memory at the time we most recently rebuilt the ShenandoahFreeSet. I am rerunning the performance tests following this suggested change.
|
These are updated performance results after making the change that uses generation size to determine expected. This change computes a larger expected size, increasing the likelihood that a particular degenerated cycle will be considered "bad progress": This represents overall improvement compared to previously reported number. It would appear that the difference in performance might be the result of "random noise". |
|
These are the results of combining both proposed PRs into a single execution test: This result is not as good as what was reported above. In my judgment, it still represents improvement over tip. The difference between the two runs may also be signal noise as there is no clear correlation between number of Full GCs and percentile latencies. The two full GCs reported in the "better both (redo)" run both result from alloc failure during evacuation. |
pengxiaolong
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank for the comprehensive tests and explanations, my approve doesn't count though:)
phohensee
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
|
/integrate |
|
/sponsor |
|
Going to push as commit ba6c965.
Your commit was automatically rebased without conflicts. |
|
@phohensee @kdnilsen Pushed as commit ba6c965. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |



At the end of a degenerated GC, we check whether sufficient progress has been made in replenishing the memory available to the mutator. The test for good progress is implemented as a ratio of free memory against the total heap size.
For generational Shenandoah, the ratio should be computed against the size of the young generation. Note that the size of the generational collection set is based on young generation size rather than total heap size.
This issue first identified in GenShen GC logs, where a large number of degenerated cycles were upgrading to full GC because the free-set progress was short of desired by 10-25%.
Progress
Issue
Reviewers
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/23306/head:pull/23306$ git checkout pull/23306Update a local copy of the PR:
$ git checkout pull/23306$ git pull https://git.openjdk.org/jdk.git pull/23306/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 23306View PR using the GUI difftool:
$ git pr show -t 23306Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/23306.diff
Using Webrev
Link to Webrev Comment