Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8311883: [Genshen] Adaptive tenuring threshold #289

Closed
wants to merge 83 commits into from

Conversation

ysramakrishna
Copy link
Member

@ysramakrishna ysramakrishna commented Jun 15, 2023

JDK-8311883 [GenShen] Adaptive tenuring

I am opening this previously draft PR for formal preliminary review. It has already benefited from review feedback from a code walkthrough of an earlier version of the code. Most of that feedback and the corrections thereof are to be found in the comments in this PR. I have addressed a large majority of those comments, and am working on the last one that I plan to address as part of this PR. For the ones that I don't plan to address in this PR, I will create follow up tickets. Those will be added in the responses for the remaining feedback comments recorded in this PR's conversation.

Preliminary testing w/SPECjbb didn't yield reliable performance data from which to infer any performance improvements stemming from enabling adaptive tenuring. I believe that was because of the way SPECjbb is run, which causes excessive degenerate and full gc's. I plan to collect SPECjbb numbers with a fixed lower max HBIR so as to be able to discern performance differences from this change, as well as Extremem workloads. Those will be added here once ready over the next few days.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed (1 review required, with at least 1 Committer)

Issue

  • JDK-8311883: [Genshen] Adaptive tenuring threshold (Enhancement - P3)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/shenandoah.git pull/289/head:pull/289
$ git checkout pull/289

Update a local copy of the PR:
$ git checkout pull/289
$ git pull https://git.openjdk.org/shenandoah.git pull/289/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 289

View PR using the GUI difftool:
$ git pr show -t 289

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/shenandoah/pull/289.diff

Webrev

Link to Webrev Comment

but not sensible yet. Expect to simplify further and add in the
remaining hooks.
table. Adds to local tables, but no consolidation yet into global table,
nor epoch increments etc. Still very preliminary.

The race noted earlier in object age extraction from displaced header
seems to have temporarily disappeared, but I am sure needs addressing.

Old evacuation-time census still present and needs to be removed.
New marking time census may need to be extended to include objects
allocated above TAMS (i.e. after commencement of marking), as age 0
objects?
…sus resolved races involving object ages with mutators because of swapping the header word, a race that we are vulnerable to in the new marking-time census which we still need to find a fix for.)
Support for clearing history.

Need to look at census to see if it makes sense. Doesn't count infants
(those born after start of marking).
adaptive computation is still stubbed out, so we just return
InitialTenuringThreshold, mimicking current implementation.
validation wrt behaviour, perofmrnace, and corner cases.
Needs testing in non-gen modes because of generational paths leaking
into non-gen code.
Other small improvements & documentation comments
@openjdk openjdk bot removed the rfr Pull request is ready for review label Jul 25, 2023
@openjdk openjdk bot added the rfr Pull request is ready for review label Jul 25, 2023
Allow consideration of mortality rate of youngest cohort for tenuring
decisions. (It was previously excluded because census data didn't
include the youngest cohort allocated in the current epoch.)

Still to do: avoid tenuring revisionism a la Kelvin's suggestion.
most recent preious cycle when computing new tenuring threshold.
default, along with ignoring mortality rates of cohorts older than the
selected tenuring age at previous epoch.
@ysramakrishna
Copy link
Member Author

I'm thinking we would want to make -XX:+ShenandoahGenerationalAdaptiveTenuring -XX:-ShenandoahGenerationalCensusAtEvac the default behavior in this PR. Since we do not yet have a large community of GenShen production users, are there reasons not to make these the defaults?

Done.

@ysramakrishna
Copy link
Member Author

I've run this through some Extremem workloads. Good news is I see no regressions. On the other hand, I am not yet seeing huge benefit. (I may be seeing a decrease in degenerated cycles, but need to run a few more tests to be sure.)

One concern is that we are still improperly identifying promotions as mortality. I'm attaching a log with some of my comments preceded by ;; at the start of lines. See line 3421 of the log, for example. IN GC(16), we chose tenure age 2. Then we promoted in place 420 regions. This caused us to believe there was high mortality of ages 3-6, but really there was no mortality and only promotions. In GC(17), we should have stayed with tenure age 2.

I think the way to fix this is to only scan from 1 to the current tenure age when you select a new tenure age. If there is no mortality at the current tenure age, then we can set the new tenure age to 1 + current tenure age. auto-tenure.out.txt

Done.

@ysramakrishna
Copy link
Member Author

JDK-8311883 [GenShen] Adaptive tenuring

...

Preliminary testing w/SPECjbb didn't yield reliable performance data from which to infer any performance improvements stemming from enabling adaptive tenuring. I believe that was because of the way SPECjbb is run, which causes excessive degenerate and full gc's. I plan to collect SPECjbb numbers with a fixed lower max HBIR so as to be able to discern performance differences from this change, as well as Extremem workloads. Those will be added here once ready over the next few days.

Update on performance efforts:

I've run SPECjbb and Extremem under various configurations of the benchmarks and of Generational Shenandoah in order to separate the performance of reference and specimen. Unfortunately, it appears as if the triggering mechanisms often cause degenerate collections which has been a bit of a challenge to completely eliminate. This appears to inject enough variability into the results that any differences in performance between the two is difficult to discern.

SPECjbb configurations included fixed preset IR among others, and Extremem configurations (thanks to Kelvin for advice) included various load characteristics. Generational Shenandoah configurations included guaranteed collection interval triggers (variously and in combination for young & old), and disabling garbage density as a criterion for inclusion in the collection set.

The only configuration in which I was able to fully eliminate degenerate GCs for these workloads was Extremem "hefty load" with a 45 g heap. This produced exactly one degenerate GC in each case during the early part of the run. Setting a guaranteed collection interval eliminated this degenerate collection.

A comparison was then made for the time in each of the concurrent phases, both total and average over all cycles. The differences were a wash.

The performance numbers for adaptive tenuring were based on the default settings to be found in
src/hotspot/share/gc/shenandoah/shenandoah_globals.hpp.

While no positive or negative performance impacts have yet been measured, the framework is now in place to allow further experimentation in the future in order to investigate and potentially extract any potential benefits from adaptive tenuring.

I'm happy to integrate this now, or do further future experiments, including coupling this with the size budgeting for generations to demonstrate impact before this goes in. I am open to any and all suggestions.

I'll attach to the ticket the design document upon which the current adaptive tenuring is based, and where we discuss some alternatives and some pitfalls we encountered along the way.

Copy link
Contributor

@kdnilsen kdnilsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. This was a lot of work. Good to see it integrated.

@openjdk
Copy link

openjdk bot commented Aug 15, 2023

@ysramakrishna This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8311883: [Genshen] Adaptive tenuring threshold

Generational Shenandoah currently has the notion of a tenuring threshold but it isn't dynamically adapted, but rather kept fixed at 7.

We now adapt the tenuring threshold based on object demographics as determined by a recent GC that visits objects in the young generation.

We keep track of age-cohort populations at each minor GC epoch and use the historical data to determine if it would be a good idea to tenure or not based on measured mortality rates. A few tunable (experimental) knobs are exposed to play with these to determine some good settings in the future.

The object census is conducted by default at marking, and is subject to noise on account of objects whose age could not be determined because of displaced header. In this case, the computed tenuring threshold is used in the following evacuation of the same cycle. Optionally, the census can be conducted at evacuation time, but sees only objects that are in the collection set. In this case, the tenuring threshold that is computed is used for tenuring decisions in the next evacuation cycle.

Other variants are possible, and may be implemented / tested in the future as opportunity and data permit.

The computed tenuring threshold has not yet been coupled with size budgeting, but will be in a followup PR.

At this time, performance measurements have not shown any benefit, but we believe that with the framework now in place, we may be able to find an adaptive tenuring algorithm that works better than the current one and provide performance benefits. Adaptive tenuring is enabled by default, but can be optionally disabled to mimic previous behavior.

Reviewed-by: kdnilsen

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been no new commits pushed to the master branch. If another commit should be pushed before you perform the /integrate command, your PR will be automatically rebased. If you prefer to avoid any potential automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Aug 15, 2023
@ysramakrishna

This comment was marked as outdated.

@openjdk
Copy link

openjdk bot commented Aug 18, 2023

@ysramakrishna Invalid summary:

8311883: [Genshen] Adaptive tenuring threshold

Generational Shenandoah currently has the notion of a tenuring threshold but it isn't dynamically adapted, but rather kept fixed at 7.

We now adapt the tenuring threshold based on object demographics as determined by a recent GC that visits objects in the young generation.

We keep track of age-cohort populations at each minor GC epoch and use the historical data to determine if it would be a good idea to tenure or not based on measured mortality rates. A few tunable (experimental) knobs are exposed to play with these to determine some good settings in the future.

The object census is conducted by default at marking, and is subject to noise on account of objects whose age could not be determined because of displaced header. In this case, the computed tenuring threshold is used in the following evacuation of the same cycle. Optionally, the census can be conducted at evacuation time, but sees only objects that are in the collection set. In this case, the tenuring threshold that is computed is used for tenuring decisions in the next evacuation cycle.

Other variants are possible, and may be implemented / tested in the future as opportunity and data permit.

The computed tenuring threshold has not yet been coupled with size budgeting, but will be in a followup PR.

At this time, performance measurements have not shown any benefit, but we believe that with the framework now in place, we may be able to find an adaptive tenuring algorithm that works better than the current one and provide performance benefits. Adaptive tenuring is enabled by default, but can be optionally disabled to mimic previous behavior.

A summary line cannot start with any of the following: <issue-id>:, Co-authored-by:, Reviewed-by:, Backport-of:. See JEP 357 for details.

@ysramakrishna
Copy link
Member Author

/summary

Generational Shenandoah currently has the notion of a tenuring threshold but it isn't dynamically adapted, but rather kept fixed at 7.

We now adapt the tenuring threshold based on object demographics as determined by a recent GC that visits objects in the young generation.

We keep track of age-cohort populations at each minor GC epoch and use the historical data to determine if it would be a good idea to tenure or not based on measured mortality rates. A few tunable (experimental) knobs are exposed to play with these to determine some good settings in the future.

The object census is conducted by default at marking, and is subject to noise on account of objects whose age could not be determined because of displaced header. In this case, the computed tenuring threshold is used in the following evacuation of the same cycle. Optionally, the census can be conducted at evacuation time, but sees only objects that are in the collection set. In this case, the tenuring threshold that is computed is used for tenuring decisions in the next evacuation cycle.

Other variants are possible, and may be implemented / tested in the future as opportunity and data permit.

The computed tenuring threshold has not yet been coupled with size budgeting, but will be in a followup PR.

At this time, performance measurements have not shown any benefit, but we believe that with the framework now in place, we may be able to find an adaptive tenuring algorithm that works better than the current one and provide performance benefits. Adaptive tenuring is enabled by default, but can be optionally disabled to mimic previous behavior.

@openjdk
Copy link

openjdk bot commented Aug 18, 2023

@ysramakrishna Setting summary to:

Generational Shenandoah currently has the notion of a tenuring threshold but it isn't dynamically adapted, but rather kept fixed at 7.

We now adapt the tenuring threshold based on object demographics as determined by a recent GC that visits objects in the young generation.

We keep track of age-cohort populations at each minor GC epoch and use the historical data to determine if it would be a good idea to tenure or not based on measured mortality rates. A few tunable (experimental) knobs are exposed to play with these to determine some good settings in the future.

The object census is conducted by default at marking, and is subject to noise on account of objects whose age could not be determined because of displaced header. In this case, the computed tenuring threshold is used in the following evacuation of the same cycle. Optionally, the census can be conducted at evacuation time, but sees only objects that are in the collection set. In this case, the tenuring threshold that is computed is used for tenuring decisions in the next evacuation cycle.

Other variants are possible, and may be implemented / tested in the future as opportunity and data permit.

The computed tenuring threshold has not yet been coupled with size budgeting, but will be in a followup PR.

At this time, performance measurements have not shown any benefit, but we believe that with the framework now in place, we may be able to find an adaptive tenuring algorithm that works better than the current one and provide performance benefits. Adaptive tenuring is enabled by default, but can be optionally disabled to mimic previous behavior.

@ysramakrishna
Copy link
Member Author

I am integrating these changes with the GHA failures https://github.com/ysramakrishna/shenandoah/actions/runs/5898305273 that are tracked in https://bugs.openjdk.org/browse/JDK-8311843

/integrate

@openjdk
Copy link

openjdk bot commented Aug 18, 2023

Going to push as commit ef4b453.

@openjdk openjdk bot added the integrated Pull request has been integrated label Aug 18, 2023
@openjdk openjdk bot closed this Aug 18, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Aug 18, 2023
@openjdk
Copy link

openjdk bot commented Aug 18, 2023

@ysramakrishna Pushed as commit ef4b453.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@ysramakrishna ysramakrishna deleted the adaptive_tenuring branch August 18, 2023 22:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integrated Pull request has been integrated
3 participants