Skip to content

Conversation

@pengxiaolong
Copy link

@pengxiaolong pengxiaolong commented Feb 25, 2025

The change is to improve the observability of Shenandoah GC, basically there are three changes for Shenandoah GC timings in this PR:

  1. Net GC pause timings include the time to propagate GC state to Java threads
  2. Add new timing "Propagate GC state" in Shenandoah GC timing logs
  3. Removal of the call of propagate_gc_state_to_all_threads from "init_update_refs", which handles gc state in handshake already.

With the change, the new GC timing log will be like:

[11.056s][info][gc,stats    ] Concurrent Reset                     89 us
[11.056s][info][gc,stats    ] Pause Init Mark (G)                 257 us
[11.056s][info][gc,stats    ] Pause Init Mark (N)                  17 us
[11.056s][info][gc,stats    ]   Update Region States                3 us
[11.056s][info][gc,stats    ]   Propagate GC state                  1 us
[11.056s][info][gc,stats    ] Concurrent Mark Roots               232 us, parallelism: 1.96x
[11.056s][info][gc,stats    ]   CMR: <total>                      456 us
[11.056s][info][gc,stats    ]   CMR: Thread Roots                 429 us, workers (us): 139, 148, 142, ---, ---, ---, 
[11.057s][info][gc,stats    ]   CMR: VM Strong Roots               11 us, workers (us):   8,   3,   0, ---, ---, ---, 
[11.057s][info][gc,stats    ]   CMR: CLDG Roots                    16 us, workers (us):  16, ---, ---, ---, ---, ---, 
[11.057s][info][gc,stats    ] Concurrent Marking                 1304 us, parallelism: 2.33x
[11.057s][info][gc,stats    ]   CM: <total>                      3043 us
[11.057s][info][gc,stats    ]   CM: Parallel Mark                3043 us, workers (us): 1023, 1017, 1003, ---, ---, ---, 
[11.057s][info][gc,stats    ]   Flush SATB                        204 us
[11.057s][info][gc,stats    ] Pause Final Mark (G)                865 us
[11.057s][info][gc,stats    ] Pause Final Mark (N)                234 us
[11.057s][info][gc,stats    ]   Finish Mark                       129 us, parallelism: 0.01x
[11.057s][info][gc,stats    ]   Propagate GC state                  2 us
[11.057s][info][gc,stats    ]   Update Region States               12 us
[11.057s][info][gc,stats    ]   Choose Collection Set              25 us
[11.057s][info][gc,stats    ]   Rebuild Free Set                   29 us
[11.057s][info][gc,stats    ] Concurrent Weak References           67 us, parallelism: 0.25x
[11.057s][info][gc,stats    ]   CWRF: <total>                      17 us
[11.057s][info][gc,stats    ]   CWRF: Weak References              17 us, workers (us):  15,   1,   0, ---, ---, ---, 
[11.057s][info][gc,stats    ] Concurrent Weak Roots               413 us
[11.057s][info][gc,stats    ]   Roots                             203 us, parallelism: 1.95x
[11.057s][info][gc,stats    ]     CWR: <total>                    396 us
[11.057s][info][gc,stats    ]     CWR: Code Cache Roots           295 us, workers (us):  90,  96, 109, ---, ---, ---, 
[11.057s][info][gc,stats    ]     CWR: VM Weak Roots              100 us, workers (us):  48,  37,  14, ---, ---, ---, 
[11.057s][info][gc,stats    ]     CWR: CLDG Roots                   2 us, workers (us): ---, ---,   2, ---, ---, ---, 
[11.058s][info][gc,stats    ]   Rendezvous                        197 us
[11.058s][info][gc,stats    ] Concurrent Cleanup                   35 us
[11.058s][info][gc,stats    ] Concurrent Class Unloading          486 us
[11.058s][info][gc,stats    ]   Unlink Stale                      398 us
[11.058s][info][gc,stats    ]     System Dictionary                 5 us
[11.058s][info][gc,stats    ]     Weak Class Links                  0 us
[11.058s][info][gc,stats    ]     Code Roots                      391 us
[11.058s][info][gc,stats    ]   Rendezvous                         69 us
[11.058s][info][gc,stats    ]   Purge Unlinked                      4 us
[11.058s][info][gc,stats    ]     Code Roots                        0 us
[11.058s][info][gc,stats    ]     CLDG                              3 us
[11.058s][info][gc,stats    ] Pause Final Roots (G)               272 us
[11.058s][info][gc,stats    ] Pause Final Roots (N)                18 us
[11.058s][info][gc,stats    ]   Propagate GC state                  3 us

Test

  • make test TEST=hotspot_gc_shenandoah

Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8350314: Shenandoah: Capture thread state sync times in GC timings (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/23759/head:pull/23759
$ git checkout pull/23759

Update a local copy of the PR:
$ git checkout pull/23759
$ git pull https://git.openjdk.org/jdk.git pull/23759/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 23759

View PR using the GUI difftool:
$ git pr show -t 23759

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/23759.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Feb 25, 2025

👋 Welcome back xpeng! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Feb 25, 2025

@pengxiaolong This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8350314: Shenandoah: Capture thread state sync times in GC timings

Reviewed-by: ysr, shade, wkemper

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 46 new commits pushed to the master branch:

  • 78c18cf: 8349399: GHA: Add static-jdk build on linux-x64
  • e43960a: 8350616: Skip ValidateHazardPtrsClosure in non-debug builds
  • 9ec4696: 8350313: Include timings for leaving safepoint in safepoint logging
  • ec6624b: 8350649: Class unloading accesses/resurrects dead Java mirror after JDK-8346567
  • 9477c70: 8024695: new File("").exists() returns false whereas it is the current working directory
  • 3e46480: 8350770: [BACKOUT] Protection zone for easier detection of accidental zero-nKlass use
  • bd112c4: 8350443: GHA: Split static-libs-bundles into a separate job
  • 2731712: 8287749: Re-enable javadoc -serialwarn option
  • 0f82268: 8345598: Upgrade NSS binaries for interop tests
  • ea2c923: 8323807: Async UL: Add a stalling mode to async UL
  • ... and 36 more: https://git.openjdk.org/jdk/compare/65f79c145b7b1b32ed064a37ad4d2b6aca935a4c...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@earthling-amzn, @ysramakrishna, @shipilev) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk
Copy link

openjdk bot commented Feb 25, 2025

@pengxiaolong The following labels will be automatically applied to this pull request:

  • hotspot-gc
  • shenandoah

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added hotspot-gc hotspot-gc-dev@openjdk.org shenandoah shenandoah-dev@openjdk.org labels Feb 25, 2025
@pengxiaolong pengxiaolong marked this pull request as ready for review February 25, 2025 22:42
@openjdk openjdk bot added the rfr Pull request is ready for review label Feb 25, 2025
@mlbridge
Copy link

mlbridge bot commented Feb 25, 2025

Copy link
Contributor

@earthling-amzn earthling-amzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we really see propagate_gc_state_to_all_threads taking a long time? Or was it exiting the safepoint (i.e., after the state had been propagated) that took a long time?

@pengxiaolong
Copy link
Author

pengxiaolong commented Feb 25, 2025

Did we really see propagate_gc_state_to_all_threads taking a long time? Or was it exiting the safepoint (i.e., after the state had been propagated) that took a long time?

No, we discussed it last week in Slack channel, propagate_gc_state_to_all_threads usually takes less than 10 ns for ~1k threads in our test, it is not a problem.

it was the futex call when exit the safepoint took long time, we have a fix in ShenandoahLock for mitigation since the change we made last year affects scheduler, meanwhile Aleksey is working on https://bugs.openjdk.org/browse/JDK-8350324 which should improve the time to leave safepoint which has much broader impact.

Copy link
Contributor

@earthling-amzn earthling-amzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Propagating gc state for init update refs is vestigial.

Copy link
Contributor

@earthling-amzn earthling-amzn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the changes.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Feb 26, 2025
@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Feb 26, 2025
@ysramakrishna
Copy link
Member

Changes are fine.

This jumped out in yr sample output:

...
[11.057s][info][gc,stats    ]   Finish Mark                       129 us, parallelism: 0.01x
...
[11.057s][info][gc,stats    ] Concurrent Weak References           67 us, parallelism: 0.25x

which seemed kinda interesting. I assume this is just a consequence of the very little work (and extremely brief time in this phase) here, and can be ignored in this sample output from likely a toy GC, or one where you may have artificially boosted the number of worker threads. Still I thought I'd ask in case you've seen this with bigger timings or more work in any of these phases with low fractional speed-ups.

Copy link
Member

@ysramakrishna ysramakrishna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Feb 26, 2025
Handshake::execute(&prepare_for_update_refs);

_update_refs_iterator.reset();
_gc_state_changed = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, what is this and why do we need it. When was it set?

In set_update_refs_in_progress() ?

May be we need a better abstraction to toggle the state_changed boolean. This seems error prone.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, we don't need to set _gc_state_changed to false here, because concurrent_prepare_for_update_refs calls ShenandoahHeap::set_gc_state_concurrent to set gc state, ShenandoahHeap::set_gc_state_concurrent doesn't change _gc_state_changed to true, hence there is no need to reset it false here.

@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Feb 26, 2025
@openjdk openjdk bot added the rfr Pull request is ready for review label Feb 26, 2025
@pengxiaolong
Copy link
Author

Changes are fine.

This jumped out in yr sample output:

...
[11.057s][info][gc,stats    ]   Finish Mark                       129 us, parallelism: 0.01x
...
[11.057s][info][gc,stats    ] Concurrent Weak References           67 us, parallelism: 0.25x

which seemed kinda interesting. I assume this is just a consequence of the very little work (and extremely brief time in this phase) here, and can be ignored in this sample output from likely a toy GC, or one where you may have artificially boosted the number of worker threads. Still I thought I'd ask in case you've seen this with bigger timings or more work in any of these phases with low fractional speed-ups.

I'm not sure how parallelism is calculated, but I think it is caused by the test I was running, the test is very simple and there are only small number of live objects after mark.

Copy link
Member

@shipilev shipilev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am good with this, thanks. Only one minor nit remains.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Feb 26, 2025
@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Feb 26, 2025
@openjdk openjdk bot added the ready Pull request is ready to be integrated label Feb 26, 2025
@pengxiaolong
Copy link
Author

Thanks all for the reviews!

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Feb 27, 2025
@openjdk
Copy link

openjdk bot commented Feb 27, 2025

@pengxiaolong
Your change (at version caf9d4c) is now ready to be sponsored by a Committer.

@ysramakrishna
Copy link
Member

I'm not sure how parallelism is calculated, but I think it is caused by the test I was running, the test is very simple and there are only small number of live objects after mark.

Yes that explains it, thanks.

(PS: Parallelism (or really parallel speed-up) is calculated as the ratio of total thread virtual time to the wall-clock (elapsed) time, IIRC. As you noted work in this specific workload is too small for the parallel task overhead, as indicated by the micro-seconds worth of time.)

@shipilev
Copy link
Member

/sponsor

@openjdk
Copy link

openjdk bot commented Feb 27, 2025

Going to push as commit 01bd7e4.
Since your change was applied there have been 51 commits pushed to the master branch:

  • 4522f12: 8349921: Crash in codeBuffer.cpp:1004: guarantee(sect->end() <= tend) failed: sanity
  • e4d3c97: 8350111: [PPC] AsyncGetCallTrace crashes when called while handling SIGTRAP
  • 885338b: 8323582: C2 SuperWord AlignVector: misaligned vector memory access with unaligned native memory
  • bb48b73: 8350723: RISC-V: debug.cpp help() is missing riscv line for pns
  • b29f8b0: 8350665: SIZE_FORMAT_HEX macro undefined in gtest
  • 78c18cf: 8349399: GHA: Add static-jdk build on linux-x64
  • e43960a: 8350616: Skip ValidateHazardPtrsClosure in non-debug builds
  • 9ec4696: 8350313: Include timings for leaving safepoint in safepoint logging
  • ec6624b: 8350649: Class unloading accesses/resurrects dead Java mirror after JDK-8346567
  • 9477c70: 8024695: new File("").exists() returns false whereas it is the current working directory
  • ... and 41 more: https://git.openjdk.org/jdk/compare/65f79c145b7b1b32ed064a37ad4d2b6aca935a4c...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Feb 27, 2025
@openjdk openjdk bot closed this Feb 27, 2025
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Feb 27, 2025
@openjdk
Copy link

openjdk bot commented Feb 27, 2025

@shipilev @pengxiaolong Pushed as commit 01bd7e4.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hotspot-gc hotspot-gc-dev@openjdk.org integrated Pull request has been integrated shenandoah shenandoah-dev@openjdk.org

Development

Successfully merging this pull request may close these issues.

4 participants