Skip to content

Conversation

stefank
Copy link
Member

@stefank stefank commented May 21, 2025

Before starting the relocation phase of a major collection we remap all pointers into the young generation so that we can disambiguate when an oop has bad bits for both the young generation and the old generation. See comment in remap_young_roots.

One part of this is requires us to visit all old pages. To parallelize that part we have a class that distribute indices to the page table to the GC worker threads (See ZIndexDistributor).

While looking into a potential, minor performance regression on Windows I noticed that the usage of constexpr in ZIndexDistributorClaimTree wasn't giving us the inlining we hoped for, which caused a noticeable worse performance on Windows compared to the other platforms. I created a patch for this that gave us the expected inlining. See master...stefank:jdk:8357443_zgc_optimize_remap_remembered

While thinking about this a bit more I realized that we could use the "found old" optimization that we already use for the remset scanning. This finds the old pages without scanning the entire page table. This gives a significant enough boost that I propose that we do that instead.

This mainly lowers the Major Collection times when you run a GC without any significant amount of objects in the old generation. So, most likely mostly important for micro benchmarks and small workloads.

The below is the average time (ms) of the Concurrent Remap Roots phase from only running System.gc() 50 times before and after this PR.

4 GB MaxHeapSize
                    Original       Patch
Default threads

mac:                0.27812        0.0507
win:                0.9485         0.10452
linux-x64:          0.53858        0.092
linux-x64 NUMA:     0.89974        0.15452
linux-aarch64:      0.32574        0.15832

4 threads

mac:                0.19112        0.04916
win:                0.83346        0.08796
linux-x64:          0.57692        0.09526
linux-x64 NUMA:     1.23684        0.17008
linux-aarch64:      0.334          0.21918

1 thread:

mac:                0.19678        0.0589
win:                1.96496        0.09928
linux-x64:          1.00788        0.1381
linux-x64 NUMA:     2.77312        0.21134
linux-aarch64:      0.63696        0.31286

The second set of data is from using the extreme end of the supported heap size. This mimics how we previously used to have a large page table even for smaller heap size (we don't do that anymore for JDK 25). It shows a quite significant difference, but it also will likely be in the noise when running larger workloads.

16 TB MaxHeapSize
                    Original       Patch
Default threads

mac:                11.4903        0.11098
win:                54.3666        0.37164
linux-x64:          18.0898        0.21094
linux-x64 NUMA:     26.9786        0.46134
linux-aarch64:      20.7151        0.32846

4 threads

mac:                6.4035         0.10096
win:                89.5496        0.32178
linux-x64:          27.883         0.2053
linux-x64 NUMA:     35.5636        0.30928
linux-aarch64:      15.4857        0.32004

1 thread:

mac:                21.2717        0.1275
win:                307.155        0.3361
linux-x64:          62.5843        0.2309
linux-x64 NUMA:     92.0048        0.3798
linux-aarch64:      61.0375        0.42458

This change removes the last usage of ZIndexDistributor. I don't know if we want to remove it, or leave it in case we need it for any of our upcoming features.

I've run this through tier1-7.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8357443: ZGC: Optimize old page iteration in remap remembered phase (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/25345/head:pull/25345
$ git checkout pull/25345

Update a local copy of the PR:
$ git checkout pull/25345
$ git pull https://git.openjdk.org/jdk.git pull/25345/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 25345

View PR using the GUI difftool:
$ git pr show -t 25345

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/25345.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented May 21, 2025

👋 Welcome back stefank! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented May 21, 2025

@stefank This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8357443: ZGC: Optimize old page iteration in remap remembered phase

Reviewed-by: aboldtch, eosterlund

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 150 new commits pushed to the master branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the rfr Pull request is ready for review label May 21, 2025
@openjdk
Copy link

openjdk bot commented May 21, 2025

@stefank The following label will be automatically applied to this pull request:

  • hotspot-gc

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-gc hotspot-gc-dev@openjdk.org label May 21, 2025
@mlbridge
Copy link

mlbridge bot commented May 21, 2025

Webrevs

Copy link
Member

@xmas92 xmas92 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. Very nice to use information we are already tracking rather than walking everything.

This change removes the last usage of ZIndexDistributor. I don't know if we want to remove it, or leave it in case we need it for any of our upcoming features.

It is probably nice too at least keep our page table iterators in the code base, so you do not have to go dig them up / do something ad hoc if you ever want to check something. Whether they need ZIndexDistributor or not is another question.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label May 21, 2025
Co-authored-by: Axel Boldt-Christmas <xmas1915@gmail.com>
@openjdk openjdk bot removed the ready Pull request is ready to be integrated label May 21, 2025
@openjdk openjdk bot added the ready Pull request is ready to be integrated label May 21, 2025
Copy link
Member Author

@stefank stefank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reviewing!

Copy link
Contributor

@fisk fisk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@stefank
Copy link
Member Author

stefank commented Jun 4, 2025

We decided to leave the ZIndexDistributor infrastructure in-place including the small additional extension to the page table. We might make some change in JDK 26, or later, to convert the distributor to work over the extended range internally but only hand out indices in the requested range. That way we could get rid of the current over head and still keep the current implementation.

I've merged locally and run that true a large set of our testing so I'm going integrate this now.

Thanks for all the reviews!

/integrate

@openjdk
Copy link

openjdk bot commented Jun 4, 2025

Going to push as commit c909216.
Since your change was applied there have been 304 commits pushed to the master branch:

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Jun 4, 2025
@openjdk openjdk bot closed this Jun 4, 2025
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Jun 4, 2025
@openjdk
Copy link

openjdk bot commented Jun 4, 2025

@stefank Pushed as commit c909216.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hotspot-gc hotspot-gc-dev@openjdk.org integrated Pull request has been integrated

Development

Successfully merging this pull request may close these issues.

3 participants