Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8326541: [AArch64] ZGC C2 load barrier stub should consider the length of live registers when spilling registers #17977

Closed
wants to merge 4 commits into from

Conversation

JoshuaZhuwj
Copy link
Member

@JoshuaZhuwj JoshuaZhuwj commented Feb 23, 2024

Currently ZGC C2 load barrier stub saves the whole live register regardless of what size of register is live on aarch64.
Considering the size of SVE register is an implementation-defined multiple of 128 bits, up to 2048 bits,
even the use of a floating point may cause the maximum 2048 bits stack occupied.
Hence I would like to introduce this change on aarch64: take the length of live registers into consideration in ZGC C2 load barrier stub.

In a floating point case on 2048 bits SVE machine, the following ZLoadBarrierStubC2

  ......
  0x0000ffff684cfad8:   stp     x15, x18, [sp, #80]
  0x0000ffff684cfadc:   sub     sp, sp, #0x100
  0x0000ffff684cfae0:   str     z16, [sp]
  0x0000ffff684cfae4:   add     x1, x13, #0x10
  0x0000ffff684cfae8:   mov     x0, x16
 ;; 0xFFFF803F5414
  0x0000ffff684cfaec:   mov     x8, #0x5414                     // #21524
  0x0000ffff684cfaf0:   movk    x8, #0x803f, lsl #16
  0x0000ffff684cfaf4:   movk    x8, #0xffff, lsl #32
  0x0000ffff684cfaf8:   blr     x8
  0x0000ffff684cfafc:   mov     x16, x0
  0x0000ffff684cfb00:   ldr     z16, [sp]
  0x0000ffff684cfb04:   add     sp, sp, #0x100
  0x0000ffff684cfb08:   ptrue   p7.b
  0x0000ffff684cfb0c:   ldp     x4, x5, [sp, #16]
  ......

could be optimized into:

  ......  
  0x0000ffff684cfa50:   stp     x15, x18, [sp, #80]
  0x0000ffff684cfa54:   str     d16, [sp, #-16]!                   // extra 8 bytes to align 16 bytes in push_fp()
  0x0000ffff684cfa58:   add     x1, x13, #0x10
  0x0000ffff684cfa5c:   mov     x0, x16
 ;; 0xFFFF7FA942A8
  0x0000ffff684cfa60:   mov     x8, #0x42a8                     // #17064
  0x0000ffff684cfa64:   movk    x8, #0x7fa9, lsl #16
  0x0000ffff684cfa68:   movk    x8, #0xffff, lsl #32
  0x0000ffff684cfa6c:   blr     x8
  0x0000ffff684cfa70:   mov     x16, x0
  0x0000ffff684cfa74:   ldr     d16, [sp], #16
  0x0000ffff684cfa78:   ptrue   p7.b
  0x0000ffff684cfa7c:   ldp     x4, x5, [sp, #16]
  ......

Besides the above benefit, when we know what size of register is live,
we could remove the unnecessary caller save in ZGC C2 load barrier stub when we meet C-ABI SOE fp registers.

Passed jtreg with option "-XX:+UseZGC -XX:+ZGenerational" with no failures introduced.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8326541: [AArch64] ZGC C2 load barrier stub should consider the length of live registers when spilling registers (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/17977/head:pull/17977
$ git checkout pull/17977

Update a local copy of the PR:
$ git checkout pull/17977
$ git pull https://git.openjdk.org/jdk.git pull/17977/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 17977

View PR using the GUI difftool:
$ git pr show -t 17977

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/17977.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Feb 23, 2024

👋 Welcome back jzhu! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Feb 23, 2024
@openjdk
Copy link

openjdk bot commented Feb 23, 2024

@JoshuaZhuwj The following label will be automatically applied to this pull request:

  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot hotspot-dev@openjdk.org label Feb 23, 2024
@mlbridge
Copy link

mlbridge bot commented Feb 23, 2024

Webrevs

@stooart-mon
Copy link
Contributor

That is a welcome change - in light of possibly very large registers, we don't want to save more than is necessary.
It looks OK, but have you run a specific test that demonstrates it working?

@JoshuaZhuwj
Copy link
Member Author

@stooart-mon Thanks for your review!

That is a welcome change - in light of possibly very large registers, we don't want to save more than is necessary. It looks OK, but have you run a specific test that demonstrates it working?

Yes. I wrote several cases to test against this commit to ensure the quality.

TestZGCSpillingAtLoadBarrierStub.java

Assembly and OptoAssembly outputs before the change: https://github.com/JoshuaZhuwj/openjdk_cases/blob/master/8326541/output_before_change.log

Outputs after the change:
https://github.com/JoshuaZhuwj/openjdk_cases/blob/master/8326541/output_after_change.log

@stooart-mon
Copy link
Contributor

Thanks, that helps - I can see you're saving/restoring the correct register lengths. Would it be possible to generate a testcase to test that registers are being saved/restored correctly?

The following is a testcase that is an example of where this testing is done, although in this PR's case it isn't subroutines, but load/store barriers:

4cd3187#diff-949a4a2f889be36be47e9b02b6d6cd1247768953b95a024f649878bac721fa04

@JoshuaZhuwj
Copy link
Member Author

Thanks, that helps - I can see you're saving/restoring the correct register lengths. Would it be possible to generate a testcase to test that registers are being saved/restored correctly?

@stooart-mon I had previously thought about how to write a good test case, but I did not think of a good way at that time. Let me rethink how to handle this gracefully. Thanks :-)

@openjdk
Copy link

openjdk bot commented Mar 13, 2024

@JoshuaZhuwj This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8326541: [AArch64] ZGC C2 load barrier stub should consider the length of live registers when spilling registers

Reviewed-by: eosterlund, rcastanedalo

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 828 new commits pushed to the master branch:

  • 80b381e: 8329555: Crash in intrinsifying heap-based MemorySegment Vector store/loads
  • 7a89555: 8330844: Add aliases for conditional jumps and additional instruction forms for x86
  • f60798a: 8329222: java.text.NumberFormat (and subclasses) spec updates
  • 2555166: 8329113: Deprecate -XX:+UseNotificationThread
  • 09b8809: 8327289: Remove unused PrintMethodFlushingStatistics option
  • 9cc163a: 8330178: Clean up non-standard use of /** comments in java.base
  • 88a5dce: 8330805: ARM32 build is broken after JDK-8139457
  • 7157eea: 8327290: Remove unused notproduct option TraceInvocationCounterOverflow
  • b4cea70: 8330693: Generational ZGC: Simplify ZAddress::finalizable_good and ZAddress::mark_good
  • 412e306: 8329593: Drop adjustments to target parallelism when virtual threads do I/O on files opened for buffered I/O
  • ... and 818 more: https://git.openjdk.org/jdk/compare/c4409eafc418c1e7a4ca2a2a522b6855c70c0f8c...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@fisk, @robcasloz) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@JoshuaZhuwj
Copy link
Member Author

A jtreg test case is created for this commit.
Please let me know if any further comments.
Thanks.

@JoshuaZhuwj
Copy link
Member Author

/label add hotspot-compiler

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Mar 15, 2024
@openjdk
Copy link

openjdk bot commented Mar 15, 2024

@JoshuaZhuwj
The hotspot-compiler label was successfully added.

int expected_number_of_push_pop_at_load_barrier_fregs) throws Exception {
String keyString = keyword + expected_number_of_push_pop_at_load_barrier_fregs + " " + expected_freg_type + " registers";
if (!containOnlyOneOccuranceOfKeyword(stdout, keyString)) {
throw new RuntimeException("Stdout is expected to contain only one occurance of keyString: " + "'" + keyString + "'");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the event of failure, would it be possible to print the erroneous output? The output from the subprocesses, being directly piped in, doesn't lend itself to easy debugging. At first I thought there might be an option that could alter OutputAnalyzers output, but sadly not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the event of failure, would it be possible to print the erroneous output? The output from the subprocesses, being directly piped in, doesn't lend itself to easy debugging. At first I thought there might be an option that could alter OutputAnalyzers output, but sadly not.

Done. Thanks for your comments.

@JoshuaZhuwj
Copy link
Member Author

@stooart-mon Thanks for your review. Please let me know if you have any other comments.
@fisk I would appreciate it if you could share your comments on this change since it follows your previous work done for x86.

Copy link
Contributor

@fisk fisk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me and seems to follow a similar design to what I did on x86_64 vectors. Thanks for doing this!

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Apr 11, 2024
@JoshuaZhuwj
Copy link
Member Author

Thanks a lot for the review! @fisk

@JoshuaZhuwj JoshuaZhuwj changed the title 8326541: [AArch64] ZGC C2 load barrier stub considers the length of live registers when spilling registers 8326541: [AArch64] ZGC C2 load barrier stub should consider the length of live registers when spilling registers Apr 15, 2024
@JoshuaZhuwj
Copy link
Member Author

/label add hotspot-gc

@openjdk openjdk bot added the hotspot-gc hotspot-gc-dev@openjdk.org label Apr 15, 2024
@openjdk
Copy link

openjdk bot commented Apr 15, 2024

@JoshuaZhuwj
The hotspot-gc label was successfully added.

@JoshuaZhuwj
Copy link
Member Author

Waiting for another review.

robcasloz added a commit to robcasloz/jdk that referenced this pull request Apr 23, 2024
Copy link
Contributor

@robcasloz robcasloz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I also tested the changeset on Oracle's internal CI (ZGC tests within tiers 1-7, on Neon machines) with an additional patch (963def0) that forces ZGC read barriers to always take the slow path and clears all vector registers upon the slow path's runtime call. Testing succeeded.

@stooart-mon
Copy link
Contributor

Hello - I have no other comments - looks good.

@JoshuaZhuwj
Copy link
Member Author

Thank you a lot for the reviews! @stooart-mon @fisk @robcasloz

@JoshuaZhuwj
Copy link
Member Author

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Apr 24, 2024
@openjdk
Copy link

openjdk bot commented Apr 24, 2024

@JoshuaZhuwj
Your change (at version f2960eb) is now ready to be sponsored by a Committer.

@TobiHartmann
Copy link
Member

/sponsor

@openjdk
Copy link

openjdk bot commented Apr 24, 2024

Going to push as commit 5c38386.
Since your change was applied there have been 829 commits pushed to the master branch:

  • 438e643: 8329531: compiler/c2/irTests/TestIfMinMax.java fails with IRViolationException: There were one or multiple IR rule failures.
  • 80b381e: 8329555: Crash in intrinsifying heap-based MemorySegment Vector store/loads
  • 7a89555: 8330844: Add aliases for conditional jumps and additional instruction forms for x86
  • f60798a: 8329222: java.text.NumberFormat (and subclasses) spec updates
  • 2555166: 8329113: Deprecate -XX:+UseNotificationThread
  • 09b8809: 8327289: Remove unused PrintMethodFlushingStatistics option
  • 9cc163a: 8330178: Clean up non-standard use of /** comments in java.base
  • 88a5dce: 8330805: ARM32 build is broken after JDK-8139457
  • 7157eea: 8327290: Remove unused notproduct option TraceInvocationCounterOverflow
  • b4cea70: 8330693: Generational ZGC: Simplify ZAddress::finalizable_good and ZAddress::mark_good
  • ... and 819 more: https://git.openjdk.org/jdk/compare/c4409eafc418c1e7a4ca2a2a522b6855c70c0f8c...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Apr 24, 2024
@openjdk openjdk bot closed this Apr 24, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Apr 24, 2024
@openjdk
Copy link

openjdk bot commented Apr 24, 2024

@TobiHartmann @JoshuaZhuwj Pushed as commit 5c38386.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@JoshuaZhuwj JoshuaZhuwj deleted the 8326541 branch May 14, 2024 02:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot hotspot-dev@openjdk.org hotspot-compiler hotspot-compiler-dev@openjdk.org hotspot-gc hotspot-gc-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

5 participants