Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8326385: [aarch64] C2: lightweight locking nodes kill the box register without specifying this effect #18183

Closed

Conversation

robcasloz
Copy link
Contributor

@robcasloz robcasloz commented Mar 11, 2024

This changeset introduces a third TEMP register for the intermediate computations in the cmpFastLockLightweight and cmpFastUnlockLightweight aarch64 ADL instructions, instead of using the legacy box register. This prevents potential overwrites, and consequent erroneous uses, of box.

Introducing a new TEMP seems conceptually simpler (and not necessarily worse from a performance perspective) than pre-assigning box an arbitrary register and marking it as USE_KILL, an alternative also suggested in the JBS issue description. Compared to mainline, the changeset does not lead to any statistically significant regression in a set of locking-intensive benchmarks from DaCapo, Renaissance, SPECjvm2008, and SPECjbb2015.

Testing

  • tier1-7 (linux-aarch64 and macosx-aarch64) with -XX:LockingMode=2.

Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8326385: [aarch64] C2: lightweight locking nodes kill the box register without specifying this effect (Bug - P3)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/18183/head:pull/18183
$ git checkout pull/18183

Update a local copy of the PR:
$ git checkout pull/18183
$ git pull https://git.openjdk.org/jdk.git pull/18183/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 18183

View PR using the GUI difftool:
$ git pr show -t 18183

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/18183.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Mar 11, 2024

👋 Welcome back rcastanedalo! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Mar 11, 2024

@robcasloz The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Mar 11, 2024
@robcasloz robcasloz marked this pull request as ready for review March 11, 2024 10:43
@openjdk openjdk bot added the rfr Pull request is ready for review label Mar 11, 2024
@mlbridge
Copy link

mlbridge bot commented Mar 11, 2024

Webrevs

Copy link
Member

@xmas92 xmas92 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm.

  • tier1-7 (linux-aarch64 and macosx-x64) with -XX:LockingMode=2.

Guess it meant to say macosx-aarch64

@openjdk
Copy link

openjdk bot commented Mar 11, 2024

@robcasloz This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8326385: [aarch64] C2: lightweight locking nodes kill the box register without specifying this effect

Reviewed-by: aboldtch, dlong

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 67 new commits pushed to the master branch:

  • 22f10e0: 8327856: Convert applet test SpanishDiacriticsTest.java to a main program
  • 7283c8b: 8327972: Convert java/awt/FileDialog/SaveFileNameOverrideTest/SaveFileNameOverrideTest.html applet test to main
  • 30249c4: 8327838: Convert java/awt/FileDialog/MultipleMode/MultipleMode.html applet test to main
  • 94b4ed5: 8327096: (fc) java/nio/channels/FileChannel/Size.java fails on partition incapable of creating large files
  • b9c3dc3: 8327738: Remove unused internal method sun.n.w.p.h.HttpURLConnection.setDefaultAuthenticator
  • 5b41466: 8327729: Remove deprecated xxxObject methods from jdk.internal.misc.Unsafe
  • 313e814: 8324682: Remove redefinition of NULL for XLC compiler
  • 8a3bdd5: 8327995: Remove unused Unused_Variable
  • 201042f: 8327487: Further augment WorstCaseTests with more cases
  • 379ad1f: 8312444: Delete unused parameters and variables in SocketPermission
  • ... and 57 more: https://git.openjdk.org/jdk/compare/f54e59835492e86b9178b2050901579707f41100...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Mar 11, 2024
@robcasloz
Copy link
Contributor Author

lgtm.

Thanks for reviewing, Axel!

  • tier1-7 (linux-aarch64 and macosx-x64) with -XX:LockingMode=2.

Guess it meant to say macosx-aarch64

Right, good catch, updated in the description.

Copy link
Contributor

@rkennke rkennke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've got a question.
Also, what about the other arches?

@@ -16019,33 +16019,33 @@ instruct cmpFastUnlock(rFlagsReg cr, iRegP object, iRegP box, iRegPNoSp tmp, iRe
ins_pipe(pipe_serial);
%}

instruct cmpFastLockLightweight(rFlagsReg cr, iRegP object, iRegP box, iRegPNoSp tmp, iRegPNoSp tmp2)
instruct cmpFastLockLightweight(rFlagsReg cr, iRegP object, iRegP box, iRegPNoSp tmp, iRegPNoSp tmp2, iRegPNoSp tmp3)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to specify the box register at all, if we never use it? It means that the register allocator assigns an actual register to it, right? This could be a problem in workloads that are both locking-intensive and with high register pressure. You may just not see it with dacapo, etc, because aarch64 has so many registers to begin with.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is, unfortunately, no ADL construction to specify that an operand such as box is not used at all. Yes, the register allocator will assign a register to box. Avoiding this would require fairly intrusive changes in C2 (add lightweight locking-specific, single-input versions of the FastLock and FastUnlock nodes and adapt all the C2 logic that deals with them), which I think would be best addressed in a separate RFE (assuming the additional register pressure is a problem in practice). Furthermore, to my understanding, the box operand is likely to be needed again in the context of Lilliput's OMWorld sub-project.

@xmas92
Copy link
Member

xmas92 commented Mar 11, 2024

Also, what about the other arches?

RISC-V and x64/x86 both binds the box to a specific register so it can be (and is) specified as USE_KILL.

PPC64 (and aarch64 after this pr) uses an extra register allocation, and does not kill the box.

Do we need to specify the box register at all, if we never use it?

I believe that would require rewriting large parts of the C2 FastLockNode. It is modelled as a CmpNode.

%{
predicate(LockingMode == LM_LIGHTWEIGHT);
match(Set cr (FastLock object box));
effect(TEMP tmp, TEMP tmp2);
effect(TEMP tmp, TEMP tmp2, TEMP tmp3);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use box as the temp instead of intruducing a separate temp?

Suggested change
effect(TEMP tmp, TEMP tmp2, TEMP tmp3);
effect(TEMP tmp, TEMP tmp2, TEMP box);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making an input TEMP will crash inside C2 when doing register allocation.

assert(opcnt < numopnds) failed: Accessing non-existent operand

V  [libjvm.dylib+0xcc650c]  MachNode::in_RegMask(unsigned int) const+0x1f0
V  [libjvm.dylib+0x3d136c]  PhaseChaitin::gather_lrg_masks(bool)+0x1130
V  [libjvm.dylib+0x3cef9c]  PhaseChaitin::Register_Allocate()+0x150
V  [libjvm.dylib+0x4c3860]  Compile::Code_Gen()+0x1f4
V  [libjvm.dylib+0x4c17a4]  Compile::Compile(ciEnv*, ciMethod*, int, Options, DirectiveSet*)+0x1388
V  [libjvm.dylib+0x38abf0]  C2Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0x1e0
V  [libjvm.dylib+0x4df028]  CompileBroker::invoke_compiler_on_method(CompileTask*)+0x854
V  [libjvm.dylib+0x4de46c]  CompileBroker::compiler_thread_loop()+0x348
V  [libjvm.dylib+0x8c10e4]  JavaThread::thread_main_inner()+0x1dc
V  [libjvm.dylib+0x117f7f8]  Thread::call_run()+0xf4
V  [libjvm.dylib+0xe53724]  thread_native_entry(Thread*)+0x138
C  [libsystem_pthread.dylib+0x7034]  _pthread_start+0x88

Maybe this can be resolved and support for TEMP input registers can be added.

@robcasloz
Copy link
Contributor Author

Thanks for reviewing, Axel and Deal! And thanks Axel for trying out Dean's suggestion!

@robcasloz
Copy link
Contributor Author

/integrate

@openjdk
Copy link

openjdk bot commented Mar 13, 2024

Going to push as commit 07acc0b.
Since your change was applied there have been 73 commits pushed to the master branch:

  • cc9a8ab: 8327460: Compile tests with the same visibility rules as product code
  • 3b18c5d: 8323605: Java source launcher should not require --source ... to enable preview
  • 5d4bfad: 8327693: C1: LIRGenerator::_instruction_for_operand is only read by assertion code
  • f3d0c45: 8327829: [JVMCI] runtime/ClassUnload/ConstantPoolDependsTest.java fails on libgraal
  • d5b95a0: 8327631: Update IANA Language Subtag Registry to Version 2024-03-07
  • 966a42f: 8324868: debug agent does not properly handle interrupts of a virtual thread
  • 22f10e0: 8327856: Convert applet test SpanishDiacriticsTest.java to a main program
  • 7283c8b: 8327972: Convert java/awt/FileDialog/SaveFileNameOverrideTest/SaveFileNameOverrideTest.html applet test to main
  • 30249c4: 8327838: Convert java/awt/FileDialog/MultipleMode/MultipleMode.html applet test to main
  • 94b4ed5: 8327096: (fc) java/nio/channels/FileChannel/Size.java fails on partition incapable of creating large files
  • ... and 63 more: https://git.openjdk.org/jdk/compare/f54e59835492e86b9178b2050901579707f41100...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Mar 13, 2024
@openjdk openjdk bot closed this Mar 13, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Mar 13, 2024
@openjdk
Copy link

openjdk bot commented Mar 13, 2024

@robcasloz Pushed as commit 07acc0b.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler hotspot-compiler-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

4 participants