Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8277180: Intrinsify recursive ObjectMonitor locking for C2 x64 and A64 #6406

Closed
wants to merge 3 commits into from

Conversation

fisk
Copy link
Contributor

@fisk fisk commented Nov 16, 2021

The C2 fast_lock and fast_unlock intrinsics don't support recursive ObjectMonitor locking. Some workloads can significantly benefit from this. Recent ObjectMonitor work has changed heuristics such that ObjectMonitors are deflated less aggressively. Therefore we can expect to see more inflated monitors in workloads where we would usually see more stack locks. That in itself is fine, except that C2 doesn't intrinsify the recursive locking paths for object monitors. Enabling those cases in the C2 code, removes a (~17%) regression we have seen with DaCapo h2 -t 1, and makes a few more benchmarks happy as well.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8277180: Intrinsify recursive ObjectMonitor locking for C2 x64 and A64

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/6406/head:pull/6406
$ git checkout pull/6406

Update a local copy of the PR:
$ git checkout pull/6406
$ git pull https://git.openjdk.java.net/jdk pull/6406/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 6406

View PR using the GUI difftool:
$ git pr show -t 6406

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/6406.diff

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Nov 16, 2021

👋 Welcome back eosterlund! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr label Nov 16, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Nov 16, 2021

@fisk The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-compiler label Nov 16, 2021
@mlbridge
Copy link

@mlbridge mlbridge bot commented Nov 16, 2021

Webrevs

__ br(Assembler::NE, cont);

__ cmp(disp_hdr, (u1)0);
__ br(Assembler::EQ, notRecursive);
Copy link
Member

@nick-arm nick-arm Nov 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can replace these two with a single __ cbz(disp_hdr, notRecursive) and avoid clobbering the flags.

Copy link
Contributor Author

@fisk fisk Nov 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a good idea. BTW note that in the unlocking path for AArch64 there is an ownership check, while in the x86_64 code there is only a comment saying we definitely need one of those, but it doesn't actually check the owner. @dholmes-ora did some digging and it seems like this was previously controlled by some ancient sync flag that isn't around anymore. It would only exist to check for unbalanced JNI locking, and the JNI spec kind of says you shouldn't do that - that's a programmer error. So it seems like just not doing the ownership check is totally fine, and seems to yield 10% better performance in some workloads where there is contended locking. But I don't want to remove that check as part of this change - just something to keep in mind for a future RFE.

__ ldr(tmp, Address(disp_hdr, ObjectMonitor::recursions_offset_in_bytes() - markWord::monitor_value));
__ add(tmp, tmp, 1u);
__ str(tmp, Address(disp_hdr, ObjectMonitor::recursions_offset_in_bytes() - markWord::monitor_value));
Copy link
Contributor

@theRealAph theRealAph Nov 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
__ ldr(tmp, Address(disp_hdr, ObjectMonitor::recursions_offset_in_bytes() - markWord::monitor_value));
__ add(tmp, tmp, 1u);
__ str(tmp, Address(disp_hdr, ObjectMonitor::recursions_offset_in_bytes() - markWord::monitor_value));
__ increment(Address(disp_hdr, ObjectMonitor::recursions_offset_in_bytes() - markWord::monitor_value));

Copy link
Contributor Author

@fisk fisk Nov 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The increment macro doesn't seem to utilize the fact that 1u can be encoded as an immediate to the add instruction. So it seems to generate worse code here. I'm okay with changing to increment anyway if you prefer that.

Copy link
Contributor

@theRealAph theRealAph Nov 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The increment macro doesn't seem to utilize the fact that 1u can be encoded as an immediate to the add instruction.

Sure it does. Try it. If it doesn't, we'll change increment()! 😁

Copy link
Contributor Author

@fisk fisk Nov 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah look at that. I disassembled it and it did the right thing. Thanks for the suggestion.

Copy link
Contributor

@theRealAph theRealAph left a comment

Normally I would hate any code added to our hand-carved assembler sequences, but even I have to admit that this surprisingly simple addition is worthwhile.

@openjdk
Copy link

@openjdk openjdk bot commented Nov 17, 2021

@fisk This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8277180: Intrinsify recursive ObjectMonitor locking for C2 x64 and A64

Reviewed-by: aph, ngasson

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 152 new commits pushed to the master branch:

  • b8453eb: 8275007: Java fails to start with null charset if LC_ALL is set to certain locales
  • 231fb61: 8276970: Default charset for PrintWriter that wraps PrintStream
  • 29e552c: 8272358: Some tests may fail when executed with other locales than the US
  • ce4471f: 8277346: ProblemList 7 serviceability/sa tests on macosx-x64
  • 45a60db: 8277045: G1: Remove unnecessary set_concurrency call in G1ConcurrentMark::weak_refs_work
  • 6bb0462: 8277224: sun.security.pkcs.PKCS9Attributes.toString() throws NPE
  • d8c0280: 8277316: ciReplay: dump_replay_data is not thread-safe
  • 007ad7c: 8277303: Terminology mismatch between JLS17-3.9 and SE17's javax.lang.model.SourceVersion method specs
  • 8881f29: 8277310: ciReplay: @CPI MethodHandle references not resolved
  • 262d070: 8277246: Check for NonRepudiation as well when validating a TSA certificate
  • ... and 142 more: https://git.openjdk.java.net/jdk/compare/a74a839af02446d322d77c6e546e652ec6ad5d73...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready label Nov 17, 2021
@fisk
Copy link
Contributor Author

@fisk fisk commented Nov 17, 2021

Thanks for the review @theRealAph and @nick-arm
Think I need someone to review the x86 code as well.

Copy link
Member

@nick-arm nick-arm left a comment

AArch64 changes LGTM.

@fisk
Copy link
Contributor Author

@fisk fisk commented Nov 18, 2021

Any takers for the x86_64 code?

@theRealAph
Copy link
Contributor

@theRealAph theRealAph commented Nov 18, 2021

Any takers for the x86_64 code?

Sure, and as far as I know no-one took away my x86 programmer's badge yet. LGTM.

@fisk
Copy link
Contributor Author

@fisk fisk commented Nov 18, 2021

Any takers for the x86_64 code?

Sure, and as far as I know no-one took away my x86 programmer's badge yet. LGTM.

Thanks Andrew. I think we can trust your x86 skills as well. :-)

@fisk
Copy link
Contributor Author

@fisk fisk commented Nov 18, 2021

/integrate

@openjdk
Copy link

@openjdk openjdk bot commented Nov 18, 2021

Going to push as commit d93b238.
Since your change was applied there have been 163 commits pushed to the master branch:

  • 00c388b: 8259643: ZGC can return metaspace OOM prematurely
  • a44b45f: 4337793: Mark non-serializable fields of java.security.cert.Certificate and CertPath
  • b3a62b4: 8276795: Deprecate seldom used CDS flags
  • 38345bd: 8277137: Set OnSpinWaitInst/OnSpinWaitInstCount defaults to "isb"/1 for Arm Neoverse N1
  • 2c06bca: 8266368: Inaccurate after_unwind hook in C2 exception handler
  • 77cc508: 8277215: Remove redundancy in ReferenceProcessor constructor args
  • 0a65e8b: 8276794: Change nested classes in java.desktop to static nested classes
  • db55f92: 8277343: dynamicArchive/SharedArchiveFileOption.java failed: '-XX:+RecordDynamicDumpInfo is unsupported when a dynamic CDS archive is specified in -XX:SharedArchiveFile:' missing
  • 2f4b540: 8276314: [JVMCI] check alignment of call displacement during code installation
  • 9160743: 8276058: Some swing test fails on specific CI macos system
  • ... and 153 more: https://git.openjdk.java.net/jdk/compare/a74a839af02446d322d77c6e546e652ec6ad5d73...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot closed this Nov 18, 2021
@openjdk openjdk bot added integrated and removed ready rfr labels Nov 18, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Nov 18, 2021

@fisk Pushed as commit d93b238.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler integrated
3 participants