Skip to content

8292059: Do not inline InstanceKlass::allocate_instance() #12782

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 7 commits into from

Conversation

afshin-zafari
Copy link
Contributor

@afshin-zafari afshin-zafari commented Feb 28, 2023

The inline and not-inline versions of the method is tested to compare the performance difference.

Test

make test TEST=micro:Capture0.lambda_01 MICRO="VM_OPTIONS=-XX:TieredStopAtLevel=1"


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8292059: Do not inline InstanceKlass::allocate_instance()

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/12782/head:pull/12782
$ git checkout pull/12782

Update a local copy of the PR:
$ git checkout pull/12782
$ git pull https://git.openjdk.org/jdk pull/12782/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 12782

View PR using the GUI difftool:
$ git pr show -t 12782

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/12782.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Feb 28, 2023

👋 Welcome back afshin-zafari! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Feb 28, 2023
@openjdk
Copy link

openjdk bot commented Feb 28, 2023

@afshin-zafari The following label will be automatically applied to this pull request:

  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot hotspot-dev@openjdk.org label Feb 28, 2023
@mlbridge
Copy link

mlbridge bot commented Feb 28, 2023

@cl4es
Copy link
Member

cl4es commented Feb 28, 2023

That's an odd graph. If everything is measuring total time (in the different phases) then a bar chart with bars originating from zero would be less confusing. Labeling units of measure would also help.

What workloads has been used to collect the statistics?

@afshin-zafari
Copy link
Contributor Author

New chart is uploaded. Vertical axis unit is milliseconds.

@cl4es
Copy link
Member

cl4es commented Feb 28, 2023

So if I'm reading this correctly then by removing the inlining we see a reduction in performance on some GC metrics on some unspecified workload by 5-10%? I'd say that's a good data point in favor of keeping this as-is. @iklam might want to comment on this, but it seems to me this might be a poor trade-off unless there are compelling wins elsewhere (binary size reduction, dramatic compilation time reductions)

@coleenp
Copy link
Contributor

coleenp commented Mar 1, 2023

I don't know about these numbers but if this is not a neutral change for performance based on looking at the code and callers, I don't know what is.

@cl4es
Copy link
Member

cl4es commented Mar 1, 2023

I'm also skeptical as to the relevance of these numbers. allocate_instance(oop, TRAPS) is nowhere near GC code, so my guess is that there's run-to-run variation on whatever the workload is here.

I'm not against removing this inlining. I did it as a minor part of another startup optimization after checking that the inlining was size-neutral and noticing that this meant a speed-up in interpreter and C1 for some relatively common JNI calls. It might still be marginally beneficial there, but if there's a measurable impact on compilation time and header complexity then who am I to object.

Copy link
Contributor

@coleenp coleenp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have one suggested change but you should point out that the parameter change is an optimization that replaces some of the inlined benefits.

Copy link
Member

@stefank stefank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a lot of changes to includes that needs to be restored / cleaned. I've marked them below:

Copy link
Member

@stefank stefank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the motivation to change the parameter from oop java_class to InstanceKlass*? The call sites are now much noisier and harder to read.

@afshin-zafari
Copy link
Contributor Author

What is the motivation to change the parameter from oop java_class to InstanceKlass*? The call sites are now much noisier and harder to read.

This is changed to gain performance when this function is called many times.

@stefank
Copy link
Member

stefank commented Mar 9, 2023

What is the motivation to change the parameter from oop java_class to InstanceKlass*? The call sites are now much noisier and harder to read.

This is changed to gain performance when this function is called many times.

Could you explain how that helps the performance?

@coleenp
Copy link
Contributor

coleenp commented Mar 9, 2023

There was a small performance difference (loss) with making the static InstanceKlass::allocate_instance function not inlined. The experiment to effectively inline the conversion from oop java_class to InstanceKlass regained this performance in Afshin's testing. I agree the parameter change is messy and with Claes that we should have a utility function - in instanceKlass.inline.hpp. Maybe call it from_class like:

InstanceKlass* InstanceKlass::from_class(oop java_class) { ... }

@stefank
Copy link
Member

stefank commented Mar 9, 2023

That's really surprising. I also don't see how any of the proposed changes could affect the GC so much. This makes me suspicious of the performance claims.

Could you redo the benchmarking and give us more information about:

  1. What benchmarks were run
  2. What was the benchmarks scores and GC metrics
  3. What was the run-to-run variance in the scores and metrics

@cl4es
Copy link
Member

cl4es commented Mar 10, 2023

That's really surprising. I also don't see how any of the proposed changes could affect the GC so much. This makes me suspicious of the performance claims.

Could you redo the benchmarking and give us more information about:

  1. What benchmarks were run
  2. What was the benchmarks scores and GC metrics
  3. What was the run-to-run variance in the scores and metrics

FWIW we had a discussion about this earlier and I'm as skeptical as you about the results listed in this PRs description. I also pointed to the microbenchmark - ./test/micro/org/openjdk/bench/vm/lambda/capture/Capture0.java - which I used originally to motivate the move of allocate_instance(oop, TRAPS) to inline.hpp, ran the numbers and saw that there's no longer any discernible effect from moving it back to .cpp:

Benchmark           Mode  Cnt   Score   Error  Units
Capture0.lambda_01  avgt   12  65,492 ± 1,675  ns/op # master
Capture0.lambda_01  avgt   12  65,022 ± 1,241  ns/op # pr/12782

This was before the oop -> InstanceKlass* change. @afshin-zafari then ran that experiment and saw a small win on that microbenchmark from doing parameter change. While the win is probably too small to matter I didn't find it objectionable since it seems reasonable to move oop conversions earlier.

Copy link
Contributor

@coleenp coleenp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to revert the changes in jni.cpp and instanceKlass.hpp.

@afshin-zafari
Copy link
Contributor Author

You need to revert the changes in jni.cpp and instanceKlass.hpp.

They are reverted.

Copy link
Member

@stefank stefank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't oppose this change.

@openjdk
Copy link

openjdk bot commented Mar 15, 2023

@afshin-zafari This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8292059: Do not inline InstanceKlass::allocate_instance()

Reviewed-by: coleenp, stefank

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 338 new commits pushed to the master branch:

  • 9d518c5: 8299375: [PPC64] GetStackTraceSuspendedStressTest tries to deoptimize frame with invalid fp
  • ebac7ee: 8304063: tools/jpackage/share/AppLauncherEnvTest.java fails when checking LD_LIBRARY_PATH
  • 36995c5: 8304242: CPUInfoTest fails because "serialize" CPU feature is not known
  • 6b42275: 7154070: in SwingSet2, switching between LaFs it's easy to lose JTable dividers
  • 8eed7de: 8304146: Refactor VisibleMemberTable (LocalMemberTable)
  • a487a27: 8303150: DCmd framework unnecessarily creates a DCmd instance on registration
  • 2e987d7: 8304360: Test to ensure ConstantDescs fields work
  • 2f23c80: 8304225: Remove javax/script/Test7.java from ProblemList
  • 96889bf: 8191565: Last-ditch Full GC should also move humongous objects
  • f629152: 8304055: G1: Remove OldGCAllocRegion::release
  • ... and 328 more: https://git.openjdk.org/jdk/compare/60e637892576792f663a25b8a949e39c29accd47...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@coleenp, @stefank) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Mar 15, 2023
Copy link
Contributor

@coleenp coleenp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great. This looks good. We'll wait for GHA and integrate.

@afshin-zafari
Copy link
Contributor Author

Thank you all reviewers for your comments on this PR.
/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Mar 17, 2023
@openjdk
Copy link

openjdk bot commented Mar 17, 2023

@afshin-zafari
Your change (at version da1a133) is now ready to be sponsored by a Committer.

@coleenp
Copy link
Contributor

coleenp commented Mar 17, 2023

/sponsor

@openjdk
Copy link

openjdk bot commented Mar 17, 2023

Going to push as commit cb4ae19.
Since your change was applied there have been 340 commits pushed to the master branch:

  • b2639e1: 8304164: jdk/classfile/CorpusTest.java still fails after JDK-8303910
  • 620564a: 8304130: Add runtime/StackGuardPages/TestStackGuardPagesNative.java to ProblemList.txt
  • 9d518c5: 8299375: [PPC64] GetStackTraceSuspendedStressTest tries to deoptimize frame with invalid fp
  • ebac7ee: 8304063: tools/jpackage/share/AppLauncherEnvTest.java fails when checking LD_LIBRARY_PATH
  • 36995c5: 8304242: CPUInfoTest fails because "serialize" CPU feature is not known
  • 6b42275: 7154070: in SwingSet2, switching between LaFs it's easy to lose JTable dividers
  • 8eed7de: 8304146: Refactor VisibleMemberTable (LocalMemberTable)
  • a487a27: 8303150: DCmd framework unnecessarily creates a DCmd instance on registration
  • 2e987d7: 8304360: Test to ensure ConstantDescs fields work
  • 2f23c80: 8304225: Remove javax/script/Test7.java from ProblemList
  • ... and 330 more: https://git.openjdk.org/jdk/compare/60e637892576792f663a25b8a949e39c29accd47...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Mar 17, 2023
@openjdk openjdk bot closed this Mar 17, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Mar 17, 2023
@openjdk
Copy link

openjdk bot commented Mar 17, 2023

@coleenp @afshin-zafari Pushed as commit cb4ae19.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot hotspot-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

4 participants