-
Notifications
You must be signed in to change notification settings - Fork 5.8k
8292059: Do not inline InstanceKlass::allocate_instance() #12782
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Welcome back afshin-zafari! A progress list of the required criteria for merging this PR into |
@afshin-zafari The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
That's an odd graph. If everything is measuring total time (in the different phases) then a bar chart with bars originating from zero would be less confusing. Labeling units of measure would also help. What workloads has been used to collect the statistics? |
New chart is uploaded. Vertical axis unit is milliseconds. |
So if I'm reading this correctly then by removing the inlining we see a reduction in performance on some GC metrics on some unspecified workload by 5-10%? I'd say that's a good data point in favor of keeping this as-is. @iklam might want to comment on this, but it seems to me this might be a poor trade-off unless there are compelling wins elsewhere (binary size reduction, dramatic compilation time reductions) |
I don't know about these numbers but if this is not a neutral change for performance based on looking at the code and callers, I don't know what is. |
I'm also skeptical as to the relevance of these numbers. I'm not against removing this inlining. I did it as a minor part of another startup optimization after checking that the inlining was size-neutral and noticing that this meant a speed-up in interpreter and C1 for some relatively common JNI calls. It might still be marginally beneficial there, but if there's a measurable impact on compilation time and header complexity then who am I to object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have one suggested change but you should point out that the parameter change is an optimization that replaces some of the inlined benefits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a lot of changes to includes that needs to be restored / cleaned. I've marked them below:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the motivation to change the parameter from oop java_class
to InstanceKlass*
? The call sites are now much noisier and harder to read.
This is changed to gain performance when this function is called many times. |
Could you explain how that helps the performance? |
There was a small performance difference (loss) with making the static InstanceKlass::allocate_instance function not inlined. The experiment to effectively inline the conversion from oop java_class to InstanceKlass regained this performance in Afshin's testing. I agree the parameter change is messy and with Claes that we should have a utility function - in instanceKlass.inline.hpp. Maybe call it from_class like: InstanceKlass* InstanceKlass::from_class(oop java_class) { ... } |
That's really surprising. I also don't see how any of the proposed changes could affect the GC so much. This makes me suspicious of the performance claims. Could you redo the benchmarking and give us more information about:
|
FWIW we had a discussion about this earlier and I'm as skeptical as you about the results listed in this PRs description. I also pointed to the microbenchmark -
This was before the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to revert the changes in jni.cpp and instanceKlass.hpp.
They are reverted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't oppose this change.
@afshin-zafari This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 338 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@coleenp, @stefank) but any other Committer may sponsor as well. ➡️ To flag this PR as ready for integration with the above commit message, type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great. This looks good. We'll wait for GHA and integrate.
Thank you all reviewers for your comments on this PR. |
@afshin-zafari |
/sponsor |
Going to push as commit cb4ae19.
Your commit was automatically rebased without conflicts. |
@coleenp @afshin-zafari Pushed as commit cb4ae19. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
The inline and not-inline versions of the method is tested to compare the performance difference.
Test
make test TEST=micro:Capture0.lambda_01 MICRO="VM_OPTIONS=-XX:TieredStopAtLevel=1"
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/12782/head:pull/12782
$ git checkout pull/12782
Update a local copy of the PR:
$ git checkout pull/12782
$ git pull https://git.openjdk.org/jdk pull/12782/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 12782
View PR using the GUI difftool:
$ git pr show -t 12782
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/12782.diff