Skip to content

8300002: Performance regression caused by non-inlined hot methods due to post call noop instructions#11958

Closed
kuksenko wants to merge 8 commits intoopenjdk:masterfrom
kuksenko:JDK-8300002
Closed

8300002: Performance regression caused by non-inlined hot methods due to post call noop instructions#11958
kuksenko wants to merge 8 commits intoopenjdk:masterfrom
kuksenko:JDK-8300002

Conversation

@kuksenko
Copy link
Contributor

@kuksenko kuksenko commented Jan 12, 2023

Post call nop instructions increase the size of methods, which leads to different inline decisions and performance regression.
Restore inline behavior by excluding post call nop instructions sizes from inline heuristics.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8300002: Performance regression caused by non-inlined hot methods due to post call noop instructions

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/11958/head:pull/11958
$ git checkout pull/11958

Update a local copy of the PR:
$ git checkout pull/11958
$ git pull https://git.openjdk.org/jdk pull/11958/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 11958

View PR using the GUI difftool:
$ git pr show -t 11958

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/11958.diff

@kuksenko kuksenko changed the title adjust inlining heuristic - do not count post call noop 8300002: Performance regression caused by non-inlined hot methods due to post call noop instructions Jan 12, 2023
@bridgekeeper
Copy link

bridgekeeper bot commented Jan 12, 2023

👋 Welcome back skuksenko! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jan 12, 2023

@kuksenko The following label will be automatically applied to this pull request:

  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot hotspot-dev@openjdk.org label Jan 12, 2023
@openjdk openjdk bot added the rfr Pull request is ready for review label Jan 12, 2023
@mlbridge
Copy link

mlbridge bot commented Jan 12, 2023

Webrevs

Copy link
Member

@TobiHartmann TobiHartmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a summary to the PR description.

If you update the last usage in ciMethod::has_compiled_code(), it looks like ciMethod::instructions_size is not needed anymore and can be removed. Instead, you could just change its semantic to exclude nops (please also update the method's comments).

@zhengxiaolinX
Copy link
Contributor

Hi Sergey,

Would you mind some RV part taking a ride? Thank you! (no need to add a contributor for this)

By the way, seems GHA says assert(assm->inst_mark() == NULL, "overlapping instructions"); may need some changes; it seems that x86 and aarch64 have an InstructionMark in post_call_nop() but ppc and riscv do not always have an InstructionMark in post_call_nop().

rv.txt

Thanks,
Xiaolin

Copy link
Contributor

@iwanowww iwanowww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an alternative way to count nops, have you considered iterating over relocations and counting post_call_nop_Reolcations then multiplying the count by NativePostCallNop::instruction_size? (So far, there's no NativePostCallNop::instruction_size declared on aarch64 and ppc, but that's the common pattern in nativeInst_*.hpp.)

BTW I'm curious why there's no relocation registered on PPC while both x86 and aarch64 have it.

@kuksenko
Copy link
Contributor Author

kuksenko commented Jan 13, 2023

Xiaolin,
It was my fault. That assert shouldn't be there.
Removed.

@kuksenko
Copy link
Contributor Author

If you update the last usage in ciMethod::has_compiled_code(), it looks like ciMethod::instructions_size is not needed anymore and can be removed. Instead, you could just change its semantic to exclude nops (please also update the method's comments).

I thought about it.
The issue is naming. "instructions_size" says nothing about the fact that nop instructions were excluded.
On the other side using "inline_instructions_size" in ciMethod::has_compiled_code() looks weird.
I am open to any suggestions.

@kuksenko
Copy link
Contributor Author

As an alternative way to count nops, have you considered iterating over relocations and counting post_call_nop_Reolcations then multiplying the count by NativePostCallNop::instruction_size? (So far, there's no NativePostCallNop::instruction_size declared on aarch64 and ppc, but that's the common pattern in nativeInst_*.hpp.)

It was my first draft which I didn't show here.
I just thought that the current way is more generic and doesn't have dependency on relocation size. In the future (if needed), such inline heuristic may be expanded by excluding some other instructions.

@kuksenko
Copy link
Contributor Author

BTW I'm curious why there's no relocation registered on PPC while both x86 and aarch64 have it.

I don't know the answer.

@fisk
Copy link
Contributor

fisk commented Jan 13, 2023

As an alternative way to count nops, have you considered iterating over relocations and counting post_call_nop_Reolcations then multiplying the count by NativePostCallNop::instruction_size? (So far, there's no NativePostCallNop::instruction_size declared on aarch64 and ppc, but that's the common pattern in nativeInst_*.hpp.)

It was my first draft which I didn't show here.

I just thought that the current way is more generic and doesn't have dependency on relocation size. In the future (if needed), such inline heuristic may be expanded by excluding some other instructions.

I agree with you. In fact I think we might want to do this route for generational ZGC as well. We don't want our different barriers to cause different inlining decisions. But if we go down that route, perhaps the name of the exclusion tool should be more generic?

@kuksenko
Copy link
Contributor Author

I agree with you. In fact I think we might want to do this route for generational ZGC as well. We don't want our different barriers to cause different inlining decisions. But if we go down that route, perhaps the name of the exclusion tool should be more generic?

I was surprised to find out about other usages so fast. I will do renaming.

@openjdk openjdk bot removed the rfr Pull request is ready for review label Jan 16, 2023
@openjdk openjdk bot added the rfr Pull request is ready for review label Jan 16, 2023
Copy link
Contributor

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see _instructions_size is used anymore so I suggest simply remove it and use _inline_instructions_size. Which was original purpose of this value anyway.
You missed ciReplay change to record _inline_instructions_size and restore it. That is what is used in inlining decision in your changes.
Also you need to record it in vmStructs.cpp. Just rename it.

@TobiHartmann
Copy link
Member

I thought about it.
The issue is naming. "instructions_size" says nothing about the fact that nop instructions were excluded.
On the other side using "inline_instructions_size" in ciMethod::has_compiled_code() looks weird.
I am open to any suggestions.

I think this is just a matter of how we define "instruction size". We could simply define it as not including oops. I would prefer to rename the existing usages to inline_instructions_size though. ciMethod::has_compiled_code() is only used in the context of "should (not) inline", we could rename it as well.

I see that Vladimir suggested the same. @vnkozlov, regarding

You missed ciReplay change to record _inline_instructions_size and restore it. That is what is used in inlining decision in your changes.

We currently don't record/restore _instructions_size in replay compilation either, probably because inlining decisions are enforced explicitly:

// m->_instructions_size = rec->_instructions_size;
m->_instructions_size = -1;

But I agree that it's better to be consistent here.

Copy link
Contributor

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update looks good.

@openjdk
Copy link

openjdk bot commented Jan 17, 2023

@kuksenko This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8300002: Performance regression caused by non-inlined hot methods due to post call noop instructions

Reviewed-by: kvn, iveresov, eosterlund

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 81 new commits pushed to the master branch:

  • 1d8b87d: 8300321: Use link tags in javax.sql.rowset package-info
  • f9883fc: 8300279: Use generalized see and link tags in core libs in client libs
  • 00b6c55: 8300254: ASan build does not correctly propagate ASAN_OPTIONS
  • e37078f: 8282664: Unroll by hand StringUTF16 and StringLatin1 polynomial hash loops
  • ade08e1: 8300093: Refactor code examples to use @snippet in java.text.MessageFormat
  • d7c05d1: 8300011: Refactor code examples to use @snippet in java.util.TimeZone
  • 8c12ae8: 8283203: Fix typo in SystemTray.getTrayIconSize javadoc
  • e7e3712: 8300010: UnsatisfiedLinkError on calling System.console().readPassword() on Windows
  • 0b9ff06: 8300184: Optimize ResourceHashtableBase::iterate_all using _number_of_entries
  • 75b122f: 8300120: Configure should support different defaults for CI/dev build environments
  • ... and 71 more: https://git.openjdk.org/jdk/compare/af8d3fb21ab59104d49bd664f634399fb72ecbd2...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jan 17, 2023
@kuksenko
Copy link
Contributor Author

/integrate

@openjdk
Copy link

openjdk bot commented Jan 18, 2023

Going to push as commit 89a032d.
Since your change was applied there have been 82 commits pushed to the master branch:

  • 7071397: 8299224: TestReporterStreams.java has bad indentation for legal header
  • 1d8b87d: 8300321: Use link tags in javax.sql.rowset package-info
  • f9883fc: 8300279: Use generalized see and link tags in core libs in client libs
  • 00b6c55: 8300254: ASan build does not correctly propagate ASAN_OPTIONS
  • e37078f: 8282664: Unroll by hand StringUTF16 and StringLatin1 polynomial hash loops
  • ade08e1: 8300093: Refactor code examples to use @snippet in java.text.MessageFormat
  • d7c05d1: 8300011: Refactor code examples to use @snippet in java.util.TimeZone
  • 8c12ae8: 8283203: Fix typo in SystemTray.getTrayIconSize javadoc
  • e7e3712: 8300010: UnsatisfiedLinkError on calling System.console().readPassword() on Windows
  • 0b9ff06: 8300184: Optimize ResourceHashtableBase::iterate_all using _number_of_entries
  • ... and 72 more: https://git.openjdk.org/jdk/compare/af8d3fb21ab59104d49bd664f634399fb72ecbd2...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Jan 18, 2023
@openjdk openjdk bot closed this Jan 18, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Jan 18, 2023
@openjdk
Copy link

openjdk bot commented Jan 18, 2023

@kuksenko Pushed as commit 89a032d.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hotspot hotspot-dev@openjdk.org integrated Pull request has been integrated

Development

Successfully merging this pull request may close these issues.

7 participants