Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8282874: Bad performance on gather/scatter API caused by different IntSpecies of indexMap #7757

Closed
wants to merge 1 commit into from

Conversation

JoshuaZhuwj
Copy link
Member

@JoshuaZhuwj JoshuaZhuwj commented Mar 9, 2022

I came across a performance issue when using scatter store VectorAPI for Integer and Long in the same application. The poor performance was caused by vector intrinsic inlining failure because of non-determined IntSpecies for a constant VectorShape of IndexMap in this scenario.
As discussion at #7721 , I change the code in VectorAPI.
Please help review.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8282874: Bad performance on gather/scatter API caused by different IntSpecies of indexMap

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/7757/head:pull/7757
$ git checkout pull/7757

Update a local copy of the PR:
$ git checkout pull/7757
$ git pull https://git.openjdk.java.net/jdk pull/7757/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 7757

View PR using the GUI difftool:
$ git pr show -t 7757

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/7757.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Mar 9, 2022

👋 Welcome back jzhu! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Mar 9, 2022
@openjdk
Copy link

openjdk bot commented Mar 9, 2022

@JoshuaZhuwj To determine the appropriate audience for reviewing this pull request, one or more labels corresponding to different subsystems will normally be applied automatically. However, no automatic labelling rule matches the changes in this pull request. In order to have an "RFR" email sent to the correct mailing list, you will need to add one or more applicable labels manually using the /label pull request command.

Applicable Labels
  • build
  • client
  • compiler
  • core-libs
  • hotspot
  • hotspot-compiler
  • hotspot-gc
  • hotspot-jfr
  • hotspot-runtime
  • i18n
  • ide-support
  • javadoc
  • jdk
  • jmx
  • kulla
  • net
  • nio
  • security
  • serviceability
  • shenandoah

@JoshuaZhuwj
Copy link
Member Author

/label add hotspot

@openjdk openjdk bot added the hotspot hotspot-dev@openjdk.org label Mar 9, 2022
@openjdk
Copy link

openjdk bot commented Mar 9, 2022

@JoshuaZhuwj
The hotspot label was successfully added.

@mlbridge
Copy link

mlbridge bot commented Mar 9, 2022

Webrevs

@openjdk
Copy link

openjdk bot commented Mar 9, 2022

@JoshuaZhuwj This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8282874: Bad performance on gather/scatter API caused by different IntSpecies of indexMap

Reviewed-by: psandoz

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 5 new commits pushed to the master branch:

  • ff76620: 8282641: Make jdb "threadgroup" command with no args reset the current threadgroup back to the default
  • 70318e1: 8282884: Provide OID aliases for MD2, MD5, and OAEP
  • 6d8d156: 8280494: (D)TLS signature schemes
  • 5df2a05: 8282628: Potential memory leak in sun.font.FontConfigManager.getFontConfig()
  • d07f7c7: 8282665: [REDO] ByteBufferTest.java: replace endless recursion with RuntimeException in void ck(double x, double y)

Please see this link for an up-to-date comparison between the source branch of this pull request and the master branch.
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@PaulSandoz) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Mar 9, 2022
@JoshuaZhuwj
Copy link
Member Author

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Mar 10, 2022
@openjdk
Copy link

openjdk bot commented Mar 10, 2022

@JoshuaZhuwj
Your change (at version c388833) is now ready to be sponsored by a Committer.

@DamonFool
Copy link
Member

I would suggest adding a jtreg test for this fix if it is possible.
Thanks.

@JoshuaZhuwj
Copy link
Member Author

I would suggest adding a jtreg test for this fix if it is possible. Thanks.

@DamonFool thanks for your suggestion.
IMO a benchmark would be more suitable for this case.
In fact, besides this fix, there exists another issue that will also affect the performance since delay vector inlining.
After my initial triage, it may relate to ConstraintCastNode's dependency.
I will add a benchmark after figuring out the solution to it.

@DamonFool
Copy link
Member

IMO a benchmark would be more suitable for this case.

To avoid breaking this fix again in the future, I would prefer a jtreg test since the jtreg tests are tested much more frequently and widely.

TBH, I still don't know how to reproduce the problem and verify the fix.

@PaulSandoz
Copy link
Member

I think its OK, to follow up after this with some tests for "polluted" profiles of vectors (which may expose more issues).
Given the scope of the fix i would recommend adding a comment in each place as to why we don't switch over the enum constant itself (note we are very careful in other performance critical areas of the JDK to avoid this e.g. in VarHandle code).

@DamonFool
Copy link
Member

I think its OK, to follow up after this with some tests for "polluted" profiles of vectors (which may expose more issues). Given the scope of the fix i would recommend adding a comment in each place as to why we don't switch over the enum constant itself (note we are very careful in other performance critical areas of the JDK to avoid this e.g. in VarHandle code).

Hi @PaulSandoz ,

JDK-8282874 is labeled as a Bug.
But there is no description about how to reproduce it and verify the fix.

I think it also helps to figure out more bugs of this kind if we provide a jtreg test.
That's why I strongly recommend adding a jtreg for this fix.
Thanks.

@JoshuaZhuwj
Copy link
Member Author

As my description at the beginning of both PRs, using gather/scatter VectorAPI for Integer and Long in the same application, this issue could be reproduced easily.
Check a simple reproducer at http://cr.openjdk.java.net/~jzhu/8282874/CheckAssembly.java. We can verify its performance via execution time.

This change fixes the non-determined IntSpecies for a constant VectorShape of IndexMap in this scenario.

But even with this fix, the performance still cannot reach optimum.
As mentioned before in this PR, besides this fix, there exists another issue that will also affect the performance since delay vector inlining.
The 2nd issue can be skirted around by disabling delay vector inlining manually.

I made an initial triage for the 2nd issue.
I had thought that force gvn by IncrementalInlineForceCleanup should help solve it. But ConstraintCastNode's StrongDependency made its own identity() lose effect.

That's why I propose to add a benchmark after we figure out the solution to 2nd issue. I agree with Paul that we could follow up with more tests afterward.

@DamonFool
Copy link
Member

Check a simple reproducer at http://cr.openjdk.java.net/~jzhu/8282874/CheckAssembly.java. We can verify its performance via execution time.

Thanks @JoshuaZhuwj for your clarification.
So we can create a jtreg test based on CheckAssembly.java, right?

If you are busy to do so, we'd like to have a try.
What do you think?

@JoshuaZhuwj
Copy link
Member Author

Check a simple reproducer at http://cr.openjdk.java.net/~jzhu/8282874/CheckAssembly.java. We can verify its performance via execution time.

Thanks @JoshuaZhuwj for your clarification. So we can create a jtreg test based on CheckAssembly.java, right?

If you are busy to do so, we'd like to have a try. What do you think?

Of course. Thanks @DamonFool
I think it would be better to sync with Paul on how to design tests for "polluted" profiles of vectors mentioned before instead of a single jtreg test for this case.

@DamonFool
Copy link
Member

I think it would be better to sync with Paul on how to design tests for "polluted" profiles of vectors mentioned before instead of a single jtreg test for this case.

Okay.

@DamonFool
Copy link
Member

Check a simple reproducer at http://cr.openjdk.java.net/~jzhu/8282874/CheckAssembly.java. We can verify its performance via execution time.

Nice performance improvement.
Thanks for fixing it.

@DamonFool
Copy link
Member

/sponsor

@openjdk
Copy link

openjdk bot commented Mar 12, 2022

Going to push as commit 5c408c1.
Since your change was applied there have been 29 commits pushed to the master branch:

  • 374193b: 8283041: [javadoc] Crashes using {@return} with @param
  • 0fd09d3: 8282978: Wrong parameter passed to GetStringXXXChars in various places
  • 95ca944: 8282354: Remove dependancy of TestHttpServer, HttpTransaction, HttpCallback from open/test/jdk/ tests
  • f99193a: 8282811: Typo in IAE details message of RecordedObject.getValueDescriptor
  • cab9def: 8282700: Properly handle several --without options during configure
  • 1a5a496: 8282763: G1: G1CardSetContainer remove intrusive-list details.
  • 88f0938: 8272493: Suboptimal code generation around Preconditions.checkIndex intrinsic with AVX2
  • a5a1a32: 8282883: Use JVM_LEAF to avoid ThreadStateTransition for some simple JVM entries
  • bb7ee5a: 8282314: nsk/jvmti/SuspendThread/suspendthrd003 may leak memory
  • f5217b4: 8282852: Debug agent asserts in classTrack_addPreparedClass()
  • ... and 19 more: https://git.openjdk.java.net/jdk/compare/31ad80a229e3f67823ff8f1fc914c5503f184b57...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Mar 12, 2022
@openjdk openjdk bot closed this Mar 12, 2022
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Mar 12, 2022
@openjdk openjdk bot removed the sponsor Pull request is ready to be sponsored label Mar 12, 2022
@openjdk
Copy link

openjdk bot commented Mar 12, 2022

@DamonFool @JoshuaZhuwj Pushed as commit 5c408c1.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@JoshuaZhuwj
Copy link
Member Author

Thanks Paul and FuJie.

@JoshuaZhuwj JoshuaZhuwj deleted the GatherScatterPerf branch March 12, 2022 10:34
yuleil pushed a commit to yuleil/dragonwell11 that referenced this pull request Jul 22, 2022
… IntSpecies of indexMap

Summary: Fix a performance issue when using scatter store VectorAPI for
Integer and Long simultaneously in the same application. The poor
performance was caused by vector intrinsic inlining failure because of
non-determined IntSpecies for a constant VectorShape of IndexMap in this
scenario.

JDK CR: openjdk/jdk#7757

Test Plan: performance test in real sceanario

Reviewers: kuaiwei.kw, zhuoren.wz

Issue: https://aone.alibaba-inc.com/task/40627608

CR: https://code.aone.alibaba-inc.com/xcode/jdk11/codereview/8291897
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot hotspot-dev@openjdk.org integrated Pull request has been integrated
3 participants