Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8315612: RISC-V: intrinsic for unsignedMultiplyHigh #15558

Closed
wants to merge 2 commits into from

Conversation

VladimirKempik
Copy link

@VladimirKempik VladimirKempik commented Sep 4, 2023

Hello
Please review this simple patch, it add c2 implementation of intrinsic for unsignedMultiplyHigh.
The generated code changes from:

            0x0000003fbcfb12f8:   mulh	t4,t2,t3
   2.99%    0x0000003fbcfb12fc:   srai	t5,t2,0x3f
            0x0000003fbcfb1300:   and	t5,t5,t3
            0x0000003fbcfb1304:   srai	t3,t3,0x3f
            0x0000003fbcfb1308:   and	t2,t3,t2
            0x0000003fbcfb130c:   add	t4,t4,t5
            0x0000003fbcfb130e:   add	t2,t2,t4                    ;*ladd {reexecute=0 rethrow=0 return_oop=0}

to

            0x0000003fdcfb6668:   mulhu	t2,t2,t3                    ;*invokestatic unsignedMultiplyHigh {reexecute=0 rethrow=0 return_oop=0}

Clear code size reduction and potentially some performance boost.
on hifive I can see the perf boost:

before:
MathBench.unsignedMultiplyHighLongLong       0  thrpt    8  67459.527 ± 10110.941  ops/ms

after:
MathBench.unsignedMultiplyHighLongLong       0  thrpt    8  86207.949 ± 8636.131  ops/ms

However on thead the jmh benchmark unsignedMultiplyHighLongLong didn't show any difference as the hottest place is the fence ( getfield isDone ):

            0x0000003fdcfb6660:   ld	t3,64(t4)
            0x0000003fdcfb6664:   ld	t2,56(t4)
            0x0000003fdcfb6668:   mulhu	t2,t2,t3                    ;*invokestatic unsignedMultiplyHigh {reexecute=0 rethrow=0 return_oop=0}
                                                                      ; - org.openjdk.bench.java.lang.MathBench::unsignedMultiplyHighLongLong@8 (line 545)
                                                                      ; - org.openjdk.bench.java.lang.jmh_generated.MathBench_unsignedMultiplyHighLongLong_jmhTest::unsignedMultiplyHighLongLong_thrpt_jmhStub@17 (line 119)
   3.12%    0x0000003fdcfb666c:   lbu	t3,148(s3)                  ;*invokestatic consumeCompiler {reexecute=0 rethrow=0 return_oop=0}
                                                                      ; - org.openjdk.jmh.infra.Blackhole::consume@7 (line 393)
                                                                      ; - org.openjdk.bench.java.lang.jmh_generated.MathBench_unsignedMultiplyHighLongLong_jmhTest::unsignedMultiplyHighLongLong_thrpt_jmhStub@20 (line 119)
            0x0000003fdcfb6670:   fence	ir,iorw                     ;*getfield isDone {reexecute=0 rethrow=0 return_oop=0}
                                                                      ; - org.openjdk.bench.java.lang.jmh_generated.MathBench_unsignedMultiplyHighLongLong_jmhTest::unsignedMultiplyHighLongLong_thrpt_jmhStub@30 (line 121)
  62.78%    0x0000003fdcfb6674:   addi	s2,s2,1                     ;*ladd {reexecute=0 rethrow=0 return_oop=0}
                                                                      ; - org.openjdk.bench.java.lang.jmh_generated.MathBench_unsignedMultiplyHighLongLong_jmhTest::unsignedMultiplyHighLongLong_thrpt_jmhStub@26 (line 120)

tier1/tier2 tbd


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8315612: RISC-V: intrinsic for unsignedMultiplyHigh (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/15558/head:pull/15558
$ git checkout pull/15558

Update a local copy of the PR:
$ git checkout pull/15558
$ git pull https://git.openjdk.org/jdk.git pull/15558/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 15558

View PR using the GUI difftool:
$ git pr show -t 15558

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/15558.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Sep 4, 2023

👋 Welcome back vkempik! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Sep 4, 2023
@openjdk
Copy link

openjdk bot commented Sep 4, 2023

@VladimirKempik The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Sep 4, 2023
@mlbridge
Copy link

mlbridge bot commented Sep 4, 2023

Webrevs

@VladimirKempik
Copy link
Author

VladimirKempik commented Sep 4, 2023

When increasing test difficulty from

    public long  unsignedMultiplyHighLongLong() {
        return  Math.unsignedMultiplyHigh(long747, long13);

to

    public long  unsignedMultiplyHighLongLong() {
        return  Math.unsignedMultiplyHigh(long747, long13) + Math.unsignedMultiplyHigh(long13, long747) + 
                                                            Math.unsignedMultiplyHigh(long747, long2);

Then I'm starting to see difference on thead as well:

before the patch:

Benchmark                               (seed)   Mode  Cnt      Score    Error   Units
MathBench.unsignedMultiplyHighLongLong       0  thrpt    8  32848.186 ? 490.924  ops/ms

after the patch

Benchmark                               (seed)   Mode  Cnt      Score    Error   Units
MathBench.unsignedMultiplyHighLongLong       0  thrpt    8  35156.335 ? 47.592  ops/ms

Copy link
Member

@RealFYang RealFYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. One minor nit.

src/hotspot/cpu/riscv/riscv.ad Outdated Show resolved Hide resolved
@openjdk
Copy link

openjdk bot commented Sep 5, 2023

@VladimirKempik This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8315612: RISC-V: intrinsic for unsignedMultiplyHigh

Reviewed-by: fyang

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 8 new commits pushed to the master branch:

  • f292268: 8315454: Add a way to create an immutable snapshot of a BitSet
  • 9def453: 8314580: PhaseIdealLoop::transform_long_range_checks fails with assert "was tested before"
  • 6c821f5: 8315545: C1: x86 cmove can use short branches
  • d7e4087: 8315369: [JVMCI] failure to attach to a libgraal isolate during shutdown should not be fatal
  • d1cabe4: 8315566: [JVMCI] deadlock in JVMCI startup when bad option specified
  • 94a74a0: 8315534: Incorrect warnings about implicit annotation processing
  • 84425a6: 8315452: Erroneous AST missing modifiers for partial input
  • 3094fd1: 8314662: jshell shows duplicated signatures of javap

Please see this link for an up-to-date comparison between the source branch of this pull request and the master branch.
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Sep 5, 2023
@VladimirKempik
Copy link
Author

Tier1 is good, will wait for tier2 results before integrating this

@VladimirKempik
Copy link
Author

tier2 is good, so
/integrate

@openjdk
Copy link

openjdk bot commented Sep 6, 2023

Going to push as commit 5d3fdc1.
Since your change was applied there have been 26 commits pushed to the master branch:

  • 5cbff24: 8315406: [REDO] serviceability/jdwp/AllModulesCommandTest.java ignores VM flags
  • 7a08e6b: 8313575: Refactor PKCS11Test tests
  • d3ee704: 8315563: Remove references to JDK-8226420 from problem list
  • aba89f2: 8312213: Remove unnecessary TEST instructions on x86 when flags reg will already be set
  • 1f4cdb3: 8315127: CDSMapTest fails with incorrect number of oop references
  • 939d7c5: 8161536: sun/security/pkcs11/sslecc/ClientJSSEServerJSSE.java fails with ProviderException
  • ebe3127: 8315717: ProblemList serviceability/sa/TestHeapDumpForInvokeDynamic.java with ZGC
  • 969fcdb: 8314191: C2 compilation fails with "bad AD file"
  • cef9fff: 8305507: Add support for grace period before AbortVMOnSafepointTimeout triggers
  • ed2b467: 8315499: build using devkit on Linux ppc64le RHEL puts path to devkit into libsplashscreen
  • ... and 16 more: https://git.openjdk.org/jdk/compare/0d52c82ed1fa6ecf5b431949c803abc8423336cb...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Sep 6, 2023
@openjdk openjdk bot closed this Sep 6, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Sep 6, 2023
@openjdk
Copy link

openjdk bot commented Sep 6, 2023

@VladimirKempik Pushed as commit 5d3fdc1.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@VladimirKempik
Copy link
Author

/backport jdk21u

@openjdk
Copy link

openjdk bot commented Sep 6, 2023

@VladimirKempik the backport was successfully created on the branch VladimirKempik-backport-5d3fdc17 in my personal fork of openjdk/jdk21u. To create a pull request with this backport targeting openjdk/jdk21u:master, just click the following link:

➡️ Create pull request

The title of the pull request is automatically filled in correctly and below you find a suggestion for the pull request body:

Hi all,

This pull request contains a backport of commit 5d3fdc17 from the openjdk/jdk repository.

The commit being backported was authored by Vladimir Kempik on 6 Sep 2023 and was reviewed by Fei Yang.

Thanks!

If you need to update the source branch of the pull then run the following commands in a local clone of your personal fork of openjdk/jdk21u:

$ git fetch https://github.com/openjdk-bots/jdk21u.git VladimirKempik-backport-5d3fdc17:VladimirKempik-backport-5d3fdc17
$ git checkout VladimirKempik-backport-5d3fdc17
# make changes
$ git add paths/to/changed/files
$ git commit --message 'Describe additional changes made'
$ git push https://github.com/openjdk-bots/jdk21u.git VladimirKempik-backport-5d3fdc17

@sonicyouth98
Copy link

sonicyouth98 commented Nov 3, 2023

Hello Please review this simple patch, it add c2 implementation of intrinsic for unsignedMultiplyHigh. The generated code changes from:

            0x0000003fbcfb12f8:   mulh	t4,t2,t3
   2.99%    0x0000003fbcfb12fc:   srai	t5,t2,0x3f
            0x0000003fbcfb1300:   and	t5,t5,t3
            0x0000003fbcfb1304:   srai	t3,t3,0x3f
            0x0000003fbcfb1308:   and	t2,t3,t2
            0x0000003fbcfb130c:   add	t4,t4,t5
            0x0000003fbcfb130e:   add	t2,t2,t4                    ;*ladd {reexecute=0 rethrow=0 return_oop=0}

to

            0x0000003fdcfb6668:   mulhu	t2,t2,t3                    ;*invokestatic unsignedMultiplyHigh {reexecute=0 rethrow=0 return_oop=0}

Clear code size reduction and potentially some performance boost. on hifive I can see the perf boost:

before:
MathBench.unsignedMultiplyHighLongLong       0  thrpt    8  67459.527 ± 10110.941  ops/ms

after:
MathBench.unsignedMultiplyHighLongLong       0  thrpt    8  86207.949 ± 8636.131  ops/ms

However on thead the jmh benchmark unsignedMultiplyHighLongLong didn't show any difference as the hottest place is the fence ( getfield isDone ):

            0x0000003fdcfb6660:   ld	t3,64(t4)
            0x0000003fdcfb6664:   ld	t2,56(t4)
            0x0000003fdcfb6668:   mulhu	t2,t2,t3                    ;*invokestatic unsignedMultiplyHigh {reexecute=0 rethrow=0 return_oop=0}
                                                                      ; - org.openjdk.bench.java.lang.MathBench::unsignedMultiplyHighLongLong@8 (line 545)
                                                                      ; - org.openjdk.bench.java.lang.jmh_generated.MathBench_unsignedMultiplyHighLongLong_jmhTest::unsignedMultiplyHighLongLong_thrpt_jmhStub@17 (line 119)
   3.12%    0x0000003fdcfb666c:   lbu	t3,148(s3)                  ;*invokestatic consumeCompiler {reexecute=0 rethrow=0 return_oop=0}
                                                                      ; - org.openjdk.jmh.infra.Blackhole::consume@7 (line 393)
                                                                      ; - org.openjdk.bench.java.lang.jmh_generated.MathBench_unsignedMultiplyHighLongLong_jmhTest::unsignedMultiplyHighLongLong_thrpt_jmhStub@20 (line 119)
            0x0000003fdcfb6670:   fence	ir,iorw                     ;*getfield isDone {reexecute=0 rethrow=0 return_oop=0}
                                                                      ; - org.openjdk.bench.java.lang.jmh_generated.MathBench_unsignedMultiplyHighLongLong_jmhTest::unsignedMultiplyHighLongLong_thrpt_jmhStub@30 (line 121)
  62.78%    0x0000003fdcfb6674:   addi	s2,s2,1                     ;*ladd {reexecute=0 rethrow=0 return_oop=0}
                                                                      ; - org.openjdk.bench.java.lang.jmh_generated.MathBench_unsignedMultiplyHighLongLong_jmhTest::unsignedMultiplyHighLongLong_thrpt_jmhStub@26 (line 120)

tier1/tier2 tbd

Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8315612: RISC-V: intrinsic for unsignedMultiplyHigh (Enhancement - P4)

Reviewers

Reviewing

Using git
Using Skara CLI tools
Using diff file

Webrev

Link to Webrev Comment

Hi Vladimir Kempik the hottest place were detected by perfasm ?

@VladimirKempik
Copy link
Author

yes, -prof perfasm option to jmh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler hotspot-compiler-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

3 participants