Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8294198: Implement isFinite intrinsic for RISC-V #10391

Closed
wants to merge 1 commit into from

Conversation

voitylov
Copy link

@voitylov voitylov commented Sep 22, 2022

Unlike on x86 (see 8285868 and the discussion in review), isFinite intrinsic turned out to be profitable on RISC-V using the same fclass instruction as for 8293695 (isInfinite instrinsic). Therefore, I'm proposing to have it added on RISC-V in this PR.

benchmark results:

before:

Benchmark Mode Cnt Score Error Units
DoubleClassCheck.testIsFiniteBranch avgt 15 52.824 ± 1.744 ns/op
DoubleClassCheck.testIsFiniteCMov avgt 15 16.104 ± 0.358 ns/op
DoubleClassCheck.testIsFiniteStore avgt 15 14.366 ± 2.174 ns/op
FloatClassCheck.testIsFiniteBranch avgt 15 49.821 ± 0.330 ns/op
FloatClassCheck.testIsFiniteCMov avgt 15 14.702 ± 0.335 ns/op
FloatClassCheck.testIsFiniteStore avgt 15 14.749 ± 0.496 ns/op

after:

DoubleClassCheck.testIsFiniteBranch avgt 15 48.921 ± 0.557 ns/op
DoubleClassCheck.testIsFiniteCMov avgt 15 13.716 ± 0.304 ns/op
DoubleClassCheck.testIsFiniteStore avgt 15 9.152 ± 0.158 ns/op
FloatClassCheck.testIsFiniteBranch avgt 15 47.740 ± 2.028 ns/op
FloatClassCheck.testIsFiniteCMov avgt 15 13.299 ± 0.282 ns/op
FloatClassCheck.testIsFiniteStore avgt 15 9.185 ± 0.396 ns/op

Existing isInfinite jtreg test was altered to be able to use common code for isFinite test and fine-grained requires tag filtering. Existing benchmark was modified to include isFinite case. A typo ("Atleast" -> "At least") was fixed on the way.

Test passed on both release and fastdebug builds. Hotspot tier1 tests were run on x86_64 and RISC-V with no issues.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/10391/head:pull/10391
$ git checkout pull/10391

Update a local copy of the PR:
$ git checkout pull/10391
$ git pull https://git.openjdk.org/jdk pull/10391/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 10391

View PR using the GUI difftool:
$ git pr show -t 10391

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/10391.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Sep 22, 2022

👋 Welcome back avoitylov! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Sep 22, 2022
@openjdk
Copy link

openjdk bot commented Sep 22, 2022

@voitylov The following labels will be automatically applied to this pull request:

  • core-libs
  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added hotspot hotspot-dev@openjdk.org core-libs core-libs-dev@openjdk.org labels Sep 22, 2022
@mlbridge
Copy link

mlbridge bot commented Sep 22, 2022

Webrevs

@jddarcy
Copy link
Member

jddarcy commented Sep 22, 2022

The java.lang.* changes look fine if the rest of the work proceeds.

@RealFYang
Copy link
Member

RealFYang commented Sep 23, 2022

May I ask on which platform was the JMH test performed? I see some fluctuations for 'testIsFiniteBranch' on SiFive Unmatched board:

Before:
Benchmark Mode Cnt Score Error Units
DoubleClassCheck.testIsFiniteBranch avgt 15 32.114 ± 3.514 ns/op
DoubleClassCheck.testIsFiniteCMov avgt 15 8.656 ± 0.023 ns/op
DoubleClassCheck.testIsFiniteStore avgt 15 7.757 ± 1.691 ns/op
FloatClassCheck.testIsFiniteBranch avgt 15 32.446 ± 3.130 ns/op
FloatClassCheck.testIsFiniteCMov avgt 15 7.801 ± 0.011 ns/op
FloatClassCheck.testIsFiniteStore avgt 15 7.279 ± 1.483 ns/op

Benchmark Mode Cnt Score Error Units
DoubleClassCheck.testIsFiniteBranch avgt 15 33.724 ± 5.157 ns/op
DoubleClassCheck.testIsFiniteCMov avgt 15 8.642 ± 0.012 ns/op
DoubleClassCheck.testIsFiniteStore avgt 15 7.765 ± 1.711 ns/op
FloatClassCheck.testIsFiniteBranch avgt 15 31.401 ± 4.104 ns/op
FloatClassCheck.testIsFiniteCMov avgt 15 7.799 ± 0.014 ns/op
FloatClassCheck.testIsFiniteStore avgt 15 8.234 ± 0.023 ns/op

Benchmark Mode Cnt Score Error Units
DoubleClassCheck.testIsFiniteBranch avgt 15 32.461 ± 3.688 ns/op
DoubleClassCheck.testIsFiniteCMov avgt 15 8.643 ± 0.007 ns/op
DoubleClassCheck.testIsFiniteStore avgt 15 7.772 ± 1.701 ns/op
FloatClassCheck.testIsFiniteBranch avgt 15 33.258 ± 3.131 ns/op
FloatClassCheck.testIsFiniteCMov avgt 15 7.815 ± 0.014 ns/op
FloatClassCheck.testIsFiniteStore avgt 15 6.343 ± 1.475 ns/op

After:
Benchmark Mode Cnt Score Error Units
DoubleClassCheck.testIsFiniteBranch avgt 15 33.915 ± 4.193 ns/op
DoubleClassCheck.testIsFiniteCMov avgt 15 7.736 ± 0.012 ns/op
DoubleClassCheck.testIsFiniteStore avgt 15 5.931 ± 0.007 ns/op
FloatClassCheck.testIsFiniteBranch avgt 15 33.845 ± 3.449 ns/op
FloatClassCheck.testIsFiniteCMov avgt 15 7.543 ± 0.013 ns/op
FloatClassCheck.testIsFiniteStore avgt 15 6.035 ± 0.006 ns/op

Benchmark Mode Cnt Score Error Units
DoubleClassCheck.testIsFiniteBranch avgt 15 33.421 ± 4.262 ns/op
DoubleClassCheck.testIsFiniteCMov avgt 15 7.734 ± 0.012 ns/op
DoubleClassCheck.testIsFiniteStore avgt 15 5.940 ± 0.010 ns/op
FloatClassCheck.testIsFiniteBranch avgt 15 33.348 ± 2.656 ns/op
FloatClassCheck.testIsFiniteCMov avgt 15 7.542 ± 0.013 ns/op
FloatClassCheck.testIsFiniteStore avgt 15 6.040 ± 0.004 ns/op

Benchmark Mode Cnt Score Error Units
DoubleClassCheck.testIsFiniteBranch avgt 15 31.669 ± 2.517 ns/op
DoubleClassCheck.testIsFiniteCMov avgt 15 7.761 ± 0.092 ns/op
DoubleClassCheck.testIsFiniteStore avgt 15 5.940 ± 0.007 ns/op
FloatClassCheck.testIsFiniteBranch avgt 15 33.508 ± 3.680 ns/op
FloatClassCheck.testIsFiniteCMov avgt 15 7.534 ± 0.009 ns/op
FloatClassCheck.testIsFiniteStore avgt 15 6.031 ± 0.007 ns/op

@voitylov
Copy link
Author

Good you checked on SiFive HW, as I could only check their cookbook for instruction costs. I was using C906.

You're not observing any statistically significant difference in the first benchmark from this change due to higher latency of fclass instruction on SiFive CPU (4 cycles, while C906 has 3 cycles latency for fclass). However, it is still profitable in general as we replace 2 relatively expensive FP instructions with 1 FP + 2 cheap GPR instructions. That's what the rest of the cases demonstrate on both SiFive and C906.

Overall, as the platform matures the instructions costs tend to decrease, so I'd assume fclass will take less cycles in most future implementations.

Copy link
Member

@RealFYang RealFYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the cause of the fluctuations for 'testIsFiniteBranch' lies in randomness of the input.

@Benchmark
@OperationsPerInvocation(BUFFER_SIZE)
public void testIsFiniteBranch() {
    for (int i = 0; i < BUFFER_SIZE; i++) {
        cmovOutputs[i] = Float.isFinite(inputs[i]) ? call() : 7;
    }
}

Here the C2 JIT code for invoking 'call()' has changed with this patch. The register allocation is
different and hence the difference of saving and restoring of live registers around the method call. So the probability of invoking this method call will affect the JMH result, which is not the case for the other two JMH tests.

For the other two JMH tests, I see the performance gain on SiFive platform comes from C2 loop unrolling. Since your change benefit the other two JMH tests on both SiFive and C906 platforms, looks good to me.

@openjdk
Copy link

openjdk bot commented Sep 26, 2022

@voitylov This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8294198: Implement isFinite intrinsic for RISC-V

Reviewed-by: fyang, kvn

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 88 new commits pushed to the master branch:

  • 5f6ad92: 8294547: HotSpotAgent.setupVM() should include "cause" exception when throwing DebuggerException
  • 545ded1: 8294548: Problem list SA core file tests on macosx-x64 due to JDK-8294316
  • 29c70f1: 8294595: Add javax/swing/plaf/aqua/CustomComboBoxFocusTest.java to problem list
  • 5d48da4: 8294370: Fix allocation bug in java_lang_Thread::async_get_stack_trace()
  • ce85cac: 8065554: MatchResult should provide values of named-capturing groups
  • 1decdce: 8294492: RISC-V: Use li instead of patchable movptr at non-patchable callsites
  • 8491fd5: 8294551: Put java/io/BufferedInputStream/TransferTo.java on problem list
  • 6f8f28e: 8294160: misc crash dump improvements
  • 8873192: 8293515: heapShared.cpp: rename JavaThread parameter to current
  • 76f1865: 8293563: [macos-aarch64] SA core file tests failing with sun.jvm.hotspot.oops.UnknownOopException
  • ... and 78 more: https://git.openjdk.org/jdk/compare/800e68d6906734242119e4ea033422f037a79857...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@RealFYang, @vnkozlov) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Sep 26, 2022
Copy link

@AryanP45 AryanP45 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@voitylov
Copy link
Author

@vnkozlov could you or someone familiar with C2 look at shared C2 HotSpot changes? I did test it on x86, but any extra testing could help as well.

Copy link
Contributor

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think old tests TestFloatClassCheck.java and TestDoubleClassCheck.java are modified correctly - they do nothing now. They have to run compiled methods and verify results.

Other then that changes seem fine.

Copy link
Contributor

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not enough coffee at the morning ;^)
Finally noticed how new and old test interact.
Changes are fine and I submitted testing.

Copy link
Contributor

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing passed.

@voitylov
Copy link
Author

Thank you @jddarcy @RealFYang @vnkozlov for your reviews.

I also ran the benchmarks with -XX:LoopUnrollLimit=0 -XX:LoopMaxUnroll=0 and see the improvement with these flags as well.

@voitylov
Copy link
Author

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Sep 29, 2022
@openjdk
Copy link

openjdk bot commented Sep 29, 2022

@voitylov
Your change (at version c5983c2) is now ready to be sponsored by a Committer.

@vnkozlov
Copy link
Contributor

/sponsor

@openjdk
Copy link

openjdk bot commented Sep 29, 2022

Going to push as commit aeef3ec.
Since your change was applied there have been 88 commits pushed to the master branch:

  • 5f6ad92: 8294547: HotSpotAgent.setupVM() should include "cause" exception when throwing DebuggerException
  • 545ded1: 8294548: Problem list SA core file tests on macosx-x64 due to JDK-8294316
  • 29c70f1: 8294595: Add javax/swing/plaf/aqua/CustomComboBoxFocusTest.java to problem list
  • 5d48da4: 8294370: Fix allocation bug in java_lang_Thread::async_get_stack_trace()
  • ce85cac: 8065554: MatchResult should provide values of named-capturing groups
  • 1decdce: 8294492: RISC-V: Use li instead of patchable movptr at non-patchable callsites
  • 8491fd5: 8294551: Put java/io/BufferedInputStream/TransferTo.java on problem list
  • 6f8f28e: 8294160: misc crash dump improvements
  • 8873192: 8293515: heapShared.cpp: rename JavaThread parameter to current
  • 76f1865: 8293563: [macos-aarch64] SA core file tests failing with sun.jvm.hotspot.oops.UnknownOopException
  • ... and 78 more: https://git.openjdk.org/jdk/compare/800e68d6906734242119e4ea033422f037a79857...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Sep 29, 2022
@openjdk openjdk bot closed this Sep 29, 2022
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Sep 29, 2022
@openjdk
Copy link

openjdk bot commented Sep 29, 2022

@vnkozlov @voitylov Pushed as commit aeef3ec.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core-libs core-libs-dev@openjdk.org hotspot hotspot-dev@openjdk.org integrated Pull request has been integrated
5 participants