-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JDK-8318228: RISC-V: C2 ConvF2HF #17450
Conversation
👋 Welcome back mli! A progress list of the required criteria for merging this PR into |
@Hamlin-Li The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, Thanks for this change. I have a small question.
__ fmv_x_w(dst, src); | ||
|
||
// preserve the payloads of non-canonical NaNs. | ||
__ srai(dst, dst, 13); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the lowest 13 bits of the payload for src
is simply discarded here. But these bits are also used for calculating the new significand bits for float16 [1]. So this doesn't seem OK to me. Did I miss anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's discarded intentionally, just like in HF2F it's padded with zero in lower 13 bits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now, I'm not sure.
I have below patch:
diff --git a/src/java.base/share/classes/java/lang/Float.java b/src/java.base/share/classes/java/lang/Float.java
index 7508c22d7f4..f96e23b568e 100644
--- a/src/java.base/share/classes/java/lang/Float.java
+++ b/src/java.base/share/classes/java/lang/Float.java
@@ -1108,9 +1108,7 @@ public static short floatToFloat16(float f) {
// Preserve high order bit of float NaN in the
// binary16 result NaN (tenth bit); OR in remaining
// bits into lower 9 bits of binary 16 significand.
- | (doppel & 0x007f_e000) >> 13 // 10 bits
- | (doppel & 0x0000_1ff0) >> 4 // 9 bits
- | (doppel & 0x0000_000f)); // 4 bits
+ | (doppel & 0x007f_e000) >> 13); // 10 bits
}
float abs_f = Math.abs(f);
And, test/jdk/java/lang/Float/Binary16ConversionNaN.java/Binary16Conversion.java both passed.
Either the tests(both library and hotspot) + intrinsics (not sure if intrinsics on other platforms need improvement) needs to be improved, or the code in library needs to be simplified. (To be frank, I don't think NaN needs such a complicated spec/design, but it depends on the spec).
I just filed a library bug to discuss it, JDK-8324212
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per discussion at: https://mail.openjdk.org/pipermail/riscv-port-dev/2022-December/000706.html, when NaN is used as input and output is also NaN, then there is no restriction on the exact NaN number, this is confirmed by the java library team. That means we can return any NaN if the input is an NaN. For floattoFloat16, the java spec is similar, floatToFloat16: If the argument is a NaN, the result is a NaN.
, it does not require how the result NaN layout should be.
So, I think we're fine to just discard the last 13 bits (as implemented in this patch), and in this way it's convenient for us as this will help to pass all the existing tests in hotspot.
But, for long term and performance consideration, I think we need to re-visit all the NaN related intrinsics in riscv to check if we need to treat NaN specially rather than just leveraging the riscv default behaviour. And I filed a bug to track this task: JDK-8324303
// in riscv, NaN needs a special process as fcvt does not work in that case. | ||
|
||
// check whether it's a NaN. | ||
fclass_s(t0, src); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As showed roundD intrinsic PR, ( https://github.com/openjdk/jdk/pull/16382/files#diff-7a5c3ed05b6f3f06ed1c59f5fc2a14ec566a6a5bd1d09606115767daa99115bdR4252 ) the feq_s(t0, src, src) + beqz(t0, label) seems to be a faster check for NaN, could you check the jmh numbers with feq_s ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion. Yes, it bring better performance.
After:
Benchmark (size) Mode Cnt Score Error Units
Fp16ConversionBenchmark.floatToFloat16 2048 avgt 5 3753.179 ? 43.557 ns/op
Fp16ConversionBenchmark.floatToFloat16Memory 2048 avgt 5 19.772 ? 0.860 ns/op
Before:
Benchmark (size) Mode Cnt Score Error Units
Fp16ConversionBenchmark.floatToFloat16 2048 avgt 5 4099.820 ? 57.671 ns/op
Fp16ConversionBenchmark.floatToFloat16Memory 2048 avgt 5 20.181 ? 0.108 ns/op
@RealFYang And per performance and test consideration, it's convenient for us to implement in current way. But, for long term and performance consideration, I filed a bug to re-visit all the NaN related intrinsics in riscv: JDK-8324303 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All right then! Thanks for finding this out.
@Hamlin-Li This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 89 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
@VladimirKempik @RealFYang Thanks for your review. /integrate |
Going to push as commit bcaad51.
Your commit was automatically rebased without conflicts. |
@Hamlin-Li Pushed as commit bcaad51. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
Hi,
Can you review the patch to add ConvF2HF intrinsic to JDK for riscv?
Thanks!
Test
Functionality
hotspot tests
test/hotspot/jtreg/compiler/intrinsics/
test/hotspot/jtreg/compiler/c2/irTests
jdk tests
test/jdk/java/lang/Float/Binary16Conversion*.java
Performance
tested on licheepi.
with UseZfh enabled
with UseZfh disabled
(i.e. disable the intrinsic)
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/17450/head:pull/17450
$ git checkout pull/17450
Update a local copy of the PR:
$ git checkout pull/17450
$ git pull https://git.openjdk.org/jdk.git pull/17450/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 17450
View PR using the GUI difftool:
$ git pr show -t 17450
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/17450.diff
Webrev
Link to Webrev Comment