-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8324724: Add Stub routines for FP16 conversions on aarch64 #17706
Conversation
This commit - openjdk@8cfd74f adds stub routines for FP16 conversions for float/float16 constants on x86 to get an accurate compile time value of the nodes. This task adds similar stub routines for aarch64 as well. With this patch, if the inputs to the conversion functions are constants, the stub routines are executed to determine the compile time values of the ConvHF2F/ConvF2HF nodes (in their respective Value() functions) and the ConvHF2F/ConvF2HF nodes are replaced with ConI/ConF nodes. This might help in further compiler optimizations like constant folding. The following testcase was used to test the disassembly - public class FloatConv { private static final short sconst; private static final float fconst; static { sconst = Short.MAX_VALUE; fconst = Float.MIN_VALUE; } @benchmark public float hf2f() { return Float.float16ToFloat(sconst); } @benchmark public short f2hf() { return Float.floatToFloat16(fconst); } } Disassembly without patch : FloatConv.f2hf() :- ... ldr s17, 0x0000ffff918cec80 fcvt h16, s17 smov x0, v16.h[0] ... ret FloatConv.hf2f() :- ... orr w11, wzr, #0x7fff mov v16.h[0], w11 fcvt s0, h16 ... ret Disassembly with patch : FloatConv.hf2f() :- ... ldr s0, 0x0000ffffb58ce880 ... ret FloatConv.f2hf() :- ... mov w0, wzr ... ret With this patch, the conversion computation is done well in advance and the ConvHF2F/ConvF2HF nodes are replaced with the ConI/ConF nodes and thus the constant values are just loaded into registers and returned. The tests in - "hotspot/jtreg/compiler/intrinsics/float16" pass on both aarch64 and x86.
👋 Welcome back bkilambi! A progress list of the required criteria for merging this PR into |
@Bhavana-Kilambi The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
@Bhavana-Kilambi This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 2 new commits pushed to the
Please see this link for an up-to-date comparison between the source branch of this pull request and the As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@theRealAph, @nick-arm) but any other Committer may sponsor as well. ➡️ To flag this PR as ready for integration with the above commit message, type |
/integrate |
@Bhavana-Kilambi |
/sponsor |
Going to push as commit 51853f7.
Your commit was automatically rebased without conflicts. |
@nick-arm @Bhavana-Kilambi Pushed as commit 51853f7. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
This change is causing crashes on macOS - JDK-8325264 |
@Bhavana-Kilambi Please allow for enough time for reviews, at least 24 hours, see https://openjdk.org/guide/#life-of-a-pr. We usually perform some sanity testing, which would have caught the issues reported in JDK-8325264. Thanks! |
Apologies. I will take a look at the failures right away .. |
This commit - 8cfd74f adds stub routines for FP16 conversions for float/float16 constants on x86 to get an accurate compile time value of the nodes. This task adds similar stub routines for aarch64 as well.
With this patch, if the inputs to the conversion functions are constants, the stub routines are executed to determine the compile time values of the ConvHF2F/ConvF2HF nodes (in their respective Value() functions) and the ConvHF2F/ConvF2HF nodes are replaced with ConI/ConF nodes. This might help in further compiler optimizations like constant folding.
The following testcase was used to test the disassembly -
Disassembly without patch :
FloatConv.f2hf() :-
FloatConv.hf2f() :-
Disassembly with patch :
FloatConv.hf2f() :-
FloatConv.f2hf() :-
With this patch, the conversion computation is done well in advance and the ConvHF2F/ConvF2HF nodes are replaced with the ConI/ConF nodes and thus the constant values are just loaded into registers and returned.
The tests in - "hotspot/jtreg/compiler/intrinsics/float16" pass on both aarch64 and x86.
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/17706/head:pull/17706
$ git checkout pull/17706
Update a local copy of the PR:
$ git checkout pull/17706
$ git pull https://git.openjdk.org/jdk.git pull/17706/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 17706
View PR using the GUI difftool:
$ git pr show -t 17706
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/17706.diff
Webrev
Link to Webrev Comment