Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8321001: RISC-V: C2 SignumVF #16925

Closed
wants to merge 5 commits into from
Closed

Conversation

Hamlin-Li
Copy link

@Hamlin-Li Hamlin-Li commented Dec 1, 2023

Hi,
Can you review the patch to add intrinisc SignumVF/SignumVD on riscv?
Thanks

Test

test/hotspot/jtreg/compiler/intrinsics/
test/hotspot/jtreg/compiler/vectorapi/
and tests found via:
grep -nr test/hotspot/jtreg/ -we Math.signum
and test found via:
grep -nr test/jdk/ -we Math.signum


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issues

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/16925/head:pull/16925
$ git checkout pull/16925

Update a local copy of the PR:
$ git checkout pull/16925
$ git pull https://git.openjdk.org/jdk.git pull/16925/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 16925

View PR using the GUI difftool:
$ git pr show -t 16925

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/16925.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Dec 1, 2023

👋 Welcome back mli! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@Hamlin-Li
Copy link
Author

/solves JDK-8321002

@openjdk openjdk bot added the rfr Pull request is ready for review label Dec 1, 2023
@openjdk
Copy link

openjdk bot commented Dec 1, 2023

@Hamlin-Li
Adding additional issue to solves list: 8321002: RISC-V: C2 SignumVD.

@openjdk
Copy link

openjdk bot commented Dec 1, 2023

@Hamlin-Li The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Dec 1, 2023
@mlbridge
Copy link

mlbridge bot commented Dec 1, 2023

Webrevs

Copy link
Member

@RealFYang RealFYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, Did you check the C2 JIT code? I am wondering whether the newly-added code is covered well by the tests performed.

instruct vsignum_reg(vReg dst, vReg zero, vReg one, vRegMask_V0 v0) %{
match(Set dst (SignumVF dst (Binary zero one)));
match(Set dst (SignumVD dst (Binary zero one)));
effect(TEMP_DEF dst);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v0 is clobbered in C2_MacroAssembler::signum_fp_v. Shouldn't we add a TEMP v0 to the effect?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching! Fixed.

// Vector Floating-Point Sign-Injection Instructions
INSN(vfsgnj_vf, 0b1010111, 0b101, 0b001000);
INSN(vfsgnjx_vf, 0b1010111, 0b101, 0b001010);
INSN(vfsgnjn_vf, 0b1010111, 0b101, 0b001001);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not used anywhere?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can remove them if you think it's better to do it.
Reason I added them is that they're quite similar instructions and it's annoying to lookup the instruction formatting in spec and add every single one when needing them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I would suggest remove them for test coverity reasons. We can add them back when needed.

@@ -1676,6 +1676,20 @@ void C2_MacroAssembler::signum_fp(FloatRegister dst, FloatRegister one, bool is_
bind(done);
}

void C2_MacroAssembler::signum_fp_v(VectorRegister dst, BasicType bt, int vlen,
VectorRegister zero, VectorRegister one) {
vsetvli_helper(bt, vlen);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a situation where vlen times sew(bt) won't fit into h/w register ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, as UseRVV is only enable when vlenb >= 16, and match_rule_supported_vector return false if UseRVV == false.

@Hamlin-Li
Copy link
Author

Hamlin-Li commented Dec 5, 2023

Hi, Did you check the C2 JIT code? I am wondering whether the newly-added code is covered well by the tests performed.

I did not check the JIT code, but the tests did help to catch some bugs when I implemented the intrinsic.
And, it's mostly covered by hotspot/jtreg/compiler/c2/cr6340864/Test[Float|Double]Vect.java

vfclass_v(v0, dst);
mv(t0, fclass_mask::zero | fclass_mask::nan);
vand_vx(v0, v0, t0);
vmseq_vv(v0, v0, zero);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that the input zero (a vector of floating-point 0.0) is appropriate here for vmseq_vv which does vector integer comparison. Why not do vmseq_vi(v0, v0, 0) instead? This will also help remove the zero parameter of this function.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean some patch like below?

diff --git a/src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp b/src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp
index ed421c9e287..aa82447b943 100644
--- a/src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp
+++ b/src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp
@@ -1676,15 +1676,14 @@ void C2_MacroAssembler::signum_fp(FloatRegister dst, FloatRegister one, bool is_
   bind(done);
 }
 
-void C2_MacroAssembler::signum_fp_v(VectorRegister dst, BasicType bt, int vlen,
-                                    VectorRegister zero, VectorRegister one) {
+void C2_MacroAssembler::signum_fp_v(VectorRegister dst, BasicType bt, int vlen, VectorRegister one) {
   vsetvli_helper(bt, vlen);
 
   // check if input is -0, +0, signaling NaN or quiet NaN
   vfclass_v(v0, dst);
   mv(t0, fclass_mask::zero | fclass_mask::nan);
   vand_vx(v0, v0, t0);
-  vmseq_vv(v0, v0, zero);
+  vmseq_vi(v0, v0, 0);
 
   // use floating-point 1.0 with a sign of input
   vfsgnj_vv(dst, one, dst, v0_t);
diff --git a/src/hotspot/cpu/riscv/riscv_v.ad b/src/hotspot/cpu/riscv/riscv_v.ad
index 9b3c9125b3b..2940655f44e 100644
--- a/src/hotspot/cpu/riscv/riscv_v.ad
+++ b/src/hotspot/cpu/riscv/riscv_v.ad
@@ -3664,15 +3664,15 @@ instruct vexpand(vReg dst, vReg src, vRegMask_V0 v0, vReg tmp) %{
 
 // Vector Math.signum
 
-instruct vsignum_reg(vReg dst, vReg zero, vReg one, vRegMask_V0 v0) %{
-  match(Set dst (SignumVF dst (Binary zero one)));
-  match(Set dst (SignumVD dst (Binary zero one)));
+instruct vsignum_reg(vReg dst, vReg one, vRegMask_V0 v0) %{
+  match(Set dst (SignumVF dst one));
+  match(Set dst (SignumVD dst one));
   effect(TEMP_DEF dst, TEMP v0);
   format %{ "vsignum $dst, $dst\t" %}
   ins_encode %{
     BasicType bt = Matcher::vector_element_basic_type(this);
     __ signum_fp_v(as_VectorRegister($dst$$reg), bt, Matcher::vector_length(this),
-                  as_VectorRegister($zero$$reg), as_VectorRegister($one$$reg));
+                   as_VectorRegister($one$$reg));
   %}
   ins_pipe(pipe_slow);
 %}

In fact this is also my initial patch, but at runtime it will report a mismatch issue.
And in x86 and aarch64, they all have zero too.

o749  SignumVD  === _ o746 o1276  [[ o750 ]]  #vectora[4]:{double}

--N: o749  SignumVD  === _ o746 o1276  [[ o750 ]]  #vectora[4]:{double}

   --N: o746  LoadVector  === o440 o865 o697  |o250  [[ o749 ]]  @double[int:>=0] (java/lang/Cloneable,java/io/Serializable):NotNull:exact+any *, idx=6; mismatched #vectora[4]:{double}
   VREG  300  loadV
   VREG_V1  300  loadV
   VREG_V2  300  loadV
   VREG_V3  300  loadV
   VREG_V4  300  loadV
   VREG_V5  300  loadV
   VREG_V6  300  loadV
   VREG_V7  300  loadV
   VREG_V8  300  loadV
   VREG_V9  300  loadV
   VREG_V10  300  loadV
   VREG_V11  300  loadV
   VREG_V12  300  loadV
   VREG_V13  300  loadV
   VREG_V14  300  loadV
   VREG_V15  300  loadV

      --N: o697  AddP  === _ o61 o1069 o1177  [[ o746 ]] 
      IREGP  100  addP_reg_imm
      IREGPNOSP  100  addP_reg_imm
      IREGP_R10  100  addP_reg_imm
      IREGP_R11  100  addP_reg_imm
      IREGP_R12  100  addP_reg_imm
      IREGP_R13  100  addP_reg_imm
      IREGP_R14  100  addP_reg_imm
      IREGP_R15  100  addP_reg_imm
      IREGP_R16  100  addP_reg_imm
      IREGP_R28  100  addP_reg_imm
      IREGP_R30  100  addP_reg_imm
      IREGP_R31  100  addP_reg_imm
      JAVATHREAD_REGP  100  addP_reg_imm
      INDIRECT  100  addP_reg_imm
      INDOFFL  0  INDOFFL
      INLINE_CACHE_REGP  100  addP_reg_imm
      MEMORY  0  INDOFFL
      IREGNORP  100  IREGP
      IREGILNP  100  IREGP
      IREGILNPNOSP  100  IREGPNOSP
      VMEMA  100  INDIRECT

         --N: o1069  AddP  === _ o61 o61 o1124  [[ o1056 o1046 o954 o949 o697 o869 o1051 o1041 ]] 
         IREGP  0  IREGP
         IREGPNOSP  0  IREGPNOSP
         IREGP_R10  0  IREGP_R10
         IREGP_R11  0  IREGP_R11
         IREGP_R12  0  IREGP_R12
         IREGP_R13  0  IREGP_R13
         IREGP_R14  0  IREGP_R14
         IREGP_R15  0  IREGP_R15
         IREGP_R16  0  IREGP_R16
         IREGP_R28  0  IREGP_R28
         IREGP_R30  0  IREGP_R30
         IREGP_R31  0  IREGP_R31
         JAVATHREAD_REGP  0  JAVATHREAD_REGP
         INDIRECT  0  INDIRECT
         INLINE_CACHE_REGP  0  INLINE_CACHE_REGP
         MEMORY  0  INDIRECT
         IREGNORP  0  IREGP
         IREGILNP  0  IREGP
         IREGILNPNOSP  0  IREGPNOSP
         VMEMA  0  INDIRECT

         --N: o1177  ConL  === o0  [[ o697 o698 ]]  #long:240
         IMML  0  IMML
         IMMLADD  0  IMMLADD
         IMMLSUB  0  IMMLSUB
         IMMLOFFSET  0  IMMLOFFSET
         IREGL  100  loadConL
         IREGLNOSP  100  loadConL
         IREGL_R28  100  loadConL
         IREGL_R29  100  loadConL
         IREGL_R30  100  loadConL
         IREGL_R10  100  loadConL
         IREGIORL  100  IREGL
         IREGILNP  100  IREGL
         IREGILNPNOSP  100  IREGLNOSP
         IMMIORL  0  IMML

   --N: o1276  Binary  === _ o747 o748  [[ o749 ]] 
   _Binary_vReg_vReg  0  _Binary_vReg_vReg

      --N: o747  Replicate  === _ o192  [[ o1276 o1277 o1275 o1274 o1273 o1272 o1271 o1270 o1269 ]]  #vectora[4]:{double}
      VREG  0  VREG
      VREG_V1  0  VREG_V1
      VREG_V2  0  VREG_V2
      VREG_V3  0  VREG_V3
      VREG_V4  0  VREG_V4
      VREG_V5  0  VREG_V5
      VREG_V6  0  VREG_V6
      VREG_V7  0  VREG_V7
      VREG_V8  0  VREG_V8
      VREG_V9  0  VREG_V9
      VREG_V10  0  VREG_V10
      VREG_V11  0  VREG_V11
      VREG_V12  0  VREG_V12
      VREG_V13  0  VREG_V13
      VREG_V14  0  VREG_V14
      VREG_V15  0  VREG_V15

      --N: o748  Replicate  === _ o193  [[ o1276 o1277 o1275 o1274 o1273 o1272 o1271 o1270 o1269 ]]  #vectora[4]:{double}
      VREG  0  VREG
      VREG_V1  0  VREG_V1
      VREG_V2  0  VREG_V2
      VREG_V3  0  VREG_V3
      VREG_V4  0  VREG_V4
      VREG_V5  0  VREG_V5
      VREG_V6  0  VREG_V6
      VREG_V7  0  VREG_V7
      VREG_V8  0  VREG_V8
      VREG_V9  0  VREG_V9
      VREG_V10  0  VREG_V10
      VREG_V11  0  VREG_V11
      VREG_V12  0  VREG_V12
      VREG_V13  0  VREG_V13
      VREG_V14  0  VREG_V14
      VREG_V15  0  VREG_V15

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, sorry for not being clear enough. In fact, I am suggesting following add-on change:

diff --git a/src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp b/src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp
index 18b5df3f68c..b732e70ed84 100644
--- a/src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp
+++ b/src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp
@@ -1677,15 +1677,15 @@ void C2_MacroAssembler::signum_fp(FloatRegister dst, FloatRegister one, bool is_
   bind(done);
 }

-void C2_MacroAssembler::signum_fp_v(VectorRegister dst, BasicType bt, int vlen,
-                                    VectorRegister zero, VectorRegister one) {
+void C2_MacroAssembler::signum_fp_v(VectorRegister dst, VectorRegister one,
+                                    BasicType bt, int vlen) {
   vsetvli_helper(bt, vlen);

   // check if input is -0, +0, signaling NaN or quiet NaN
   vfclass_v(v0, dst);
   mv(t0, fclass_mask::zero | fclass_mask::nan);
   vand_vx(v0, v0, t0);
-  vmseq_vv(v0, v0, zero);
+  vmseq_vi(v0, v0, 0);

   // use floating-point 1.0 with a sign of input
   vfsgnj_vv(dst, one, dst, v0_t);
diff --git a/src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.hpp b/src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.hpp
index 0ed7c3686dc..b9a7749631a 100644
--- a/src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.hpp
+++ b/src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.hpp
@@ -163,7 +163,7 @@

   void signum_fp(FloatRegister dst, FloatRegister one, bool is_double);

-  void signum_fp_v(VectorRegister dst, BasicType bt, int vlen, VectorRegister zero, VectorRegister one);
+  void signum_fp_v(VectorRegister dst, VectorRegister one, BasicType bt, int vlen);

   // intrinsic methods implemented by rvv instructions

diff --git a/src/hotspot/cpu/riscv/riscv_v.ad b/src/hotspot/cpu/riscv/riscv_v.ad
index 9c2349a6c92..c163325fc81 100644
--- a/src/hotspot/cpu/riscv/riscv_v.ad
+++ b/src/hotspot/cpu/riscv/riscv_v.ad
@@ -3671,8 +3671,8 @@ instruct vsignum_reg(vReg dst, vReg zero, vReg one, vRegMask_V0 v0) %{
   format %{ "vsignum $dst, $dst\t" %}
   ins_encode %{
     BasicType bt = Matcher::vector_element_basic_type(this);
-    __ signum_fp_v(as_VectorRegister($dst$$reg), bt, Matcher::vector_length(this),
-                   as_VectorRegister($zero$$reg), as_VectorRegister($one$$reg));
+    __ signum_fp_v(as_VectorRegister($dst$$reg), as_VectorRegister($one$$reg),
+                   bt, Matcher::vector_length(this));
   %}
   ins_pipe(pipe_slow);
 %}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch of the bug! Fixed.
I did not realize it before. I think the reason that tests did not catch it is because both int and float zero has the exact same binary content.

// Vector Floating-Point Sign-Injection Instructions
INSN(vfsgnj_vf, 0b1010111, 0b101, 0b001000);
INSN(vfsgnjx_vf, 0b1010111, 0b101, 0b001010);
INSN(vfsgnjn_vf, 0b1010111, 0b101, 0b001001);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I would suggest remove them for test coverity reasons. We can add them back when needed.

ins_encode %{
BasicType bt = Matcher::vector_element_basic_type(this);
__ signum_fp_v(as_VectorRegister($dst$$reg), bt, Matcher::vector_length(this),
as_VectorRegister($zero$$reg), as_VectorRegister($one$$reg));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: maybe leave one extra space here to align with parameters of the preceding line.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I would suggest remove them for test coverity reasons. We can add them back when needed.

That's a good point! the code is removed.

@openjdk
Copy link

openjdk bot commented Dec 7, 2023

@Hamlin-Li This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8321001: RISC-V: C2 SignumVF
8321002: RISC-V: C2 SignumVD

Reviewed-by: fyang

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 250 new commits pushed to the master branch:

  • 632a3c5: 8305825: getBounds API returns wrong value resulting in multiple Regression Test Failures on Ubuntu 23.04
  • 75a7c19: 8315827: Kitchensink.java and RenaissanceStressTest.java time out with jvmti module errors
  • 91ffdfb: 8320365: IPPPrintService.getAttributes() causes blanket re-initialisation
  • 50baaf4: 8321013: Parallel: Refactor ObjectStartArray
  • afb8964: 8320443: [macos] Test java/awt/print/PrinterJob/PrinterDevice.java fails on macOS
  • 82796bd: 8320570: NegativeArraySizeException decoding >1G UTF8 bytes with non-ascii characters
  • 781775d: 8321484: Make TestImplicitlyDeclaredClasses release independent
  • b02fc86: 8321122: Shenandoah: Remove ShenandoahLoopOptsAfterExpansion flag
  • 2830dd2: 8321410: Shenandoah: Remove ShenandoahSuspendibleWorkers flag
  • f482260: 8319969: os::large_page_init() turns off THPs for ZGC
  • ... and 240 more: https://git.openjdk.org/jdk/compare/e055fae104a887c436da9f2924e88029518d5d96...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Dec 7, 2023
@Hamlin-Li
Copy link
Author

@RealFYang @VladimirKempik Thanks for your reviewing.

/integrate

@openjdk
Copy link

openjdk bot commented Dec 7, 2023

Going to push as commit 2f9e70e.
Since your change was applied there have been 263 commits pushed to the master branch:

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Dec 7, 2023
@openjdk openjdk bot closed this Dec 7, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Dec 7, 2023
@openjdk
Copy link

openjdk bot commented Dec 7, 2023

@Hamlin-Li Pushed as commit 2f9e70e.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@Hamlin-Li Hamlin-Li deleted the signum-v branch January 17, 2024 09:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler hotspot-compiler-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

3 participants