Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RISCV] Lower vfmv.s.f intrinsics to VFMV_S_F_VL first #76699

Merged
merged 2 commits into from
Jan 15, 2024

Conversation

lukel97
Copy link
Contributor

@lukel97 lukel97 commented Jan 2, 2024

Currently vfmv.s.f intrinsics are directly selected to their pseudos via a
tablegen pattern in RISCVInstrInfoVPseudos.td, whereas the other move
instructions (vmv.s.x/vmv.v.x/vmv.v.f etc.) first get lowered to their
corresponding VL SDNode, then get selected from a pattern in
RISCVInstrInfoVVLPatterns.td

This patch brings vfmv.s.f inline with the other move instructions, and shows
how the LMUL reducing combine for VFMV_S_F_VL and VMV_S_X_VL results in the
discrepancy in the test added in #75544.

Split out from #71501, where we did this to preserve the behaviour of selecting
vmv_s_x for VFMV_S_F_VL for small enough immediates.

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 2, 2024

@llvm/pr-subscribers-backend-risc-v

Author: Luke Lau (lukel97)

Changes

Currently vfmv.s.f intrinsics are directly selected to their pseudos via a
tablegen pattern in RISCVInstrInfoVPseudos.td, whereas the other move
instructions (vmv.s.x/vmv.v.x/vmv.v.f etc.) first get lowered to their
corresponding VL SDNode, then get selected from a pattern in
RISCVInstrInfoVVLPatterns.td

This patch brings vfmv.s.f inline with the other move instructions, and shows
how the LMUL reducing combine for VFMV_S_F_VL and VMV_S_X_VL results in the
discrepancy in the test added in #75544.

Split out from #71501, where we did this to preserve the behaviour of selecting
vmv_s_x for VFMV_S_F_VL for small enough immediates.


Full diff: https://github.com/llvm/llvm-project/pull/76699.diff

4 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+3)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td (-26)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vfmv.s.f.ll (+18-18)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vsetvli-insert.ll (+2-1)
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 51580d15451ca2..978b465f89c069 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -8492,6 +8492,9 @@ SDValue RISCVTargetLowering::LowerINTRINSIC_WO_CHAIN(SDValue Op,
     return DAG.getNode(RISCVISD::VSELECT_VL, DL, VT, SelectCond, SplattedVal,
                        Vec, VL);
   }
+  case Intrinsic::riscv_vfmv_s_f:
+    return DAG.getNode(RISCVISD::VFMV_S_F_VL, DL, Op.getSimpleValueType(),
+                       Op.getOperand(1), Op.getOperand(2), Op.getOperand(3));
   // EGS * EEW >= 128 bits
   case Intrinsic::riscv_vaesdf_vv:
   case Intrinsic::riscv_vaesdf_vs:
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td b/llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td
index be4bc3b58766ef..af044f9b0a7228 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td
@@ -7399,32 +7399,6 @@ foreach vti = AllIntegerVectors in {
   // vmv.s.x is handled with a custom node in RISCVInstrInfoVVLPatterns.td
 }
 
-//===----------------------------------------------------------------------===//
-// 16.2. Floating-Point Scalar Move Instructions
-//===----------------------------------------------------------------------===//
-
-foreach fvti = AllFloatVectors in {
-  let Predicates = GetVTypePredicates<fvti>.Predicates in {
-    def : Pat<(fvti.Vector (int_riscv_vfmv_s_f (fvti.Vector fvti.RegClass:$rs1),
-                           (fvti.Scalar fvti.ScalarRegClass:$rs2), VLOpFrag)),
-              (!cast<Instruction>("PseudoVFMV_S_"#fvti.ScalarSuffix#"_" #
-                                  fvti.LMul.MX)
-               (fvti.Vector $rs1),
-               (fvti.Scalar fvti.ScalarRegClass:$rs2),
-               GPR:$vl, fvti.Log2SEW)>;
-
-    def : Pat<(fvti.Vector (int_riscv_vfmv_s_f (fvti.Vector fvti.RegClass:$rs1),
-                           (fvti.Scalar (fpimm0)), VLOpFrag)),
-              (!cast<Instruction>("PseudoVMV_S_X_" # fvti.LMul.MX)
-               (fvti.Vector $rs1), (XLenVT X0), GPR:$vl, fvti.Log2SEW)>;
-
-    def : Pat<(fvti.Vector (int_riscv_vfmv_s_f (fvti.Vector fvti.RegClass:$rs1),
-                           (fvti.Scalar (SelectFPImm (XLenVT GPR:$imm))), VLOpFrag)),
-              (!cast<Instruction>("PseudoVMV_S_X_" # fvti.LMul.MX)
-               (fvti.Vector $rs1), GPR:$imm, GPR:$vl, fvti.Log2SEW)>;
-  }
-}
-
 //===----------------------------------------------------------------------===//
 // 16.3. Vector Slide Instructions
 //===----------------------------------------------------------------------===//
diff --git a/llvm/test/CodeGen/RISCV/rvv/vfmv.s.f.ll b/llvm/test/CodeGen/RISCV/rvv/vfmv.s.f.ll
index 4fdc7f2c774f5b..01d2eac99d0c07 100644
--- a/llvm/test/CodeGen/RISCV/rvv/vfmv.s.f.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/vfmv.s.f.ll
@@ -48,7 +48,7 @@ declare <vscale x 8 x half> @llvm.riscv.vfmv.s.f.nxv8f16(<vscale x 8 x half>, ha
 define <vscale x 8 x half> @intrinsic_vfmv.s.f_f_nxv8f16(<vscale x 8 x half> %0, half %1, iXLen %2) nounwind {
 ; CHECK-LABEL: intrinsic_vfmv.s.f_f_nxv8f16:
 ; CHECK:       # %bb.0: # %entry
-; CHECK-NEXT:    vsetvli zero, a0, e16, m2, tu, ma
+; CHECK-NEXT:    vsetvli zero, a0, e16, m1, tu, ma
 ; CHECK-NEXT:    vfmv.s.f v8, fa0
 ; CHECK-NEXT:    ret
 entry:
@@ -61,7 +61,7 @@ declare <vscale x 16 x half> @llvm.riscv.vfmv.s.f.nxv16f16(<vscale x 16 x half>,
 define <vscale x 16 x half> @intrinsic_vfmv.s.f_f_nxv16f16(<vscale x 16 x half> %0, half %1, iXLen %2) nounwind {
 ; CHECK-LABEL: intrinsic_vfmv.s.f_f_nxv16f16:
 ; CHECK:       # %bb.0: # %entry
-; CHECK-NEXT:    vsetvli zero, a0, e16, m4, tu, ma
+; CHECK-NEXT:    vsetvli zero, a0, e16, m1, tu, ma
 ; CHECK-NEXT:    vfmv.s.f v8, fa0
 ; CHECK-NEXT:    ret
 entry:
@@ -74,7 +74,7 @@ declare <vscale x 32 x half> @llvm.riscv.vfmv.s.f.nxv32f16(<vscale x 32 x half>,
 define <vscale x 32 x half> @intrinsic_vfmv.s.f_f_nxv32f16(<vscale x 32 x half> %0, half %1, iXLen %2) nounwind {
 ; CHECK-LABEL: intrinsic_vfmv.s.f_f_nxv32f16:
 ; CHECK:       # %bb.0: # %entry
-; CHECK-NEXT:    vsetvli zero, a0, e16, m8, tu, ma
+; CHECK-NEXT:    vsetvli zero, a0, e16, m1, tu, ma
 ; CHECK-NEXT:    vfmv.s.f v8, fa0
 ; CHECK-NEXT:    ret
 entry:
@@ -113,7 +113,7 @@ declare <vscale x 4 x float> @llvm.riscv.vfmv.s.f.nxv4f32(<vscale x 4 x float>,
 define <vscale x 4 x float> @intrinsic_vfmv.s.f_f_nxv4f32(<vscale x 4 x float> %0, float %1, iXLen %2) nounwind {
 ; CHECK-LABEL: intrinsic_vfmv.s.f_f_nxv4f32:
 ; CHECK:       # %bb.0: # %entry
-; CHECK-NEXT:    vsetvli zero, a0, e32, m2, tu, ma
+; CHECK-NEXT:    vsetvli zero, a0, e32, m1, tu, ma
 ; CHECK-NEXT:    vfmv.s.f v8, fa0
 ; CHECK-NEXT:    ret
 entry:
@@ -126,7 +126,7 @@ declare <vscale x 8 x float> @llvm.riscv.vfmv.s.f.nxv8f32(<vscale x 8 x float>,
 define <vscale x 8 x float> @intrinsic_vfmv.s.f_f_nxv8f32(<vscale x 8 x float> %0, float %1, iXLen %2) nounwind {
 ; CHECK-LABEL: intrinsic_vfmv.s.f_f_nxv8f32:
 ; CHECK:       # %bb.0: # %entry
-; CHECK-NEXT:    vsetvli zero, a0, e32, m4, tu, ma
+; CHECK-NEXT:    vsetvli zero, a0, e32, m1, tu, ma
 ; CHECK-NEXT:    vfmv.s.f v8, fa0
 ; CHECK-NEXT:    ret
 entry:
@@ -139,7 +139,7 @@ declare <vscale x 16 x float> @llvm.riscv.vfmv.s.f.nxv16f32(<vscale x 16 x float
 define <vscale x 16 x float> @intrinsic_vfmv.s.f_f_nxv16f32(<vscale x 16 x float> %0, float %1, iXLen %2) nounwind {
 ; CHECK-LABEL: intrinsic_vfmv.s.f_f_nxv16f32:
 ; CHECK:       # %bb.0: # %entry
-; CHECK-NEXT:    vsetvli zero, a0, e32, m8, tu, ma
+; CHECK-NEXT:    vsetvli zero, a0, e32, m1, tu, ma
 ; CHECK-NEXT:    vfmv.s.f v8, fa0
 ; CHECK-NEXT:    ret
 entry:
@@ -165,7 +165,7 @@ declare <vscale x 2 x double> @llvm.riscv.vfmv.s.f.nxv2f64(<vscale x 2 x double>
 define <vscale x 2 x double> @intrinsic_vfmv.s.f_f_nxv2f64(<vscale x 2 x double> %0, double %1, iXLen %2) nounwind {
 ; CHECK-LABEL: intrinsic_vfmv.s.f_f_nxv2f64:
 ; CHECK:       # %bb.0: # %entry
-; CHECK-NEXT:    vsetvli zero, a0, e64, m2, tu, ma
+; CHECK-NEXT:    vsetvli zero, a0, e64, m1, tu, ma
 ; CHECK-NEXT:    vfmv.s.f v8, fa0
 ; CHECK-NEXT:    ret
 entry:
@@ -178,7 +178,7 @@ declare <vscale x 4 x double> @llvm.riscv.vfmv.s.f.nxv4f64(<vscale x 4 x double>
 define <vscale x 4 x double> @intrinsic_vfmv.s.f_f_nxv4f64(<vscale x 4 x double> %0, double %1, iXLen %2) nounwind {
 ; CHECK-LABEL: intrinsic_vfmv.s.f_f_nxv4f64:
 ; CHECK:       # %bb.0: # %entry
-; CHECK-NEXT:    vsetvli zero, a0, e64, m4, tu, ma
+; CHECK-NEXT:    vsetvli zero, a0, e64, m1, tu, ma
 ; CHECK-NEXT:    vfmv.s.f v8, fa0
 ; CHECK-NEXT:    ret
 entry:
@@ -191,7 +191,7 @@ declare <vscale x 8 x double> @llvm.riscv.vfmv.s.f.nxv8f64(<vscale x 8 x double>
 define <vscale x 8 x double> @intrinsic_vfmv.s.f_f_nxv8f64(<vscale x 8 x double> %0, double %1, iXLen %2) nounwind {
 ; CHECK-LABEL: intrinsic_vfmv.s.f_f_nxv8f64:
 ; CHECK:       # %bb.0: # %entry
-; CHECK-NEXT:    vsetvli zero, a0, e64, m8, tu, ma
+; CHECK-NEXT:    vsetvli zero, a0, e64, m1, tu, ma
 ; CHECK-NEXT:    vfmv.s.f v8, fa0
 ; CHECK-NEXT:    ret
 entry:
@@ -235,7 +235,7 @@ entry:
 define <vscale x 8 x half> @intrinsic_vfmv.s.f_f_zero_nxv8f16(<vscale x 8 x half> %0, iXLen %1) nounwind {
 ; CHECK-LABEL: intrinsic_vfmv.s.f_f_zero_nxv8f16:
 ; CHECK:       # %bb.0: # %entry
-; CHECK-NEXT:    vsetvli zero, a0, e16, m2, tu, ma
+; CHECK-NEXT:    vsetvli zero, a0, e16, m1, tu, ma
 ; CHECK-NEXT:    vmv.s.x v8, zero
 ; CHECK-NEXT:    ret
 entry:
@@ -246,7 +246,7 @@ entry:
 define <vscale x 16 x half> @intrinsic_vfmv.s.f_f_zero_nxv16f16(<vscale x 16 x half> %0, iXLen %1) nounwind {
 ; CHECK-LABEL: intrinsic_vfmv.s.f_f_zero_nxv16f16:
 ; CHECK:       # %bb.0: # %entry
-; CHECK-NEXT:    vsetvli zero, a0, e16, m4, tu, ma
+; CHECK-NEXT:    vsetvli zero, a0, e16, m1, tu, ma
 ; CHECK-NEXT:    vmv.s.x v8, zero
 ; CHECK-NEXT:    ret
 entry:
@@ -257,7 +257,7 @@ entry:
 define <vscale x 32 x half> @intrinsic_vfmv.s.f_f_zero_nxv32f16(<vscale x 32 x half> %0, iXLen %1) nounwind {
 ; CHECK-LABEL: intrinsic_vfmv.s.f_f_zero_nxv32f16:
 ; CHECK:       # %bb.0: # %entry
-; CHECK-NEXT:    vsetvli zero, a0, e16, m8, tu, ma
+; CHECK-NEXT:    vsetvli zero, a0, e16, m1, tu, ma
 ; CHECK-NEXT:    vmv.s.x v8, zero
 ; CHECK-NEXT:    ret
 entry:
@@ -290,7 +290,7 @@ entry:
 define <vscale x 4 x float> @intrinsic_vfmv.s.f_f_zero_nxv4f32(<vscale x 4 x float> %0, iXLen %1) nounwind {
 ; CHECK-LABEL: intrinsic_vfmv.s.f_f_zero_nxv4f32:
 ; CHECK:       # %bb.0: # %entry
-; CHECK-NEXT:    vsetvli zero, a0, e32, m2, tu, ma
+; CHECK-NEXT:    vsetvli zero, a0, e32, m1, tu, ma
 ; CHECK-NEXT:    vmv.s.x v8, zero
 ; CHECK-NEXT:    ret
 entry:
@@ -301,7 +301,7 @@ entry:
 define <vscale x 8 x float> @intrinsic_vfmv.s.f_f_zero_nxv8f32(<vscale x 8 x float> %0, iXLen %1) nounwind {
 ; CHECK-LABEL: intrinsic_vfmv.s.f_f_zero_nxv8f32:
 ; CHECK:       # %bb.0: # %entry
-; CHECK-NEXT:    vsetvli zero, a0, e32, m4, tu, ma
+; CHECK-NEXT:    vsetvli zero, a0, e32, m1, tu, ma
 ; CHECK-NEXT:    vmv.s.x v8, zero
 ; CHECK-NEXT:    ret
 entry:
@@ -312,7 +312,7 @@ entry:
 define <vscale x 16 x float> @intrinsic_vfmv.s.f_f_zero_nxv16f32(<vscale x 16 x float> %0, iXLen %1) nounwind {
 ; CHECK-LABEL: intrinsic_vfmv.s.f_f_zero_nxv16f32:
 ; CHECK:       # %bb.0: # %entry
-; CHECK-NEXT:    vsetvli zero, a0, e32, m8, tu, ma
+; CHECK-NEXT:    vsetvli zero, a0, e32, m1, tu, ma
 ; CHECK-NEXT:    vmv.s.x v8, zero
 ; CHECK-NEXT:    ret
 entry:
@@ -334,7 +334,7 @@ entry:
 define <vscale x 2 x double> @intrinsic_vfmv.s.f_f_zero_nxv2f64(<vscale x 2 x double> %0, iXLen %1) nounwind {
 ; CHECK-LABEL: intrinsic_vfmv.s.f_f_zero_nxv2f64:
 ; CHECK:       # %bb.0: # %entry
-; CHECK-NEXT:    vsetvli zero, a0, e64, m2, tu, ma
+; CHECK-NEXT:    vsetvli zero, a0, e64, m1, tu, ma
 ; CHECK-NEXT:    vmv.s.x v8, zero
 ; CHECK-NEXT:    ret
 entry:
@@ -345,7 +345,7 @@ entry:
 define <vscale x 4 x double> @intrinsic_vfmv.s.f_f_zero_nxv4f64(<vscale x 4 x double> %0, iXLen %1) nounwind {
 ; CHECK-LABEL: intrinsic_vfmv.s.f_f_zero_nxv4f64:
 ; CHECK:       # %bb.0: # %entry
-; CHECK-NEXT:    vsetvli zero, a0, e64, m4, tu, ma
+; CHECK-NEXT:    vsetvli zero, a0, e64, m1, tu, ma
 ; CHECK-NEXT:    vmv.s.x v8, zero
 ; CHECK-NEXT:    ret
 entry:
@@ -356,7 +356,7 @@ entry:
 define <vscale x 8 x double> @intrinsic_vfmv.s.f_f_zero_nxv8f64(<vscale x 8 x double> %0, iXLen %1) nounwind {
 ; CHECK-LABEL: intrinsic_vfmv.s.f_f_zero_nxv8f64:
 ; CHECK:       # %bb.0: # %entry
-; CHECK-NEXT:    vsetvli zero, a0, e64, m8, tu, ma
+; CHECK-NEXT:    vsetvli zero, a0, e64, m1, tu, ma
 ; CHECK-NEXT:    vmv.s.x v8, zero
 ; CHECK-NEXT:    ret
 entry:
diff --git a/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert.ll b/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert.ll
index e15c5a3323cbe2..5ad741dbac10e4 100644
--- a/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert.ll
@@ -630,8 +630,9 @@ define void @add_v16i64(ptr %x, ptr %y) vscale_range(2,2) {
 define <vscale x 2 x float> @fp_reduction_vfmv_s_f(float %0, <vscale x 8 x float> %1, i64  %2) {
 ; CHECK-LABEL: fp_reduction_vfmv_s_f:
 ; CHECK:       # %bb.0:
-; CHECK-NEXT:    vsetvli zero, a0, e32, m4, ta, ma
+; CHECK-NEXT:    vsetvli zero, a0, e32, m1, ta, ma
 ; CHECK-NEXT:    vfmv.s.f v12, fa0
+; CHECK-NEXT:    vsetvli zero, a0, e32, m4, ta, ma
 ; CHECK-NEXT:    vfredusum.vs v8, v8, v12
 ; CHECK-NEXT:    ret
   %4 = tail call <vscale x 8 x float> @llvm.riscv.vfmv.s.f.nxv8f32.i64(<vscale x 8 x float> poison, float %0, i64 %2)

; CHECK-NEXT: vfmv.s.f v12, fa0
; CHECK-NEXT: vsetvli zero, a0, e32, m4, ta, ma
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a regression. Do you have a plan to fix this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @preames's draft patch here should fix it. Happy to take over it if there's no objections

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

@preames preames left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs rebased after #76801, marking as request changes to reflect and make it easier to scan review list for actionable items.

Currently vfmv.s.f intrinsics are directly selected to their pseudos via a
tablegen pattern in RISCVInstrInfoVPseudos.td, whereas the other move
instructions (vmv.s.x/vmv.v.x/vmv.v.f etc.) first get lowered to their
corresponding VL SDNode, then get selected from a pattern in
RISCVInstrInfoVVLPatterns.td

This patch brings vfmv.s.f inline with the other move instructions, and shows
how the LMUL reducing combine for VFMV_S_F_VL and VMV_S_X_VL results in the
discrepancy in the test added in llvm#75544.

Split out from llvm#71501, where we did this to preserve the behaviour of selecting
vmv_s_x for VFMV_S_F_VL for small enough immediates.
@lukel97 lukel97 force-pushed the remove-vfmv.s.f-intrinsic-pats branch from 050496a to d453f0e Compare January 11, 2024 08:36
Copy link
Collaborator

@topperc topperc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lukel97 lukel97 merged commit c07a1fe into llvm:main Jan 15, 2024
4 checks passed
justinfargnoli pushed a commit to justinfargnoli/llvm-project that referenced this pull request Jan 28, 2024
Currently vfmv.s.f intrinsics are directly selected to their pseudos via
a
tablegen pattern in RISCVInstrInfoVPseudos.td, whereas the other move
instructions (vmv.s.x/vmv.v.x/vmv.v.f etc.) first get lowered to their
corresponding VL SDNode, then get selected from a pattern in
RISCVInstrInfoVVLPatterns.td

This patch brings vfmv.s.f inline with the other move instructions.

Split out from llvm#71501, where we did this to preserve the behaviour of
selecting
vmv_s_x for VFMV_S_F_VL for small enough immediates.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants