[RISCV Add some vsetvli insertion test cases with vmv.s.x+reduction. NFC #75544

topperc · 2023-12-14T23:50:21Z

These test cases where intended to get a single vsetvli by using the vmv.s.x intrinsic with the same LMUL as the reduction. This works for FP, but does not work for integer.

I believe #71501 will break this for FP too. Hopefully the vsetvli pass can be taught to fix this.

These test cases where intended to get a single vsetvli by using the vmv.s.x intrinsic with the same LMUL as the reduction. This works for FP, but does not work for integer. I believe llvm#71501 will break this for FP too. Hopefully the vsetvli pass can be taught to fix this.

preames

LGTM

For the record, the difference between integer and float is very weird here.

preames · 2023-12-15T16:17:41Z

For the record, the difference between integer and float is very weird here.

Congrats, you nerd sniped me. :)

I dug into this, and it turns out to be a difference caused by the LMUL1 subreg stuff (which we apparently only do for int?), and a gap in the backwards walk code.

A draft patch follows. (Sorry for the huge inline comment, attaching files appears broken today?) Note that the fixme inline needs addressed before this can become a real patch, and there's definitely some style cleanup needed.

diff --git a/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp b/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
index b2d36b362b3a..6fe82991c82f 100644
--- a/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
@@ -1472,6 +1472,14 @@ static bool isNonZeroAVL(const MachineOperand &MO,
   return 0 != MO.getImm();
 }
 
+static bool hasSameAVL(const MachineOperand &AVL1, const MachineOperand &AVL2) {
+  if (AVL1.isReg() && AVL2.isReg())
+    return AVL1.getReg() == AVL2.getReg();
+  if (AVL1.isImm() && AVL2.isImm())
+    return AVL1.getImm() == AVL2.getImm();
+  return false;
+}
+
 // Return true if we can mutate PrevMI to match MI without changing any the
 // fields which would be observed.
 static bool canMutatePriorConfig(const MachineInstr &PrevMI,
@@ -1490,15 +1498,24 @@ static bool canMutatePriorConfig(const MachineInstr &PrevMI,
     if (Used.VLZeroness) {
       if (isVLPreservingConfig(PrevMI))
         return false;
-      if (!isNonZeroAVL(MI.getOperand(1), MRI) ||
-          !isNonZeroAVL(PrevMI.getOperand(1), MRI))
-        return false;
+      if (!hasSameAVL(MI.getOperand(1), PrevMI.getOperand(1)))
+        if (!isNonZeroAVL(MI.getOperand(1), MRI) ||
+            !isNonZeroAVL(PrevMI.getOperand(1), MRI))
+          return false;
     }
 
     // TODO: Track whether the register is defined between
     // PrevMI and MI.
-    if (MI.getOperand(1).isReg() &&
-        RISCV::X0 != MI.getOperand(1).getReg())
+    auto regMayBeDefinedBetween = [&]() {
+      auto &AVL1 = PrevMI.getOperand(1);
+      auto &AVL2 = MI.getOperand(1);
+      // FIXME: This strongly assumes SSA. Rewrite using MRI interface.
+      if (AVL1.isReg() && AVL2.isReg() && AVL1.getReg() == AVL2.getReg())
+        return false;
+      return RISCV::X0 != AVL2.getReg();
+    };
+
+    if (MI.getOperand(1).isReg() && regMayBeDefinedBetween())
       return false;
   }
 
@@ -1519,7 +1536,6 @@ void RISCVInsertVSETVLI::doLocalPostpass(MachineBasicBlock &MBB) {
   Used.demandVTYPE();
   SmallVector<MachineInstr*> ToDelete;
   for (MachineInstr &MI : make_range(MBB.rbegin(), MBB.rend())) {
-
     if (!isVectorConfigInstr(MI)) {
       doUnion(Used, getDemanded(MI, MRI, ST));
       continue;
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-insert.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-insert.ll
index 6ebbc37f4afd..b652804813e4 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-insert.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-insert.ll
@@ -63,9 +63,8 @@ define <32 x i32> @insertelt_v32i32_31(<32 x i32> %a, i32 %y) {
 ; CHECK-LABEL: insertelt_v32i32_31:
 ; CHECK:       # %bb.0:
 ; CHECK-NEXT:    li a1, 32
-; CHECK-NEXT:    vsetvli zero, a1, e32, m1, ta, ma
-; CHECK-NEXT:    vmv.s.x v16, a0
 ; CHECK-NEXT:    vsetvli zero, a1, e32, m8, ta, ma
+; CHECK-NEXT:    vmv.s.x v16, a0
 ; CHECK-NEXT:    vslideup.vi v8, v16, 31
 ; CHECK-NEXT:    ret
   %b = insertelement <32 x i32> %a, i32 %y, i32 31
@@ -101,9 +100,8 @@ define <64 x i32> @insertelt_v64i32_63(<64 x i32> %a, i32 %y) {
 ; CHECK-LABEL: insertelt_v64i32_63:
 ; CHECK:       # %bb.0:
 ; CHECK-NEXT:    li a1, 32
-; CHECK-NEXT:    vsetvli zero, a1, e32, m1, ta, ma
-; CHECK-NEXT:    vmv.s.x v24, a0
 ; CHECK-NEXT:    vsetvli zero, a1, e32, m8, ta, ma
+; CHECK-NEXT:    vmv.s.x v24, a0
 ; CHECK-NEXT:    vslideup.vi v16, v24, 31
 ; CHECK-NEXT:    ret
   %b = insertelement <64 x i32> %a, i32 %y, i32 63

One of the requirements to be able to delete a vsetvli in the backwards pass is that the preceding vsetvli must have the same AVL. This handles the case where the AVLs are registers by using MachineRegisterInfo to check if there are any definitions between the two vsetvlis. The Dominates helper was taken from MachineDominatorTree and scans through the instructions in the block which is less than ideal. But it's only called whenever the two registers are the same, which should be rare. This also replaces the equally-zero check with the existing hasEquallyZeroAVL function, which is needed to handle the case where the AVLs are the same. Based off the draft patch in llvm#75544 (comment).

Currently vfmv.s.f intrinsics are directly selected to their pseudos via a tablegen pattern in RISCVInstrInfoVPseudos.td, whereas the other move instructions (vmv.s.x/vmv.v.x/vmv.v.f etc.) first get lowered to their corresponding VL SDNode, then get selected from a pattern in RISCVInstrInfoVVLPatterns.td This patch brings vfmv.s.f inline with the other move instructions, and shows how the LMUL reducing combine for VFMV_S_F_VL and VMV_S_X_VL results in the discrepancy in the test added in llvm#75544. Split out from llvm#71501, where we did this to preserve the behaviour of selecting vmv_s_x for VFMV_S_F_VL for small enough immediates.

topperc requested review from preames and lukel97 December 14, 2023 23:50

preames approved these changes Dec 15, 2023

View reviewed changes

preames mentioned this pull request Dec 15, 2023

[RISCV] Remove vmv.s.x and vmv.x.s lmul pseudo variants #71501

Merged

topperc merged commit 93b14c3 into llvm:main Dec 15, 2023
4 checks passed

topperc deleted the pr/vsetvli-reduction branch December 15, 2023 16:50

lukel97 mentioned this pull request Jan 2, 2024

[RISCV] Lower vfmv.s.f intrinsics to VFMV_S_F_VL first #76699

Merged

lukel97 mentioned this pull request Jan 3, 2024

[RISCV] Allow vsetvlis with same register AVL in doLocalPostpass #76801

Merged

lukel97 added a commit to lukel97/llvm-project that referenced this pull request Jan 15, 2024

Update vsetvli-insert.ll after llvm#75544

18dd94d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RISCV Add some vsetvli insertion test cases with vmv.s.x+reduction. NFC #75544

[RISCV Add some vsetvli insertion test cases with vmv.s.x+reduction. NFC #75544

topperc commented Dec 14, 2023

preames left a comment

preames commented Dec 15, 2023

[RISCV Add some vsetvli insertion test cases with vmv.s.x+reduction. NFC #75544

[RISCV Add some vsetvli insertion test cases with vmv.s.x+reduction. NFC #75544

Conversation

topperc commented Dec 14, 2023

preames left a comment

Choose a reason for hiding this comment

preames commented Dec 15, 2023