Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RISCV Add some vsetvli insertion test cases with vmv.s.x+reduction. NFC #75544

Merged
merged 1 commit into from
Dec 15, 2023

Conversation

topperc
Copy link
Collaborator

@topperc topperc commented Dec 14, 2023

These test cases where intended to get a single vsetvli by using the vmv.s.x intrinsic with the same LMUL as the reduction. This works for FP, but does not work for integer.

I believe #71501 will break this for FP too. Hopefully the vsetvli pass can be taught to fix this.

These test cases where intended to get a single vsetvli by using
the vmv.s.x intrinsic with the same LMUL as the reduction. This
works for FP, but does not work for integer.

I believe llvm#71501 will break this for FP too. Hopefully the vsetvli
pass can be taught to fix this.
Copy link
Collaborator

@preames preames left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

For the record, the difference between integer and float is very weird here.

@preames
Copy link
Collaborator

preames commented Dec 15, 2023

For the record, the difference between integer and float is very weird here.

Congrats, you nerd sniped me. :)

I dug into this, and it turns out to be a difference caused by the LMUL1 subreg stuff (which we apparently only do for int?), and a gap in the backwards walk code.

A draft patch follows. (Sorry for the huge inline comment, attaching files appears broken today?) Note that the fixme inline needs addressed before this can become a real patch, and there's definitely some style cleanup needed.

diff --git a/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp b/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
index b2d36b362b3a..6fe82991c82f 100644
--- a/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
@@ -1472,6 +1472,14 @@ static bool isNonZeroAVL(const MachineOperand &MO,
   return 0 != MO.getImm();
 }
 
+static bool hasSameAVL(const MachineOperand &AVL1, const MachineOperand &AVL2) {
+  if (AVL1.isReg() && AVL2.isReg())
+    return AVL1.getReg() == AVL2.getReg();
+  if (AVL1.isImm() && AVL2.isImm())
+    return AVL1.getImm() == AVL2.getImm();
+  return false;
+}
+
 // Return true if we can mutate PrevMI to match MI without changing any the
 // fields which would be observed.
 static bool canMutatePriorConfig(const MachineInstr &PrevMI,
@@ -1490,15 +1498,24 @@ static bool canMutatePriorConfig(const MachineInstr &PrevMI,
     if (Used.VLZeroness) {
       if (isVLPreservingConfig(PrevMI))
         return false;
-      if (!isNonZeroAVL(MI.getOperand(1), MRI) ||
-          !isNonZeroAVL(PrevMI.getOperand(1), MRI))
-        return false;
+      if (!hasSameAVL(MI.getOperand(1), PrevMI.getOperand(1)))
+        if (!isNonZeroAVL(MI.getOperand(1), MRI) ||
+            !isNonZeroAVL(PrevMI.getOperand(1), MRI))
+          return false;
     }
 
     // TODO: Track whether the register is defined between
     // PrevMI and MI.
-    if (MI.getOperand(1).isReg() &&
-        RISCV::X0 != MI.getOperand(1).getReg())
+    auto regMayBeDefinedBetween = [&]() {
+      auto &AVL1 = PrevMI.getOperand(1);
+      auto &AVL2 = MI.getOperand(1);
+      // FIXME: This strongly assumes SSA. Rewrite using MRI interface.
+      if (AVL1.isReg() && AVL2.isReg() && AVL1.getReg() == AVL2.getReg())
+        return false;
+      return RISCV::X0 != AVL2.getReg();
+    };
+
+    if (MI.getOperand(1).isReg() && regMayBeDefinedBetween())
       return false;
   }
 
@@ -1519,7 +1536,6 @@ void RISCVInsertVSETVLI::doLocalPostpass(MachineBasicBlock &MBB) {
   Used.demandVTYPE();
   SmallVector<MachineInstr*> ToDelete;
   for (MachineInstr &MI : make_range(MBB.rbegin(), MBB.rend())) {
-
     if (!isVectorConfigInstr(MI)) {
       doUnion(Used, getDemanded(MI, MRI, ST));
       continue;
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-insert.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-insert.ll
index 6ebbc37f4afd..b652804813e4 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-insert.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-insert.ll
@@ -63,9 +63,8 @@ define <32 x i32> @insertelt_v32i32_31(<32 x i32> %a, i32 %y) {
 ; CHECK-LABEL: insertelt_v32i32_31:
 ; CHECK:       # %bb.0:
 ; CHECK-NEXT:    li a1, 32
-; CHECK-NEXT:    vsetvli zero, a1, e32, m1, ta, ma
-; CHECK-NEXT:    vmv.s.x v16, a0
 ; CHECK-NEXT:    vsetvli zero, a1, e32, m8, ta, ma
+; CHECK-NEXT:    vmv.s.x v16, a0
 ; CHECK-NEXT:    vslideup.vi v8, v16, 31
 ; CHECK-NEXT:    ret
   %b = insertelement <32 x i32> %a, i32 %y, i32 31
@@ -101,9 +100,8 @@ define <64 x i32> @insertelt_v64i32_63(<64 x i32> %a, i32 %y) {
 ; CHECK-LABEL: insertelt_v64i32_63:
 ; CHECK:       # %bb.0:
 ; CHECK-NEXT:    li a1, 32
-; CHECK-NEXT:    vsetvli zero, a1, e32, m1, ta, ma
-; CHECK-NEXT:    vmv.s.x v24, a0
 ; CHECK-NEXT:    vsetvli zero, a1, e32, m8, ta, ma
+; CHECK-NEXT:    vmv.s.x v24, a0
 ; CHECK-NEXT:    vslideup.vi v16, v24, 31
 ; CHECK-NEXT:    ret
   %b = insertelement <64 x i32> %a, i32 %y, i32 63

@topperc topperc merged commit 93b14c3 into llvm:main Dec 15, 2023
4 checks passed
@topperc topperc deleted the pr/vsetvli-reduction branch December 15, 2023 16:50
lukel97 added a commit to lukel97/llvm-project that referenced this pull request Jan 3, 2024
One of the requirements to be able to delete a vsetvli in the backwards pass is
that the preceding vsetvli must have the same AVL. This handles the case where
the AVLs are registers by using MachineRegisterInfo to check if there are any
definitions between the two vsetvlis.

The Dominates helper was taken from MachineDominatorTree and scans through the
instructions in the block which is less than ideal. But it's only called
whenever the two registers are the same, which should be rare.

This also replaces the equally-zero check with the existing hasEquallyZeroAVL
function, which is needed to handle the case where the AVLs are the same.

Based off the draft patch in
llvm#75544 (comment).
lukel97 added a commit to lukel97/llvm-project that referenced this pull request Jan 11, 2024
Currently vfmv.s.f intrinsics are directly selected to their pseudos via a
tablegen pattern in RISCVInstrInfoVPseudos.td, whereas the other move
instructions (vmv.s.x/vmv.v.x/vmv.v.f etc.) first get lowered to their
corresponding VL SDNode, then get selected from a pattern in
RISCVInstrInfoVVLPatterns.td

This patch brings vfmv.s.f inline with the other move instructions, and shows
how the LMUL reducing combine for VFMV_S_F_VL and VMV_S_X_VL results in the
discrepancy in the test added in llvm#75544.

Split out from llvm#71501, where we did this to preserve the behaviour of selecting
vmv_s_x for VFMV_S_F_VL for small enough immediates.
lukel97 added a commit to lukel97/llvm-project that referenced this pull request Jan 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants