[RISCV] Re-model RVV comparison instructions #88868

wangpc-pp · 2024-04-16T10:28:08Z

We remove the @earlyclobber constraint.

Instead, we add RegisterClasses which contain only the lowest
LMUL1 registers of different LMULs. We use them as the output
operand of comparison instrcutions to match the constraint:

The destination EEW is smaller than the source EEW and the overlap
is in the lowest-numbered part of the source register group.

The benefits:

From the tests diff, we can see improvements (and there are some
regressions in LMUL8 tests). Of course, this conclusion is from
unit tests, we should evaluate it on some benchmarks.
This change makes [RISCV] Don't use V0 directly in patterns #88496 possible (though I think a fine-grained
earlyclobber mechanism should be the right way).
If we agree, we can model other instructions in this way.

Created using spr 1.3.6-beta.1

llvmbot · 2024-04-16T10:28:41Z

@llvm/pr-subscribers-llvm-globalisel

@llvm/pr-subscribers-backend-risc-v

Author: Pengcheng Wang (wangpc-pp)

Changes

We remove the @earlyclobber constraint.

Instead, we add RegisterClasses which contain only the lowest
LMUL1 registers of different LMULs. We use them as the output
operand of comparison instrcutions to match the constraint:

> The destination EEW is smaller than the source EEW and the overlap
> is in the lowest-numbered part of the source register group.

The benefits:

From the tests diff, we can see improvements (and there are some
regressions in LMUL8 tests). Of course, this conclusion is from
unit tests, we should evaluate it on some benchmarks.
This change makes #88496 possible (though I think a fine-grained
earlyclobber mechanism should be the right way).
If we agree, we can model other instructions in this way.

Patch is 1.84 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/88868.diff

90 Files Affected:

(modified) llvm/lib/Target/RISCV/GISel/RISCVRegisterBankInfo.cpp (+2-14)
(modified) llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td (+10-12)
(modified) llvm/lib/Target/RISCV/RISCVRegisterInfo.td (+6)
(modified) llvm/test/CodeGen/RISCV/GlobalISel/instruction-select/rvv/icmp.mir (+68-68)
(modified) llvm/test/CodeGen/RISCV/rvv/active_lane_mask.ll (+22-22)
(modified) llvm/test/CodeGen/RISCV/rvv/binop-splats.ll (+9-9)
(modified) llvm/test/CodeGen/RISCV/rvv/bitreverse-vp.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/rvv/bswap-vp.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/rvv/ceil-vp.ll (+52-74)
(modified) llvm/test/CodeGen/RISCV/rvv/extractelt-i1.ll (+14-14)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-binop-splats.ll (+6-6)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-bitreverse-vp.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-bswap-vp.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-ceil-vp.ll (+46-67)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-extract-i1.ll (+48-48)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-floor-vp.ll (+46-67)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fmaximum-vp.ll (+113-108)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fmaximum.ll (+42-32)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fminimum-vp.ll (+113-108)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fminimum.ll (+42-32)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp-setcc.ll (+138-138)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fptosi-vp-mask.ll (+2-3)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fptoui-vp-mask.ll (+2-3)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-setcc.ll (+48-48)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-interleave-store.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-mask-splat.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-load-fp.ll (+16-16)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-load-int.ll (+14-14)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-store-fp.ll (+32-120)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-store-int.ll (+39-103)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-nearbyint-vp.ll (+57-62)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-reduction-fp.ll (+336-160)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-reduction-int-vp.ll (+22-22)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-rint-vp.ll (+36-53)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-round-vp.ll (+46-67)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-roundeven-vp.ll (+46-67)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-roundtozero-vp.ll (+46-67)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-setcc-fp-vp.ll (+177-235)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-setcc-int-vp.ll (+164-295)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-trunc-vp.ll (+25-25)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vfclass-vp.ll (+14-21)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vfcmp-constrained-sdnode.ll (+1008-1260)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vfcmps-constrained-sdnode.ll (+486-486)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vpmerge.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vselect-vp.ll (+17-3)
(modified) llvm/test/CodeGen/RISCV/rvv/floor-vp.ll (+52-74)
(modified) llvm/test/CodeGen/RISCV/rvv/fmaximum-sdnode.ll (+129-69)
(modified) llvm/test/CodeGen/RISCV/rvv/fmaximum-vp.ll (+245-336)
(modified) llvm/test/CodeGen/RISCV/rvv/fminimum-sdnode.ll (+129-69)
(modified) llvm/test/CodeGen/RISCV/rvv/fminimum-vp.ll (+245-336)
(modified) llvm/test/CodeGen/RISCV/rvv/mscatter-sdnode.ll (+34-6)
(modified) llvm/test/CodeGen/RISCV/rvv/nearbyint-vp.ll (+114-170)
(modified) llvm/test/CodeGen/RISCV/rvv/rint-vp.ll (+78-124)
(modified) llvm/test/CodeGen/RISCV/rvv/round-vp.ll (+78-124)
(modified) llvm/test/CodeGen/RISCV/rvv/roundeven-vp.ll (+78-124)
(modified) llvm/test/CodeGen/RISCV/rvv/roundtozero-vp.ll (+78-124)
(modified) llvm/test/CodeGen/RISCV/rvv/select-int.ll (+18-18)
(modified) llvm/test/CodeGen/RISCV/rvv/setcc-fp-vp.ll (+347-371)
(modified) llvm/test/CodeGen/RISCV/rvv/setcc-fp.ll (+296-296)
(modified) llvm/test/CodeGen/RISCV/rvv/setcc-int-vp.ll (+179-291)
(modified) llvm/test/CodeGen/RISCV/rvv/setcc-integer.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/rvv/sshl_sat_vec.ll (+12-12)
(modified) llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-load.ll (+2-2)
(modified) llvm/test/CodeGen/RISCV/rvv/vector-interleave-store.ll (+13-22)
(modified) llvm/test/CodeGen/RISCV/rvv/vector-interleave.ll (+4-4)
(modified) llvm/test/CodeGen/RISCV/rvv/vfcmp-constrained-sdnode.ll (+1452-1830)
(modified) llvm/test/CodeGen/RISCV/rvv/vfcmps-constrained-sdnode.ll (+729-729)
(modified) llvm/test/CodeGen/RISCV/rvv/vfptoi-sdnode.ll (+8-8)
(modified) llvm/test/CodeGen/RISCV/rvv/vfptosi-vp-mask.ll (+2-3)
(modified) llvm/test/CodeGen/RISCV/rvv/vfptoui-vp-mask.ll (+2-3)
(modified) llvm/test/CodeGen/RISCV/rvv/vmfeq.ll (+39-39)
(modified) llvm/test/CodeGen/RISCV/rvv/vmfge.ll (+39-39)
(modified) llvm/test/CodeGen/RISCV/rvv/vmfgt.ll (+39-39)
(modified) llvm/test/CodeGen/RISCV/rvv/vmfle.ll (+39-39)
(modified) llvm/test/CodeGen/RISCV/rvv/vmflt.ll (+39-39)
(modified) llvm/test/CodeGen/RISCV/rvv/vmfne.ll (+39-39)
(modified) llvm/test/CodeGen/RISCV/rvv/vmseq.ll (+82-82)
(modified) llvm/test/CodeGen/RISCV/rvv/vmsge.ll (+118-122)
(modified) llvm/test/CodeGen/RISCV/rvv/vmsgeu.ll (+118-122)
(modified) llvm/test/CodeGen/RISCV/rvv/vmsgt.ll (+82-82)
(modified) llvm/test/CodeGen/RISCV/rvv/vmsgtu.ll (+82-82)
(modified) llvm/test/CodeGen/RISCV/rvv/vmsle.ll (+82-82)
(modified) llvm/test/CodeGen/RISCV/rvv/vmsleu.ll (+82-82)
(modified) llvm/test/CodeGen/RISCV/rvv/vmslt.ll (+82-82)
(modified) llvm/test/CodeGen/RISCV/rvv/vmsltu.ll (+82-82)
(modified) llvm/test/CodeGen/RISCV/rvv/vmsne.ll (+82-82)
(modified) llvm/test/CodeGen/RISCV/rvv/vp-reverse-mask.ll (+3-6)
(modified) llvm/test/CodeGen/RISCV/rvv/vp-splice-mask-vectors.ll (+3-6)
(modified) llvm/test/CodeGen/RISCV/rvv/vselect-fp.ll (+12-20)
(modified) llvm/test/CodeGen/RISCV/rvv/vtrunc-vp-mask.ll (+2-3)

diff --git a/llvm/lib/Target/RISCV/GISel/RISCVRegisterBankInfo.cpp b/llvm/lib/Target/RISCV/GISel/RISCVRegisterBankInfo.cpp
index 86e44343b50865..ca77a9729e03b9 100644
--- a/llvm/lib/Target/RISCV/GISel/RISCVRegisterBankInfo.cpp
+++ b/llvm/lib/Target/RISCV/GISel/RISCVRegisterBankInfo.cpp
@@ -110,6 +110,8 @@ RISCVRegisterBankInfo::getRegBankFromRegClass(const TargetRegisterClass &RC,
                                               LLT Ty) const {
   switch (RC.getID()) {
   default:
+    if (RISCVRI::isVRegClass(RC.TSFlags))
+      return getRegBank(RISCV::VRBRegBankID);
     llvm_unreachable("Register class not supported");
   case RISCV::GPRRegClassID:
   case RISCV::GPRF16RegClassID:
@@ -131,20 +133,6 @@ RISCVRegisterBankInfo::getRegBankFromRegClass(const TargetRegisterClass &RC,
   case RISCV::FPR64CRegClassID:
   case RISCV::FPR32CRegClassID:
     return getRegBank(RISCV::FPRBRegBankID);
-  case RISCV::VMRegClassID:
-  case RISCV::VRRegClassID:
-  case RISCV::VRNoV0RegClassID:
-  case RISCV::VRM2RegClassID:
-  case RISCV::VRM2NoV0RegClassID:
-  case RISCV::VRM4RegClassID:
-  case RISCV::VRM4NoV0RegClassID:
-  case RISCV::VMV0RegClassID:
-  case RISCV::VRM2_with_sub_vrm1_0_in_VMV0RegClassID:
-  case RISCV::VRM4_with_sub_vrm1_0_in_VMV0RegClassID:
-  case RISCV::VRM8RegClassID:
-  case RISCV::VRM8NoV0RegClassID:
-  case RISCV::VRM8_with_sub_vrm1_0_in_VMV0RegClassID:
-    return getRegBank(RISCV::VRBRegBankID);
   }
 }
 
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td b/llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td
index ad1821d57256bc..686bfd1af0d062 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoVPseudos.td
@@ -143,22 +143,24 @@ class PseudoToVInst<string PseudoInst> {
 
 // This class describes information associated to the LMUL.
 class LMULInfo<int lmul, int oct, VReg regclass, VReg wregclass,
-               VReg f2regclass, VReg f4regclass, VReg f8regclass, string mx> {
+               VReg f2regclass, VReg f4regclass, VReg f8regclass, string mx,
+               VReg moutregclass = VMM1> {
   bits<3> value = lmul; // This is encoded as the vlmul field of vtype.
   VReg vrclass = regclass;
   VReg wvrclass = wregclass;
   VReg f8vrclass = f8regclass;
   VReg f4vrclass = f4regclass;
   VReg f2vrclass = f2regclass;
+  VReg moutclass = moutregclass;
   string MX = mx;
   int octuple = oct;
 }
 
 // Associate LMUL with tablegen records of register classes.
 def V_M1  : LMULInfo<0b000,  8,   VR,        VRM2,   VR,   VR, VR, "M1">;
-def V_M2  : LMULInfo<0b001, 16, VRM2,        VRM4,   VR,   VR, VR, "M2">;
-def V_M4  : LMULInfo<0b010, 32, VRM4,        VRM8, VRM2,   VR, VR, "M4">;
-def V_M8  : LMULInfo<0b011, 64, VRM8,/*NoVReg*/VR, VRM4, VRM2, VR, "M8">;
+def V_M2  : LMULInfo<0b001, 16, VRM2,        VRM4,   VR,   VR, VR, "M2", VMM2>;
+def V_M4  : LMULInfo<0b010, 32, VRM4,        VRM8, VRM2,   VR, VR, "M4", VMM4>;
+def V_M8  : LMULInfo<0b011, 64, VRM8,/*NoVReg*/VR, VRM4, VRM2, VR, "M8", VMM8>;
 
 def V_MF8 : LMULInfo<0b101, 1, VR, VR,/*NoVReg*/VR,/*NoVReg*/VR,/*NoVReg*/VR, "MF8">;
 def V_MF4 : LMULInfo<0b110, 2, VR, VR,          VR,/*NoVReg*/VR,/*NoVReg*/VR, "MF4">;
@@ -2668,25 +2670,21 @@ multiclass PseudoVEXT_VF8 {
 // With LMUL<=1 the source and dest occupy a single register so any overlap
 // is in the lowest-numbered part.
 multiclass VPseudoBinaryM_VV<LMULInfo m, int TargetConstraintType = 1> {
-  defm _VV : VPseudoBinaryM<VR, m.vrclass, m.vrclass, m,
-                            !if(!ge(m.octuple, 16), "@earlyclobber $rd", ""), TargetConstraintType>;
+  defm _VV : VPseudoBinaryM<m.moutclass, m.vrclass, m.vrclass, m, "", TargetConstraintType>;
 }
 
 multiclass VPseudoBinaryM_VX<LMULInfo m, int TargetConstraintType = 1> {
   defm "_VX" :
-    VPseudoBinaryM<VR, m.vrclass, GPR, m,
-                   !if(!ge(m.octuple, 16), "@earlyclobber $rd", ""), TargetConstraintType>;
+    VPseudoBinaryM<m.moutclass, m.vrclass, GPR, m, "", TargetConstraintType>;
 }
 
 multiclass VPseudoBinaryM_VF<LMULInfo m, FPR_Info f, int TargetConstraintType = 1> {
   defm "_V" # f.FX :
-    VPseudoBinaryM<VR, m.vrclass, f.fprclass, m,
-                   !if(!ge(m.octuple, 16), "@earlyclobber $rd", ""), TargetConstraintType>;
+    VPseudoBinaryM<m.moutclass, m.vrclass, f.fprclass, m, "", TargetConstraintType>;
 }
 
 multiclass VPseudoBinaryM_VI<LMULInfo m, int TargetConstraintType = 1> {
-  defm _VI : VPseudoBinaryM<VR, m.vrclass, simm5, m,
-                            !if(!ge(m.octuple, 16), "@earlyclobber $rd", ""), TargetConstraintType>;
+  defm _VI : VPseudoBinaryM<m.moutclass, m.vrclass, simm5, m, "", TargetConstraintType>;
 }
 
 multiclass VPseudoVGTR_VV_VX_VI<Operand ImmType = simm5, string Constraint = ""> {
diff --git a/llvm/lib/Target/RISCV/RISCVRegisterInfo.td b/llvm/lib/Target/RISCV/RISCVRegisterInfo.td
index 316daf2763ca1e..1a0533c7072705 100644
--- a/llvm/lib/Target/RISCV/RISCVRegisterInfo.td
+++ b/llvm/lib/Target/RISCV/RISCVRegisterInfo.td
@@ -533,6 +533,12 @@ def VR : VReg<!listconcat(VM1VTs, VMaskVTs),
               (add (sequence "V%u", 8, 31),
                    (sequence "V%u", 7, 0)), 1>;
 
+// V0 is likely to be used as mask, so we move it in front of allocation order.
+def VMM1 : VReg<VMaskVTs, (add (sequence "V%u", 0, 31)), 1>;
+def VMM2 : VReg<VMaskVTs, (add (sequence "V%u", 0, 31, 2)), 1>;
+def VMM4 : VReg<VMaskVTs, (add (sequence "V%u", 0, 31, 4)), 1>;
+def VMM8 : VReg<VMaskVTs, (add (sequence "V%u", 0, 31, 8)), 1>;
+
 def VRNoV0 : VReg<!listconcat(VM1VTs, VMaskVTs), (sub VR, V0), 1>;
 
 def VRM2 : VReg<VM2VTs, (add (sequence "V%uM2", 8, 31, 2),
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/instruction-select/rvv/icmp.mir b/llvm/test/CodeGen/RISCV/GlobalISel/instruction-select/rvv/icmp.mir
index df0d48aac92551..0677232fa60677 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/instruction-select/rvv/icmp.mir
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/instruction-select/rvv/icmp.mir
@@ -13,13 +13,13 @@ body:             |
   bb.0.entry:
     ; RV32I-LABEL: name: icmp_nxv1i8
     ; RV32I: [[DEF:%[0-9]+]]:vr = IMPLICIT_DEF
-    ; RV32I-NEXT: [[PseudoVMSLTU_VV_MF8_:%[0-9]+]]:vr = PseudoVMSLTU_VV_MF8 [[DEF]], [[DEF]], -1, 3 /* e8 */
+    ; RV32I-NEXT: [[PseudoVMSLTU_VV_MF8_:%[0-9]+]]:vmm1 = PseudoVMSLTU_VV_MF8 [[DEF]], [[DEF]], -1, 3 /* e8 */
     ; RV32I-NEXT: $v8 = COPY [[PseudoVMSLTU_VV_MF8_]]
     ; RV32I-NEXT: PseudoRET implicit $v8
     ;
     ; RV64I-LABEL: name: icmp_nxv1i8
     ; RV64I: [[DEF:%[0-9]+]]:vr = IMPLICIT_DEF
-    ; RV64I-NEXT: [[PseudoVMSLTU_VV_MF8_:%[0-9]+]]:vr = PseudoVMSLTU_VV_MF8 [[DEF]], [[DEF]], -1, 3 /* e8 */
+    ; RV64I-NEXT: [[PseudoVMSLTU_VV_MF8_:%[0-9]+]]:vmm1 = PseudoVMSLTU_VV_MF8 [[DEF]], [[DEF]], -1, 3 /* e8 */
     ; RV64I-NEXT: $v8 = COPY [[PseudoVMSLTU_VV_MF8_]]
     ; RV64I-NEXT: PseudoRET implicit $v8
     %0:vrb(<vscale x 1 x s8>) = G_IMPLICIT_DEF
@@ -37,13 +37,13 @@ body:             |
   bb.0.entry:
     ; RV32I-LABEL: name: icmp_nxv2i8
     ; RV32I: [[DEF:%[0-9]+]]:vr = IMPLICIT_DEF
-    ; RV32I-NEXT: [[PseudoVMSLT_VV_MF4_:%[0-9]+]]:vr = PseudoVMSLT_VV_MF4 [[DEF]], [[DEF]], -1, 3 /* e8 */
+    ; RV32I-NEXT: [[PseudoVMSLT_VV_MF4_:%[0-9]+]]:vmm1 = PseudoVMSLT_VV_MF4 [[DEF]], [[DEF]], -1, 3 /* e8 */
     ; RV32I-NEXT: $v8 = COPY [[PseudoVMSLT_VV_MF4_]]
     ; RV32I-NEXT: PseudoRET implicit $v8
     ;
     ; RV64I-LABEL: name: icmp_nxv2i8
     ; RV64I: [[DEF:%[0-9]+]]:vr = IMPLICIT_DEF
-    ; RV64I-NEXT: [[PseudoVMSLT_VV_MF4_:%[0-9]+]]:vr = PseudoVMSLT_VV_MF4 [[DEF]], [[DEF]], -1, 3 /* e8 */
+    ; RV64I-NEXT: [[PseudoVMSLT_VV_MF4_:%[0-9]+]]:vmm1 = PseudoVMSLT_VV_MF4 [[DEF]], [[DEF]], -1, 3 /* e8 */
     ; RV64I-NEXT: $v8 = COPY [[PseudoVMSLT_VV_MF4_]]
     ; RV64I-NEXT: PseudoRET implicit $v8
     %0:vrb(<vscale x 2 x s8>) = G_IMPLICIT_DEF
@@ -61,13 +61,13 @@ body:             |
   bb.0.entry:
     ; RV32I-LABEL: name: icmp_nxv4i8
     ; RV32I: [[DEF:%[0-9]+]]:vr = IMPLICIT_DEF
-    ; RV32I-NEXT: [[PseudoVMSLEU_VV_MF2_:%[0-9]+]]:vr = PseudoVMSLEU_VV_MF2 [[DEF]], [[DEF]], -1, 3 /* e8 */
+    ; RV32I-NEXT: [[PseudoVMSLEU_VV_MF2_:%[0-9]+]]:vmm1 = PseudoVMSLEU_VV_MF2 [[DEF]], [[DEF]], -1, 3 /* e8 */
     ; RV32I-NEXT: $v8 = COPY [[PseudoVMSLEU_VV_MF2_]]
     ; RV32I-NEXT: PseudoRET implicit $v8
     ;
     ; RV64I-LABEL: name: icmp_nxv4i8
     ; RV64I: [[DEF:%[0-9]+]]:vr = IMPLICIT_DEF
-    ; RV64I-NEXT: [[PseudoVMSLEU_VV_MF2_:%[0-9]+]]:vr = PseudoVMSLEU_VV_MF2 [[DEF]], [[DEF]], -1, 3 /* e8 */
+    ; RV64I-NEXT: [[PseudoVMSLEU_VV_MF2_:%[0-9]+]]:vmm1 = PseudoVMSLEU_VV_MF2 [[DEF]], [[DEF]], -1, 3 /* e8 */
     ; RV64I-NEXT: $v8 = COPY [[PseudoVMSLEU_VV_MF2_]]
     ; RV64I-NEXT: PseudoRET implicit $v8
     %0:vrb(<vscale x 4 x s8>) = G_IMPLICIT_DEF
@@ -85,13 +85,13 @@ body:             |
   bb.0.entry:
     ; RV32I-LABEL: name: icmp_nxv8i8
     ; RV32I: [[DEF:%[0-9]+]]:vr = IMPLICIT_DEF
-    ; RV32I-NEXT: [[PseudoVMSLE_VV_M1_:%[0-9]+]]:vr = PseudoVMSLE_VV_M1 [[DEF]], [[DEF]], -1, 3 /* e8 */
+    ; RV32I-NEXT: [[PseudoVMSLE_VV_M1_:%[0-9]+]]:vmm1 = PseudoVMSLE_VV_M1 [[DEF]], [[DEF]], -1, 3 /* e8 */
     ; RV32I-NEXT: $v8 = COPY [[PseudoVMSLE_VV_M1_]]
     ; RV32I-NEXT: PseudoRET implicit $v8
     ;
     ; RV64I-LABEL: name: icmp_nxv8i8
     ; RV64I: [[DEF:%[0-9]+]]:vr = IMPLICIT_DEF
-    ; RV64I-NEXT: [[PseudoVMSLE_VV_M1_:%[0-9]+]]:vr = PseudoVMSLE_VV_M1 [[DEF]], [[DEF]], -1, 3 /* e8 */
+    ; RV64I-NEXT: [[PseudoVMSLE_VV_M1_:%[0-9]+]]:vmm1 = PseudoVMSLE_VV_M1 [[DEF]], [[DEF]], -1, 3 /* e8 */
     ; RV64I-NEXT: $v8 = COPY [[PseudoVMSLE_VV_M1_]]
     ; RV64I-NEXT: PseudoRET implicit $v8
     %0:vrb(<vscale x 8 x s8>) = G_IMPLICIT_DEF
@@ -109,14 +109,14 @@ body:             |
   bb.0.entry:
     ; RV32I-LABEL: name: icmp_nxv16i8
     ; RV32I: [[DEF:%[0-9]+]]:vrm2 = IMPLICIT_DEF
-    ; RV32I-NEXT: early-clobber %1:vr = PseudoVMSLTU_VV_M2 [[DEF]], [[DEF]], -1, 3 /* e8 */
-    ; RV32I-NEXT: $v8 = COPY %1
+    ; RV32I-NEXT: [[PseudoVMSLTU_VV_M2_:%[0-9]+]]:vmm2 = PseudoVMSLTU_VV_M2 [[DEF]], [[DEF]], -1, 3 /* e8 */
+    ; RV32I-NEXT: $v8 = COPY [[PseudoVMSLTU_VV_M2_]]
     ; RV32I-NEXT: PseudoRET implicit $v8
     ;
     ; RV64I-LABEL: name: icmp_nxv16i8
     ; RV64I: [[DEF:%[0-9]+]]:vrm2 = IMPLICIT_DEF
-    ; RV64I-NEXT: early-clobber %1:vr = PseudoVMSLTU_VV_M2 [[DEF]], [[DEF]], -1, 3 /* e8 */
-    ; RV64I-NEXT: $v8 = COPY %1
+    ; RV64I-NEXT: [[PseudoVMSLTU_VV_M2_:%[0-9]+]]:vmm2 = PseudoVMSLTU_VV_M2 [[DEF]], [[DEF]], -1, 3 /* e8 */
+    ; RV64I-NEXT: $v8 = COPY [[PseudoVMSLTU_VV_M2_]]
     ; RV64I-NEXT: PseudoRET implicit $v8
     %0:vrb(<vscale x 16 x s8>) = G_IMPLICIT_DEF
     %1:vrb(<vscale x 16 x s1>) = G_ICMP intpred(ugt), %0(<vscale x 16 x s8>), %0
@@ -133,14 +133,14 @@ body:             |
   bb.0.entry:
     ; RV32I-LABEL: name: icmp_nxv32i8
     ; RV32I: [[DEF:%[0-9]+]]:vrm4 = IMPLICIT_DEF
-    ; RV32I-NEXT: early-clobber %1:vr = PseudoVMSLT_VV_M4 [[DEF]], [[DEF]], -1, 3 /* e8 */
-    ; RV32I-NEXT: $v8 = COPY %1
+    ; RV32I-NEXT: [[PseudoVMSLT_VV_M4_:%[0-9]+]]:vmm4 = PseudoVMSLT_VV_M4 [[DEF]], [[DEF]], -1, 3 /* e8 */
+    ; RV32I-NEXT: $v8 = COPY [[PseudoVMSLT_VV_M4_]]
     ; RV32I-NEXT: PseudoRET implicit $v8
     ;
     ; RV64I-LABEL: name: icmp_nxv32i8
     ; RV64I: [[DEF:%[0-9]+]]:vrm4 = IMPLICIT_DEF
-    ; RV64I-NEXT: early-clobber %1:vr = PseudoVMSLT_VV_M4 [[DEF]], [[DEF]], -1, 3 /* e8 */
-    ; RV64I-NEXT: $v8 = COPY %1
+    ; RV64I-NEXT: [[PseudoVMSLT_VV_M4_:%[0-9]+]]:vmm4 = PseudoVMSLT_VV_M4 [[DEF]], [[DEF]], -1, 3 /* e8 */
+    ; RV64I-NEXT: $v8 = COPY [[PseudoVMSLT_VV_M4_]]
     ; RV64I-NEXT: PseudoRET implicit $v8
     %0:vrb(<vscale x 32 x s8>) = G_IMPLICIT_DEF
     %1:vrb(<vscale x 32 x s1>) = G_ICMP intpred(sgt), %0(<vscale x 32 x s8>), %0
@@ -157,14 +157,14 @@ body:             |
   bb.0.entry:
     ; RV32I-LABEL: name: icmp_nxv64i8
     ; RV32I: [[DEF:%[0-9]+]]:vrm8 = IMPLICIT_DEF
-    ; RV32I-NEXT: early-clobber %1:vr = PseudoVMSLEU_VV_M8 [[DEF]], [[DEF]], -1, 3 /* e8 */
-    ; RV32I-NEXT: $v8 = COPY %1
+    ; RV32I-NEXT: [[PseudoVMSLEU_VV_M8_:%[0-9]+]]:vmm8 = PseudoVMSLEU_VV_M8 [[DEF]], [[DEF]], -1, 3 /* e8 */
+    ; RV32I-NEXT: $v8 = COPY [[PseudoVMSLEU_VV_M8_]]
     ; RV32I-NEXT: PseudoRET implicit $v8
     ;
     ; RV64I-LABEL: name: icmp_nxv64i8
     ; RV64I: [[DEF:%[0-9]+]]:vrm8 = IMPLICIT_DEF
-    ; RV64I-NEXT: early-clobber %1:vr = PseudoVMSLEU_VV_M8 [[DEF]], [[DEF]], -1, 3 /* e8 */
-    ; RV64I-NEXT: $v8 = COPY %1
+    ; RV64I-NEXT: [[PseudoVMSLEU_VV_M8_:%[0-9]+]]:vmm8 = PseudoVMSLEU_VV_M8 [[DEF]], [[DEF]], -1, 3 /* e8 */
+    ; RV64I-NEXT: $v8 = COPY [[PseudoVMSLEU_VV_M8_]]
     ; RV64I-NEXT: PseudoRET implicit $v8
     %0:vrb(<vscale x 64 x s8>) = G_IMPLICIT_DEF
     %1:vrb(<vscale x 64 x s1>) = G_ICMP intpred(ule), %0(<vscale x 64 x s8>), %0
@@ -181,13 +181,13 @@ body:             |
   bb.0.entry:
     ; RV32I-LABEL: name: icmp_nxv1i16
     ; RV32I: [[DEF:%[0-9]+]]:vr = IMPLICIT_DEF
-    ; RV32I-NEXT: [[PseudoVMSLE_VV_MF4_:%[0-9]+]]:vr = PseudoVMSLE_VV_MF4 [[DEF]], [[DEF]], -1, 4 /* e16 */
+    ; RV32I-NEXT: [[PseudoVMSLE_VV_MF4_:%[0-9]+]]:vmm1 = PseudoVMSLE_VV_MF4 [[DEF]], [[DEF]], -1, 4 /* e16 */
     ; RV32I-NEXT: $v8 = COPY [[PseudoVMSLE_VV_MF4_]]
     ; RV32I-NEXT: PseudoRET implicit $v8
     ;
     ; RV64I-LABEL: name: icmp_nxv1i16
     ; RV64I: [[DEF:%[0-9]+]]:vr = IMPLICIT_DEF
-    ; RV64I-NEXT: [[PseudoVMSLE_VV_MF4_:%[0-9]+]]:vr = PseudoVMSLE_VV_MF4 [[DEF]], [[DEF]], -1, 4 /* e16 */
+    ; RV64I-NEXT: [[PseudoVMSLE_VV_MF4_:%[0-9]+]]:vmm1 = PseudoVMSLE_VV_MF4 [[DEF]], [[DEF]], -1, 4 /* e16 */
     ; RV64I-NEXT: $v8 = COPY [[PseudoVMSLE_VV_MF4_]]
     ; RV64I-NEXT: PseudoRET implicit $v8
     %0:vrb(<vscale x 1 x s16>) = G_IMPLICIT_DEF
@@ -205,13 +205,13 @@ body:             |
   bb.0.entry:
     ; RV32I-LABEL: name: icmp_nxv2i16
     ; RV32I: [[DEF:%[0-9]+]]:vr = IMPLICIT_DEF
-    ; RV32I-NEXT: [[PseudoVMSNE_VV_MF2_:%[0-9]+]]:vr = PseudoVMSNE_VV_MF2 [[DEF]], [[DEF]], -1, 4 /* e16 */
+    ; RV32I-NEXT: [[PseudoVMSNE_VV_MF2_:%[0-9]+]]:vmm1 = PseudoVMSNE_VV_MF2 [[DEF]], [[DEF]], -1, 4 /* e16 */
     ; RV32I-NEXT: $v8 = COPY [[PseudoVMSNE_VV_MF2_]]
     ; RV32I-NEXT: PseudoRET implicit $v8
     ;
     ; RV64I-LABEL: name: icmp_nxv2i16
     ; RV64I: [[DEF:%[0-9]+]]:vr = IMPLICIT_DEF
-    ; RV64I-NEXT: [[PseudoVMSNE_VV_MF2_:%[0-9]+]]:vr = PseudoVMSNE_VV_MF2 [[DEF]], [[DEF]], -1, 4 /* e16 */
+    ; RV64I-NEXT: [[PseudoVMSNE_VV_MF2_:%[0-9]+]]:vmm1 = PseudoVMSNE_VV_MF2 [[DEF]], [[DEF]], -1, 4 /* e16 */
     ; RV64I-NEXT: $v8 = COPY [[PseudoVMSNE_VV_MF2_]]
     ; RV64I-NEXT: PseudoRET implicit $v8
     %0:vrb(<vscale x 2 x s16>) = G_IMPLICIT_DEF
@@ -229,13 +229,13 @@ body:             |
   bb.0.entry:
     ; RV32I-LABEL: name: icmp_nxv4i16
     ; RV32I: [[DEF:%[0-9]+]]:vr = IMPLICIT_DEF
-    ; RV32I-NEXT: [[PseudoVMSEQ_VV_M1_:%[0-9]+]]:vr = PseudoVMSEQ_VV_M1 [[DEF]], [[DEF]], -1, 4 /* e16 */
+    ; RV32I-NEXT: [[PseudoVMSEQ_VV_M1_:%[0-9]+]]:vmm1 = PseudoVMSEQ_VV_M1 [[DEF]], [[DEF]], -1, 4 /* e16 */
     ; RV32I-NEXT: $v8 = COPY [[PseudoVMSEQ_VV_M1_]]
     ; RV32I-NEXT: PseudoRET implicit $v8
     ;
     ; RV64I-LABEL: name: icmp_nxv4i16
     ; RV64I: [[DEF:%[0-9]+]]:vr = IMPLICIT_DEF
-    ; RV64I-NEXT: [[PseudoVMSEQ_VV_M1_:%[0-9]+]]:vr = PseudoVMSEQ_VV_M1 [[DEF]], [[DEF]], -1, 4 /* e16 */
+    ; RV64I-NEXT: [[PseudoVMSEQ_VV_M1_:%[0-9]+]]:vmm1 = PseudoVMSEQ_VV_M1 [[DEF]], [[DEF]], -1, 4 /* e16 */
     ; RV64I-NEXT: $v8 = COPY [[PseudoVMSEQ_VV_M1_]]
     ; RV64I-NEXT: PseudoRET implicit $v8
     %0:vrb(<vscale x 4 x s16>) = G_IMPLICIT_DEF
@@ -253,14 +253,14 @@ body:             |
   bb.0.entry:
     ; RV32I-LABEL: name: icmp_nxv8i16
     ; RV32I: [[DEF:%[0-9]+]]:vrm2 = IMPLICIT_DEF
-    ; RV32I-NEXT: early-clobber %1:vr = PseudoVMSLTU_VV_M2 [[DEF]], [[DEF]], -1, 4 /* e16 */
-    ; RV32I-NEXT: $v8 = COPY %1
+    ; RV32I-NEXT: [[PseudoVMSLTU_VV_M2_:%[0-9]+]]:vmm2 = PseudoVMSLTU_VV_M2 [[DEF]], [[DEF]], -1, 4 /* e16 */
+    ; RV32I-NEXT: $v8 = COPY [[PseudoVMSLTU_VV_M2_]]
     ; RV32I-NEXT: PseudoRET implicit $v8
     ;
     ; RV64I-LABEL: name: icmp_nxv8i16
     ; RV64I: [[DEF:%[0-9]+]]:vrm2 = IMPLICIT_DEF
-    ; RV64I-NEXT: early-clobber %1:vr = PseudoVMSLTU_VV_M2 [[DEF]], [[DEF]], -1, 4 /* e16 */
-    ; RV64I-NEXT: $v8 = COPY %1
+    ; RV64I-NEXT: [[PseudoVMSLTU_VV_M2_:%[0-9]+]]:vmm2 = PseudoVMSLTU_VV_M2 [[DEF]], [[DEF]], -1, 4 /* e16 */
+    ; RV64I-NEXT: $v8 = COPY [[PseudoVMSLTU_VV_M2_]]
     ; RV64I-NEXT: PseudoRET implicit $v8
     %0:vrb(<vscale x 8 x s16>) = G_IMPLICIT_DEF
     %1:vrb(<vscale x 8 x s1>) = G_ICMP intpred(ult), %0(<vscale x 8 x s16>), %0
@@ -277,14 +277,14 @@ body:             |
   bb.0.entry:
     ; RV32I-LABEL: name: icmp_nxv16i16
     ; RV32I: [[DEF:%[0-9]+]]:vrm4 = IMPLICIT_DEF
-    ; RV32I-NEXT: early-clobber %1:vr = PseudoVMSLT_VV_M4 [[DEF]], [[DEF]], -1, 4 /* e16 */
-    ; RV32I-NEXT: $v8 = COPY %1
+    ; RV32I-NEXT: [[PseudoVMSLT_VV_M4_:%[0-9]+]]:vmm4 = PseudoVMSLT_VV_M4 [[DEF]], [[DEF]], -1, 4 /* e16 */
+    ; RV32I-NEXT: $v8 = COPY [[PseudoVMSLT_VV_M4_]]
     ; RV32I-NEXT: PseudoRET implicit $v8
     ;
     ; RV64I-LABEL: name: icmp_nxv16i16
     ; RV64I: [[DEF:%[0-9]+]]:vrm4 = IMPLICIT_DEF
-    ; RV64I-NEXT: early-clobber %1:vr = PseudoVMSLT_VV_M4 [[DEF]], [[DEF]], -1, 4 /* e16 */
-    ; RV64I-NEXT: $v8 = COPY %1
+    ; RV64I-NEXT: [[PseudoVMSLT_VV_M4_:%[0-9]+]]:vmm4 = PseudoVMSLT_VV_M4 [[DEF]], [[DEF]], -1, 4 /* e16 */
+    ; RV64I-NEXT: $v8 = COPY [[PseudoVMSLT_VV_M4_]]
     ; RV64I-NEXT: PseudoRET implicit $v8
     %0:vrb(<vscale x 16 x s16>) = G_IMPLICIT_DEF
     %1:vrb(<vscale x 16 x s1>) = G_ICMP intpred(slt), %0(<vscale x 16 x s16>), %0
@@ -301,14 +301,14 @@ body:             |
   bb.0.entry:
     ; RV32I-LABEL: name: icmp_nxv32i16
     ; RV32I: [[DEF:%[0-9]+]]:vrm8 = IMPLICIT_DEF
-    ; RV32I-NEXT: early-clobber %1:vr = PseudoVMSLEU_VV_M8 [[DEF]], [[DEF]], -1, 4 /* e16 */
-    ; RV32I-NEXT: $v8 = COPY %1
+    ; RV32I-NEXT: [[PseudoVMSLEU_VV_M8_:%[0-9]+]]:vmm8 = PseudoVMSLEU_VV_M8 [[DEF]], [[DEF]], -1, 4 /* e16 */
+    ; RV32I-NEXT: $v8 = COPY [[PseudoVMSLEU_VV_M8_]]
     ; RV32I-NEXT: PseudoRET implicit $v8
     ;
     ; RV64I-LABEL: name: icmp_nxv32i16
     ; RV64I: [[DEF:%[0-9]+]]:vrm8 = IMPLICIT_DEF
-    ; RV64I-NEXT: early-clobber %1:vr = PseudoVMSLEU_VV_M8 [[DEF]], [[DEF]], -1, 4 /* e16 */
-    ; RV64I-NEXT: $v8 = COPY %1
+    ; RV64I-NEXT: [[PseudoVMSLEU_VV_M8_:%[0-9]+]]:vmm8 = PseudoVMSLEU_VV_M8 [[DEF]], [[DEF]], -1, 4 /* e16 */
+    ; RV64I-NEXT: $v8 = COPY [[PseudoVMSLEU_VV_M8_]]
     ; RV64I-NEXT: PseudoRET implicit $v8
     %0:vrb(<vscale x 32 x s16>) = G_IMPLICIT_DEF
     %1:vrb(<vscale x 32 x s1>) = G_ICMP intpred(uge), %0(<vscale x 32 x s16>), %0
@@ -325,13 +325,13 @@ body:             |
   bb.0.entry:
     ; RV32I-LABEL: name: icmp_nxv1i32
     ; RV32I: [[DEF:%[0-9]+]]:vr = IMPLICIT_DEF
-    ; RV32I-NEXT: [[PseudoVMSLE_VV_MF2_:%[0-9]+]]:vr = PseudoVMSLE_VV_MF2 [[DEF]], [[DEF]], -1, 5 /* e32 */
+    ; RV32I-NEXT: [[PseudoVMSLE_VV_MF2_:%[0-9]+]]:vmm1 = PseudoVMSLE_VV_MF2 [[DEF]], [[DEF]], -1, 5 /* e32 */
     ; RV32I-NEXT: $v8 = COPY [[PseudoVMSLE_VV_MF2_]]
     ; RV32I-NEXT: PseudoRET implicit $v8
     ;
     ; RV64I-LABEL: name: icmp_nxv1i32
     ; RV64I: [[DEF:%[0-9]+]]:vr = IMPLICIT_DEF
-    ; RV64I-NEXT: [[PseudoVMSLE_VV_MF2_:%[0-9]+]]:vr = PseudoVMSLE_VV_MF2 [[DEF]], [[DEF]], -1, 5 /* e32 */
+    ; RV64I-NEXT: [[PseudoVMSLE_VV_MF2_:%[0-9]+]]:vmm1 = PseudoVMSLE_VV_MF2 [[DEF]], [[DEF]], -1, 5 /* e32 */
     ; RV64I-NEXT: $v8 = COPY [[PseudoVMSLE_VV_MF2_]]
     ; RV64I-NEXT: PseudoRET implicit $v8
     %0:vrb(<vscale x 1 x s32>) = G_IMPLICIT_DEF
@@ -349,13 +349,13 @@ body:             |
   bb.0.entry:
     ; RV32I-LABEL: name: icmp_nxv2i32
     ; RV32I: [[DEF:%[0-9]+]]:vr = IMPLICIT_DEF
-    ; RV32I-NEXT: [[PseudoVMSLTU_VV_M1_:%[0-9]+]]:vr = PseudoVMSLTU_VV_M1 [[DEF]], [[DEF]], -1, 5 /* e32 */
+    ; RV32I-NEXT: [[PseudoVMSLTU_VV_M1_:%[0-9]+]]:vmm1 = PseudoVMSLTU_VV_M1 [[DEF]], [[DEF]], -1, 5 /* e32 */
     ; RV32I-NEXT: $v8 = COPY [[PseudoVMSLTU_VV_M1_]]
     ; RV32I-NEXT: PseudoRET implicit $v8
     ;
     ; RV64I-LABEL: name: icmp_nxv2i32
     ; RV64I: [[DEF:%[0-9]+]]:vr = IMPLICIT_DEF
-    ; RV64I-NEXT: [[PseudoVMSLTU_VV_M1_:%[0-9]+]]:vr = PseudoVMSLTU_VV_M1 [[DEF]], [[DEF]], -1, 5 /* e32 */
+    ; RV64I-NEXT: [[PseudoVMSLTU_VV_M1_:%[0-9]+]]:vmm1 = PseudoVMSLTU_VV_M1 [[DEF]], [[DEF]], -1, 5 /* e32 */
     ; RV64I-NEXT: $v8 = COPY [[PseudoVMSLTU_VV_M1_]]
     ; RV64I-NEXT: PseudoRET implicit $v8
     %0:vrb(<vscale x 2 x s32>) = G_IMPLICIT_DEF
@@ -373,14 +373,14 @@ body:             |
   bb.0.entry:
   ...
[truncated]

wangpc-pp · 2024-04-16T10:33:16Z

llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fmaximum-vp.ll

 ; CHECK-NEXT:    vmerge.vvm v24, v8, v16, v0
-; CHECK-NEXT:    vmv1r.v v0, v7
+; CHECK-NEXT:    vl1r.v v0, (a0) # Unknown-size Folded Reload


An example of regression here.

wangpc-pp · 2024-04-16T10:35:02Z

llvm/test/CodeGen/RISCV/rvv/fixed-vectors-floor-vp.ll

 ; CHECK-NEXT:    vsetvli zero, zero, e64, m4, ta, ma
 ; CHECK-NEXT:    fsrmi a0, 2
-; CHECK-NEXT:    vmv1r.v v0, v12
-; CHECK-NEXT:    vfcvt.x.f.v v16, v8, v0.t
+; CHECK-NEXT:    vfcvt.x.f.v v12, v8, v0.t


An example of improvement here.

Created using spr 1.3.6-beta.1

topperc · 2024-04-16T16:34:22Z

If we agree, we can model other instructions in this way.

What other instructions need this?

topperc · 2024-04-16T16:36:59Z

I think, a long time ago I wondered if we should have a version of the compare pseudos that had the dest tied to v0 to avoid the early clobber. And let the 3 address instruction conversion break it if it needed. Similar to the _TIED instructions we have for VWADD.WV.

wangpc-pp · 2024-04-17T08:19:16Z

I think, a long time ago I wondered if we should have a version of the compare pseudos that had the dest tied to v0 to avoid the early clobber. And let the 3 address instruction conversion break it if it needed. Similar to the _TIED instructions we have for VWADD.WV.

Thanks! I didn't know we can avoid the early clobber in this way, I will have a try!

wangpc-pp · 2024-04-17T10:29:49Z

I think, a long time ago I wondered if we should have a version of the compare pseudos that had the dest tied to v0 to avoid the early clobber. And let the 3 address instruction conversion break it if it needed. Similar to the _TIED instructions we have for VWADD.WV.

Thanks! I didn't know we can avoid the early clobber in this way, I will have a try!

I don't know if I understand correctly, I think this TIED pseudos way doesn't work for compare instructions. :-(

Created using spr 1.3.6-beta.1

lukel97 · 2024-04-22T08:34:55Z

llvm/test/CodeGen/RISCV/rvv/binop-splats.ll

@@ -88,11 +88,11 @@ define <vscale x 16 x i1> @nxv16i1(i1 %x, i1 %y) {
 ; CHECK-NEXT:    andi a0, a0, 1
 ; CHECK-NEXT:    vsetvli a2, zero, e8, m2, ta, ma
 ; CHECK-NEXT:    vmv.v.x v8, a0
-; CHECK-NEXT:    vmsne.vi v10, v8, 0
+; CHECK-NEXT:    vmsne.vi v0, v8, 0


For LMUL 8 the dest reg can only be v0,v8,v16,v24 now right? I think we would want to check that the register coalescer isn't propagating the VMM8 reg class to other uses. Or at least make sure that it gets inflated back to VR?

Or at least make sure that it gets inflated back to VR?

Any thoughts about this?

I was thinking that MachineRegisterInfo::recomputeRegClass might be able to change a VMM8 register to VR. But after reading the function, I don't think it will if the compare instruction is still around since that will still have the constraint.

So I presume we'll end up with spilling if we have more than four LMUL8 compare results live at the same time? It would be good to know how often that happens in practice

preames · 2024-04-22T16:20:30Z

It snot clear to me that this change is a net positive.

Using m2 as an example, with the early clobber modeling the register allocator can pick any of 30 registers (for the vx forms). With your proposed form, we only get 16 possible registers. This is a net increase in register constraint and seems undesirable.

The ability to reuse a source register is only useful if that source has no other use, for cases where the operand has other uses, this patch would be a regression.

I think Craigs suggestion of a tied variant which is untied if needed might be reasonable. We could also consider biting the bullet and implementing the actual overlap constraint rules - though that would need some pretty major evidence of benefit to be worth the work.

[𝘀𝗽𝗿] initial version

d72e50a

Created using spr 1.3.6-beta.1

llvmbot added backend:RISC-V llvm:globalisel labels Apr 16, 2024

wangpc-pp requested review from asb, preames, lukel97, mshockwave, michaelmaitland, tclin914, jacquesguan and topperc April 16, 2024 10:29

wangpc-pp commented Apr 16, 2024

View reviewed changes

Rebase

3664076

Created using spr 1.3.6-beta.1

wangpc-pp added a commit that referenced this pull request Apr 17, 2024

Rebase on #88868

7d1b70a

Created using spr 1.3.6-beta.1

lukel97 reviewed Apr 22, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RISCV] Re-model RVV comparison instructions #88868

[RISCV] Re-model RVV comparison instructions #88868

wangpc-pp commented Apr 16, 2024

llvmbot commented Apr 16, 2024 •

edited

Loading

wangpc-pp Apr 16, 2024

wangpc-pp Apr 16, 2024

topperc commented Apr 16, 2024

topperc commented Apr 16, 2024

wangpc-pp commented Apr 17, 2024

wangpc-pp commented Apr 17, 2024

lukel97 Apr 22, 2024

wangpc-pp Apr 22, 2024

lukel97 Apr 22, 2024

preames commented Apr 22, 2024

[RISCV] Re-model RVV comparison instructions #88868

Are you sure you want to change the base?

[RISCV] Re-model RVV comparison instructions #88868

Conversation

wangpc-pp commented Apr 16, 2024

llvmbot commented Apr 16, 2024 • edited Loading

wangpc-pp Apr 16, 2024

Choose a reason for hiding this comment

wangpc-pp Apr 16, 2024

Choose a reason for hiding this comment

topperc commented Apr 16, 2024

topperc commented Apr 16, 2024

wangpc-pp commented Apr 17, 2024

wangpc-pp commented Apr 17, 2024

lukel97 Apr 22, 2024

Choose a reason for hiding this comment

wangpc-pp Apr 22, 2024

Choose a reason for hiding this comment

lukel97 Apr 22, 2024

Choose a reason for hiding this comment

preames commented Apr 22, 2024

llvmbot commented Apr 16, 2024 •

edited

Loading