[RISCV] Fix mgather -> riscv.masked.strided.load combine not extending indices #82506

lukel97 · 2024-02-21T16:32:24Z

This fixes the miscompile reported in #82430 by telling isSimpleVIDSequence to sign extend to XLen instead of the width of the indices, since the "sequence" of indices generated by a strided load will be at XLen.

This was the simplest way I could think of getting isSimpleVIDSequence to treat the indexes as if they were zero extended to XLenVT.

Another way we could do this is by refactoring out the "get constant integers" part from isSimpleVIDSequence and handle them as APInts so we can separately zero extend it.

Fixes #82430

llvmbot · 2024-02-21T16:32:56Z

@llvm/pr-subscribers-backend-risc-v

Author: Luke Lau (lukel97)

Changes

This fixes the miscompile reported in #82430 by telling isSimpleVIDSequence to
sign extending to XLen instead of the type of the indices, since the "sequence"
of indices generated by a strided load will be at XLen.

This was the simplest way I could think of getting isSimpleVIDSequence to
treat the indexes as if they were zero extended to XLenVT.

Another way we could do this is by refactoring out the "get constant integers"
part from isSimpleVIDSequence and handle them as APInts so we can separately
zero extend it.

Full diff: https://github.com/llvm/llvm-project/pull/82506.diff

2 Files Affected:

(modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+12-8)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-gather.ll (+2-6)

diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index f7275eb7c77bb3..75be97ff32bbe5 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -3240,7 +3240,8 @@ static std::optional<uint64_t> getExactInteger(const APFloat &APF,
 // Note that this method will also match potentially unappealing index
 // sequences, like <i32 0, i32 50939494>, however it is left to the caller to
 // determine whether this is worth generating code for.
-static std::optional<VIDSequence> isSimpleVIDSequence(SDValue Op) {
+static std::optional<VIDSequence> isSimpleVIDSequence(SDValue Op,
+                                                      unsigned EltSizeInBits) {
   unsigned NumElts = Op.getNumOperands();
   assert(Op.getOpcode() == ISD::BUILD_VECTOR && "Unexpected BUILD_VECTOR");
   bool IsInteger = Op.getValueType().isInteger();
@@ -3248,7 +3249,7 @@ static std::optional<VIDSequence> isSimpleVIDSequence(SDValue Op) {
   std::optional<unsigned> SeqStepDenom;
   std::optional<int64_t> SeqStepNum, SeqAddend;
   std::optional<std::pair<uint64_t, unsigned>> PrevElt;
-  unsigned EltSizeInBits = Op.getValueType().getScalarSizeInBits();
+  assert(EltSizeInBits >= Op.getValueType().getScalarSizeInBits());
   for (unsigned Idx = 0; Idx < NumElts; Idx++) {
     // Assume undef elements match the sequence; we just have to be careful
     // when interpolating across them.
@@ -3261,14 +3262,14 @@ static std::optional<VIDSequence> isSimpleVIDSequence(SDValue Op) {
       if (!isa<ConstantSDNode>(Op.getOperand(Idx)))
         return std::nullopt;
       Val = Op.getConstantOperandVal(Idx) &
-            maskTrailingOnes<uint64_t>(EltSizeInBits);
+            maskTrailingOnes<uint64_t>(Op.getScalarValueSizeInBits());
     } else {
       // The BUILD_VECTOR must be all constants.
       if (!isa<ConstantFPSDNode>(Op.getOperand(Idx)))
         return std::nullopt;
       if (auto ExactInteger = getExactInteger(
               cast<ConstantFPSDNode>(Op.getOperand(Idx))->getValueAPF(),
-              EltSizeInBits))
+              Op.getScalarValueSizeInBits()))
         Val = *ExactInteger;
       else
         return std::nullopt;
@@ -3324,11 +3325,11 @@ static std::optional<VIDSequence> isSimpleVIDSequence(SDValue Op) {
     uint64_t Val;
     if (IsInteger) {
       Val = Op.getConstantOperandVal(Idx) &
-            maskTrailingOnes<uint64_t>(EltSizeInBits);
+            maskTrailingOnes<uint64_t>(Op.getScalarValueSizeInBits());
     } else {
       Val = *getExactInteger(
           cast<ConstantFPSDNode>(Op.getOperand(Idx))->getValueAPF(),
-          EltSizeInBits);
+          Op.getScalarValueSizeInBits());
     }
     uint64_t ExpectedVal =
         (int64_t)(Idx * (uint64_t)*SeqStepNum) / *SeqStepDenom;
@@ -3598,7 +3599,7 @@ static SDValue lowerBuildVectorOfConstants(SDValue Op, SelectionDAG &DAG,
   // Try and match index sequences, which we can lower to the vid instruction
   // with optional modifications. An all-undef vector is matched by
   // getSplatValue, above.
-  if (auto SimpleVID = isSimpleVIDSequence(Op)) {
+  if (auto SimpleVID = isSimpleVIDSequence(Op, Op.getScalarValueSizeInBits())) {
     int64_t StepNumerator = SimpleVID->StepNumerator;
     unsigned StepDenominator = SimpleVID->StepDenominator;
     int64_t Addend = SimpleVID->Addend;
@@ -15978,7 +15979,10 @@ SDValue RISCVTargetLowering::PerformDAGCombine(SDNode *N,
 
     if (Index.getOpcode() == ISD::BUILD_VECTOR &&
         MGN->getExtensionType() == ISD::NON_EXTLOAD && isTypeLegal(VT)) {
-      if (std::optional<VIDSequence> SimpleVID = isSimpleVIDSequence(Index);
+      // The sequence will be XLenVT, not the type of Index. Tell
+      // isSimpleVIDSequence this so we avoid overflow.
+      if (std::optional<VIDSequence> SimpleVID =
+              isSimpleVIDSequence(Index, Subtarget.getXLen());
           SimpleVID && SimpleVID->StepDenominator == 1) {
         const int64_t StepNumerator = SimpleVID->StepNumerator;
         const int64_t Addend = SimpleVID->Addend;
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-gather.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-gather.ll
index 1724b48dd6be9e..2628672ee6b722 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-gather.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-gather.ll
@@ -15086,14 +15086,10 @@ define <32 x i64> @mgather_strided_split(ptr %base) {
   ret <32 x i64> %x
 }
 
-; FIXME: This is a miscompile triggered by the mgather ->
-; riscv.masked.strided.load combine. In order for it to trigger we need either a
-; strided gather that RISCVGatherScatterLowering doesn't pick up, or a new
-; strided gather generated by the widening sew combine.
 define <4 x i32> @masked_gather_widen_sew_negative_stride(ptr %base) {
 ; RV32V-LABEL: masked_gather_widen_sew_negative_stride:
 ; RV32V:       # %bb.0:
-; RV32V-NEXT:    addi a0, a0, -128
+; RV32V-NEXT:    addi a0, a0, 128
 ; RV32V-NEXT:    li a1, -128
 ; RV32V-NEXT:    vsetivli zero, 2, e64, m1, ta, ma
 ; RV32V-NEXT:    vlse64.v v8, (a0), a1
@@ -15101,7 +15097,7 @@ define <4 x i32> @masked_gather_widen_sew_negative_stride(ptr %base) {
 ;
 ; RV64V-LABEL: masked_gather_widen_sew_negative_stride:
 ; RV64V:       # %bb.0:
-; RV64V-NEXT:    addi a0, a0, -128
+; RV64V-NEXT:    addi a0, a0, 128
 ; RV64V-NEXT:    li a1, -128
 ; RV64V-NEXT:    vsetivli zero, 2, e64, m1, ta, ma
 ; RV64V-NEXT:    vlse64.v v8, (a0), a1

topperc · 2024-02-21T18:21:25Z

llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-gather.ll

 define <4 x i32> @masked_gather_widen_sew_negative_stride(ptr %base) {
 ; RV32V-LABEL: masked_gather_widen_sew_negative_stride:
 ; RV32V:       # %bb.0:
-; RV32V-NEXT:    addi a0, a0, -128
+; RV32V-NEXT:    addi a0, a0, 128
 ; RV32V-NEXT:    li a1, -128


I think I would like to see something closer to the original test case where we also calculated the wrong stride. Here the stride was correct.

I've updated the test in 11d115d to make the stride larger than 128

See #82506 (comment)

…g indices This fixes the miscompile reported in llvm#82430 by telling isSimpleVIDSequence to sign extending to XLen instead of the type of the indices, since the "sequence" of indices generated by a strided load will be at XLen. This was the simplest way I could think of of getting isSimpleVIDSequence to treat the indexes as if they were zero extended to XLenVT. Another way we could do this is by refactoring out the "get constant integers" part from isSimpleVIDSequence and handle them as APInts so we can separately zero extend it.

topperc

LGTM

See llvm#82506 (comment) (cherry picked from commit 11d115d)

…g indices (llvm#82506) This fixes the miscompile reported in llvm#82430 by telling isSimpleVIDSequence to sign extend to XLen instead of the width of the indices, since the "sequence" of indices generated by a strided load will be at XLen. This was the simplest way I could think of getting isSimpleVIDSequence to treat the indexes as if they were zero extended to XLenVT. Another way we could do this is by refactoring out the "get constant integers" part from isSimpleVIDSequence and handle them as APInts so we can separately zero extend it. Fixes llvm#82430 (cherry picked from commit 815644b)

wangpc-pp

LGTM.

wangpc-pp · 2024-02-22T03:48:18Z

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

@@ -3261,14 +3262,14 @@ static std::optional<VIDSequence> isSimpleVIDSequence(SDValue Op) {
      if (!isa<ConstantSDNode>(Op.getOperand(Idx)))
        return std::nullopt;
      Val = Op.getConstantOperandVal(Idx) &
-            maskTrailingOnes<uint64_t>(EltSizeInBits);
+            maskTrailingOnes<uint64_t>(Op.getScalarValueSizeInBits());


Make Op.getScalarValueSizeInBits() a variable? It has a lot of usages.

I think we can also reduce the number of usages by only creating the list of integers once. I'll try and create a PR for this.

I've opened #82590, hopefully it removes some duplication

See llvm#82506 (comment) (cherry picked from commit 11d115d)

…g indices (llvm#82506) This fixes the miscompile reported in llvm#82430 by telling isSimpleVIDSequence to sign extend to XLen instead of the width of the indices, since the "sequence" of indices generated by a strided load will be at XLen. This was the simplest way I could think of getting isSimpleVIDSequence to treat the indexes as if they were zero extended to XLenVT. Another way we could do this is by refactoring out the "get constant integers" part from isSimpleVIDSequence and handle them as APInts so we can separately zero extend it. Fixes llvm#82430 (cherry picked from commit 815644b)

lukel97 requested review from preames, topperc and wangpc-pp February 21, 2024 16:32

llvmbot added the backend:RISC-V label Feb 21, 2024

topperc reviewed Feb 21, 2024

View reviewed changes

lukel97 added a commit that referenced this pull request Feb 22, 2024

[RISCV] Adjust test case to show wrong stride. NFC

11d115d

See #82506 (comment)

lukel97 force-pushed the fix-mgather-combine-isSimpleVIDSequence branch from 0330d18 to b9f27be Compare February 22, 2024 03:15

topperc approved these changes Feb 22, 2024

View reviewed changes

lukel97 merged commit 815644b into llvm:main Feb 22, 2024
3 of 4 checks passed

llvmbot pushed a commit to llvmbot/llvm-project that referenced this pull request Feb 22, 2024

[RISCV] Adjust test case to show wrong stride. NFC

37dc7c0

See llvm#82506 (comment) (cherry picked from commit 11d115d)

wangpc-pp reviewed Feb 22, 2024

View reviewed changes

llvmbot pushed a commit to llvmbot/llvm-project that referenced this pull request Mar 19, 2024

[RISCV] Adjust test case to show wrong stride. NFC

a9d4ed7

See llvm#82506 (comment) (cherry picked from commit 11d115d)

pointhex mentioned this pull request May 7, 2024

getStyleDiagHandler #91314

Closed

aemerson mentioned this pull request May 9, 2024

release/18.x: [AArc64][GlobalISel] Fix legalizer assert for G_INSERT_VECTOR_ELT - manual merge #91672

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RISCV] Fix mgather -> riscv.masked.strided.load combine not extending indices #82506

[RISCV] Fix mgather -> riscv.masked.strided.load combine not extending indices #82506

lukel97 commented Feb 21, 2024 •

edited

llvmbot commented Feb 21, 2024

topperc Feb 21, 2024

lukel97 Feb 22, 2024

topperc left a comment

wangpc-pp left a comment

wangpc-pp Feb 22, 2024

lukel97 Feb 22, 2024

lukel97 Feb 22, 2024

[RISCV] Fix mgather -> riscv.masked.strided.load combine not extending indices #82506

[RISCV] Fix mgather -> riscv.masked.strided.load combine not extending indices #82506

Conversation

lukel97 commented Feb 21, 2024 • edited

llvmbot commented Feb 21, 2024

topperc Feb 21, 2024

Choose a reason for hiding this comment

lukel97 Feb 22, 2024

Choose a reason for hiding this comment

topperc left a comment

Choose a reason for hiding this comment

wangpc-pp left a comment

Choose a reason for hiding this comment

wangpc-pp Feb 22, 2024

Choose a reason for hiding this comment

lukel97 Feb 22, 2024

Choose a reason for hiding this comment

lukel97 Feb 22, 2024

Choose a reason for hiding this comment

lukel97 commented Feb 21, 2024 •

edited