[GISEL] Add IRTranslation for shufflevector on scalable vector types #80378

michaelmaitland · 2024-02-02T02:13:56Z

This patch is stacked on #80372, #80307, and #80306.

ShuffleVector on scalable vector types gets IRTranslate'd to G_SPLAT_VECTOR since a ShuffleVector that has operates on scalable vectors is a splat vector where the value of the splat vector is the 0th element of the first operand, since the index mask operand is the zeroinitializer (undef and poison are treated as zeroinitializer here). This is analogous to what happens in SelectionDAG for ShuffleVector.

buildSplatVector is renamed tobuildBuildVectorSplatVector. I did not make this a separate patch because it would cause problems to revert that change without reverting this change too.

llvmbot · 2024-02-02T02:14:16Z

@llvm/pr-subscribers-backend-aarch64
@llvm/pr-subscribers-llvm-support
@llvm/pr-subscribers-backend-risc-v

@llvm/pr-subscribers-llvm-globalisel

Author: Michael Maitland (michaelmaitland)

Changes

This patch is stacked on #80372, #80307, and #80306.

Patch is 108.12 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/80378.diff

7 Files Affected:

(modified) llvm/include/llvm/CodeGen/GlobalISel/Utils.h (+10-4)
(modified) llvm/lib/CodeGen/GlobalISel/CallLowering.cpp (+7-6)
(modified) llvm/lib/CodeGen/GlobalISel/Utils.cpp (+100-61)
(modified) llvm/lib/CodeGen/MachineVerifier.cpp (+27-12)
(modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+3-3)
(added) llvm/test/CodeGen/RISCV/GlobalISel/irtranslator/shufflevector.ll (+1524)
(modified) llvm/unittests/CodeGen/GlobalISel/GISelUtilsTest.cpp (+143)

diff --git a/llvm/include/llvm/CodeGen/GlobalISel/Utils.h b/llvm/include/llvm/CodeGen/GlobalISel/Utils.h
index bf02911e19351..f8900f3434cca 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/Utils.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/Utils.h
@@ -343,10 +343,13 @@ Register getFunctionLiveInPhysReg(MachineFunction &MF,
                                   const TargetRegisterClass &RC,
                                   const DebugLoc &DL, LLT RegTy = LLT());
 
-/// Return the least common multiple type of \p OrigTy and \p TargetTy, by changing the
-/// number of vector elements or scalar bitwidth. The intent is a
+/// Return the least common multiple type of \p OrigTy and \p TargetTy, by
+/// changing the number of vector elements or scalar bitwidth. The intent is a
 /// G_MERGE_VALUES, G_BUILD_VECTOR, or G_CONCAT_VECTORS can be constructed from
-/// \p OrigTy elements, and unmerged into \p TargetTy
+/// \p OrigTy elements, and unmerged into \p TargetTy. It is an error to call
+/// this function where one argument is a fixed vector and the other is a
+/// scalable vector, since it is illegal to build a G_{MERGE|UNMERGE}_VALUES
+/// between fixed and scalable vectors.
 LLVM_READNONE
 LLT getLCMType(LLT OrigTy, LLT TargetTy);
 
@@ -365,7 +368,10 @@ LLT getCoverTy(LLT OrigTy, LLT TargetTy);
 /// If these are vectors with different element types, this will try to produce
 /// a vector with a compatible total size, but the element type of \p OrigTy. If
 /// this can't be satisfied, this will produce a scalar smaller than the
-/// original vector elements.
+/// original vector elements. It is an error to call this function where
+/// one argument is a fixed vector and the other is a scalable vector, since it
+/// is illegal to build a G_{MERGE|UNMERGE}_VALUES between fixed and scalable
+/// vectors.
 ///
 /// In the worst case, this returns LLT::scalar(1)
 LLVM_READNONE
diff --git a/llvm/lib/CodeGen/GlobalISel/CallLowering.cpp b/llvm/lib/CodeGen/GlobalISel/CallLowering.cpp
index ccd9b13d730b6..3bd1542eeb746 100644
--- a/llvm/lib/CodeGen/GlobalISel/CallLowering.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/CallLowering.cpp
@@ -412,7 +412,7 @@ static void buildCopyFromRegs(MachineIRBuilder &B, ArrayRef<Register> OrigRegs,
     // size, e.g. PartLLT == v2s64 and LLTy is v3s32, then first coerce it to
     // have the same elt type, i.e. v4s32.
     // TODO: Extend this coersion to element multiples other than just 2.
-    if (PartLLT.getSizeInBits() > LLTy.getSizeInBits() &&
+    if (TypeSize::isKnownGT(PartLLT.getSizeInBits(), LLTy.getSizeInBits()) &&
         PartLLT.getScalarSizeInBits() == LLTy.getScalarSizeInBits() * 2 &&
         Regs.size() == 1) {
       LLT NewTy = PartLLT.changeElementType(LLTy.getElementType())
@@ -529,7 +529,7 @@ static void buildCopyToRegs(MachineIRBuilder &B, ArrayRef<Register> DstRegs,
   // We could just insert a regular copy, but this is unreachable at the moment.
   assert(SrcTy != PartTy && "identical part types shouldn't reach here");
 
-  const unsigned PartSize = PartTy.getSizeInBits();
+  const TypeSize PartSize = PartTy.getSizeInBits();
 
   if (PartTy.isVector() == SrcTy.isVector() &&
       PartTy.getScalarSizeInBits() > SrcTy.getScalarSizeInBits()) {
@@ -539,7 +539,7 @@ static void buildCopyToRegs(MachineIRBuilder &B, ArrayRef<Register> DstRegs,
   }
 
   if (SrcTy.isVector() && !PartTy.isVector() &&
-      PartSize > SrcTy.getElementType().getSizeInBits()) {
+      TypeSize::isKnownGT(PartSize, SrcTy.getElementType().getSizeInBits())) {
     // Vector was scalarized, and the elements extended.
     auto UnmergeToEltTy = B.buildUnmerge(SrcTy.getElementType(), SrcReg);
     for (int i = 0, e = DstRegs.size(); i != e; ++i)
@@ -548,9 +548,10 @@ static void buildCopyToRegs(MachineIRBuilder &B, ArrayRef<Register> DstRegs,
   }
 
   if (SrcTy.isVector() && PartTy.isVector() &&
-      PartTy.getScalarSizeInBits() == SrcTy.getScalarSizeInBits() &&
-      SrcTy.getNumElements() < PartTy.getNumElements()) {
-    // A coercion like: v2f32 -> v4f32.
+      PartTy.getSizeInBits() == SrcTy.getSizeInBits() &&
+      ElementCount::isKnownLT(SrcTy.getElementCount(),
+                              PartTy.getElementCount())) {
+    // A coercion like: v2f32 -> v4f32 or nxv2f32 -> nxv4f32
     Register DstReg = DstRegs.front();
     B.buildPadVectorWithUndefElements(DstReg, SrcReg);
     return;
diff --git a/llvm/lib/CodeGen/GlobalISel/Utils.cpp b/llvm/lib/CodeGen/GlobalISel/Utils.cpp
index aed826a9cbc54..73da8e61972eb 100644
--- a/llvm/lib/CodeGen/GlobalISel/Utils.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/Utils.cpp
@@ -1071,49 +1071,73 @@ void llvm::getSelectionDAGFallbackAnalysisUsage(AnalysisUsage &AU) {
 }
 
 LLT llvm::getLCMType(LLT OrigTy, LLT TargetTy) {
-  const unsigned OrigSize = OrigTy.getSizeInBits();
-  const unsigned TargetSize = TargetTy.getSizeInBits();
-
-  if (OrigSize == TargetSize)
+  if (OrigTy.getSizeInBits() == TargetTy.getSizeInBits())
     return OrigTy;
 
-  if (OrigTy.isVector()) {
-    const LLT OrigElt = OrigTy.getElementType();
-
-    if (TargetTy.isVector()) {
-      const LLT TargetElt = TargetTy.getElementType();
+  if (OrigTy.isVector() && TargetTy.isVector()) {
+    LLT OrigElt = OrigTy.getElementType();
+    LLT TargetElt = TargetTy.getElementType();
 
-      if (OrigElt.getSizeInBits() == TargetElt.getSizeInBits()) {
-        int GCDElts =
-            std::gcd(OrigTy.getNumElements(), TargetTy.getNumElements());
-        // Prefer the original element type.
-        ElementCount Mul = OrigTy.getElementCount() * TargetTy.getNumElements();
-        return LLT::vector(Mul.divideCoefficientBy(GCDElts),
-                           OrigTy.getElementType());
-      }
-    } else {
-      if (OrigElt.getSizeInBits() == TargetSize)
-        return OrigTy;
+    // TODO: The docstring for this function says the intention is to use this
+    // function to build MERGE/UNMERGE instructions. It won't be the case that
+    // we generate a MERGE/UNMERGE between fixed and scalable vector types. We
+    // could implement getLCMType between the two in the future if there was a
+    // need, but it is not worth it now as this function should not be used in
+    // that way.
+    if ((OrigTy.isScalableVector() && TargetTy.isFixedVector()) ||
+        (OrigTy.isFixedVector() && TargetTy.isScalableVector()))
+      llvm_unreachable(
+          "getLCMType not implemented between fixed and scalable vectors.");
+
+    if (OrigElt.getSizeInBits() == TargetElt.getSizeInBits()) {
+      int GCDMinElts = std::gcd(OrigTy.getElementCount().getKnownMinValue(),
+                                TargetTy.getElementCount().getKnownMinValue());
+      // Prefer the original element type.
+      ElementCount Mul = OrigTy.getElementCount().multiplyCoefficientBy(
+          TargetTy.getElementCount().getKnownMinValue());
+      return LLT::vector(Mul.divideCoefficientBy(GCDMinElts),
+                         OrigTy.getElementType());
     }
-
-    unsigned LCMSize = std::lcm(OrigSize, TargetSize);
-    return LLT::fixed_vector(LCMSize / OrigElt.getSizeInBits(), OrigElt);
+    unsigned LCM = std::lcm(OrigTy.getElementCount().getKnownMinValue() *
+                                OrigElt.getSizeInBits().getFixedValue(),
+                            TargetTy.getElementCount().getKnownMinValue() *
+                                TargetElt.getSizeInBits().getFixedValue());
+    return LLT::vector(
+        ElementCount::get(LCM / OrigElt.getSizeInBits(), OrigTy.isScalable()),
+        OrigElt);
   }
 
-  if (TargetTy.isVector()) {
-    unsigned LCMSize = std::lcm(OrigSize, TargetSize);
-    return LLT::fixed_vector(LCMSize / OrigSize, OrigTy);
+  // One type is scalar, one type is vector
+  if (OrigTy.isVector() || TargetTy.isVector()) {
+    LLT VecTy = OrigTy.isVector() ? OrigTy : TargetTy;
+    LLT ScalarTy = OrigTy.isVector() ? TargetTy : OrigTy;
+    LLT EltTy = VecTy.getElementType();
+    LLT OrigEltTy = OrigTy.isVector() ? OrigTy.getElementType() : OrigTy;
+
+    // Prefer scalar type from OrigTy.
+    if (EltTy.getSizeInBits() == ScalarTy.getSizeInBits())
+      return LLT::vector(VecTy.getElementCount(), OrigEltTy);
+
+    // Different size scalars. Create vector with the same total size.
+    // LCM will take fixed/scalable from VecTy.
+    unsigned LCM = std::lcm(EltTy.getSizeInBits().getFixedValue() *
+                                VecTy.getElementCount().getKnownMinValue(),
+                            ScalarTy.getSizeInBits().getFixedValue());
+    // Prefer type from OrigTy
+    return LLT::vector(ElementCount::get(LCM / OrigEltTy.getSizeInBits(),
+                                         VecTy.getElementCount().isScalable()),
+                       OrigEltTy);
   }
 
-  unsigned LCMSize = std::lcm(OrigSize, TargetSize);
-
+  // At this point, both types are scalars of different size
+  unsigned LCM = std::lcm(OrigTy.getSizeInBits().getFixedValue(),
+                          TargetTy.getSizeInBits().getFixedValue());
   // Preserve pointer types.
-  if (LCMSize == OrigSize)
+  if (LCM == OrigTy.getSizeInBits())
     return OrigTy;
-  if (LCMSize == TargetSize)
+  if (LCM == TargetTy.getSizeInBits())
     return TargetTy;
-
-  return LLT::scalar(LCMSize);
+  return LLT::scalar(LCM);
 }
 
 LLT llvm::getCoverTy(LLT OrigTy, LLT TargetTy) {
@@ -1132,45 +1156,60 @@ LLT llvm::getCoverTy(LLT OrigTy, LLT TargetTy) {
 }
 
 LLT llvm::getGCDType(LLT OrigTy, LLT TargetTy) {
-  const unsigned OrigSize = OrigTy.getSizeInBits();
-  const unsigned TargetSize = TargetTy.getSizeInBits();
-
-  if (OrigSize == TargetSize)
+  if (OrigTy.getSizeInBits() == TargetTy.getSizeInBits())
     return OrigTy;
 
-  if (OrigTy.isVector()) {
+  if (OrigTy.isVector() && TargetTy.isVector()) {
     LLT OrigElt = OrigTy.getElementType();
-    if (TargetTy.isVector()) {
-      LLT TargetElt = TargetTy.getElementType();
-      if (OrigElt.getSizeInBits() == TargetElt.getSizeInBits()) {
-        int GCD = std::gcd(OrigTy.getNumElements(), TargetTy.getNumElements());
-        return LLT::scalarOrVector(ElementCount::getFixed(GCD), OrigElt);
-      }
-    } else {
-      // If the source is a vector of pointers, return a pointer element.
-      if (OrigElt.getSizeInBits() == TargetSize)
-        return OrigElt;
-    }
+    LLT TargetElt = TargetTy.getElementType();
 
-    unsigned GCD = std::gcd(OrigSize, TargetSize);
+    // TODO: The docstring for this function says the intention is to use this
+    // function to build MERGE/UNMERGE instructions. It won't be the case that
+    // we generate a MERGE/UNMERGE between fixed and scalable vector types. We
+    // could implement getGCDType between the two in the future if there was a
+    // need, but it is not worth it now as this function should not be used in
+    // that way.
+    if ((OrigTy.isScalableVector() && TargetTy.isFixedVector()) ||
+        (OrigTy.isFixedVector() && TargetTy.isScalableVector()))
+      llvm_unreachable(
+          "getGCDType not implemented between fixed and scalable vectors.");
+
+    unsigned GCD = std::gcd(OrigTy.getElementCount().getKnownMinValue() *
+                                OrigElt.getSizeInBits().getFixedValue(),
+                            TargetTy.getElementCount().getKnownMinValue() *
+                                TargetElt.getSizeInBits().getFixedValue());
     if (GCD == OrigElt.getSizeInBits())
-      return OrigElt;
+      return LLT::scalarOrVector(ElementCount::get(1, OrigTy.isScalable()),
+                                 OrigElt);
 
-    // If we can't produce the original element type, we have to use a smaller
-    // scalar.
+    // Cannot produce original element type, but both have vscale in common.
     if (GCD < OrigElt.getSizeInBits())
-      return LLT::scalar(GCD);
-    return LLT::fixed_vector(GCD / OrigElt.getSizeInBits(), OrigElt);
-  }
+      return LLT::scalarOrVector(ElementCount::get(1, OrigTy.isScalable()),
+                                 GCD);
 
-  if (TargetTy.isVector()) {
-    // Try to preserve the original element type.
-    LLT TargetElt = TargetTy.getElementType();
-    if (TargetElt.getSizeInBits() == OrigSize)
-      return OrigTy;
+    return LLT::vector(
+        ElementCount::get(GCD / OrigElt.getSizeInBits().getFixedValue(),
+                          OrigTy.isScalable()),
+        OrigElt);
   }
 
-  unsigned GCD = std::gcd(OrigSize, TargetSize);
+  // If one type is vector and the element size matches the scalar size, then
+  // the gcd is the scalar type.
+  if (OrigTy.isVector() &&
+      OrigTy.getElementType().getSizeInBits() == TargetTy.getSizeInBits())
+    return OrigTy.getElementType();
+  if (TargetTy.isVector() &&
+      TargetTy.getElementType().getSizeInBits() == OrigTy.getSizeInBits())
+    return OrigTy;
+
+  // At this point, both types are either scalars of different type or one is a
+  // vector and one is a scalar. If both types are scalars, the GCD type is the
+  // GCD between the two scalar sizes. If one is vector and one is scalar, then
+  // the GCD type is the GCD between the scalar and the vector element size.
+  LLT OrigScalar = OrigTy.isVector() ? OrigTy.getElementType() : OrigTy;
+  LLT TargetScalar = TargetTy.isVector() ? TargetTy.getElementType() : TargetTy;
+  unsigned GCD = std::gcd(OrigScalar.getSizeInBits().getFixedValue(),
+                          TargetScalar.getSizeInBits().getFixedValue());
   return LLT::scalar(GCD);
 }
 
diff --git a/llvm/lib/CodeGen/MachineVerifier.cpp b/llvm/lib/CodeGen/MachineVerifier.cpp
index b182000a3d705..f72584839f01b 100644
--- a/llvm/lib/CodeGen/MachineVerifier.cpp
+++ b/llvm/lib/CodeGen/MachineVerifier.cpp
@@ -1400,7 +1400,8 @@ void MachineVerifier::verifyPreISelGenericInstruction(const MachineInstr *MI) {
     if (DstTy.isVector()) {
       // This case is the converse of G_CONCAT_VECTORS.
       if (!SrcTy.isVector() || SrcTy.getScalarType() != DstTy.getScalarType() ||
-          SrcTy.getNumElements() != NumDsts * DstTy.getNumElements())
+          SrcTy.isScalableVector() != DstTy.isScalableVector() ||
+          SrcTy.getSizeInBits() != NumDsts * DstTy.getSizeInBits())
         report("G_UNMERGE_VALUES source operand does not match vector "
                "destination operands",
                MI);
@@ -1477,8 +1478,8 @@ void MachineVerifier::verifyPreISelGenericInstruction(const MachineInstr *MI) {
     for (const MachineOperand &MO : llvm::drop_begin(MI->operands(), 2))
       if (MRI->getType(MI->getOperand(1).getReg()) != MRI->getType(MO.getReg()))
         report("G_CONCAT_VECTOR source operand types are not homogeneous", MI);
-    if (DstTy.getNumElements() !=
-        SrcTy.getNumElements() * (MI->getNumOperands() - 1))
+    if (DstTy.getElementCount() !=
+        SrcTy.getElementCount() * (MI->getNumOperands() - 1))
       report("G_CONCAT_VECTOR num dest and source elements should match", MI);
     break;
   }
@@ -1617,20 +1618,34 @@ void MachineVerifier::verifyPreISelGenericInstruction(const MachineInstr *MI) {
 
     // Don't check that all operands are vector because scalars are used in
     // place of 1 element vectors.
-    int SrcNumElts = Src0Ty.isVector() ? Src0Ty.getNumElements() : 1;
-    int DstNumElts = DstTy.isVector() ? DstTy.getNumElements() : 1;
+    ElementCount SrcNumElts = Src0Ty.isVector() ? Src0Ty.getElementCount()
+                                                : ElementCount::getFixed(1);
+    ElementCount DstNumElts =
+        DstTy.isVector() ? DstTy.getElementCount() : ElementCount::getFixed(1);
 
     ArrayRef<int> MaskIdxes = MaskOp.getShuffleMask();
 
-    if (static_cast<int>(MaskIdxes.size()) != DstNumElts)
+    // For scalable vectors, there is an entry in the Mask for each
+    // KnownMinValue.
+    if (MaskIdxes.size() != DstNumElts.getKnownMinValue())
       report("Wrong result type for shufflemask", MI);
 
-    for (int Idx : MaskIdxes) {
-      if (Idx < 0)
-        continue;
-
-      if (Idx >= 2 * SrcNumElts)
-        report("Out of bounds shuffle index", MI);
+    if (Src0Ty.isScalableVector()) {
+      if (!llvm::all_of(MaskIdxes,
+                       [&MaskIdxes](int M) { return M == MaskIdxes[0]; }))
+        report("Elements of a scalable G_SHUFFLE_VECTOR mask must match", MI);
+      if (MaskIdxes[0] != 0 && MaskIdxes[0] != -1)
+        report("Elements of a scalable G_SHUFFLE_VECTOR mask be zero or undef",
+               MI);
+    } else {
+      // Idxes for fixed vectors must be in bounds or undef, which is
+      // represented as -1.
+      for (int Idx : MaskIdxes) {
+        if (Idx < 0)
+          continue;
+        if ((unsigned)Idx >= 2 * SrcNumElts.getFixedValue())
+          report("Out of bounds shuffle index", MI);
+      }
     }
 
     break;
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index b5db41197a35a..a7807ead7a2d9 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -20485,11 +20485,11 @@ unsigned RISCVTargetLowering::getCustomCtpopCost(EVT VT,
 
 bool RISCVTargetLowering::fallBackToDAGISel(const Instruction &Inst) const {
 
-  // GISel support is in progress or complete for G_ADD, G_SUB, G_AND, G_OR, and
-  // G_XOR.
+  // GISel support is in progress or complete for these opcodes.
   unsigned Op = Inst.getOpcode();
   if (Op == Instruction::Add || Op == Instruction::Sub ||
-      Op == Instruction::And || Op == Instruction::Or || Op == Instruction::Xor)
+      Op == Instruction::And || Op == Instruction::Or ||
+      Op == Instruction::Xor || Op == Instruction::ShuffleVector)
     return false;
 
   if (Inst.getType()->isScalableTy())
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/irtranslator/shufflevector.ll b/llvm/test/CodeGen/RISCV/GlobalISel/irtranslator/shufflevector.ll
new file mode 100644
index 0000000000000..741f791ff3d05
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/irtranslator/shufflevector.ll
@@ -0,0 +1,1524 @@
+; NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 4
+; RUN: llc -mtriple=riscv32 -mattr=+v -global-isel -stop-after=irtranslator \
+; RUN:   -verify-machineinstrs < %s | FileCheck -check-prefixes=RV32 %s
+; RUN: llc -mtriple=riscv64 -mattr=+v -global-isel -stop-after=irtranslator \
+; RUN:   -verify-machineinstrs < %s | FileCheck -check-prefixes=RV64 %s
+
+define <vscale x 1 x i1> @shufflevector_nxv1i1_0() {
+  ; RV32-LABEL: name: shufflevector_nxv1i1_0
+  ; RV32: bb.1 (%ir-block.0):
+  ; RV32-NEXT:   [[DEF:%[0-9]+]]:_(<vscale x 1 x s1>) = G_IMPLICIT_DEF
+  ; RV32-NEXT:   [[SHUF:%[0-9]+]]:_(<vscale x 1 x s1>) = G_SHUFFLE_VECTOR [[DEF]](<vscale x 1 x s1>), [[DEF]], shufflemask(undef)
+  ; RV32-NEXT:   $v0 = COPY [[SHUF]](<vscale x 1 x s1>)
+  ; RV32-NEXT:   PseudoRET implicit $v0
+  ;
+  ; RV64-LABEL: name: shufflevector_nxv1i1_0
+  ; RV64: bb.1 (%ir-block.0):
+  ; RV64-NEXT:   [[DEF:%[0-9]+]]:_(<vscale x 1 x s1>) = G_IMPLICIT_DEF
+  ; RV64-NEXT:   [[SHUF:%[0-9]+]]:_(<vscale x 1 x s1>) = G_SHUFFLE_VECTOR [[DEF]](<vscale x 1 x s1>), [[DEF]], shufflemask(undef)
+  ; RV64-NEXT:   $v0 = COPY [[SHUF]](<vscale x 1 x s1>)
+  ; RV64-NEXT:   PseudoRET implicit $v0
+  %a = shufflevector <vscale x 1 x i1> poison, <vscale x 1 x i1> poison, <vscale x 1 x i32> poison
+  ret <vscale x 1 x i1> %a
+}
+
+define <vscale x 1 x i1> @shufflevector_nxv1i1_1() {
+  ; RV32-LABEL: name: shufflevector_nxv1i1_1
+  ; RV32: bb.1 (%ir-block.0):
+  ; RV32-NEXT:   [[DEF:%[0-9]+]]:_(<vscale x 1 x s1>) = G_IMPLICIT_DEF
+  ; RV32-NEXT:   [[SHUF:%[0-9]+]]:_(<vscale x 1 x s1>) = G_SHUFFLE_VECTOR [[DEF]](<vscale x 1 x s1>), [[DEF]], shufflemask(undef)
+  ; RV32-NEXT:   $v0 = COPY [[SHUF]](<vscale x 1 x s1>)
+  ; RV32-NEXT:   PseudoRET implicit $v0
+  ;
+  ; RV64-LABEL: name: shufflevector_nxv1i1_1
+  ; RV64: bb.1 (%ir-block.0):
+  ; RV64-NEXT:   [[DEF:%[0-9]+]]:_(<vscale x 1 x s1>) = G_IMPLICIT_DEF
+  ; RV64-NEXT:   [[SHUF:%[0-9]+]]:_(<vscale x 1 x s1>) = G_SHUFFLE_VECTOR [[DEF]](<vscale x 1 x s1>), [[DEF]], shufflemask(undef)
+  ; RV64-NEXT:   $v0 = COPY [[SHUF]](<vscale x 1 x s1>)
+  ; RV64-NEXT:   PseudoRET implicit $v0
+  %a = shufflevector <vscale x 1 x i1> undef, <vscale x 1 x i1> undef, <vscale x 1 x i32> undef
+  ret <vscale x 1 x i1> %a
+}
+
+define <vscale x 1 x i1> @shufflevector_nxv1i1_2(<vscale x 1 x i1> %a) {
+  ; RV32-LABEL: name: shufflevector_nxv1i1_2
+  ; RV32: bb.1 (%ir-block.0):
+  ; RV32-NEXT:   liveins: $v0
+  ; RV32-NEXT: {{ ...
[truncated]

github-actions · 2024-02-02T02:16:43Z

✅ With the latest revision this PR passed the C/C++ code formatter.

…80372) This is stacked on by #80377 and #80378

llvm/lib/CodeGen/MachineVerifier.cpp

…lvm#80372) This is stacked on by llvm#80377 and llvm#80378

topperc · 2024-02-06T04:25:29Z

SelectionDAG doesn't support ISD::SHUFFLEVECTOR for scalable vectors. It only supports ISD::SPLAT_VECTOR. Are we trying to do something different for GISel for some reason?

michaelmaitland · 2024-02-06T13:46:22Z

SelectionDAG doesn't support ISD::SHUFFLEVECTOR for scalable vectors. It only supports ISD::SPLAT_VECTOR. Are we trying to do something different for GISel for some reason?

I intend to write a patch after this that combines g_insert_element + g_shuffle_vector into g_splat_vector where appropriate. For scalable vectors, that is in all cases since the mask index vector is always zeroinit (undef case can be treated as zeroinit).

I thought there is a case where fixed vectors can generate a non-splat vector which is why we kept the opcode. But I will defer to someone who knows more about this to answer why it’s currently different than SDAG.

topperc · 2024-02-06T16:00:03Z

SelectionDAG doesn't support ISD::SHUFFLEVECTOR for scalable vectors. It only supports ISD::SPLAT_VECTOR. Are we trying to do something different for GISel for some reason?

I intend to write a patch after this that combines g_insert_element + g_shuffle_vector into g_splat_vector where appropriate. For scalable vectors, that is in all cases since the mask index vector is always zeroinit (undef case can be treated as zeroinit).

What if the G_INSERT_ELT doesn't exist? SelectionDAG always creates an extract+splat_vector when it translates a scalable shufflevector.

michaelmaitland · 2024-02-06T16:04:00Z

SelectionDAG doesn't support ISD::SHUFFLEVECTOR for scalable vectors. It only supports ISD::SPLAT_VECTOR. Are we trying to do something different for GISel for some reason?

I intend to write a patch after this that combines g_insert_element + g_shuffle_vector into g_splat_vector where appropriate. For scalable vectors, that is in all cases since the mask index vector is always zeroinit (undef case can be treated as zeroinit).

What if the G_INSERT_ELT doesn't exist? SelectionDAG always creates an extract+splat_vector when it translates a scalable shufflevector.

Now that I think more about it, I am pretty sure that G_SHUFFLE_VECTOR for scalable vectors is the same thing as a splat, since the mask is always taking from index zero. If there is no insertelement before it, then we are splatting whatever is in index 0 of the input vector.

topperc · 2024-02-06T17:37:06Z

SelectionDAG doesn't support ISD::SHUFFLEVECTOR for scalable vectors. It only supports ISD::SPLAT_VECTOR. Are we trying to do something different for GISel for some reason?

I intend to write a patch after this that combines g_insert_element + g_shuffle_vector into g_splat_vector where appropriate. For scalable vectors, that is in all cases since the mask index vector is always zeroinit (undef case can be treated as zeroinit).

What if the G_INSERT_ELT doesn't exist? SelectionDAG always creates an extract+splat_vector when it translates a scalable shufflevector.

Now that I think more about it, I am pretty sure that G_SHUFFLE_VECTOR for scalable vectors is the same thing as a splat, since the mask is always taking from index zero. If there is no insertelement before it, then we are splatting whatever is in index 0 of the input vector.

Yes the only scalable shufflevector supported is a splat of element 0. The difference between ISD::SHUFFLE_VECTOR and ISD::SPLAT_VECTOR is that ISD::SHUFFLE_VECTOR takes a vector input and ISD::SPLAT_VECTOR takes a scalar input.

michaelmaitland · 2024-02-06T17:45:14Z

Yes the only scalable shufflevector supported is a splat of element 0. The difference between ISD::SHUFFLE_VECTOR and ISD::SPLAT_VECTOR is that ISD::SHUFFLE_VECTOR takes a vector input and ISD::SPLAT_VECTOR takes a scalar input.

Do you think it would be reasonable to deal with converting a G_SHUFFLE_VECTOR into a G_SPLAT_VECTOR during legalization? Or are you asking for it to occur here during IRTranslation? FYI there is no G_SPLAT_VECTOR opcode today -- we would need to add that.

tschuett · 2024-02-06T17:45:35Z

GlobalIsel has no splat_vector yet. I was hoping that #74502 would make progress.

michaelmaitland · 2024-02-06T17:47:40Z

GlobalIsel has no splat_vector yet. I was hoping that #74502 would make progress.

We might still want a G_SPLAT_VECTOR in the case that the splat is not constant. WDYT?

tschuett · 2024-02-06T17:51:28Z

@aemerson We would still need G_FSPLAT_VECTOR and G_SPLAT_VECTOR. If there are constants or operations?!?

tschuett · 2024-02-06T17:52:22Z

IR gets splats as constants. The IR translator eventually has to translate them into GlobalIsel.

topperc · 2024-02-06T17:54:02Z

@aemerson We would still need G_FSPLAT_VECTOR and G_SPLAT_VECTOR. If there are constants or operations?!?

Why would we need two? There's only one G_BUILD_VECTOR isn't there? Though I do see a G_BUILD_VECTOR_TRUNC. So maybe we need G_SPLAT_VECTOR and G_SPLAT_VECTOR_TRUNC?

tschuett · 2024-02-06T17:57:17Z

How would you otherwise distinguish between integer and float splats? If a splat vector only gets an immediate?

topperc · 2024-02-06T17:59:07Z

How would you otherwise distinguish between integer and float splats? If a splat vector only gets an immediate?

How do we do it for a BUILD_VECTOR of immediates?

arsenm · 2024-02-06T18:00:12Z

How would you otherwise distinguish between integer and float splats? If a splat vector only gets an immediate?

There's no need to, but you can always inspect the operands. In fact, I think we should get rid of G_FCONSTANT altogether and just adds unnecessary hassle

michaelmaitland · 2024-02-06T18:00:22Z

How would you otherwise distinguish between integer and float splats? If a splat vector only gets an immediate?

How do we distinguish today between integers and floats for G_SHUFFLE_VECTOR? We don't have a G_FSHUFFLE_VECTOR.

My guess would be G_SPLAT_VECTOR (G_CONSTANT 8) vs G_SPLAT_VECTOR (G_FCONSTANT 3.2) allows us to look into the operand and determine whether it is a float or not.

… types

…r types" (#84330) Reverts #80378 causing Buildbot failures that did not show up with check-llvm or CI.

…80378) Recommits #80378 which was reverted in #84330. The problem was that the change in llvm/test/CodeGen/AArch64/GlobalISel/legalizer-info-validation.mir used 217 as an opcode instead of a regex.

adrian-prantl · 2024-03-07T19:20:49Z

@michaelmaitland I'm getting crashes of tblgen when building ToT main and reverting this patch seems to fix it. Could you please revert this once more and investigate?

adrian-prantl · 2024-03-07T19:27:27Z

@michaelmaitland Sorry for the noise — after a clean rebuild I can't reproduce this any more. Unfortunately green.lab.llvm.org is not reachable so I can't point at any bots either. I think you can ignore this.

…types processing (#84766) This PR: * adds support for G_SPLAT_VECTOR generic opcode that may be legally generated instead of G_BUILD_VECTOR by previous passes of the translator (see #80378 for the source of breaking changes); * improves deduction of types for opaque pointers. This PR also fixes the following issues: * if a function has ptr argument(s), two functions that have different SPIR-V type definitions may get identical LLVM function types and break agreements of global register and duplicate checker; * checks for pointer types do not account for TypedPointerType. Update of tests: * A test case is added to cover the issue with function ptr parameters. * The first case, that is support for G_SPLAT_VECTOR generic opcode, is covered by existing test cases. * Multiple additional checks by `spirv-val` is added to cover more possibilities of generation of invalid code.

michaelmaitland added backend:RISC-V llvm:globalisel labels Feb 2, 2024

michaelmaitland requested review from arsenm, tschuett, topperc and jiahanxie353 February 2, 2024 02:13

michaelmaitland force-pushed the gisel-shufflevector-ir branch 2 times, most recently from 2b06f80 to a670300 Compare February 2, 2024 14:49

michaelmaitland added a commit that referenced this pull request Feb 2, 2024

[GISEL] More accounting for scalable vectors when operating on LLTs (#…

ad0acf9

…80372) This is stacked on by #80377 and #80378

arsenm reviewed Feb 5, 2024

View reviewed changes

llvm/lib/CodeGen/MachineVerifier.cpp Outdated Show resolved Hide resolved

michaelmaitland force-pushed the gisel-shufflevector-ir branch from 0db424d to 698f604 Compare February 5, 2024 15:41

michaelmaitland requested a review from arsenm February 5, 2024 15:41

agozillon pushed a commit to agozillon/llvm-project that referenced this pull request Feb 5, 2024

[GISEL] More accounting for scalable vectors when operating on LLTs (l…

6e5a5f9

…lvm#80372) This is stacked on by llvm#80377 and llvm#80378

michaelmaitland requested review from tschuett, arsenm, davemgreen and topperc February 12, 2024 20:56

arsenm approved these changes Feb 28, 2024

View reviewed changes

michaelmaitland added 6 commits March 7, 2024 06:35

[RISCV][GISEL] Add IRTranslation for shufflevector on scalable vector…

3ba8a2b

… types

fixme! add machineverifier tests for new report cases

7524a5b

fixup! select shufflevector as G_SPLAT_VECTOR

b4f91cd

!fixup undo scalable vector shuffle vector in verifier

4d9cf88

fixup! respond to review comments

9ff2bae

fixup! fix failing tests

9008a03

michaelmaitland force-pushed the gisel-shufflevector-ir branch 2 times, most recently from c0146e2 to 2f246e2 Compare March 7, 2024 14:48

fixup! limit g_splat_vector to scalable types

e91efd3

michaelmaitland force-pushed the gisel-shufflevector-ir branch from 2f246e2 to e91efd3 Compare March 7, 2024 14:49

michaelmaitland merged commit 2b8aaef into llvm:main Mar 7, 2024
4 of 5 checks passed

michaelmaitland mentioned this pull request Mar 7, 2024

Revert "[GISEL] Add IRTranslation for shufflevector on scalable vector types" #84330

Merged

michaelmaitland added a commit that referenced this pull request Mar 7, 2024

Revert "[GISEL] Add IRTranslation for shufflevector on scalable vecto…

552da24

…r types" (#84330) Reverts #80378 causing Buildbot failures that did not show up with check-llvm or CI.

tschuett mentioned this pull request Mar 11, 2024

[SPIR-V] Add implementation of G_SPLAT_VECTOR opcode and fix invalid types processing #84766

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GISEL] Add IRTranslation for shufflevector on scalable vector types #80378

[GISEL] Add IRTranslation for shufflevector on scalable vector types #80378

michaelmaitland commented Feb 2, 2024 •

edited

llvmbot commented Feb 2, 2024 •

edited

github-actions bot commented Feb 2, 2024 •

edited

topperc commented Feb 6, 2024

michaelmaitland commented Feb 6, 2024

topperc commented Feb 6, 2024

michaelmaitland commented Feb 6, 2024

topperc commented Feb 6, 2024 •

edited

michaelmaitland commented Feb 6, 2024

tschuett commented Feb 6, 2024

michaelmaitland commented Feb 6, 2024

tschuett commented Feb 6, 2024

tschuett commented Feb 6, 2024

topperc commented Feb 6, 2024

tschuett commented Feb 6, 2024

topperc commented Feb 6, 2024

arsenm commented Feb 6, 2024

michaelmaitland commented Feb 6, 2024

adrian-prantl commented Mar 7, 2024

adrian-prantl commented Mar 7, 2024

[GISEL] Add IRTranslation for shufflevector on scalable vector types #80378

[GISEL] Add IRTranslation for shufflevector on scalable vector types #80378

Conversation

michaelmaitland commented Feb 2, 2024 • edited

llvmbot commented Feb 2, 2024 • edited

github-actions bot commented Feb 2, 2024 • edited

topperc commented Feb 6, 2024

michaelmaitland commented Feb 6, 2024

topperc commented Feb 6, 2024

michaelmaitland commented Feb 6, 2024

topperc commented Feb 6, 2024 • edited

michaelmaitland commented Feb 6, 2024

tschuett commented Feb 6, 2024

michaelmaitland commented Feb 6, 2024

tschuett commented Feb 6, 2024

tschuett commented Feb 6, 2024

topperc commented Feb 6, 2024

tschuett commented Feb 6, 2024

topperc commented Feb 6, 2024

arsenm commented Feb 6, 2024

michaelmaitland commented Feb 6, 2024

adrian-prantl commented Mar 7, 2024

adrian-prantl commented Mar 7, 2024

michaelmaitland commented Feb 2, 2024 •

edited

llvmbot commented Feb 2, 2024 •

edited

github-actions bot commented Feb 2, 2024 •

edited

topperc commented Feb 6, 2024 •

edited