Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CodeGen] Update for scalable MemoryType in MMO #70452

Merged
merged 1 commit into from
Mar 23, 2024

Conversation

harviniriawan
Copy link
Contributor

This is a continuation of #69152.

  Remove getSizeOrUnknown call when MachineMemOperand is created.
  For Scalable TypeSize, the MemoryType created becomes a
  scalable_vector.

  2 MMOs that have scalable memory access can then use the updated
  BasicAA that understands scalable LocationSize

@llvmbot
Copy link
Collaborator

llvmbot commented Oct 27, 2023

@llvm/pr-subscribers-backend-powerpc
@llvm/pr-subscribers-backend-systemz
@llvm/pr-subscribers-backend-x86
@llvm/pr-subscribers-backend-arm
@llvm/pr-subscribers-debuginfo
@llvm/pr-subscribers-backend-risc-v
@llvm/pr-subscribers-llvm-globalisel
@llvm/pr-subscribers-llvm-analysis
@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-llvm-selectiondag

@llvm/pr-subscribers-backend-aarch64

Author: Harvin Iriawan (harviniriawan)

Changes

This is a continuation of #69152.

  Remove getSizeOrUnknown call when MachineMemOperand is created.
  For Scalable TypeSize, the MemoryType created becomes a
  scalable_vector.

  2 MMOs that have scalable memory access can then use the updated
  BasicAA that understands scalable LocationSize

Patch is 59.89 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/70452.diff

15 Files Affected:

  • (modified) llvm/include/llvm/Analysis/MemoryLocation.h (+7-2)
  • (modified) llvm/include/llvm/CodeGen/MachineFunction.h (+21)
  • (modified) llvm/include/llvm/CodeGen/MachineMemOperand.h (+12-4)
  • (modified) llvm/lib/Analysis/BasicAliasAnalysis.cpp (+154-47)
  • (modified) llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp (+26-13)
  • (modified) llvm/lib/CodeGen/MachineFunction.cpp (+20)
  • (modified) llvm/lib/CodeGen/MachineInstr.cpp (+30-13)
  • (modified) llvm/lib/CodeGen/MachineOperand.cpp (+21-3)
  • (modified) llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (+46-24)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (+8-11)
  • (modified) llvm/test/Analysis/AliasSet/memloc-vscale.ll (+2-1)
  • (modified) llvm/test/Analysis/BasicAA/vscale.ll (+69-42)
  • (modified) llvm/test/CodeGen/AArch64/aarch64-sme2-asm.ll (+3-3)
  • (modified) llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-vops-mir.ll (+4-4)
  • (modified) llvm/test/Transforms/GVN/vscale.ll (+3-8)
diff --git a/llvm/include/llvm/Analysis/MemoryLocation.h b/llvm/include/llvm/Analysis/MemoryLocation.h
index b72a27cab86b349..bea96827a7c9e39 100644
--- a/llvm/include/llvm/Analysis/MemoryLocation.h
+++ b/llvm/include/llvm/Analysis/MemoryLocation.h
@@ -291,8 +291,9 @@ class MemoryLocation {
     return MemoryLocation(Ptr, LocationSize::beforeOrAfterPointer(), AATags);
   }
 
-  // Return the exact size if the exact size is known at compiletime,
-  // otherwise return MemoryLocation::UnknownSize.
+  // TODO: Remove getSizeOrUnknown
+  // interface once llvm/unittests/CodeGen/SelectionDAGAddressAnalysisTest is
+  // updated
   static uint64_t getSizeOrUnknown(const TypeSize &T) {
     return T.isScalable() ? UnknownSize : T.getFixedValue();
   }
@@ -303,6 +304,10 @@ class MemoryLocation {
                           const AAMDNodes &AATags = AAMDNodes())
       : Ptr(Ptr), Size(Size), AATags(AATags) {}
 
+  explicit MemoryLocation(const Value *Ptr, TypeSize Size,
+                          const AAMDNodes &AATags = AAMDNodes())
+      : Ptr(Ptr), Size(LocationSize::precise(Size)), AATags(AATags) {}
+
   MemoryLocation getWithNewPtr(const Value *NewPtr) const {
     MemoryLocation Copy(*this);
     Copy.Ptr = NewPtr;
diff --git a/llvm/include/llvm/CodeGen/MachineFunction.h b/llvm/include/llvm/CodeGen/MachineFunction.h
index 8f1651c2958e591..53c066bccffa812 100644
--- a/llvm/include/llvm/CodeGen/MachineFunction.h
+++ b/llvm/include/llvm/CodeGen/MachineFunction.h
@@ -1021,6 +1021,13 @@ class LLVM_EXTERNAL_VISIBILITY MachineFunction {
       AtomicOrdering Ordering = AtomicOrdering::NotAtomic,
       AtomicOrdering FailureOrdering = AtomicOrdering::NotAtomic);
 
+  MachineMemOperand *getMachineMemOperand(
+      MachinePointerInfo PtrInfo, MachineMemOperand::Flags f, TypeSize ts,
+      Align base_alignment, const AAMDNodes &AAInfo = AAMDNodes(),
+      const MDNode *Ranges = nullptr, SyncScope::ID SSID = SyncScope::System,
+      AtomicOrdering Ordering = AtomicOrdering::NotAtomic,
+      AtomicOrdering FailureOrdering = AtomicOrdering::NotAtomic);
+
   MachineMemOperand *getMachineMemOperand(
       MachinePointerInfo PtrInfo, MachineMemOperand::Flags f, LLT MemTy,
       Align base_alignment, const AAMDNodes &AAInfo = AAMDNodes(),
@@ -1040,6 +1047,17 @@ class LLVM_EXTERNAL_VISIBILITY MachineFunction {
         MMO, Offset, Size == ~UINT64_C(0) ? LLT() : LLT::scalar(8 * Size));
   }
 
+  MachineMemOperand *getMachineMemOperand(const MachineMemOperand *MMO,
+                                          int64_t Offset, TypeSize ts) {
+    return getMachineMemOperand(
+        MMO, Offset,
+        ts.getKnownMinValue() == ~UINT64_C(0)
+            ? LLT()
+            : ts.isScalable()
+                  ? LLT::scalable_vector(1, 8 * ts.getKnownMinValue())
+                  : LLT::scalar(8 * ts.getKnownMinValue()));
+  }
+
   /// getMachineMemOperand - Allocate a new MachineMemOperand by copying
   /// an existing one, replacing only the MachinePointerInfo and size.
   /// MachineMemOperands are owned by the MachineFunction and need not be
@@ -1047,6 +1065,9 @@ class LLVM_EXTERNAL_VISIBILITY MachineFunction {
   MachineMemOperand *getMachineMemOperand(const MachineMemOperand *MMO,
                                           const MachinePointerInfo &PtrInfo,
                                           uint64_t Size);
+  MachineMemOperand *getMachineMemOperand(const MachineMemOperand *MMO,
+                                          const MachinePointerInfo &PtrInfo,
+                                          TypeSize ts);
   MachineMemOperand *getMachineMemOperand(const MachineMemOperand *MMO,
                                           const MachinePointerInfo &PtrInfo,
                                           LLT Ty);
diff --git a/llvm/include/llvm/CodeGen/MachineMemOperand.h b/llvm/include/llvm/CodeGen/MachineMemOperand.h
index da7fd7cdf02958d..4a575b2d9b32da9 100644
--- a/llvm/include/llvm/CodeGen/MachineMemOperand.h
+++ b/llvm/include/llvm/CodeGen/MachineMemOperand.h
@@ -190,6 +190,12 @@ class MachineMemOperand {
                     SyncScope::ID SSID = SyncScope::System,
                     AtomicOrdering Ordering = AtomicOrdering::NotAtomic,
                     AtomicOrdering FailureOrdering = AtomicOrdering::NotAtomic);
+  MachineMemOperand(MachinePointerInfo PtrInfo, Flags flags, TypeSize ts,
+                    Align a, const AAMDNodes &AAInfo = AAMDNodes(),
+                    const MDNode *Ranges = nullptr,
+                    SyncScope::ID SSID = SyncScope::System,
+                    AtomicOrdering Ordering = AtomicOrdering::NotAtomic,
+                    AtomicOrdering FailureOrdering = AtomicOrdering::NotAtomic);
   MachineMemOperand(MachinePointerInfo PtrInfo, Flags flags, LLT type, Align a,
                     const AAMDNodes &AAInfo = AAMDNodes(),
                     const MDNode *Ranges = nullptr,
@@ -233,13 +239,15 @@ class MachineMemOperand {
   LLT getMemoryType() const { return MemoryType; }
 
   /// Return the size in bytes of the memory reference.
-  uint64_t getSize() const {
-    return MemoryType.isValid() ? MemoryType.getSizeInBytes() : ~UINT64_C(0);
+  TypeSize getSize() const {
+    return MemoryType.isValid() ? MemoryType.getSizeInBytes()
+                                : TypeSize::Fixed(~UINT64_C(0));
   }
 
   /// Return the size in bits of the memory reference.
-  uint64_t getSizeInBits() const {
-    return MemoryType.isValid() ? MemoryType.getSizeInBits() : ~UINT64_C(0);
+  TypeSize getSizeInBits() const {
+    return MemoryType.isValid() ? MemoryType.getSizeInBits()
+                                : TypeSize::Fixed(~UINT64_C(0));
   }
 
   LLT getType() const {
diff --git a/llvm/lib/Analysis/BasicAliasAnalysis.cpp b/llvm/lib/Analysis/BasicAliasAnalysis.cpp
index 2acd5c47b7681e8..74fb52386ee1dcd 100644
--- a/llvm/lib/Analysis/BasicAliasAnalysis.cpp
+++ b/llvm/lib/Analysis/BasicAliasAnalysis.cpp
@@ -44,6 +44,7 @@
 #include "llvm/IR/IntrinsicInst.h"
 #include "llvm/IR/Intrinsics.h"
 #include "llvm/IR/Operator.h"
+#include "llvm/IR/PatternMatch.h"
 #include "llvm/IR/Type.h"
 #include "llvm/IR/User.h"
 #include "llvm/IR/Value.h"
@@ -63,6 +64,7 @@
 #define DEBUG_TYPE "basicaa"
 
 using namespace llvm;
+using namespace llvm::PatternMatch;
 
 /// Enable analysis of recursive PHI nodes.
 static cl::opt<bool> EnableRecPhiAnalysis("basic-aa-recphi", cl::Hidden,
@@ -344,13 +346,20 @@ struct LinearExpression {
 
 /// Analyzes the specified value as a linear expression: "A*V + B", where A and
 /// B are constant integers.
-static LinearExpression GetLinearExpression(
-    const CastedValue &Val,  const DataLayout &DL, unsigned Depth,
-    AssumptionCache *AC, DominatorTree *DT) {
+static LinearExpression GetLinearExpression(const CastedValue &Val,
+                                            const DataLayout &DL,
+                                            unsigned Depth, AssumptionCache *AC,
+                                            DominatorTree *DT) {
   // Limit our recursion depth.
   if (Depth == 6)
     return Val;
 
+  // If llvm.vscale is matched, set linear expression with scale 1 and offset 0
+  if (match(Val.V, m_VScale())) {
+    return LinearExpression(Val, APInt(Val.getBitWidth(), 1),
+                            APInt(Val.getBitWidth(), 0), true);
+  }
+
   if (const ConstantInt *Const = dyn_cast<ConstantInt>(Val.V))
     return LinearExpression(Val, APInt(Val.getBitWidth(), 0),
                             Val.evaluateWith(Const->getValue()), true);
@@ -457,6 +466,9 @@ struct VariableGEPIndex {
   CastedValue Val;
   APInt Scale;
 
+  // A value representing vscale quantity in a GEP expression
+  bool IsVScale;
+
   // Context instruction to use when querying information about this index.
   const Instruction *CxtI;
 
@@ -479,13 +491,10 @@ struct VariableGEPIndex {
     dbgs() << "\n";
   }
   void print(raw_ostream &OS) const {
-    OS << "(V=" << Val.V->getName()
-       << ", zextbits=" << Val.ZExtBits
-       << ", sextbits=" << Val.SExtBits
-       << ", truncbits=" << Val.TruncBits
-       << ", scale=" << Scale
-       << ", nsw=" << IsNSW
-       << ", negated=" << IsNegated << ")";
+    OS << "(V=" << Val.V->getName() << "  IsVScale=" << IsVScale
+       << ", zextbits=" << Val.ZExtBits << ", sextbits=" << Val.SExtBits
+       << ", truncbits=" << Val.TruncBits << ", scale=" << Scale
+       << ", nsw=" << IsNSW << ", negated=" << IsNegated << ")";
   }
 };
 }
@@ -606,6 +615,7 @@ BasicAAResult::DecomposeGEPExpression(const Value *V, const DataLayout &DL,
     for (User::const_op_iterator I = GEPOp->op_begin() + 1, E = GEPOp->op_end();
          I != E; ++I, ++GTI) {
       const Value *Index = *I;
+      const bool ScalableGEP = isa<ScalableVectorType>(GTI.getIndexedType());
       // Compute the (potentially symbolic) offset in bytes for this index.
       if (StructType *STy = GTI.getStructTypeOrNull()) {
         // For a struct, add the member offset.
@@ -617,27 +627,18 @@ BasicAAResult::DecomposeGEPExpression(const Value *V, const DataLayout &DL,
         continue;
       }
 
+      TypeSize AllocTypeSize = DL.getTypeAllocSize(GTI.getIndexedType());
       // For an array/pointer, add the element offset, explicitly scaled.
+      // Skip adding to constant offset if GEP index is marked as scalable
+      // they are handled below as variable offset
       if (const ConstantInt *CIdx = dyn_cast<ConstantInt>(Index)) {
         if (CIdx->isZero())
           continue;
-
-        // Don't attempt to analyze GEPs if the scalable index is not zero.
-        TypeSize AllocTypeSize = DL.getTypeAllocSize(GTI.getIndexedType());
-        if (AllocTypeSize.isScalable()) {
-          Decomposed.Base = V;
-          return Decomposed;
+        if (!ScalableGEP) {
+          Decomposed.Offset += AllocTypeSize.getFixedValue() *
+                               CIdx->getValue().sextOrTrunc(MaxIndexSize);
+          continue;
         }
-
-        Decomposed.Offset += AllocTypeSize.getFixedValue() *
-                             CIdx->getValue().sextOrTrunc(MaxIndexSize);
-        continue;
-      }
-
-      TypeSize AllocTypeSize = DL.getTypeAllocSize(GTI.getIndexedType());
-      if (AllocTypeSize.isScalable()) {
-        Decomposed.Base = V;
-        return Decomposed;
       }
 
       GepHasConstantOffset = false;
@@ -647,22 +648,55 @@ BasicAAResult::DecomposeGEPExpression(const Value *V, const DataLayout &DL,
       unsigned Width = Index->getType()->getIntegerBitWidth();
       unsigned SExtBits = IndexSize > Width ? IndexSize - Width : 0;
       unsigned TruncBits = IndexSize < Width ? Width - IndexSize : 0;
-      LinearExpression LE = GetLinearExpression(
-          CastedValue(Index, 0, SExtBits, TruncBits), DL, 0, AC, DT);
+      // Scalable GEP decomposition
+      // Allow Scalable GEP to be decomposed in the case of
+      //    1. getelementptr <4 x vscale x i32> with 1st index as a constant
+      //    2. Index which have a leaf of @llvm.vscale
+      // In both cases, essentially CastedValue of VariableGEPIndex is Vscale,
+      // however in the 1st case, CastedValue is of type constant, hence another
+      // flag in VariableGEPIndex is created in this case, IsVScale If GEP is
+      // Scalable type, e.g. <4 x vscale x i32>, the first index will have
+      // vscale as a variable index, create a LE in this case
+      LinearExpression LE(CastedValue(Index, 0, SExtBits, TruncBits));
+      if (ScalableGEP) {
+        if (const ConstantInt *CIdx = dyn_cast<ConstantInt>(Index)) {
+          LE = LinearExpression(
+              CastedValue(Index, 0, SExtBits, TruncBits),
+              CastedValue(Index, 0, SExtBits, TruncBits)
+                  .evaluateWith(CIdx->getValue()),
+              APInt(CastedValue(Index, 0, SExtBits, TruncBits).getBitWidth(),
+                    0),
+              true);
+          assert(LE.Offset.isZero() && "For Scalable GEP constant first index, "
+                                       "the offset of LE should be 0");
+        } else {
+          // if first index is not a constant, a single variable gep will
+          // contain 2 variables, bail in this case
+          Decomposed.Base = V;
+          return Decomposed;
+        }
+      } else
+        LE = GetLinearExpression(CastedValue(Index, 0, SExtBits, TruncBits), DL,
+                                 0, AC, DT);
 
       // Scale by the type size.
-      unsigned TypeSize = AllocTypeSize.getFixedValue();
+      unsigned TypeSize = AllocTypeSize.getKnownMinValue();
       LE = LE.mul(APInt(IndexSize, TypeSize), GEPOp->isInBounds());
       Decomposed.Offset += LE.Offset.sext(MaxIndexSize);
       APInt Scale = LE.Scale.sext(MaxIndexSize);
+      bool LEhasVscale = match(LE.Val.V, m_VScale());
 
       // If we already had an occurrence of this index variable, merge this
       // scale into it.  For example, we want to handle:
       //   A[x][x] -> x*16 + x*4 -> x*20
       // This also ensures that 'x' only appears in the index list once.
+      // Only add to IsVScale VariableGEPIndex if it's @llvm.vscale or gep
+      // vscale index
       for (unsigned i = 0, e = Decomposed.VarIndices.size(); i != e; ++i) {
-        if (Decomposed.VarIndices[i].Val.V == LE.Val.V &&
-            Decomposed.VarIndices[i].Val.hasSameCastsAs(LE.Val)) {
+        if (Decomposed.VarIndices[i].Val.hasSameCastsAs(LE.Val) &&
+            ((Decomposed.VarIndices[i].IsVScale &&
+              (ScalableGEP || LEhasVscale)) ||
+             Decomposed.VarIndices[i].Val.V == LE.Val.V)) {
           Scale += Decomposed.VarIndices[i].Scale;
           LE.IsNSW = false; // We cannot guarantee nsw for the merge.
           Decomposed.VarIndices.erase(Decomposed.VarIndices.begin() + i);
@@ -672,10 +706,21 @@ BasicAAResult::DecomposeGEPExpression(const Value *V, const DataLayout &DL,
 
       // Make sure that we have a scale that makes sense for this target's
       // index size.
+      // Only allow variableGEP decomposition for constants, in the case of
+      // vscale
       Scale = adjustToIndexSize(Scale, IndexSize);
+      bool InvalidVarVScale = (ScalableGEP && LEhasVscale) ||
+                              (ScalableGEP && !isa<ConstantInt>(LE.Val.V));
+
+      assert(!InvalidVarVScale &&
+             "Variable GEP index contains VScale and another variable");
 
       if (!!Scale) {
-        VariableGEPIndex Entry = {LE.Val, Scale, CxtI, LE.IsNSW,
+        VariableGEPIndex Entry = {LE.Val,
+                                  Scale,
+                                  ScalableGEP || LEhasVscale,
+                                  CxtI,
+                                  LE.IsNSW,
                                   /* IsNegated */ false};
         Decomposed.VarIndices.push_back(Entry);
       }
@@ -1058,19 +1103,17 @@ AliasResult BasicAAResult::aliasGEP(
 
   // If an inbounds GEP would have to start from an out of bounds address
   // for the two to alias, then we can assume noalias.
-  // TODO: Remove !isScalable() once BasicAA fully support scalable location
-  // size
   if (*DecompGEP1.InBounds && DecompGEP1.VarIndices.empty() &&
-      V2Size.hasValue() && !V2Size.isScalable() &&
-      DecompGEP1.Offset.sge(V2Size.getValue()) &&
+      V2Size.hasValue() &&
+      DecompGEP1.Offset.sge(V2Size.getValue().getKnownMinValue()) &&
       isBaseOfObject(DecompGEP2.Base))
     return AliasResult::NoAlias;
 
   if (isa<GEPOperator>(V2)) {
     // Symmetric case to above.
     if (*DecompGEP2.InBounds && DecompGEP1.VarIndices.empty() &&
-        V1Size.hasValue() && !V1Size.isScalable() &&
-        DecompGEP1.Offset.sle(-V1Size.getValue()) &&
+        V1Size.hasValue() &&
+        DecompGEP1.Offset.sle(-V1Size.getValue().getKnownMinValue()) &&
         isBaseOfObject(DecompGEP1.Base))
       return AliasResult::NoAlias;
   }
@@ -1094,10 +1137,6 @@ AliasResult BasicAAResult::aliasGEP(
     return BaseAlias;
   }
 
-  // Bail on analysing scalable LocationSize
-  if (V1Size.isScalable() || V2Size.isScalable())
-    return AliasResult::MayAlias;
-
   // If there is a constant difference between the pointers, but the difference
   // is less than the size of the associated memory object, then we know
   // that the objects are partially overlapping.  If the difference is
@@ -1124,16 +1163,16 @@ AliasResult BasicAAResult::aliasGEP(
       Off = -Off;
     }
 
-    if (!VLeftSize.hasValue())
+    if (!VLeftSize.hasValue() || VLeftSize.isScalable())
       return AliasResult::MayAlias;
 
-    const uint64_t LSize = VLeftSize.getValue();
+    const uint64_t LSize = VLeftSize.getValue().getKnownMinValue();
     if (Off.ult(LSize)) {
       // Conservatively drop processing if a phi was visited and/or offset is
       // too big.
       AliasResult AR = AliasResult::PartialAlias;
       if (VRightSize.hasValue() && Off.ule(INT32_MAX) &&
-          (Off + VRightSize.getValue()).ule(LSize)) {
+          (Off + VRightSize.getValue().getKnownMinValue()).ule(LSize)) {
         // Memory referenced by right pointer is nested. Save the offset in
         // cache. Note that originally offset estimated as GEP1-V2, but
         // AliasResult contains the shift that represents GEP1+Offset=V2.
@@ -1149,12 +1188,71 @@ AliasResult BasicAAResult::aliasGEP(
   if (!V1Size.hasValue() || !V2Size.hasValue())
     return AliasResult::MayAlias;
 
+  // VScale Alias Analysis
+  // GEPs with Vscale will have the expression A*Vscale + B (1 variable index
+  // and constant offset) The difference between two GEPs and Scalable
+  // LocationSize can then be analysed as they have the form of
+  //     LSize                SubtractDecomposedGEP output
+  //   A * Vscale                   B * Vscale + C
+  // Since VScale is strictly a positive number (Vscale >= 1), the larger GEP
+  // can be known
+  // TODO: Use knowledge of vscale_range to make the analysis more accurate
+  if (DecompGEP1.VarIndices.size() == 1 && DecompGEP1.VarIndices[0].IsVScale &&
+      (V1Size.isScalable() || V2Size.isScalable())) {
+    const VariableGEPIndex &ScalableVar = DecompGEP1.VarIndices[0];
+    bool StrictlyPos = false, StrictlyNeg = false;
+    APInt &Off = DecompGEP1.Offset;
+    if (!ScalableVar.IsNegated) {
+      if (Off.isNegative())
+        StrictlyPos = ScalableVar.Scale.ugt(Off.abs());
+      else
+        StrictlyPos = true;
+    } else
+      StrictlyPos = Off.isNonNegative();
+
+    if (ScalableVar.IsNegated) {
+      if (Off.isNonNegative())
+        StrictlyNeg = Off.ult(ScalableVar.Scale.abs());
+      else
+        StrictlyNeg = true;
+    } else
+      StrictlyNeg = Off.isNegative();
+
+    if (StrictlyPos || StrictlyNeg) {
+      LocationSize VLeftSize = V2Size;
+      LocationSize VRightSize = V1Size;
+      const bool Swapped = StrictlyNeg;
+
+      if (Swapped) {
+        std::swap(VLeftSize, VRightSize);
+        Off = -Off;
+      }
+
+      const uint64_t LSize = VLeftSize.getValue().getKnownMinValue();
+      if (VLeftSize.isScalable() && ScalableVar.Scale.ult(LSize) &&
+          (ScalableVar.Scale + DecompGEP1.Offset).ult(LSize))
+        return AliasResult::PartialAlias;
+
+      if ((ScalableVar.Scale.uge(LSize) && VLeftSize.isScalable()) ||
+          ((ScalableVar.Scale + DecompGEP1.Offset).uge(LSize) &&
+           !VLeftSize.isScalable()))
+        return AliasResult::NoAlias;
+    }
+  }
+
+  // Bail on Scalable location size from now onwards
+  if (V1Size.isScalable() || V2Size.isScalable())
+    return AliasResult::MayAlias;
+
   APInt GCD;
   ConstantRange OffsetRange = ConstantRange(DecompGEP1.Offset);
   for (unsigned i = 0, e = DecompGEP1.VarIndices.size(); i != e; ++i) {
     const VariableGEPIndex &Index = DecompGEP1.VarIndices[i];
     const APInt &Scale = Index.Scale;
     APInt ScaleForGCD = Scale;
+    assert((!Index.IsVScale || match(Index.Val.V, m_VScale()) ||
+            isa<ConstantInt>(Index.Val.V)) &&
+           "Not allowed to have non-constant values if IsVScale is set");
     if (!Index.IsNSW)
       ScaleForGCD =
           APInt::getOneBitSet(Scale.getBitWidth(), Scale.countr_zero());
@@ -1727,7 +1825,12 @@ void BasicAAResult::subtractDecomposedGEPs(DecomposedGEP &DestGEP,
     bool Found = false;
     for (auto I : enumerate(DestGEP.VarIndices)) {
       VariableGEPIndex &Dest = I.value();
-      if (!isValueEqualInPotentialCycles(Dest.Val.V, Src.Val.V, AAQI) ||
+      if (Dest.IsVScale != Src.IsVScale)
+        continue;
+      const bool Sr...
[truncated]

@github-actions
Copy link

github-actions bot commented Oct 27, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

@davemgreen
Copy link
Collaborator

Hello. With Harvin having rotated back to a hardware team, I've pushed some extra changes to clean this up a little, remove the patch from #69152 and rebase.

@davemgreen davemgreen changed the title [SelectionDAG] Update for scalable MemoryType in MMO [CodeGen] Update for scalable MemoryType in MMO Nov 21, 2023
return MemoryType.isValid() ? MemoryType.getSizeInBits() : ~UINT64_C(0);
TypeSize getSizeInBits() const {
return MemoryType.isValid() ? MemoryType.getSizeInBits()
: TypeSize::Fixed(~UINT64_C(0));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any equivalent of MemoryLocation::UnknownSize we can use here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Not that I can seen in TypeSize, I'm afraid. Some places appear to pass MemoryLocation::UnknownSize assuming they are the same size.

I could try and use MemoryLocation::UnknownSize in more places, or try and change it to an optional<> if that is better? That looks like it might be quite a bit of work though

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should be using LocationSize rather than TypeSize, which can represent an unknown size? I'm not sure to what degree the different cases that LocationSize can represent are relevant at the MI level.

return true;

const bool BothNotScalable =
!(MUC0.NumBytes.isScalable() || MUC1.NumBytes.isScalable());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!isScalable && !isScalable

@davemgreen
Copy link
Collaborator

I've been having a look into making it a LocationSize. The size internally is stored in the size of a LLT, so it is only the input and output from the MMO that would handle LocationSize. (And afterPointer would not be supported, but that is probably fine).

I've had a go at making it use LocationSize. There are obviously a lot more changes, as it can no longer rely upon the implicit conversions from TypeSize->unsigned or from UnknownSize->unsigned. Hopefully I handled them all correctly, some places make the assumption that values cannot be Unknown, or handle it strangely it they do.

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason the MMO uses an LLT is we also need the number of vector elements to correctly legalize vector loads and stores. Doesn't switching that to LocationSize lose that information?

@davemgreen
Copy link
Collaborator

Do you mean if we changed it internally too? Yeah that would lose the info I think. I'm not sure if it could come from the type of the load/store instead?

This patch is just adding extra external interfaces though, the internal type is still a LLT, and can still be accessed directly through GISel.

@arsenm
Copy link
Contributor

arsenm commented Jan 9, 2024

Do you mean if we changed it internally too? Yeah that would lose the info I think. I'm not sure if it could come from the type of the load/store instead?

It cannot in the case of extending vector load/truncating vector store. The current legalization for load/store is a total mess. I've had a WIP patch rewriting it for about 4 years which I really need to get back to

davemgreen added a commit that referenced this pull request Mar 6, 2024
This is similar to #83017 but for the areas in GlobalISel's
LoadStoreOpt, and should help simplify #70452 a little. It will likely
change a little again once the sizes can be scalable.
davemgreen added a commit that referenced this pull request Mar 6, 2024
…#83875)

This is another part of #70452 which makes getMemOperandsWithOffsetWidth
use a LocationSize for Width, as opposed to the unsigned it currently
uses. The advantages on it's own are not super high if
getMemOperandsWithOffsetWidth usually uses known sizes, but if the
values can come from an MMO it can help be more accurate in case they
are Unknown (and in the future, scalable).
@davemgreen
Copy link
Collaborator

I've given this a rebase.

davemgreen added a commit to davemgreen/llvm-project that referenced this pull request Mar 11, 2024
This is part of llvm#70452 that changes the type used for the external interface of
MMO to LocationSize as opposed to uint64_t. This means the constructors take
LocationSize, and convert ~UINT64_C(0) to LocationSize::beforeOrAfter(). The
getSize methods return a LocationSize.

This allows us to be more precise with unknown sizes, not accidentally treating
them as unsigned values, and in the future should allow us to add proper
scalable vector support but none of that is included in this patch. It should
mostly be an NFC.

Global ISel is still expected to use the underlying LLT as it needs, and are
not expected to see unknown sizes for generic operations. Most of the changes
are hopefully fairly mechanical.
llvmbot pushed a commit to llvmbot/llvm-project that referenced this pull request Mar 13, 2024
… to LocationSize. (llvm#83017)

This is another smaller step of llvm#70452, changing the signature of
computeAliasing() from optional<uint64_t> to LocationSize, and follow-up
changes in DAGCombiner::mayAlias(). There are some test change due to
the previous AA->isNoAlias call incorrectly using an unknown size
(~UINT64_T(0)). This should then be improved again in llvm#70452 when the
types are known to be scalable.

(cherry picked from commit 6e41d60)
davemgreen added a commit to davemgreen/llvm-project that referenced this pull request Mar 17, 2024
This is part of llvm#70452 that changes the type used for the external interface of
MMO to LocationSize as opposed to uint64_t. This means the constructors take
LocationSize, and convert ~UINT64_C(0) to LocationSize::beforeOrAfter(). The
getSize methods return a LocationSize.

This allows us to be more precise with unknown sizes, not accidentally treating
them as unsigned values, and in the future should allow us to add proper
scalable vector support but none of that is included in this patch. It should
mostly be an NFC.

Global ISel is still expected to use the underlying LLT as it needs, and are
not expected to see unknown sizes for generic operations. Most of the changes
are hopefully fairly mechanical.
davemgreen added a commit that referenced this pull request Mar 17, 2024
This is part of #70452 that changes the type used for the external
interface of MMO to LocationSize as opposed to uint64_t. This means the
constructors take LocationSize, and convert ~UINT64_C(0) to
LocationSize::beforeOrAfter(). The getSize methods return a
LocationSize.

This allows us to be more precise with unknown sizes, not accidentally
treating them as unsigned values, and in the future should allow us to
add proper scalable vector support but none of that is included in this
patch. It should mostly be an NFC.

Global ISel is still expected to use the underlying LLT as it needs, and
are not expected to see unknown sizes for generic operations. Most of
the changes are hopefully fairly mechanical, adding a lot of getValue()
calls and protecting them with hasValue() where needed.
Remove getSizeOrUnknown call when MachineMemOperand is created.  For Scalable
TypeSize, the MemoryType created becomes a scalable_vector.

2 MMOs that have scalable memory access can then use the updated BasicAA that
understands scalable LocationSize.

Original Patch by Harvin Iriawan
@davemgreen
Copy link
Collaborator

Thanks. I've rebased to remove all the other changes from #84751, now leaving the Scalable vector parts here.

@davemgreen davemgreen merged commit 57146da into llvm:main Mar 23, 2024
4 checks passed
@hiraditya
Copy link
Collaborator

Can this patch cause #88799?

@hiraditya
Copy link
Collaborator

Reverting this patch does fix the crash.

hiraditya added a commit that referenced this pull request Apr 15, 2024
lukel97 added a commit to lukel97/llvm-project that referenced this pull request Apr 30, 2024
In llvm#70452 DAGCombiner::mayAlias was taught to handle scalable vectors, but when it checks via AA->isNoAlias it didn't take into account the case where the size is scalable but there was an offset too.

For the fixed length case the offset was just accounted for by adding to the LocationSize, but for the scalable case there doesn't seem to be a way to represent both a scalable and fixed part in it. So this patch works around it by bailing if there is an offset.

Fixes llvm#90559
lukel97 added a commit that referenced this pull request Apr 30, 2024
…ets (#90573)

In #70452 DAGCombiner::mayAlias was taught to handle scalable sizes, but
when it checks via AA->isNoAlias it didn't take into account the case
where the size is scalable but there was an offset too.

For the fixed length case the offset was just accounted for by adding to
the LocationSize, but for the scalable case there doesn't seem to be a
way to represent both a scalable and fixed part in it. So this patch
works around it by bailing if there is an offset.

Fixes #90559
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants