Skip to content

[LV] Vectorize selecting last IV of min/max element. #141431

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

fhahn
Copy link
Contributor

@fhahn fhahn commented May 25, 2025

Add support for vectorizing loops that select the index of the minimum or
maximum element. The patch implements vectorizing those patterns by
combining Min/Max and FindFirstIV reductions.

It extends matching Min/Max reductions to allow in-loop users that are
FindLastIV reductions. It records a flag indicating that the Min/Max
reduction is used by another reduction.

When creating reduction recipes, we process any reduction that has other
reduction users. The reduction using the min/max reduction needs adjusting
to compute the correct result:

  1. We need to find the first IV for which the condition based on the
    min/max reduction is true,
  2. Compare the partial min/max reduction result to its final value and,
  3. Select the lanes of the partial FindLastIV reductions which correspond
    to the lanes matching the min/max reduction result.

Depends on #140451

@llvmbot
Copy link
Member

llvmbot commented May 25, 2025

@llvm/pr-subscribers-vectorizers

@llvm/pr-subscribers-llvm-analysis

Author: Florian Hahn (fhahn)

Changes

Add support for vectorizing loops that select the index of the minimum or
maximum element. The patch implements vectorizing those patterns by
combining Min/Max and FindFirstIV reductions.

It extends matching Min/Max reductions to allow in-loop users that are
FindLastIV reductions. It records a flag indicating that the Min/Max
reduction is used by another reduction.

When creating reduction recipes, we process any reduction that has other
reduction users. The reduction using the min/max reduction needs adjusting
to compute the correct result:

  1. We need to find the first IV for which the condition based on the
    min/max reduction is true,
  2. Compare the partial min/max reduction result to its final value and,
  3. Select the lanes of the partial FindLastIV reductions which correspond
    to the lanes matching the min/max reduction result.

Depends on #140451


Patch is 140.28 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/141431.diff

13 Files Affected:

  • (modified) llvm/include/llvm/Analysis/IVDescriptors.h (+39-9)
  • (modified) llvm/lib/Analysis/IVDescriptors.cpp (+90-24)
  • (modified) llvm/lib/Transforms/Utils/LoopUtils.cpp (+10-6)
  • (modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+95-26)
  • (modified) llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp (+6)
  • (modified) llvm/lib/Transforms/Vectorize/VPlan.h (+7-2)
  • (modified) llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp (+1-1)
  • (modified) llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp (+19-11)
  • (modified) llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp (+1-1)
  • (modified) llvm/test/Transforms/LoopVectorize/if-reduction.ll (+76-10)
  • (modified) llvm/test/Transforms/LoopVectorize/iv-select-cmp.ll (+290-32)
  • (modified) llvm/test/Transforms/LoopVectorize/select-min-index.ll (+461-76)
  • (modified) llvm/test/Transforms/LoopVectorize/vplan-printing-reductions.ll (+1-1)
diff --git a/llvm/include/llvm/Analysis/IVDescriptors.h b/llvm/include/llvm/Analysis/IVDescriptors.h
index d94ffa7287db3..80b4b8390749b 100644
--- a/llvm/include/llvm/Analysis/IVDescriptors.h
+++ b/llvm/include/llvm/Analysis/IVDescriptors.h
@@ -56,6 +56,13 @@ enum class RecurKind {
   FindLastIV, ///< FindLast reduction with select(cmp(),x,y) where one of
               ///< (x,y) is increasing loop induction, and both x and y are
               ///< integer type.
+  FindFirstIVUMin, /// FindFirst reduction with select(icmp(),x,y) where one of
+                   ///< (x,y) is a decreasing loop induction, and both x and y
+                   ///< are integer type.
+  FindFirstIVSMin /// FindFirst reduction with select(icmp(),x,y) where one of
+                  ///< (x,y) is a decreasing loop induction, and both x and y
+                  ///< are integer type.
+
   // clang-format on
   // TODO: Any_of and FindLast reduction need not be restricted to integer type
   // only.
@@ -160,12 +167,13 @@ class RecurrenceDescriptor {
   /// Returns a struct describing whether the instruction is either a
   ///   Select(ICmp(A, B), X, Y), or
   ///   Select(FCmp(A, B), X, Y)
-  /// where one of (X, Y) is an increasing loop induction variable, and the
-  /// other is a PHI value.
+  /// where one of (X, Y) is an increasing (FindLast) or decreasing (FindFirst)
+  /// loop induction variable, and the other is a PHI value.
   // TODO: Support non-monotonic variable. FindLast does not need be restricted
   // to increasing loop induction variables.
-  static InstDesc isFindLastIVPattern(Loop *TheLoop, PHINode *OrigPhi,
-                                      Instruction *I, ScalarEvolution &SE);
+  static InstDesc isFindIVPattern(RecurKind Kind, Loop *TheLoop,
+                                  PHINode *OrigPhi, Instruction *I,
+                                  ScalarEvolution &SE);
 
   /// Returns a struct describing if the instruction is a
   /// Select(FCmp(X, Y), (Z = X op PHINode), PHINode) instruction pattern.
@@ -259,19 +267,37 @@ class RecurrenceDescriptor {
     return Kind == RecurKind::FindLastIV;
   }
 
+  /// Returns true if the recurrence kind is of the form
+  ///   select(cmp(),x,y) where one of (x,y) is an increasing or decreasing loop
+  ///   induction.
+  static bool isFindIVRecurrenceKind(RecurKind Kind) {
+    return Kind == RecurKind::FindLastIV ||
+           Kind == RecurKind::FindFirstIVUMin ||
+           Kind == RecurKind::FindFirstIVSMin;
+  }
+
   /// Returns the type of the recurrence. This type can be narrower than the
   /// actual type of the Phi if the recurrence has been type-promoted.
   Type *getRecurrenceType() const { return RecurrenceType; }
 
-  /// Returns the sentinel value for FindLastIV recurrences to replace the start
-  /// value.
+  /// Returns the sentinel value for FindFirstIV &FindLastIV recurrences to
+  /// replace the start value.
   Value *getSentinelValue() const {
-    assert(isFindLastIVRecurrenceKind(Kind) && "Unexpected recurrence kind");
     Type *Ty = StartValue->getType();
-    return ConstantInt::get(Ty,
-                            APInt::getSignedMinValue(Ty->getIntegerBitWidth()));
+    if (isFindLastIVRecurrenceKind(Kind)) {
+      return ConstantInt::get(
+          Ty, APInt::getSignedMinValue(Ty->getIntegerBitWidth()));
+    } else if (Kind == RecurKind::FindFirstIVSMin) {
+      return ConstantInt::get(
+          Ty, APInt::getSignedMaxValue(Ty->getIntegerBitWidth()));
+    } else {
+      assert(Kind == RecurKind::FindFirstIVUMin);
+      return ConstantInt::get(Ty, APInt::getMaxValue(Ty->getIntegerBitWidth()));
+    }
   }
 
+  void setKind(RecurKind NewKind) { Kind = NewKind; }
+
   /// Returns a reference to the instructions used for type-promoting the
   /// recurrence.
   const SmallPtrSet<Instruction *, 8> &getCastInsts() const { return CastInsts; }
@@ -303,6 +329,10 @@ class RecurrenceDescriptor {
   /// AddReductionVar method, this field will be assigned the last met store.
   StoreInst *IntermediateStore = nullptr;
 
+  /// True if this recurrence is used by another recurrence in the loop. Users
+  /// need to ensure that the final code-gen accounts for the use in the loop.
+  bool IsUsedByOtherRecurrence = false;
+
 private:
   // The starting value of the recurrence.
   // It does not have to be zero!
diff --git a/llvm/lib/Analysis/IVDescriptors.cpp b/llvm/lib/Analysis/IVDescriptors.cpp
index b7c7bcab168cc..c1886935884e5 100644
--- a/llvm/lib/Analysis/IVDescriptors.cpp
+++ b/llvm/lib/Analysis/IVDescriptors.cpp
@@ -51,6 +51,8 @@ bool RecurrenceDescriptor::isIntegerRecurrenceKind(RecurKind Kind) {
   case RecurKind::UMin:
   case RecurKind::AnyOf:
   case RecurKind::FindLastIV:
+  case RecurKind::FindFirstIVUMin:
+  case RecurKind::FindFirstIVSMin:
     return true;
   }
   return false;
@@ -253,7 +255,7 @@ bool RecurrenceDescriptor::AddReductionVar(
   // Data used for determining if the recurrence has been type-promoted.
   Type *RecurrenceType = Phi->getType();
   SmallPtrSet<Instruction *, 4> CastInsts;
-  unsigned MinWidthCastToRecurrenceType;
+  unsigned MinWidthCastToRecurrenceType = -1ull;
   Instruction *Start = Phi;
   bool IsSigned = false;
 
@@ -308,6 +310,7 @@ bool RecurrenceDescriptor::AddReductionVar(
   //    This is either:
   //      * An instruction type other than PHI or the reduction operation.
   //      * A PHI in the header other than the initial PHI.
+  bool IsUsedByOtherRecurrence = false;
   while (!Worklist.empty()) {
     Instruction *Cur = Worklist.pop_back_val();
 
@@ -369,15 +372,37 @@ bool RecurrenceDescriptor::AddReductionVar(
 
     // Any reduction instruction must be of one of the allowed kinds. We ignore
     // the starting value (the Phi or an AND instruction if the Phi has been
-    // type-promoted).
+    // type-promoted) and other in-loop users, if they form a FindLastIV
+    // reduction. In the latter case, the user of the IVDescriptors must account
+    // for that during codegen.
     if (Cur != Start) {
       ReduxDesc =
           isRecurrenceInstr(TheLoop, Phi, Cur, Kind, ReduxDesc, FuncFMF, SE);
       ExactFPMathInst = ExactFPMathInst == nullptr
                             ? ReduxDesc.getExactFPMathInst()
                             : ExactFPMathInst;
-      if (!ReduxDesc.isRecurrence())
+      if (!ReduxDesc.isRecurrence()) {
+        if (isMinMaxRecurrenceKind(Kind)) {
+          // If the current recurrence is Min/Max, check if the current user is
+          // a select that is a FindLastIV reduction. During codegen, this
+          // recurrence needs to be turned into one that finds the first IV, as
+          // the value to compare against is a Min/Max recurrence.
+          auto *Sel = dyn_cast<SelectInst>(Cur);
+          if (!Sel || !Sel->getType()->isIntegerTy())
+            return false;
+          auto *OtherPhi = dyn_cast<PHINode>(Sel->getOperand(2));
+          if (!OtherPhi)
+            return false;
+          auto NewReduxDesc =
+              isRecurrenceInstr(TheLoop, OtherPhi, Cur, RecurKind::FindLastIV,
+                                ReduxDesc, FuncFMF, SE);
+          if (NewReduxDesc.isRecurrence()) {
+            IsUsedByOtherRecurrence = true;
+            continue;
+          }
+        }
         return false;
+      }
       // FIXME: FMF is allowed on phi, but propagation is not handled correctly.
       if (isa<FPMathOperator>(ReduxDesc.getPatternInst()) && !IsAPhi) {
         FastMathFlags CurFMF = ReduxDesc.getPatternInst()->getFastMathFlags();
@@ -501,7 +526,7 @@ bool RecurrenceDescriptor::AddReductionVar(
   // pattern or more than just a select and cmp. Zero implies that we saw a
   // llvm.min/max intrinsic, which is always OK.
   if (isMinMaxRecurrenceKind(Kind) && NumCmpSelectPatternInst != 2 &&
-      NumCmpSelectPatternInst != 0)
+      NumCmpSelectPatternInst != 0 && !IsUsedByOtherRecurrence)
     return false;
 
   if (isAnyOfRecurrenceKind(Kind) && NumCmpSelectPatternInst != 1)
@@ -533,7 +558,13 @@ bool RecurrenceDescriptor::AddReductionVar(
       ExitInstruction = cast<Instruction>(IntermediateStore->getValueOperand());
   }
 
-  if (!FoundStartPHI || !FoundReduxOp || !ExitInstruction)
+  if (!FoundStartPHI || !FoundReduxOp)
+    return false;
+
+  if (IsUsedByOtherRecurrence) {
+    if (ExitInstruction)
+      return false;
+  } else if (!ExitInstruction)
     return false;
 
   const bool IsOrdered =
@@ -584,8 +615,9 @@ bool RecurrenceDescriptor::AddReductionVar(
   //       without needing a white list of instructions to ignore.
   //       This may also be useful for the inloop reductions, if it can be
   //       kept simple enough.
-  collectCastInstrs(TheLoop, ExitInstruction, RecurrenceType, CastInsts,
-                    MinWidthCastToRecurrenceType);
+  if (ExitInstruction)
+    collectCastInstrs(TheLoop, ExitInstruction, RecurrenceType, CastInsts,
+                      MinWidthCastToRecurrenceType);
 
   // We found a reduction var if we have reached the original phi node and we
   // only have a single instruction with out-of-loop users.
@@ -598,7 +630,7 @@ bool RecurrenceDescriptor::AddReductionVar(
                           FMF, ExactFPMathInst, RecurrenceType, IsSigned,
                           IsOrdered, CastInsts, MinWidthCastToRecurrenceType);
   RedDes = RD;
-
+  RedDes.IsUsedByOtherRecurrence = IsUsedByOtherRecurrence;
   return true;
 }
 
@@ -683,8 +715,9 @@ RecurrenceDescriptor::isAnyOfPattern(Loop *Loop, PHINode *OrigPhi,
 // value of the data type or a non-constant value by using mask and multiple
 // reduction operations.
 RecurrenceDescriptor::InstDesc
-RecurrenceDescriptor::isFindLastIVPattern(Loop *TheLoop, PHINode *OrigPhi,
-                                          Instruction *I, ScalarEvolution &SE) {
+RecurrenceDescriptor::isFindIVPattern(RecurKind Kind, Loop *TheLoop,
+                                      PHINode *OrigPhi, Instruction *I,
+                                      ScalarEvolution &SE) {
   // TODO: Support the vectorization of FindLastIV when the reduction phi is
   // used by more than one select instruction. This vectorization is only
   // performed when the SCEV of each increasing induction variable used by the
@@ -700,7 +733,7 @@ RecurrenceDescriptor::isFindLastIVPattern(Loop *TheLoop, PHINode *OrigPhi,
                                      m_Value(NonRdxPhi)))))
     return InstDesc(false, I);
 
-  auto IsIncreasingLoopInduction = [&](Value *V) {
+  auto IsSupportedLoopInduction = [&](Value *V, RecurKind Kind) {
     Type *Ty = V->getType();
     if (!SE.isSCEVable(Ty))
       return false;
@@ -710,21 +743,39 @@ RecurrenceDescriptor::isFindLastIVPattern(Loop *TheLoop, PHINode *OrigPhi,
       return false;
 
     const SCEV *Step = AR->getStepRecurrence(SE);
-    if (!SE.isKnownPositive(Step))
+    if (Kind == RecurKind::FindFirstIVUMin ||
+        Kind == RecurKind::FindFirstIVSMin) {
+      if (!SE.isKnownNegative(Step))
+        return false;
+    } else if (!SE.isKnownPositive(Step))
       return false;
 
     const ConstantRange IVRange = SE.getSignedRange(AR);
     unsigned NumBits = Ty->getIntegerBitWidth();
-    // Keep the minimum value of the recurrence type as the sentinel value.
-    // The maximum acceptable range for the increasing induction variable,
-    // called the valid range, will be defined as
+    // Keep the minimum (FindLast) or maximum (FindFirst) value of the
+    // recurrence type as the sentinel value. The maximum acceptable range for
+    // the induction variable, called the valid range, will be defined as
     //   [<sentinel value> + 1, <sentinel value>)
-    // where <sentinel value> is SignedMin(<recurrence type>)
+    // where <sentinel value> is SignedMin(<recurrence type>) for FindLast or
+    // UnsignedMax(<recurrence type>) for FindFirst.
     // TODO: This range restriction can be lifted by adding an additional
     // virtual OR reduction.
-    const APInt Sentinel = APInt::getSignedMinValue(NumBits);
-    const ConstantRange ValidRange =
-        ConstantRange::getNonEmpty(Sentinel + 1, Sentinel);
+    const APInt Sentinel = Kind == RecurKind::FindFirstIVUMin
+                               ? APInt::getMaxValue(NumBits)
+                               : (Kind == RecurKind::FindFirstIVSMin
+                                      ? APInt::getSignedMaxValue(NumBits)
+                                      : APInt::getSignedMinValue(NumBits));
+    ConstantRange ValidRange = ConstantRange::getEmpty(NumBits);
+    if (Kind == RecurKind::FindFirstIVSMin)
+      ValidRange =
+          ConstantRange::getNonEmpty(APInt::getSignedMinValue(NumBits),
+                                     APInt::getSignedMaxValue(NumBits) - 1);
+    else {
+      const APInt Sentinel = Kind == RecurKind::FindFirstIVUMin
+                                 ? APInt::getMaxValue(NumBits)
+                                 : APInt::getSignedMinValue(NumBits);
+      ValidRange = ConstantRange::getNonEmpty(Sentinel + 1, Sentinel);
+    }
     LLVM_DEBUG(dbgs() << "LV: FindLastIV valid range is " << ValidRange
                       << ", and the signed range of " << *AR << " is "
                       << IVRange << "\n");
@@ -736,11 +787,18 @@ RecurrenceDescriptor::isFindLastIVPattern(Loop *TheLoop, PHINode *OrigPhi,
   // We are looking for selects of the form:
   //   select(cmp(), phi, increasing_loop_induction) or
   //   select(cmp(), increasing_loop_induction, phi)
-  // TODO: Support for monotonically decreasing induction variable
-  if (!IsIncreasingLoopInduction(NonRdxPhi))
+  if (Kind == RecurKind::FindLastIV) {
+    if (IsSupportedLoopInduction(NonRdxPhi, Kind))
+      return InstDesc(I, Kind);
     return InstDesc(false, I);
+  }
+
+  if (IsSupportedLoopInduction(NonRdxPhi, RecurKind::FindFirstIVSMin))
+    return InstDesc(I, RecurKind::FindFirstIVSMin);
+  if (IsSupportedLoopInduction(NonRdxPhi, RecurKind::FindFirstIVUMin))
+    return InstDesc(I, RecurKind::FindFirstIVUMin);
 
-  return InstDesc(I, RecurKind::FindLastIV);
+  return InstDesc(false, I);
 }
 
 RecurrenceDescriptor::InstDesc
@@ -875,8 +933,8 @@ RecurrenceDescriptor::InstDesc RecurrenceDescriptor::isRecurrenceInstr(
     if (Kind == RecurKind::FAdd || Kind == RecurKind::FMul ||
         Kind == RecurKind::Add || Kind == RecurKind::Mul)
       return isConditionalRdxPattern(Kind, I);
-    if (isFindLastIVRecurrenceKind(Kind) && SE)
-      return isFindLastIVPattern(L, OrigPhi, I, *SE);
+    if (isFindIVRecurrenceKind(Kind) && SE)
+      return isFindIVPattern(Kind, L, OrigPhi, I, *SE);
     [[fallthrough]];
   case Instruction::FCmp:
   case Instruction::ICmp:
@@ -990,6 +1048,12 @@ bool RecurrenceDescriptor::isReductionPHI(PHINode *Phi, Loop *TheLoop,
     LLVM_DEBUG(dbgs() << "Found a FindLastIV reduction PHI." << *Phi << "\n");
     return true;
   }
+  if (AddReductionVar(Phi, RecurKind::FindFirstIVUMin, TheLoop, FMF, RedDes, DB,
+                      AC, DT, SE)) {
+    LLVM_DEBUG(dbgs() << "Found a FindFirstV reduction PHI." << *Phi << "\n");
+    return true;
+  }
+
   if (AddReductionVar(Phi, RecurKind::FMul, TheLoop, FMF, RedDes, DB, AC, DT,
                       SE)) {
     LLVM_DEBUG(dbgs() << "Found an FMult reduction PHI." << *Phi << "\n");
@@ -1153,6 +1217,8 @@ unsigned RecurrenceDescriptor::getOpcode(RecurKind Kind) {
   case RecurKind::SMin:
   case RecurKind::UMax:
   case RecurKind::UMin:
+  case RecurKind::FindFirstIVUMin:
+  case RecurKind::FindFirstIVSMin:
     return Instruction::ICmp;
   case RecurKind::FMax:
   case RecurKind::FMin:
diff --git a/llvm/lib/Transforms/Utils/LoopUtils.cpp b/llvm/lib/Transforms/Utils/LoopUtils.cpp
index 2fff9521017ff..973f33685beed 100644
--- a/llvm/lib/Transforms/Utils/LoopUtils.cpp
+++ b/llvm/lib/Transforms/Utils/LoopUtils.cpp
@@ -1244,12 +1244,16 @@ Value *llvm::createAnyOfReduction(IRBuilderBase &Builder, Value *Src,
 Value *llvm::createFindLastIVReduction(IRBuilderBase &Builder, Value *Src,
                                        Value *Start,
                                        const RecurrenceDescriptor &Desc) {
-  assert(RecurrenceDescriptor::isFindLastIVRecurrenceKind(
-             Desc.getRecurrenceKind()) &&
-         "Unexpected reduction kind");
+  assert(
+      RecurrenceDescriptor::isFindIVRecurrenceKind(Desc.getRecurrenceKind()) &&
+      "Unexpected reduction kind");
   Value *Sentinel = Desc.getSentinelValue();
   Value *MaxRdx = Src->getType()->isVectorTy()
-                      ? Builder.CreateIntMaxReduce(Src, true)
+                      ? (Desc.getRecurrenceKind() == RecurKind::FindLastIV
+                             ? Builder.CreateIntMaxReduce(Src, true)
+                             : Builder.CreateIntMinReduce(
+                                   Src, Desc.getRecurrenceKind() ==
+                                            RecurKind::FindFirstIVSMin))
                       : Src;
   // Correct the final reduction result back to the start value if the maximum
   // reduction is sentinel value.
@@ -1345,8 +1349,8 @@ Value *llvm::createSimpleReduction(IRBuilderBase &Builder, Value *Src,
 Value *llvm::createSimpleReduction(VectorBuilder &VBuilder, Value *Src,
                                    RecurKind Kind) {
   assert(!RecurrenceDescriptor::isAnyOfRecurrenceKind(Kind) &&
-         !RecurrenceDescriptor::isFindLastIVRecurrenceKind(Kind) &&
-         "AnyOf or FindLastIV reductions are not supported.");
+         !RecurrenceDescriptor::isFindIVRecurrenceKind(Kind) &&
+         "AnyOf, FindFirstIV and FindLastIV reductions are not supported.");
   Intrinsic::ID Id = getReductionIntrinsicID(Kind);
   auto *SrcTy = cast<VectorType>(Src->getType());
   Type *SrcEltTy = SrcTy->getElementType();
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 8636550d4f644..1a8016d93221a 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -5176,7 +5176,7 @@ LoopVectorizationCostModel::selectInterleaveCount(VPlan &Plan, ElementCount VF,
           const RecurrenceDescriptor &RdxDesc = Reduction.second;
           RecurKind RK = RdxDesc.getRecurrenceKind();
           return RecurrenceDescriptor::isAnyOfRecurrenceKind(RK) ||
-                 RecurrenceDescriptor::isFindLastIVRecurrenceKind(RK);
+                 RecurrenceDescriptor::isFindIVRecurrenceKind(RK);
         });
     if (HasSelectCmpReductions) {
       LLVM_DEBUG(dbgs() << "LV: Not interleaving select-cmp reductions.\n");
@@ -7549,7 +7549,7 @@ static void fixReductionScalarResumeWhenVectorizingEpilog(
   auto *EpiRedResult = dyn_cast<VPInstruction>(R);
   if (!EpiRedResult ||
       (EpiRedResult->getOpcode() != VPInstruction::ComputeReductionResult &&
-       EpiRedResult->getOpcode() != VPInstruction::ComputeFindLastIVResult))
+       EpiRedResult->getOpcode() != VPInstruction::ComputeFindIVResult))
     return;
 
   auto *EpiRedHeaderPhi =
@@ -7567,7 +7567,7 @@ static void fixReductionScalarResumeWhenVectorizingEpilog(
            "AnyOf expected to start by comparing main resume value to original "
            "start value");
     MainResumeValue = Cmp->getOperand(0);
-  } else if (RecurrenceDescriptor::isFindLastIVRecurrenceKind(
+  } else if (RecurrenceDescriptor::isFindIVRecurrenceKind(
                  RdxDesc.getRecurrenceKind())) {
     using namespace llvm::PatternMatch;
     Value *Cmp, *OrigResumeV, *CmpOp;
@@ -8499,7 +8499,7 @@ bool VPRecipeBuilder::getScaledReductions(
     Instruction *PHI, Instruction *RdxExitInstr, VFRange &Range,
     SmallVectorImpl<std::pair<PartialReductionChain, unsigned>> &Chains) {
 
-  if (!CM.TheLoop->contains(RdxExitInstr))
+  if (!RdxExitInstr || !CM.TheLoop->contains(RdxExitInstr))
     return false;
 
   auto *Update = dyn_cast<BinaryOperator>(RdxExitInstr);
@@ -9360,7 +9360,7 @@ void LoopVectorizationPlanner::adjustRecipesForReductions(
     RecurKind Kind = RdxDesc.getRecurrenceKind();
     assert(
         !RecurrenceDescriptor::isAnyOfRecurrenceKind(Kind) &&
-        !RecurrenceDescriptor::isFindLastIVRecurrenceKind(Kind) &&
+        !RecurrenceDescriptor::isFindIVRecurrenceKind(Kind) &&
         "AnyOf and FindLast reductions are not allowed for in-loop reductions");
 
     // Collect the chain of "link" recipes for the reduction starting at PhiR.
@@ -9517,7 +9517,7 @@ void LoopVectorizationPlanner::adjustRecipesForReductions(
                (cast<VPInstru...
[truncated]

Copy link
Contributor

@Mel-Chen Mel-Chen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a good idea that this patch avoids introducing a new RecurKind specifically for min/max with index.
However, I believe it would be more appropriate not to reuse AddReductionVars when idiom detection. The scalar phi for min/max doesn't really align with the current definition of a reduction. A reduction phi shouldn't be used by operations inside the loop that are not part of the recurrence chain, and it also needs to have users outside the loop.
So I think it should be detected independently.
Of course, a better approach would be to first identify the recurrence chain, and then determine whether it's a reduction.

Here's the source code we use for the detection part: #141467
As for the IR generation, I'd like to keep it internal for a while longer to test things before releasing it.

@fhahn
Copy link
Contributor Author

fhahn commented Jun 1, 2025

So I think it should be detected independently. Of course, a better approach would be to first identify the recurrence chain, and then determine whether it's a reduction.

Yep that's a good point! I was initially planning to split off detection of the pattern separately as follow-up, but it seems better to do it to start with. I started sharing a few refactoring patches to make it simpler to create reduction phi recipes in VPlan (final one in the series is #142322) and will update the PR here in the next few days.

@fhahn fhahn changed the title [LV] Vectorize selecting index of min/max element. [LV] Vectorize selecting last IV of min/max element. Jun 9, 2025
@fhahn fhahn force-pushed the lv-find-min-max-index branch from 0554871 to 9019ec3 Compare June 9, 2025 21:42
@fhahn
Copy link
Contributor Author

fhahn commented Jun 9, 2025

So I think it should be detected independently. Of course, a better approach would be to first identify the recurrence chain, and then determine whether it's a reduction.

Yep that's a good point! I was initially planning to split off detection of the pattern separately as follow-up, but it seems better to do it to start with. I started sharing a few refactoring patches to make it simpler to create reduction phi recipes in VPlan (final one in the series is #142322) and will update the PR here in the next few days.

Updated the latest version to perform the check for the pattern in VPlan.

The main idea is that we allow unclassified phis in legal and hand off the VPlan to check if we can the remaining VPWidenPHIRecipes to min/max reductions used by FindLast/FirstIV. The current patch just supports the FindLastIV case, to not make it depend on #140451.

It now depends on #142322 to make it easy to create VPReductionPHIRecipes from VPlan w/o RecurrenceDescriptor. WDYT?

@Mel-Chen
Copy link
Contributor

Updated the latest version to perform the check for the pattern in VPlan.

The main idea is that we allow unclassified phis in legal and hand off the VPlan to check if we can the remaining VPWidenPHIRecipes to min/max reductions used by FindLast/FirstIV. The current patch just supports the FindLastIV case, to not make it depend on #140451.

It now depends on #142322 to make it easy to create VPReductionPHIRecipes from VPlan w/o RecurrenceDescriptor. WDYT?

#142322 is good and will definitely be useful in the future. However, I’m a bit confused about VPlanTransforms::legalizeUnclassifiedPhis. Do we have any plans to move legality checks into the VPlan transformation phase? What are the benefits of doing this?
I’m concerned that identifying min/max recurrences so late might limit future support for in-loop min/max reductions. Also, when min/max recurrences have LoopExitInstr, would this approach prevent us from using tail folding for vectorization?

Separately, I submitted #141467 and #142335, but they haven’t received any review yet. I’d like to understand if there are any major design issues with my patches that are preventing them from being accepted.

@fhahn
Copy link
Contributor Author

fhahn commented Jun 10, 2025

#142322 is good and will definitely be useful in the future. However, I’m a bit confused about VPlanTransforms::legalizeUnclassifiedPhis. Do we have any plans to move legality checks into the VPlan transformation phase? What are the benefits of doing this?

We already do some legality checks in VPlan. The main one is for fixed-order recurrences, where we bail out if we fail to re-order the users of a recurrence. A number of VPlan transformations also check legality on VPlan before applying a transformation (including convertToSingleScalar). There are also some transforms that are required for legality, but work independently of LoopVectorizationLegality (e.g. dropPoisonGeneratingRecipes and the existing adjustRecipesForReductions, that converts recipes as needed for in-loop reductions).

I see the motivation similar to moving the cost model to be VPlan-based. Performing legality checks in VPlan allows combining checks directly with the transformations, which can be simpler/more direct due to not first requiring to collect the information separately in one place and then trying to translate it later to the already constructed VPlan based on IR references. It can also encourage more composable recipes. Many/most improvements over the last few years built on top of VPlan instead of the existing legacy code, and overall the approach has been very successful in my opinion.

In general, being able to perform legality checks directly on VPlan can also allow more flexibility, as we do not have to commit to a particular strategy (or check all possible strategies) early on (e.g. tail-folding vs. not folding the tail), and some legality issues can be resolved by other transforms that can be applied directly (e.g sinking/hoisting for fixed-order recurrences, peeling, distribution).

I think new capabilities serve as a great opportunity to improve and evolve the infrastructure to make sure the current VPlan representation is flexible and expressive enough to drive them

I’m concerned that identifying min/max recurrences so late might limit future support for in-loop min/max reductions. Also, when min/max recurrences have LoopExitInstr, would this approach prevent us from using tail folding for vectorization?

In our current setup, the transform happens after adjustReciepsForReductions. I think we can’t easily create in-loop reductions after that as-is. But I think this is just a limitation of the current order. As follow-ups, we should be able to improve the order like this: 1) Add ComputeReductionResults and update exit users, 2) Legalize the remaining phis, and 3) Convert recipes for in-loop reductions.

As for tail folding with exit users, this won’t work with the current canFoldTailByMasking. But as I mentioned earlier, this could be a good chance to rethink/evolve how and when we commit to tail folding and check its legality.

Separately, I submitted #141467 and #142335, but they haven’t received any review yet. I’d like to understand if there are any major design issues with my patches that are preventing them from being accepted.

Yep thanks! I originally shared the patch to get the ball rolling and to better understand if performing the checks directly based on VPlan would be feasible as an option, so we can consider the trade-offs more concretely.

#141467 and #142335 fit into the current system. One potential drawback is that it requires carrying state collected in IVDescriptors via LoopVectorizationLegality to LoopVectorize, which adds complexity. For this particular case, there may still be some limitations to VPlan currently that make it desirable not to use a VPlan-first approach here, but as I mentioned earlier, it may serve as a good opportunity to evolve the infrastructure and pave the road for further modularization/VPlan-ization of more legality checks.

fhahn added 3 commits June 17, 2025 22:45
Instead of looking up the narrower reduction type via getRecurrenceType
we can generate the needed extend directly at constructiond re-use the
truncated value from the loop.
Replace VPReductionPHIRecipe with RecurKind in VPReductionPHIRecipe, as
all VPlan analyses and codegen only require the recurrence kind. This
enables creating new VPReductionPHIRecipe directly in LV, without
needing to construction a whole RecurrenceDescriptor object.

Depends on
llvm#141860
llvm#141932
llvm#142290
llvm#142291
Add support for vectorizing loops that select the index of the minimum or
maximum element. The patch implements vectorizing those patterns by
combining Min/Max and FindFirstIV reductions.

It extends matching Min/Max reductions to allow in-loop users that are
FindLastIV reductions. It records a flag indicating that the Min/Max
reduction is used by another reduction.

When creating reduction recipes, we process any reduction that has other
reduction users. The reduction using the min/max reduction needs adjusting
to compute the correct result:
 1. We need to find the first IV for which the condition based on the
    min/max reduction is true,
 2. Compare the partial min/max reduction result to its final value and,
 3. Select the lanes of the partial FindLastIV reductions which correspond
    to the lanes matching the min/max reduction result.
@fhahn fhahn force-pushed the lv-find-min-max-index branch from 9019ec3 to e064211 Compare June 17, 2025 21:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants