New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LoopVectorize] Vectorize select-cmp reduction pattern for increasing integer induction variable #67812
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-vectorizers ChangesConsider the following loop:
We can vectorize this loop if This patch added new RecurKind enums - IFindLastIV and FFindLastIV. Patch is 230.36 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/67812.diff 11 Files Affected:
diff --git a/llvm/include/llvm/Analysis/IVDescriptors.h b/llvm/include/llvm/Analysis/IVDescriptors.h
index 0ee3f4fed8c976d..622060df0d8cd6c 100644
--- a/llvm/include/llvm/Analysis/IVDescriptors.h
+++ b/llvm/include/llvm/Analysis/IVDescriptors.h
@@ -52,9 +52,16 @@ enum class RecurKind {
FMulAdd, ///< Sum of float products with llvm.fmuladd(a * b + sum).
IAnyOf, ///< Any_of reduction with select(icmp(),x,y) where one of (x,y) is
///< loop invariant, and both x and y are integer type.
- FAnyOf ///< Any_of reduction with select(fcmp(),x,y) where one of (x,y) is
+ FAnyOf, ///< Any_of reduction with select(fcmp(),x,y) where one of (x,y) is
///< loop invariant, and both x and y are integer type.
- // TODO: Any_of reduction need not be restricted to integer type only.
+ IFindLastIV, ///< FindLast reduction with select(icmp(),x,y) where one of
+ ///< (x,y) is increasing loop induction PHI, and both x and y are
+ ///< integer type.
+ FFindLastIV ///< FindLast reduction with select(fcmp(),x,y) where one of (x,y)
+ ///< is increasing loop induction PHI, and both x and y are
+ ///< integer type.
+ // TODO: Any_of and FindLast reduction need not be restricted to integer type
+ // only.
};
/// The RecurrenceDescriptor is used to identify recurrences variables in a
@@ -126,7 +133,7 @@ class RecurrenceDescriptor {
/// the returned struct.
static InstDesc isRecurrenceInstr(Loop *L, PHINode *Phi, Instruction *I,
RecurKind Kind, InstDesc &Prev,
- FastMathFlags FuncFMF);
+ FastMathFlags FuncFMF, ScalarEvolution *SE);
/// Returns true if instruction I has multiple uses in Insts
static bool hasMultipleUsesOf(Instruction *I,
@@ -153,6 +160,16 @@ class RecurrenceDescriptor {
static InstDesc isAnyOfPattern(Loop *Loop, PHINode *OrigPhi, Instruction *I,
InstDesc &Prev);
+ /// Returns a struct describing whether the instruction is either a
+ /// Select(ICmp(A, B), X, Y), or
+ /// Select(FCmp(A, B), X, Y)
+ /// where one of (X, Y) is an increasing loop induction variable, and the
+ /// other is a PHI value.
+ // TODO: FindLast does not need be restricted to increasing loop induction
+ // variables.
+ static InstDesc isFindLastIVPattern(Loop *Loop, PHINode *OrigPhi,
+ Instruction *I, ScalarEvolution *SE);
+
/// Returns a struct describing if the instruction is a
/// Select(FCmp(X, Y), (Z = X op PHINode), PHINode) instruction pattern.
static InstDesc isConditionalRdxPattern(RecurKind Kind, Instruction *I);
@@ -241,6 +258,12 @@ class RecurrenceDescriptor {
return Kind == RecurKind::IAnyOf || Kind == RecurKind::FAnyOf;
}
+ /// Returns true if the recurrence kind is of the form
+ /// select(cmp(),x,y) where one of (x,y) is increasing loop induction.
+ static bool isFindLastIVRecurrenceKind(RecurKind Kind) {
+ return Kind == RecurKind::IFindLastIV || Kind == RecurKind::FFindLastIV;
+ }
+
/// Returns the type of the recurrence. This type can be narrower than the
/// actual type of the Phi if the recurrence has been type-promoted.
Type *getRecurrenceType() const { return RecurrenceType; }
diff --git a/llvm/include/llvm/Transforms/Utils/LoopUtils.h b/llvm/include/llvm/Transforms/Utils/LoopUtils.h
index 0d99249be413762..b83cfbe07ec59d8 100644
--- a/llvm/include/llvm/Transforms/Utils/LoopUtils.h
+++ b/llvm/include/llvm/Transforms/Utils/LoopUtils.h
@@ -372,6 +372,12 @@ CmpInst::Predicate getMinMaxReductionPredicate(RecurKind RK);
Value *createAnyOfOp(IRBuilderBase &Builder, Value *StartVal, RecurKind RK,
Value *Left, Value *Right);
+/// See RecurrenceDescriptor::isFindLastIVPattern for a description of the
+/// pattern we are trying to match. In this pattern, since the selected set of
+/// values forms an increasing sequence, we are selecting the maximum value from
+/// \p Left and \p Right.
+Value *createFindLastIVOp(IRBuilderBase &Builder, Value *Left, Value *Right);
+
/// Returns a Min/Max operation corresponding to MinMaxRecurrenceKind.
/// The Builder's fast-math-flags must be set to propagate the expected values.
Value *createMinMaxOp(IRBuilderBase &Builder, RecurKind RK, Value *Left,
@@ -402,6 +408,12 @@ Value *createAnyOfTargetReduction(IRBuilderBase &B, Value *Src,
const RecurrenceDescriptor &Desc,
PHINode *OrigPhi);
+/// Create a target reduction of the given vector \p Src for a reduction of the
+/// kind RecurKind::IFindLastIV or RecurKind::FFindLastIV. The reduction
+/// operation is described by \p Desc.
+Value *createFindLastIVTargetReduction(IRBuilderBase &B, Value *Src,
+ const RecurrenceDescriptor &Desc);
+
/// Create a generic target reduction using a recurrence descriptor \p Desc
/// The target is queried to determine if intrinsics or shuffle sequences are
/// required to implement the reduction.
@@ -415,6 +427,15 @@ Value *createOrderedReduction(IRBuilderBase &B,
const RecurrenceDescriptor &Desc, Value *Src,
Value *Start);
+/// Returns a set of cmp and select instructions as shown below:
+/// Select(Cmp(NE, Rdx, Iden), Rdx, InitVal)
+/// where \p Rdx is a scalar value generated by target reduction, Iden is the
+/// sentinel value of the recurrence descriptor \p Desc, and InitVal is the
+/// start value of the recurrence descriptor \p Desc.
+Value *createSentinelValueHandling(IRBuilderBase &Builder,
+ const RecurrenceDescriptor &Desc,
+ Value *Rdx);
+
/// Get the intersection (logical and) of all of the potential IR flags
/// of each scalar operation (VL) that will be converted into a vector (I).
/// If OpValue is non-null, we only consider operations similar to OpValue
diff --git a/llvm/lib/Analysis/IVDescriptors.cpp b/llvm/lib/Analysis/IVDescriptors.cpp
index 46629e381bc3665..19f594d0453c574 100644
--- a/llvm/lib/Analysis/IVDescriptors.cpp
+++ b/llvm/lib/Analysis/IVDescriptors.cpp
@@ -54,6 +54,8 @@ bool RecurrenceDescriptor::isIntegerRecurrenceKind(RecurKind Kind) {
case RecurKind::UMin:
case RecurKind::IAnyOf:
case RecurKind::FAnyOf:
+ case RecurKind::IFindLastIV:
+ case RecurKind::FFindLastIV:
return true;
}
return false;
@@ -375,7 +377,7 @@ bool RecurrenceDescriptor::AddReductionVar(
// type-promoted).
if (Cur != Start) {
ReduxDesc =
- isRecurrenceInstr(TheLoop, Phi, Cur, Kind, ReduxDesc, FuncFMF);
+ isRecurrenceInstr(TheLoop, Phi, Cur, Kind, ReduxDesc, FuncFMF, SE);
ExactFPMathInst = ExactFPMathInst == nullptr
? ReduxDesc.getExactFPMathInst()
: ExactFPMathInst;
@@ -662,6 +664,96 @@ RecurrenceDescriptor::isAnyOfPattern(Loop *Loop, PHINode *OrigPhi,
: RecurKind::FAnyOf);
}
+// We are looking for loops that do something like this:
+// int r = 0;
+// for (int i = 0; i < n; i++) {
+// if (src[i] > 3)
+// r = i;
+// }
+// The reduction value (r) is derived from either the values of an increasing
+// induction variable (i) sequence, or from the start value (0).
+// The LLVM IR generated for such loops would be as follows:
+// for.body:
+// %r = phi i32 [ %spec.select, %for.body ], [ 0, %entry ]
+// %i = phi i32 [ %inc, %for.body ], [ 0, %entry ]
+// ...
+// %cmp = icmp sgt i32 %5, 3
+// %spec.select = select i1 %cmp, i32 %i, i32 %r
+// %inc = add nsw i32 %i, 1
+// ...
+// Since 'i' is an increasing induction variable, the reduction value after the
+// loop will be the maximum value of 'i' that the condition (src[i] > 3) is
+// satisfied, or the start value (0 in the example above). When the start value
+// of the increasing induction variable 'i' is greater than the minimum value of
+// the data type, we can use the minimum value of the data type as a sentinel
+// value to replace the start value. This allows us to perform a single
+// reduction max operation to obtain the final reduction result.
+// TODO: It is possible to solve the case where the start value is the minimum
+// value of the data type or a non-constant value by using mask and multiple
+// reduction operations.
+RecurrenceDescriptor::InstDesc
+RecurrenceDescriptor::isFindLastIVPattern(Loop *Loop, PHINode *OrigPhi,
+ Instruction *I, ScalarEvolution *SE) {
+ // Only match select with single use cmp condition.
+ // TODO: Only handle single use for now.
+ CmpInst::Predicate Pred;
+ if (!match(I, m_Select(m_OneUse(m_Cmp(Pred, m_Value(), m_Value())), m_Value(),
+ m_Value())))
+ return InstDesc(false, I);
+
+ SelectInst *SI = cast<SelectInst>(I);
+ Value *NonRdxPhi = nullptr;
+
+ if (OrigPhi == dyn_cast<PHINode>(SI->getTrueValue()))
+ NonRdxPhi = SI->getFalseValue();
+ else if (OrigPhi == dyn_cast<PHINode>(SI->getFalseValue()))
+ NonRdxPhi = SI->getTrueValue();
+ else
+ return InstDesc(false, I);
+
+ auto IsIncreasingLoopInduction = [&SE, &Loop](Value *V) {
+ if (!SE)
+ return false;
+
+ Type *Ty = V->getType();
+ if (!SE->isSCEVable(Ty))
+ return false;
+
+ auto *AR = dyn_cast<SCEVAddRecExpr>(SE->getSCEV(V));
+ if (!AR)
+ return false;
+
+ const SCEV *Step = AR->getStepRecurrence(*SE);
+ if (!SE->isKnownPositive(Step))
+ return false;
+
+ const ConstantRange IVRange = SE->getSignedRange(AR);
+ unsigned NumBits = Ty->getIntegerBitWidth();
+ // Keep the minmum value of the recurrence type as the sentinel value.
+ // The maximum acceptable range for the increasing induction variable,
+ // called the valid range, will be defined as
+ // [<sentinel value> + 1, SignedMin(<recurrence type>))
+ // TODO: This range restriction can be lifted by adding an additional
+ // virtual OR reduction.
+ const APInt Sentinel = APInt::getSignedMinValue(NumBits);
+ const ConstantRange ValidRange = ConstantRange::getNonEmpty(
+ Sentinel + 1, APInt::getSignedMinValue(NumBits));
+ LLVM_DEBUG(dbgs() << "LV: FindLastIV valid range is " << ValidRange
+ << ", and the signed range of " << *AR << " is "
+ << IVRange << "\n");
+ return ValidRange.contains(IVRange);
+ };
+
+ // We are looking for selects of the form:
+ // select(cmp(), phi, loop_induction) or
+ // select(cmp(), loop_induction, phi)
+ if (!IsIncreasingLoopInduction(NonRdxPhi))
+ return InstDesc(false, I);
+
+ return InstDesc(I, isa<ICmpInst>(I->getOperand(0)) ? RecurKind::IFindLastIV
+ : RecurKind::FFindLastIV);
+}
+
RecurrenceDescriptor::InstDesc
RecurrenceDescriptor::isMinMaxPattern(Instruction *I, RecurKind Kind,
const InstDesc &Prev) {
@@ -765,10 +857,9 @@ RecurrenceDescriptor::isConditionalRdxPattern(RecurKind Kind, Instruction *I) {
return InstDesc(true, SI);
}
-RecurrenceDescriptor::InstDesc
-RecurrenceDescriptor::isRecurrenceInstr(Loop *L, PHINode *OrigPhi,
- Instruction *I, RecurKind Kind,
- InstDesc &Prev, FastMathFlags FuncFMF) {
+RecurrenceDescriptor::InstDesc RecurrenceDescriptor::isRecurrenceInstr(
+ Loop *L, PHINode *OrigPhi, Instruction *I, RecurKind Kind, InstDesc &Prev,
+ FastMathFlags FuncFMF, ScalarEvolution *SE) {
assert(Prev.getRecKind() == RecurKind::None || Prev.getRecKind() == Kind);
switch (I->getOpcode()) {
default:
@@ -798,6 +889,8 @@ RecurrenceDescriptor::isRecurrenceInstr(Loop *L, PHINode *OrigPhi,
if (Kind == RecurKind::FAdd || Kind == RecurKind::FMul ||
Kind == RecurKind::Add || Kind == RecurKind::Mul)
return isConditionalRdxPattern(Kind, I);
+ if (isFindLastIVRecurrenceKind(Kind))
+ return isFindLastIVPattern(L, OrigPhi, I, SE);
[[fallthrough]];
case Instruction::FCmp:
case Instruction::ICmp:
@@ -902,6 +995,11 @@ bool RecurrenceDescriptor::isReductionPHI(PHINode *Phi, Loop *TheLoop,
<< *Phi << "\n");
return true;
}
+ if (AddReductionVar(Phi, RecurKind::IFindLastIV, TheLoop, FMF, RedDes, DB, AC,
+ DT, SE)) {
+ LLVM_DEBUG(dbgs() << "Found a FindLastIV reduction PHI." << *Phi << "\n");
+ return true;
+ }
if (AddReductionVar(Phi, RecurKind::FMul, TheLoop, FMF, RedDes, DB, AC, DT,
SE)) {
LLVM_DEBUG(dbgs() << "Found an FMult reduction PHI." << *Phi << "\n");
@@ -1091,6 +1189,9 @@ Value *RecurrenceDescriptor::getRecurrenceIdentity(RecurKind K, Type *Tp,
case RecurKind::FAnyOf:
return getRecurrenceStartValue();
break;
+ case RecurKind::IFindLastIV:
+ case RecurKind::FFindLastIV:
+ return getRecurrenceIdentity(RecurKind::SMax, Tp, FMF);
default:
llvm_unreachable("Unknown recurrence kind");
}
@@ -1118,12 +1219,14 @@ unsigned RecurrenceDescriptor::getOpcode(RecurKind Kind) {
case RecurKind::UMax:
case RecurKind::UMin:
case RecurKind::IAnyOf:
+ case RecurKind::IFindLastIV:
return Instruction::ICmp;
case RecurKind::FMax:
case RecurKind::FMin:
case RecurKind::FMaximum:
case RecurKind::FMinimum:
case RecurKind::FAnyOf:
+ case RecurKind::FFindLastIV:
return Instruction::FCmp;
default:
llvm_unreachable("Unknown recurrence operation");
diff --git a/llvm/lib/Transforms/Utils/LoopUtils.cpp b/llvm/lib/Transforms/Utils/LoopUtils.cpp
index 21affe7bdce406e..192e93a2f9b455d 100644
--- a/llvm/lib/Transforms/Utils/LoopUtils.cpp
+++ b/llvm/lib/Transforms/Utils/LoopUtils.cpp
@@ -942,6 +942,11 @@ Value *llvm::createAnyOfOp(IRBuilderBase &Builder, Value *StartVal,
return Builder.CreateSelect(Cmp, Left, Right, "rdx.select");
}
+Value *llvm::createFindLastIVOp(IRBuilderBase &Builder, Value *Left,
+ Value *Right) {
+ return createMinMaxOp(Builder, RecurKind::SMax, Left, Right);
+}
+
Value *llvm::createMinMaxOp(IRBuilderBase &Builder, RecurKind RK, Value *Left,
Value *Right) {
Type *Ty = Left->getType();
@@ -1062,6 +1067,14 @@ Value *llvm::createAnyOfTargetReduction(IRBuilderBase &Builder, Value *Src,
return Builder.CreateSelect(Cmp, NewVal, InitVal, "rdx.select");
}
+Value *llvm::createFindLastIVTargetReduction(IRBuilderBase &Builder, Value *Src,
+ const RecurrenceDescriptor &Desc) {
+ assert(RecurrenceDescriptor::isFindLastIVRecurrenceKind(
+ Desc.getRecurrenceKind()) &&
+ "Unexpected reduction kind");
+ return Builder.CreateIntMaxReduce(Src, true);
+}
+
Value *llvm::createSimpleTargetReduction(IRBuilderBase &Builder, Value *Src,
RecurKind RdxKind) {
auto *SrcVecEltTy = cast<VectorType>(Src->getType())->getElementType();
@@ -1115,6 +1128,8 @@ Value *llvm::createTargetReduction(IRBuilderBase &B,
RecurKind RK = Desc.getRecurrenceKind();
if (RecurrenceDescriptor::isAnyOfRecurrenceKind(RK))
return createAnyOfTargetReduction(B, Src, Desc, OrigPhi);
+ if (RecurrenceDescriptor::isFindLastIVRecurrenceKind(RK))
+ return createFindLastIVTargetReduction(B, Src, Desc);
return createSimpleTargetReduction(B, Src, RK);
}
@@ -1131,6 +1146,16 @@ Value *llvm::createOrderedReduction(IRBuilderBase &B,
return B.CreateFAddReduce(Start, Src);
}
+Value *llvm::createSentinelValueHandling(IRBuilderBase &Builder,
+ const RecurrenceDescriptor &Desc,
+ Value *Rdx) {
+ Value *InitVal = Desc.getRecurrenceStartValue();
+ Value *Iden = Desc.getRecurrenceIdentity(
+ Desc.getRecurrenceKind(), Rdx->getType(), Desc.getFastMathFlags());
+ Value *Cmp = Builder.CreateCmp(CmpInst::ICMP_NE, Rdx, Iden, "rdx.select.cmp");
+ return Builder.CreateSelect(Cmp, Rdx, InitVal, "rdx.select");
+}
+
void llvm::propagateIRFlags(Value *I, ArrayRef<Value *> VL, Value *OpValue,
bool IncludeWrapFlags) {
auto *VecOp = dyn_cast<Instruction>(I);
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index cc17d91d4f43727..35be8deeb542c56 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -3901,6 +3901,8 @@ void InnerLoopVectorizer::fixReduction(VPReductionPHIRecipe *PhiR,
else if (RecurrenceDescriptor::isAnyOfRecurrenceKind(RK))
ReducedPartRdx = createAnyOfOp(Builder, ReductionStartValue, RK,
ReducedPartRdx, RdxPart);
+ else if (RecurrenceDescriptor::isFindLastIVRecurrenceKind(RK))
+ ReducedPartRdx = createFindLastIVOp(Builder, ReducedPartRdx, RdxPart);
else
ReducedPartRdx = createMinMaxOp(Builder, RK, ReducedPartRdx, RdxPart);
}
@@ -3919,6 +3921,10 @@ void InnerLoopVectorizer::fixReduction(VPReductionPHIRecipe *PhiR,
: Builder.CreateZExt(ReducedPartRdx, PhiTy);
}
+ if (RecurrenceDescriptor::isFindLastIVRecurrenceKind(RK))
+ ReducedPartRdx =
+ createSentinelValueHandling(Builder, RdxDesc, ReducedPartRdx);
+
PHINode *ResumePhi =
dyn_cast<PHINode>(PhiR->getStartValue()->getUnderlyingValue());
@@ -5822,8 +5828,9 @@ LoopVectorizationCostModel::selectInterleaveCount(ElementCount VF,
HasReductions &&
any_of(Legal->getReductionVars(), [&](auto &Reduction) -> bool {
const RecurrenceDescriptor &RdxDesc = Reduction.second;
- return RecurrenceDescriptor::isAnyOfRecurrenceKind(
- RdxDesc.getRecurrenceKind());
+ RecurKind RK = RdxDesc.getRecurrenceKind();
+ return RecurrenceDescriptor::isAnyOfRecurrenceKind(RK) ||
+ RecurrenceDescriptor::isFindLastIVRecurrenceKind(RK);
});
if (HasSelectCmpReductions) {
LLVM_DEBUG(dbgs() << "LV: Not interleaving select-cmp reductions.\n");
@@ -8973,8 +8980,10 @@ void LoopVectorizationPlanner::adjustRecipesForReductions(
for (VPReductionPHIRecipe *PhiR : InLoopReductionPhis) {
const RecurrenceDescriptor &RdxDesc = PhiR->getRecurrenceDescriptor();
RecurKind Kind = RdxDesc.getRecurrenceKind();
- assert(!RecurrenceDescriptor::isAnyOfRecurrenceKind(Kind) &&
- "AnyOf reductions are not allowed for in-loop reductions");
+ assert(
+ (!RecurrenceDescriptor::isAnyOfRecurrenceKind(Kind) &&
+ !RecurrenceDescriptor::isFindLastIVRecurrenceKind(Kind)) &&
+ "AnyOf and FindLast reductions are not allowed for in-loop reductions");
// Collect the chain of "link" recipes for the reduction starting at PhiR.
SetVector<VPRecipeBase *> Worklist;
diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index 572c4399b8b55cd..3ffba0113cbca03 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -14352,6 +14352,8 @@ class HorizontalReduction {
case RecurKind::FMulAdd:
case RecurKind::IAnyOf:
case RecurKind::FAnyOf:
+ case RecurKind::IFindLastIV:
+ case RecurKind::FFindLastIV:
case RecurKind::None:
llvm_unreachable("Unexpected reduction kind for repeated scalar.");
}
@@ -14441,6 +14443,8 @@ class HorizontalReduction {
case RecurKind::FMulAdd:
case RecurKind::IAnyOf:
case RecurKind::FAnyOf:
+ case RecurKind::IFindLastIV:
+ case RecurKind::FFindLastIV:
case RecurKind::None:
llvm_unreachable("Unexpected reduction kind for reused scalars.");
}
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index 2a1213a98095907..86e93f909bf747c 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -1589,6 +1589,22 @@ void VPReductionPHIRecipe::execute(VPTransformState &State) {
StartV = Iden =
Builder.CreateVectorSplat(State.VF, StartV, "minmax.ident");
}
+ } ...
[truncated]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, apart from minor nits.
; RUN: opt -passes=loop-vectorize -force-vector-interleave=4 -force-vector-width=4 -S < %s | FileCheck %s --check-prefix=CHECK | ||
; RUN: opt -passes=loop-vectorize -force-vector-interleave=4 -force-vector-width=1 -S < %s | FileCheck %s --check-prefix=CHECK | ||
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 3 | ||
; RUN: opt -passes=loop-vectorize -force-vector-interleave=1 -force-vector-width=4 -S < %s | FileCheck %s --check-prefix=CHECK-VF4IC1 --check-prefix=CHECK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FileCheck %s --check-prefixes=CHECK,CHECK-VF4IC1
.
llvm/lib/Analysis/IVDescriptors.cpp
Outdated
// virtual OR reduction. | ||
const APInt Sentinel = APInt::getSignedMinValue(NumBits); | ||
const ConstantRange ValidRange = ConstantRange::getNonEmpty( | ||
Sentinel + 1, APInt::getSignedMinValue(NumBits)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ConstantRange::getNonEmpty(Sentinel + 1, Sentinel)
.
Extend the idea in llvm#67812 to support vectorizion of decreasing IV in select-cmp patterns. llvm#67812 enabled vectorization of the following example: long src[20000] = {4, 5, 2}; long r = 331; for (long i = 0; i < 20000; i++) { if (src[i] > 3) r = i; } return r; This patch extends the above idea to also vectorize: long src[20000] = {4, 5, 2}; long r = 331; for (long i = 20000 - 1; i >= 0; i--) { if (src[i] > 3) r = i; } return r;
✅ With the latest revision this PR passed the C/C++ code formatter. |
Although IFindLastIV can be fixed to FFindLastIV when parsing instructions, could we reject IFindLastIV with FCmpInst and AddReductionVar for FFindLastIV explicitily? |
12801ba
to
ed0e0a1
Compare
This is possible, but it may cause going through the pattern identification process one more time. |
@ayalz @fhahn @david-arm Ping. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this patch - it looks like a useful addition. I'm still looking through all the tests, but I thought I'd leave some comments I have so far!
@@ -900,6 +993,11 @@ bool RecurrenceDescriptor::isReductionPHI(PHINode *Phi, Loop *TheLoop, | |||
<< *Phi << "\n"); | |||
return true; | |||
} | |||
if (AddReductionVar(Phi, RecurKind::IFindLastIV, TheLoop, FMF, RedDes, DB, AC, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we not also need one for FFindLastIV?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. We don't need to call AddReductionVar
again for FFindLastIV.
This is because at the end of function isFindLastIVPattern
, IFindLastIV can be transformed into FFindLastIV if the predicate is an fcmp instruction.
llvm/lib/Analysis/IVDescriptors.cpp
Outdated
return InstDesc(false, I); | ||
|
||
auto IsIncreasingLoopInduction = [&](Value *V) { | ||
if (!SE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it's better to bail out at the start of isFindLastIVPattern if SE is null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
952b5df
Nice suggestion. I pass SE by reference directly. Thanks.
llvm/lib/Analysis/IVDescriptors.cpp
Outdated
return false; | ||
|
||
const SCEV *Step = AR->getStepRecurrence(*SE); | ||
if (!SE->isKnownPositive(Step)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume there isn't a fundamental reason why we can't support known negative steps too? Is it worth adding a TODO for future work? It sounds like we'd just need a sentinel that was the max positive value and use smin instead of smax for the final reduction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ec9217d
Add the TODO for the decreasing induction.
// Keep the minimum value of the recurrence type as the sentinel value. | ||
// The maximum acceptable range for the increasing induction variable, | ||
// called the valid range, will be defined as | ||
// [<sentinel value> + 1, SignedMin(<recurrence type>)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this comment a little confusing as it suggests <sentinel value>
and SignedMin(<recurrence type>)
are different, yet the code below suggests they are the same, i.e.
Sentinel = APInt::getSignedMinValue(NumBits);
Isn't this just
// [<sentinel value> + 1, <sentinel value>)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<sentinel value>
and SignedMin(<recurrence type>)
are the same.
The valid range refers to the acceptable range of the increasing induction variable.
Since <sentinel value>
cannot be equal to any possible value of the increasing induction variable, the acceptable range of the increasing induction variable is limited to [<sentinel value> + 1, <sentinel value>)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK thanks for explaining. Can you update the comment to show that please? i.e.
// called the valid range, will be defined as
// [<sentinel value> + 1, <sentinel value>)
// where <sentinel value> == SignedMin(<recurrence type>)
// TODO: This range restriction can be lifted by adding an additional | ||
// virtual OR reduction. | ||
const APInt Sentinel = APInt::getSignedMinValue(NumBits); | ||
const ConstantRange ValidRange = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you just avoid calling difference and specify the range explicitly? I'll be honest I struggle to understand the ConstantRange stuff so please forgive my lack of knowledge! I was thinking it might be something like:
ConstantRange::getNonEmpty(Sentinel + 1, Sentinel);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your answer is correct. :)
Actually, that's my initial version. The current version has been modified based on previous comments. 2466df5
integer induction variable Consider the following loop: int rdx = init; for (int i = 0; i < n; ++i) rdx = (a[i] > b[i]) ? i : rdx; We can vectorize this loop if `i` is an increasing induction variable. The final reduced value will be the maximum of `i` that the condition `a[i] > b[i]` is satisfied, or the start value `init`. This patch added new RecurKind enums - IFindLastIV and FFindLastIV.
This patch applys range analysis. It will exclude cases where the range of induction variable cannot be fully contained within [<sentinel value> + 1, <minimum value of recurrence type>) This approach also handles truncated induction variable cases well.
Comment from Alexey Co-authored-by: Alexey Bataev <5361294+alexey-bataev@users.noreply.github.com>
comment from Alexey Co-authored-by: Alexey Bataev <5361294+alexey-bataev@users.noreply.github.com>
Co-authored-by: Alexey Bataev <5361294+alexey-bataev@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've almost finished reviewing! Still have some tests to work through, but I thought I'd leave the comments I have so far. It looks like you've done a thorough job at testing all the edge cases though.
RecurrenceDescriptor::isFindLastIVPattern(PHINode *OrigPhi, Instruction *I, | ||
ScalarEvolution &SE) { | ||
// Only match select with single use cmp condition. | ||
// TODO: Only handle single use for now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This TODO line says the same thing as above. Maybe just have a single TODO line, i.e. something like:
// TODO: Match selects with multi-use cmp conditions.
// Keep the minimum value of the recurrence type as the sentinel value. | ||
// The maximum acceptable range for the increasing induction variable, | ||
// called the valid range, will be defined as | ||
// [<sentinel value> + 1, SignedMin(<recurrence type>)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK thanks for explaining. Can you update the comment to show that please? i.e.
// called the valid range, will be defined as
// [<sentinel value> + 1, <sentinel value>)
// where <sentinel value> == SignedMin(<recurrence type>)
@@ -900,6 +991,11 @@ bool RecurrenceDescriptor::isReductionPHI(PHINode *Phi, Loop *TheLoop, | |||
<< *Phi << "\n"); | |||
return true; | |||
} | |||
if (AddReductionVar(Phi, RecurKind::IFindLastIV, TheLoop, FMF, RedDes, DB, AC, | |||
DT, SE)) { | |||
LLVM_DEBUG(dbgs() << "Found a FindLastIV reduction PHI." << *Phi << "\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be a good idea to indicate whether we've matched IFindLastIV or FFindLastIV because FindLastIV
doesn't tell you which one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just delete this file, since it looks practically the same as llvm/test/Transforms/LoopVectorize/iv-select-cmp-no-wrap.ll and doesn't seem to offer any extra value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, just one last comment. I've finished reviewing now!
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 3 | ||
; RUN: opt -passes=loop-vectorize -force-vector-interleave=1 -force-vector-width=4 -S < %s | FileCheck %s --check-prefix=CHECK-VF4IC1 --check-prefix=CHECK | ||
; RUN: opt -passes=loop-vectorize -force-vector-interleave=4 -force-vector-width=4 -S < %s | FileCheck %s --check-prefix=CHECK-VF4IC4 --check-prefix=CHECK | ||
; RUN: opt -passes=loop-vectorize -force-vector-interleave=4 -force-vector-width=1 -S < %s | FileCheck %s --check-prefix=CHECK-VF1IC4 --check-prefix=CHECK | ||
|
||
define i64 @select_icmp_const_1(ptr %a, i64 %n) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both @select_icmp_const_1
and @select_icmp_const_2
look similar to test select_icmp_nuw_nsw
in Transforms/LoopVectorize/iv-select-cmp-no-wrap.ll.
Also, I see the only difference between @select_icmp_const_1
and @select_icmp_const_2
is the operands to the select are swapped. I'm not sure having both versions really adds much value. Perhaps you can remove both of them and leave the one in iv-select-cmp-no-wrap.ll
Consider the following loop:
We can vectorize this loop if
i
is an increasing induction variable.The final reduced value will be the maximum of
i
that the conditiona[i] > b[i]
is satisfied, or the start valueinit
.This patch added new RecurKind enums - IFindLastIV and FFindLastIV.