-
Notifications
You must be signed in to change notification settings - Fork 14k
[LV] Vectorize selecting last IV of min/max element. #141431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-vectorizers @llvm/pr-subscribers-llvm-analysis Author: Florian Hahn (fhahn) ChangesAdd support for vectorizing loops that select the index of the minimum or It extends matching Min/Max reductions to allow in-loop users that are When creating reduction recipes, we process any reduction that has other
Depends on #140451 Patch is 140.28 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/141431.diff 13 Files Affected:
diff --git a/llvm/include/llvm/Analysis/IVDescriptors.h b/llvm/include/llvm/Analysis/IVDescriptors.h
index d94ffa7287db3..80b4b8390749b 100644
--- a/llvm/include/llvm/Analysis/IVDescriptors.h
+++ b/llvm/include/llvm/Analysis/IVDescriptors.h
@@ -56,6 +56,13 @@ enum class RecurKind {
FindLastIV, ///< FindLast reduction with select(cmp(),x,y) where one of
///< (x,y) is increasing loop induction, and both x and y are
///< integer type.
+ FindFirstIVUMin, /// FindFirst reduction with select(icmp(),x,y) where one of
+ ///< (x,y) is a decreasing loop induction, and both x and y
+ ///< are integer type.
+ FindFirstIVSMin /// FindFirst reduction with select(icmp(),x,y) where one of
+ ///< (x,y) is a decreasing loop induction, and both x and y
+ ///< are integer type.
+
// clang-format on
// TODO: Any_of and FindLast reduction need not be restricted to integer type
// only.
@@ -160,12 +167,13 @@ class RecurrenceDescriptor {
/// Returns a struct describing whether the instruction is either a
/// Select(ICmp(A, B), X, Y), or
/// Select(FCmp(A, B), X, Y)
- /// where one of (X, Y) is an increasing loop induction variable, and the
- /// other is a PHI value.
+ /// where one of (X, Y) is an increasing (FindLast) or decreasing (FindFirst)
+ /// loop induction variable, and the other is a PHI value.
// TODO: Support non-monotonic variable. FindLast does not need be restricted
// to increasing loop induction variables.
- static InstDesc isFindLastIVPattern(Loop *TheLoop, PHINode *OrigPhi,
- Instruction *I, ScalarEvolution &SE);
+ static InstDesc isFindIVPattern(RecurKind Kind, Loop *TheLoop,
+ PHINode *OrigPhi, Instruction *I,
+ ScalarEvolution &SE);
/// Returns a struct describing if the instruction is a
/// Select(FCmp(X, Y), (Z = X op PHINode), PHINode) instruction pattern.
@@ -259,19 +267,37 @@ class RecurrenceDescriptor {
return Kind == RecurKind::FindLastIV;
}
+ /// Returns true if the recurrence kind is of the form
+ /// select(cmp(),x,y) where one of (x,y) is an increasing or decreasing loop
+ /// induction.
+ static bool isFindIVRecurrenceKind(RecurKind Kind) {
+ return Kind == RecurKind::FindLastIV ||
+ Kind == RecurKind::FindFirstIVUMin ||
+ Kind == RecurKind::FindFirstIVSMin;
+ }
+
/// Returns the type of the recurrence. This type can be narrower than the
/// actual type of the Phi if the recurrence has been type-promoted.
Type *getRecurrenceType() const { return RecurrenceType; }
- /// Returns the sentinel value for FindLastIV recurrences to replace the start
- /// value.
+ /// Returns the sentinel value for FindFirstIV &FindLastIV recurrences to
+ /// replace the start value.
Value *getSentinelValue() const {
- assert(isFindLastIVRecurrenceKind(Kind) && "Unexpected recurrence kind");
Type *Ty = StartValue->getType();
- return ConstantInt::get(Ty,
- APInt::getSignedMinValue(Ty->getIntegerBitWidth()));
+ if (isFindLastIVRecurrenceKind(Kind)) {
+ return ConstantInt::get(
+ Ty, APInt::getSignedMinValue(Ty->getIntegerBitWidth()));
+ } else if (Kind == RecurKind::FindFirstIVSMin) {
+ return ConstantInt::get(
+ Ty, APInt::getSignedMaxValue(Ty->getIntegerBitWidth()));
+ } else {
+ assert(Kind == RecurKind::FindFirstIVUMin);
+ return ConstantInt::get(Ty, APInt::getMaxValue(Ty->getIntegerBitWidth()));
+ }
}
+ void setKind(RecurKind NewKind) { Kind = NewKind; }
+
/// Returns a reference to the instructions used for type-promoting the
/// recurrence.
const SmallPtrSet<Instruction *, 8> &getCastInsts() const { return CastInsts; }
@@ -303,6 +329,10 @@ class RecurrenceDescriptor {
/// AddReductionVar method, this field will be assigned the last met store.
StoreInst *IntermediateStore = nullptr;
+ /// True if this recurrence is used by another recurrence in the loop. Users
+ /// need to ensure that the final code-gen accounts for the use in the loop.
+ bool IsUsedByOtherRecurrence = false;
+
private:
// The starting value of the recurrence.
// It does not have to be zero!
diff --git a/llvm/lib/Analysis/IVDescriptors.cpp b/llvm/lib/Analysis/IVDescriptors.cpp
index b7c7bcab168cc..c1886935884e5 100644
--- a/llvm/lib/Analysis/IVDescriptors.cpp
+++ b/llvm/lib/Analysis/IVDescriptors.cpp
@@ -51,6 +51,8 @@ bool RecurrenceDescriptor::isIntegerRecurrenceKind(RecurKind Kind) {
case RecurKind::UMin:
case RecurKind::AnyOf:
case RecurKind::FindLastIV:
+ case RecurKind::FindFirstIVUMin:
+ case RecurKind::FindFirstIVSMin:
return true;
}
return false;
@@ -253,7 +255,7 @@ bool RecurrenceDescriptor::AddReductionVar(
// Data used for determining if the recurrence has been type-promoted.
Type *RecurrenceType = Phi->getType();
SmallPtrSet<Instruction *, 4> CastInsts;
- unsigned MinWidthCastToRecurrenceType;
+ unsigned MinWidthCastToRecurrenceType = -1ull;
Instruction *Start = Phi;
bool IsSigned = false;
@@ -308,6 +310,7 @@ bool RecurrenceDescriptor::AddReductionVar(
// This is either:
// * An instruction type other than PHI or the reduction operation.
// * A PHI in the header other than the initial PHI.
+ bool IsUsedByOtherRecurrence = false;
while (!Worklist.empty()) {
Instruction *Cur = Worklist.pop_back_val();
@@ -369,15 +372,37 @@ bool RecurrenceDescriptor::AddReductionVar(
// Any reduction instruction must be of one of the allowed kinds. We ignore
// the starting value (the Phi or an AND instruction if the Phi has been
- // type-promoted).
+ // type-promoted) and other in-loop users, if they form a FindLastIV
+ // reduction. In the latter case, the user of the IVDescriptors must account
+ // for that during codegen.
if (Cur != Start) {
ReduxDesc =
isRecurrenceInstr(TheLoop, Phi, Cur, Kind, ReduxDesc, FuncFMF, SE);
ExactFPMathInst = ExactFPMathInst == nullptr
? ReduxDesc.getExactFPMathInst()
: ExactFPMathInst;
- if (!ReduxDesc.isRecurrence())
+ if (!ReduxDesc.isRecurrence()) {
+ if (isMinMaxRecurrenceKind(Kind)) {
+ // If the current recurrence is Min/Max, check if the current user is
+ // a select that is a FindLastIV reduction. During codegen, this
+ // recurrence needs to be turned into one that finds the first IV, as
+ // the value to compare against is a Min/Max recurrence.
+ auto *Sel = dyn_cast<SelectInst>(Cur);
+ if (!Sel || !Sel->getType()->isIntegerTy())
+ return false;
+ auto *OtherPhi = dyn_cast<PHINode>(Sel->getOperand(2));
+ if (!OtherPhi)
+ return false;
+ auto NewReduxDesc =
+ isRecurrenceInstr(TheLoop, OtherPhi, Cur, RecurKind::FindLastIV,
+ ReduxDesc, FuncFMF, SE);
+ if (NewReduxDesc.isRecurrence()) {
+ IsUsedByOtherRecurrence = true;
+ continue;
+ }
+ }
return false;
+ }
// FIXME: FMF is allowed on phi, but propagation is not handled correctly.
if (isa<FPMathOperator>(ReduxDesc.getPatternInst()) && !IsAPhi) {
FastMathFlags CurFMF = ReduxDesc.getPatternInst()->getFastMathFlags();
@@ -501,7 +526,7 @@ bool RecurrenceDescriptor::AddReductionVar(
// pattern or more than just a select and cmp. Zero implies that we saw a
// llvm.min/max intrinsic, which is always OK.
if (isMinMaxRecurrenceKind(Kind) && NumCmpSelectPatternInst != 2 &&
- NumCmpSelectPatternInst != 0)
+ NumCmpSelectPatternInst != 0 && !IsUsedByOtherRecurrence)
return false;
if (isAnyOfRecurrenceKind(Kind) && NumCmpSelectPatternInst != 1)
@@ -533,7 +558,13 @@ bool RecurrenceDescriptor::AddReductionVar(
ExitInstruction = cast<Instruction>(IntermediateStore->getValueOperand());
}
- if (!FoundStartPHI || !FoundReduxOp || !ExitInstruction)
+ if (!FoundStartPHI || !FoundReduxOp)
+ return false;
+
+ if (IsUsedByOtherRecurrence) {
+ if (ExitInstruction)
+ return false;
+ } else if (!ExitInstruction)
return false;
const bool IsOrdered =
@@ -584,8 +615,9 @@ bool RecurrenceDescriptor::AddReductionVar(
// without needing a white list of instructions to ignore.
// This may also be useful for the inloop reductions, if it can be
// kept simple enough.
- collectCastInstrs(TheLoop, ExitInstruction, RecurrenceType, CastInsts,
- MinWidthCastToRecurrenceType);
+ if (ExitInstruction)
+ collectCastInstrs(TheLoop, ExitInstruction, RecurrenceType, CastInsts,
+ MinWidthCastToRecurrenceType);
// We found a reduction var if we have reached the original phi node and we
// only have a single instruction with out-of-loop users.
@@ -598,7 +630,7 @@ bool RecurrenceDescriptor::AddReductionVar(
FMF, ExactFPMathInst, RecurrenceType, IsSigned,
IsOrdered, CastInsts, MinWidthCastToRecurrenceType);
RedDes = RD;
-
+ RedDes.IsUsedByOtherRecurrence = IsUsedByOtherRecurrence;
return true;
}
@@ -683,8 +715,9 @@ RecurrenceDescriptor::isAnyOfPattern(Loop *Loop, PHINode *OrigPhi,
// value of the data type or a non-constant value by using mask and multiple
// reduction operations.
RecurrenceDescriptor::InstDesc
-RecurrenceDescriptor::isFindLastIVPattern(Loop *TheLoop, PHINode *OrigPhi,
- Instruction *I, ScalarEvolution &SE) {
+RecurrenceDescriptor::isFindIVPattern(RecurKind Kind, Loop *TheLoop,
+ PHINode *OrigPhi, Instruction *I,
+ ScalarEvolution &SE) {
// TODO: Support the vectorization of FindLastIV when the reduction phi is
// used by more than one select instruction. This vectorization is only
// performed when the SCEV of each increasing induction variable used by the
@@ -700,7 +733,7 @@ RecurrenceDescriptor::isFindLastIVPattern(Loop *TheLoop, PHINode *OrigPhi,
m_Value(NonRdxPhi)))))
return InstDesc(false, I);
- auto IsIncreasingLoopInduction = [&](Value *V) {
+ auto IsSupportedLoopInduction = [&](Value *V, RecurKind Kind) {
Type *Ty = V->getType();
if (!SE.isSCEVable(Ty))
return false;
@@ -710,21 +743,39 @@ RecurrenceDescriptor::isFindLastIVPattern(Loop *TheLoop, PHINode *OrigPhi,
return false;
const SCEV *Step = AR->getStepRecurrence(SE);
- if (!SE.isKnownPositive(Step))
+ if (Kind == RecurKind::FindFirstIVUMin ||
+ Kind == RecurKind::FindFirstIVSMin) {
+ if (!SE.isKnownNegative(Step))
+ return false;
+ } else if (!SE.isKnownPositive(Step))
return false;
const ConstantRange IVRange = SE.getSignedRange(AR);
unsigned NumBits = Ty->getIntegerBitWidth();
- // Keep the minimum value of the recurrence type as the sentinel value.
- // The maximum acceptable range for the increasing induction variable,
- // called the valid range, will be defined as
+ // Keep the minimum (FindLast) or maximum (FindFirst) value of the
+ // recurrence type as the sentinel value. The maximum acceptable range for
+ // the induction variable, called the valid range, will be defined as
// [<sentinel value> + 1, <sentinel value>)
- // where <sentinel value> is SignedMin(<recurrence type>)
+ // where <sentinel value> is SignedMin(<recurrence type>) for FindLast or
+ // UnsignedMax(<recurrence type>) for FindFirst.
// TODO: This range restriction can be lifted by adding an additional
// virtual OR reduction.
- const APInt Sentinel = APInt::getSignedMinValue(NumBits);
- const ConstantRange ValidRange =
- ConstantRange::getNonEmpty(Sentinel + 1, Sentinel);
+ const APInt Sentinel = Kind == RecurKind::FindFirstIVUMin
+ ? APInt::getMaxValue(NumBits)
+ : (Kind == RecurKind::FindFirstIVSMin
+ ? APInt::getSignedMaxValue(NumBits)
+ : APInt::getSignedMinValue(NumBits));
+ ConstantRange ValidRange = ConstantRange::getEmpty(NumBits);
+ if (Kind == RecurKind::FindFirstIVSMin)
+ ValidRange =
+ ConstantRange::getNonEmpty(APInt::getSignedMinValue(NumBits),
+ APInt::getSignedMaxValue(NumBits) - 1);
+ else {
+ const APInt Sentinel = Kind == RecurKind::FindFirstIVUMin
+ ? APInt::getMaxValue(NumBits)
+ : APInt::getSignedMinValue(NumBits);
+ ValidRange = ConstantRange::getNonEmpty(Sentinel + 1, Sentinel);
+ }
LLVM_DEBUG(dbgs() << "LV: FindLastIV valid range is " << ValidRange
<< ", and the signed range of " << *AR << " is "
<< IVRange << "\n");
@@ -736,11 +787,18 @@ RecurrenceDescriptor::isFindLastIVPattern(Loop *TheLoop, PHINode *OrigPhi,
// We are looking for selects of the form:
// select(cmp(), phi, increasing_loop_induction) or
// select(cmp(), increasing_loop_induction, phi)
- // TODO: Support for monotonically decreasing induction variable
- if (!IsIncreasingLoopInduction(NonRdxPhi))
+ if (Kind == RecurKind::FindLastIV) {
+ if (IsSupportedLoopInduction(NonRdxPhi, Kind))
+ return InstDesc(I, Kind);
return InstDesc(false, I);
+ }
+
+ if (IsSupportedLoopInduction(NonRdxPhi, RecurKind::FindFirstIVSMin))
+ return InstDesc(I, RecurKind::FindFirstIVSMin);
+ if (IsSupportedLoopInduction(NonRdxPhi, RecurKind::FindFirstIVUMin))
+ return InstDesc(I, RecurKind::FindFirstIVUMin);
- return InstDesc(I, RecurKind::FindLastIV);
+ return InstDesc(false, I);
}
RecurrenceDescriptor::InstDesc
@@ -875,8 +933,8 @@ RecurrenceDescriptor::InstDesc RecurrenceDescriptor::isRecurrenceInstr(
if (Kind == RecurKind::FAdd || Kind == RecurKind::FMul ||
Kind == RecurKind::Add || Kind == RecurKind::Mul)
return isConditionalRdxPattern(Kind, I);
- if (isFindLastIVRecurrenceKind(Kind) && SE)
- return isFindLastIVPattern(L, OrigPhi, I, *SE);
+ if (isFindIVRecurrenceKind(Kind) && SE)
+ return isFindIVPattern(Kind, L, OrigPhi, I, *SE);
[[fallthrough]];
case Instruction::FCmp:
case Instruction::ICmp:
@@ -990,6 +1048,12 @@ bool RecurrenceDescriptor::isReductionPHI(PHINode *Phi, Loop *TheLoop,
LLVM_DEBUG(dbgs() << "Found a FindLastIV reduction PHI." << *Phi << "\n");
return true;
}
+ if (AddReductionVar(Phi, RecurKind::FindFirstIVUMin, TheLoop, FMF, RedDes, DB,
+ AC, DT, SE)) {
+ LLVM_DEBUG(dbgs() << "Found a FindFirstV reduction PHI." << *Phi << "\n");
+ return true;
+ }
+
if (AddReductionVar(Phi, RecurKind::FMul, TheLoop, FMF, RedDes, DB, AC, DT,
SE)) {
LLVM_DEBUG(dbgs() << "Found an FMult reduction PHI." << *Phi << "\n");
@@ -1153,6 +1217,8 @@ unsigned RecurrenceDescriptor::getOpcode(RecurKind Kind) {
case RecurKind::SMin:
case RecurKind::UMax:
case RecurKind::UMin:
+ case RecurKind::FindFirstIVUMin:
+ case RecurKind::FindFirstIVSMin:
return Instruction::ICmp;
case RecurKind::FMax:
case RecurKind::FMin:
diff --git a/llvm/lib/Transforms/Utils/LoopUtils.cpp b/llvm/lib/Transforms/Utils/LoopUtils.cpp
index 2fff9521017ff..973f33685beed 100644
--- a/llvm/lib/Transforms/Utils/LoopUtils.cpp
+++ b/llvm/lib/Transforms/Utils/LoopUtils.cpp
@@ -1244,12 +1244,16 @@ Value *llvm::createAnyOfReduction(IRBuilderBase &Builder, Value *Src,
Value *llvm::createFindLastIVReduction(IRBuilderBase &Builder, Value *Src,
Value *Start,
const RecurrenceDescriptor &Desc) {
- assert(RecurrenceDescriptor::isFindLastIVRecurrenceKind(
- Desc.getRecurrenceKind()) &&
- "Unexpected reduction kind");
+ assert(
+ RecurrenceDescriptor::isFindIVRecurrenceKind(Desc.getRecurrenceKind()) &&
+ "Unexpected reduction kind");
Value *Sentinel = Desc.getSentinelValue();
Value *MaxRdx = Src->getType()->isVectorTy()
- ? Builder.CreateIntMaxReduce(Src, true)
+ ? (Desc.getRecurrenceKind() == RecurKind::FindLastIV
+ ? Builder.CreateIntMaxReduce(Src, true)
+ : Builder.CreateIntMinReduce(
+ Src, Desc.getRecurrenceKind() ==
+ RecurKind::FindFirstIVSMin))
: Src;
// Correct the final reduction result back to the start value if the maximum
// reduction is sentinel value.
@@ -1345,8 +1349,8 @@ Value *llvm::createSimpleReduction(IRBuilderBase &Builder, Value *Src,
Value *llvm::createSimpleReduction(VectorBuilder &VBuilder, Value *Src,
RecurKind Kind) {
assert(!RecurrenceDescriptor::isAnyOfRecurrenceKind(Kind) &&
- !RecurrenceDescriptor::isFindLastIVRecurrenceKind(Kind) &&
- "AnyOf or FindLastIV reductions are not supported.");
+ !RecurrenceDescriptor::isFindIVRecurrenceKind(Kind) &&
+ "AnyOf, FindFirstIV and FindLastIV reductions are not supported.");
Intrinsic::ID Id = getReductionIntrinsicID(Kind);
auto *SrcTy = cast<VectorType>(Src->getType());
Type *SrcEltTy = SrcTy->getElementType();
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 8636550d4f644..1a8016d93221a 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -5176,7 +5176,7 @@ LoopVectorizationCostModel::selectInterleaveCount(VPlan &Plan, ElementCount VF,
const RecurrenceDescriptor &RdxDesc = Reduction.second;
RecurKind RK = RdxDesc.getRecurrenceKind();
return RecurrenceDescriptor::isAnyOfRecurrenceKind(RK) ||
- RecurrenceDescriptor::isFindLastIVRecurrenceKind(RK);
+ RecurrenceDescriptor::isFindIVRecurrenceKind(RK);
});
if (HasSelectCmpReductions) {
LLVM_DEBUG(dbgs() << "LV: Not interleaving select-cmp reductions.\n");
@@ -7549,7 +7549,7 @@ static void fixReductionScalarResumeWhenVectorizingEpilog(
auto *EpiRedResult = dyn_cast<VPInstruction>(R);
if (!EpiRedResult ||
(EpiRedResult->getOpcode() != VPInstruction::ComputeReductionResult &&
- EpiRedResult->getOpcode() != VPInstruction::ComputeFindLastIVResult))
+ EpiRedResult->getOpcode() != VPInstruction::ComputeFindIVResult))
return;
auto *EpiRedHeaderPhi =
@@ -7567,7 +7567,7 @@ static void fixReductionScalarResumeWhenVectorizingEpilog(
"AnyOf expected to start by comparing main resume value to original "
"start value");
MainResumeValue = Cmp->getOperand(0);
- } else if (RecurrenceDescriptor::isFindLastIVRecurrenceKind(
+ } else if (RecurrenceDescriptor::isFindIVRecurrenceKind(
RdxDesc.getRecurrenceKind())) {
using namespace llvm::PatternMatch;
Value *Cmp, *OrigResumeV, *CmpOp;
@@ -8499,7 +8499,7 @@ bool VPRecipeBuilder::getScaledReductions(
Instruction *PHI, Instruction *RdxExitInstr, VFRange &Range,
SmallVectorImpl<std::pair<PartialReductionChain, unsigned>> &Chains) {
- if (!CM.TheLoop->contains(RdxExitInstr))
+ if (!RdxExitInstr || !CM.TheLoop->contains(RdxExitInstr))
return false;
auto *Update = dyn_cast<BinaryOperator>(RdxExitInstr);
@@ -9360,7 +9360,7 @@ void LoopVectorizationPlanner::adjustRecipesForReductions(
RecurKind Kind = RdxDesc.getRecurrenceKind();
assert(
!RecurrenceDescriptor::isAnyOfRecurrenceKind(Kind) &&
- !RecurrenceDescriptor::isFindLastIVRecurrenceKind(Kind) &&
+ !RecurrenceDescriptor::isFindIVRecurrenceKind(Kind) &&
"AnyOf and FindLast reductions are not allowed for in-loop reductions");
// Collect the chain of "link" recipes for the reduction starting at PhiR.
@@ -9517,7 +9517,7 @@ void LoopVectorizationPlanner::adjustRecipesForReductions(
(cast<VPInstru...
[truncated]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a good idea that this patch avoids introducing a new RecurKind specifically for min/max with index.
However, I believe it would be more appropriate not to reuse AddReductionVars when idiom detection. The scalar phi for min/max doesn't really align with the current definition of a reduction. A reduction phi shouldn't be used by operations inside the loop that are not part of the recurrence chain, and it also needs to have users outside the loop.
So I think it should be detected independently.
Of course, a better approach would be to first identify the recurrence chain, and then determine whether it's a reduction.
Here's the source code we use for the detection part: #141467
As for the IR generation, I'd like to keep it internal for a while longer to test things before releasing it.
Yep that's a good point! I was initially planning to split off detection of the pattern separately as follow-up, but it seems better to do it to start with. I started sharing a few refactoring patches to make it simpler to create reduction phi recipes in VPlan (final one in the series is #142322) and will update the PR here in the next few days. |
0554871
to
9019ec3
Compare
Updated the latest version to perform the check for the pattern in VPlan. The main idea is that we allow unclassified phis in legal and hand off the VPlan to check if we can the remaining VPWidenPHIRecipes to min/max reductions used by FindLast/FirstIV. The current patch just supports the FindLastIV case, to not make it depend on #140451. It now depends on #142322 to make it easy to create VPReductionPHIRecipes from VPlan w/o RecurrenceDescriptor. WDYT? |
#142322 is good and will definitely be useful in the future. However, I’m a bit confused about VPlanTransforms::legalizeUnclassifiedPhis. Do we have any plans to move legality checks into the VPlan transformation phase? What are the benefits of doing this? Separately, I submitted #141467 and #142335, but they haven’t received any review yet. I’d like to understand if there are any major design issues with my patches that are preventing them from being accepted. |
We already do some legality checks in VPlan. The main one is for fixed-order recurrences, where we bail out if we fail to re-order the users of a recurrence. A number of VPlan transformations also check legality on VPlan before applying a transformation (including convertToSingleScalar). There are also some transforms that are required for legality, but work independently of LoopVectorizationLegality (e.g. dropPoisonGeneratingRecipes and the existing adjustRecipesForReductions, that converts recipes as needed for in-loop reductions). I see the motivation similar to moving the cost model to be VPlan-based. Performing legality checks in VPlan allows combining checks directly with the transformations, which can be simpler/more direct due to not first requiring to collect the information separately in one place and then trying to translate it later to the already constructed VPlan based on IR references. It can also encourage more composable recipes. Many/most improvements over the last few years built on top of VPlan instead of the existing legacy code, and overall the approach has been very successful in my opinion. In general, being able to perform legality checks directly on VPlan can also allow more flexibility, as we do not have to commit to a particular strategy (or check all possible strategies) early on (e.g. tail-folding vs. not folding the tail), and some legality issues can be resolved by other transforms that can be applied directly (e.g sinking/hoisting for fixed-order recurrences, peeling, distribution). I think new capabilities serve as a great opportunity to improve and evolve the infrastructure to make sure the current VPlan representation is flexible and expressive enough to drive them
In our current setup, the transform happens after adjustReciepsForReductions. I think we can’t easily create in-loop reductions after that as-is. But I think this is just a limitation of the current order. As follow-ups, we should be able to improve the order like this: 1) Add ComputeReductionResults and update exit users, 2) Legalize the remaining phis, and 3) Convert recipes for in-loop reductions. As for tail folding with exit users, this won’t work with the current canFoldTailByMasking. But as I mentioned earlier, this could be a good chance to rethink/evolve how and when we commit to tail folding and check its legality.
Yep thanks! I originally shared the patch to get the ball rolling and to better understand if performing the checks directly based on VPlan would be feasible as an option, so we can consider the trade-offs more concretely. #141467 and #142335 fit into the current system. One potential drawback is that it requires carrying state collected in IVDescriptors via LoopVectorizationLegality to LoopVectorize, which adds complexity. For this particular case, there may still be some limitations to VPlan currently that make it desirable not to use a VPlan-first approach here, but as I mentioned earlier, it may serve as a good opportunity to evolve the infrastructure and pave the road for further modularization/VPlan-ization of more legality checks. |
Instead of looking up the narrower reduction type via getRecurrenceType we can generate the needed extend directly at constructiond re-use the truncated value from the loop.
Replace VPReductionPHIRecipe with RecurKind in VPReductionPHIRecipe, as all VPlan analyses and codegen only require the recurrence kind. This enables creating new VPReductionPHIRecipe directly in LV, without needing to construction a whole RecurrenceDescriptor object. Depends on llvm#141860 llvm#141932 llvm#142290 llvm#142291
Add support for vectorizing loops that select the index of the minimum or maximum element. The patch implements vectorizing those patterns by combining Min/Max and FindFirstIV reductions. It extends matching Min/Max reductions to allow in-loop users that are FindLastIV reductions. It records a flag indicating that the Min/Max reduction is used by another reduction. When creating reduction recipes, we process any reduction that has other reduction users. The reduction using the min/max reduction needs adjusting to compute the correct result: 1. We need to find the first IV for which the condition based on the min/max reduction is true, 2. Compare the partial min/max reduction result to its final value and, 3. Select the lanes of the partial FindLastIV reductions which correspond to the lanes matching the min/max reduction result.
9019ec3
to
e064211
Compare
Add support for vectorizing loops that select the index of the minimum or
maximum element. The patch implements vectorizing those patterns by
combining Min/Max and FindFirstIV reductions.
It extends matching Min/Max reductions to allow in-loop users that are
FindLastIV reductions. It records a flag indicating that the Min/Max
reduction is used by another reduction.
When creating reduction recipes, we process any reduction that has other
reduction users. The reduction using the min/max reduction needs adjusting
to compute the correct result:
min/max reduction is true,
to the lanes matching the min/max reduction result.
Depends on #140451