Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions llvm/include/llvm/Analysis/IVDescriptors.h
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ class Loop;
class PredicatedScalarEvolution;
class ScalarEvolution;
class SCEV;
class SCEVAddRecExpr;
class StoreInst;

/// These are the kinds of recurrences that we support.
Expand Down Expand Up @@ -310,6 +311,11 @@ class RecurrenceDescriptor {
isFindLastIVRecurrenceKind(Kind);
}

/// Returns true if \p AR's range is valid for either FindFirstIV or
/// FindLastIV reductions i.e. if the sentinel value is outside \p AR's range.
Comment on lines +314 to +315
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sentinel value should be outside AR's range, when FindFirstIV or FindLastIV work with a condition that may be false for all values inside AR's range, as in that case the out-of-bound sentinel value helps to convey that answer. But for argmin/argmax with strict/FirstIV or non-strict/LastIV, the condition is guaranteed to be true at-least once inside AR's range. So having a sentinel inside AR's range should be fine?

Follow-up thought: even if the condition may be false for all values inside AR's range, but the final value to return in that case coincides with the natural sentinel inside AR's range, it could potentially also be optimized as above. E.g., suppose FindLastIV work with an arbitrary condition a[i]>0 in the range [0,100], where if all a's are zeroes the value to return is 0.

Another thought: argmin/argmax is interesting not only because it's condition is guaranteed to hold at-least once in the range, but also because it supports FindFirstIV naturally (w/o negating the IV - which conceptually still means it's a FindLastIV but on a decreasing derived IV) - by relying on the running min/max recurrence whose strict comparison precludes repeated occurrences. This relates to a discussion we had on some earlier patch, where adding another, dependent (boolean "found") recurrence was found to have significant negative impact.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if currently the condition must hold at least once unless I am missing something, because there are no restrictions on the start values e.g. the minimum reduction does not have to start at the max value, so the IV may never get selected in the loop, and the minimuxm index reduction may start at a value that's different to the IV start value.

But IV start == min.idx start and condition known to be true at least once are good to optimize separately. Note that for the initial FindLastIV, the range is checked during the IVDescriptor analysis.

static bool isValidIVRangeForFindIV(const SCEVAddRecExpr *AR, bool IsSigned,
bool IsFindFirstIV, ScalarEvolution &SE);

/// Returns the type of the recurrence. This type can be narrower than the
/// actual type of the Phi if the recurrence has been type-promoted.
Type *getRecurrenceType() const { return RecurrenceType; }
Expand Down
73 changes: 39 additions & 34 deletions llvm/lib/Analysis/IVDescriptors.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -715,6 +715,36 @@ RecurrenceDescriptor::isAnyOfPattern(Loop *Loop, PHINode *OrigPhi,
return InstDesc(I, RecurKind::AnyOf);
}

bool RecurrenceDescriptor::isValidIVRangeForFindIV(const SCEVAddRecExpr *AR,
bool IsSigned,
bool IsFindFirstIV,
ScalarEvolution &SE) {
const ConstantRange IVRange =
IsSigned ? SE.getSignedRange(AR) : SE.getUnsignedRange(AR);
unsigned NumBits = AR->getType()->getIntegerBitWidth();
ConstantRange ValidRange = ConstantRange::getEmpty(NumBits);

if (IsFindFirstIV) {
if (IsSigned)
ValidRange =
ConstantRange::getNonEmpty(APInt::getSignedMinValue(NumBits),
APInt::getSignedMaxValue(NumBits) - 1);
else
ValidRange = ConstantRange::getNonEmpty(APInt::getMinValue(NumBits),
APInt::getMaxValue(NumBits) - 1);
} else {
APInt Sentinel = IsSigned ? APInt::getSignedMinValue(NumBits)
: APInt::getMinValue(NumBits);
ValidRange = ConstantRange::getNonEmpty(Sentinel + 1, Sentinel);
}

LLVM_DEBUG(dbgs() << "LV: " << (IsFindFirstIV ? "FindFirstIV" : "FindLastIV")
<< " valid range is " << ValidRange << ", and the range of "
<< *AR << " is " << IVRange << "\n");

return ValidRange.contains(IVRange);
}

// We are looking for loops that do something like this:
// int r = 0;
// for (int i = 0; i < n; i++) {
Expand Down Expand Up @@ -792,49 +822,24 @@ RecurrenceDescriptor::isFindIVPattern(RecurKind Kind, Loop *TheLoop,
// [Signed|Unsigned]Max(<recurrence type>) for FindFirstIV.
// TODO: This range restriction can be lifted by adding an additional
// virtual OR reduction.
auto CheckRange = [&](bool IsSigned) {
const ConstantRange IVRange =
IsSigned ? SE.getSignedRange(AR) : SE.getUnsignedRange(AR);
unsigned NumBits = Ty->getIntegerBitWidth();
ConstantRange ValidRange = ConstantRange::getEmpty(NumBits);
if (isFindLastIVRecurrenceKind(Kind)) {
APInt Sentinel = IsSigned ? APInt::getSignedMinValue(NumBits)
: APInt::getMinValue(NumBits);
ValidRange = ConstantRange::getNonEmpty(Sentinel + 1, Sentinel);
} else {
if (IsSigned)
ValidRange =
ConstantRange::getNonEmpty(APInt::getSignedMinValue(NumBits),
APInt::getSignedMaxValue(NumBits) - 1);
else
ValidRange = ConstantRange::getNonEmpty(
APInt::getMinValue(NumBits), APInt::getMaxValue(NumBits) - 1);
}

LLVM_DEBUG(dbgs() << "LV: "
<< (isFindLastIVRecurrenceKind(Kind) ? "FindLastIV"
: "FindFirstIV")
<< " valid range is " << ValidRange
<< ", and the range of " << *AR << " is " << IVRange
<< "\n");

// Ensure the induction variable does not wrap around by verifying that
// its range is fully contained within the valid range.
return ValidRange.contains(IVRange);
};
bool IsFindFirstIV = isFindFirstIVRecurrenceKind(Kind);
if (isFindLastIVRecurrenceKind(Kind)) {
if (CheckRange(true))
if (RecurrenceDescriptor::isValidIVRangeForFindIV(
cast<SCEVAddRecExpr>(AR), /*IsSigned=*/true, IsFindFirstIV, SE))
return RecurKind::FindLastIVSMax;
if (CheckRange(false))
if (RecurrenceDescriptor::isValidIVRangeForFindIV(
cast<SCEVAddRecExpr>(AR), /*IsSigned=*/false, IsFindFirstIV, SE))
return RecurKind::FindLastIVUMax;
return std::nullopt;
}
assert(isFindFirstIVRecurrenceKind(Kind) &&
"Kind must either be a FindLastIV or FindFirstIV");

if (CheckRange(true))
if (RecurrenceDescriptor::isValidIVRangeForFindIV(
cast<SCEVAddRecExpr>(AR), /*IsSigned=*/true, IsFindFirstIV, SE))
return RecurKind::FindFirstIVSMin;
if (CheckRange(false))
if (RecurrenceDescriptor::isValidIVRangeForFindIV(
cast<SCEVAddRecExpr>(AR), /*IsSigned=*/false, IsFindFirstIV, SE))
return RecurKind::FindFirstIVUMin;
return std::nullopt;
};
Expand Down
3 changes: 1 addition & 2 deletions llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -8587,8 +8587,7 @@ VPlanPtr LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes(

// Apply mandatory transformation to handle reductions with multiple in-loop
// uses if possible, bail out otherwise.
if (!VPlanTransforms::runPass(VPlanTransforms::handleMultiUseReductions,
*Plan))
if (!VPlanTransforms::handleMultiUseReductions(*Plan, *PSE.getSE(), OrigLoop))
return nullptr;
// Apply mandatory transformation to handle FP maxnum/minnum reduction with
// NaNs if possible, bail out otherwise.
Expand Down
74 changes: 63 additions & 11 deletions llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,16 @@

#include "LoopVectorizationPlanner.h"
#include "VPlan.h"
#include "VPlanAnalysis.h"
#include "VPlanCFG.h"
#include "VPlanDominatorTree.h"
#include "VPlanPatternMatch.h"
#include "VPlanTransforms.h"
#include "VPlanUtils.h"
#include "llvm/Analysis/LoopInfo.h"
#include "llvm/Analysis/LoopIterator.h"
#include "llvm/Analysis/ScalarEvolution.h"
#include "llvm/Analysis/ScalarEvolutionExpressions.h"
#include "llvm/IR/InstrTypes.h"
#include "llvm/IR/MDBuilder.h"
#include "llvm/Transforms/Utils/LoopUtils.h"
Expand Down Expand Up @@ -997,7 +1000,48 @@ bool VPlanTransforms::handleMaxMinNumReductions(VPlan &Plan) {
return true;
}

bool VPlanTransforms::handleMultiUseReductions(VPlan &Plan) {
/// Try to convert FindLastIV to FindFirstIV reduction when using a strict
/// predicate. Returns the new FindFirstIVPhiR on success, nullptr on failure.
static VPReductionPHIRecipe *
tryConvertToFindFirstIV(VPlan &Plan, VPReductionPHIRecipe *FindLastIVPhiR,
VPValue *IVOp, ScalarEvolution &SE, const Loop *L) {
Type *Ty = VPTypeAnalysis(Plan).inferScalarType(FindLastIVPhiR);
unsigned NumBits = Ty->getIntegerBitWidth();

// Determine the reduction kind and sentinel based on the IV range.
RecurKind NewKind;
VPValue *NewSentinel;
auto *AR = cast<SCEVAddRecExpr>(vputils::getSCEVExprForVPValue(IVOp, SE, L));
if (RecurrenceDescriptor::isValidIVRangeForFindIV(
AR, /*IsSigned=*/true, /*IsFindFirstIV=*/true, SE)) {
NewKind = RecurKind::FindFirstIVSMin;
NewSentinel = Plan.getConstantInt(APInt::getSignedMaxValue(NumBits));
} else if (RecurrenceDescriptor::isValidIVRangeForFindIV(
AR, /*IsSigned=*/false, /*IsFindFirstIV=*/true, SE)) {
NewKind = RecurKind::FindFirstIVUMin;
NewSentinel = Plan.getConstantInt(APInt::getMaxValue(NumBits));
} else {
return nullptr;
}

// Create the new FindFirstIV reduction recipe.
assert(!FindLastIVPhiR->isInLoop() && !FindLastIVPhiR->isOrdered());
ReductionStyle Style = RdxUnordered{FindLastIVPhiR->getVFScaleFactor()};
auto *FindFirstIVPhiR =
new VPReductionPHIRecipe(nullptr, NewKind, *NewSentinel, Style,
FindLastIVPhiR->hasUsesOutsideReductionChain());
FindFirstIVPhiR->addOperand(FindLastIVPhiR->getBackedgeValue());

FindFirstIVPhiR->insertBefore(FindLastIVPhiR);
VPInstruction *FindLastIVResult =
findUserOf<VPInstruction::ComputeFindIVResult>(FindLastIVPhiR);
FindLastIVPhiR->replaceAllUsesWith(FindFirstIVPhiR);
FindLastIVResult->setOperand(2, NewSentinel);
return FindFirstIVPhiR;
}

bool VPlanTransforms::handleMultiUseReductions(VPlan &Plan, ScalarEvolution &SE,
const Loop *L) {
for (auto &PhiR : make_early_inc_range(
Plan.getVectorLoopRegion()->getEntryBasicBlock()->phis())) {
auto *MinMaxPhiR = dyn_cast<VPReductionPHIRecipe>(&PhiR);
Expand Down Expand Up @@ -1080,33 +1124,41 @@ bool VPlanTransforms::handleMultiUseReductions(VPlan &Plan) {
FindIVPhiR->getRecurrenceKind()))
return false;

assert(!FindIVPhiR->isInLoop() && !FindIVPhiR->isOrdered() &&
"cannot handle inloop/ordered reductions yet");

// TODO: Support cases where IVOp is the IV increment.
if (!match(IVOp, m_TruncOrSelf(m_VPValue(IVOp))) ||
!isa<VPWidenIntOrFpInductionRecipe>(IVOp))
return false;

CmpInst::Predicate RdxPredicate = [RdxKind]() {
// Check if the predicate is compatible with the reduction kind.
bool IsValidPredicate = [RdxKind, Pred]() {
switch (RdxKind) {
case RecurKind::UMin:
return CmpInst::ICMP_UGE;
return Pred == CmpInst::ICMP_UGE || Pred == CmpInst::ICMP_UGT;
case RecurKind::UMax:
return CmpInst::ICMP_ULE;
return Pred == CmpInst::ICMP_ULE || Pred == CmpInst::ICMP_ULT;
case RecurKind::SMax:
return CmpInst::ICMP_SLE;
return Pred == CmpInst::ICMP_SLE || Pred == CmpInst::ICMP_SLT;
case RecurKind::SMin:
return CmpInst::ICMP_SGE;
return Pred == CmpInst::ICMP_SGE || Pred == CmpInst::ICMP_SGT;
default:
llvm_unreachable("unhandled recurrence kind");
}
}();

// TODO: Strict predicates need to find the first IV value for which the
// predicate holds, not the last.
if (Pred != RdxPredicate)
if (!IsValidPredicate)
return false;

assert(!FindIVPhiR->isInLoop() && !FindIVPhiR->isOrdered() &&
"cannot handle inloop/ordered reductions yet");
// For strict predicates, transform try to convert FindLastIV to
// FindFirstIV.
bool IsStrictPredicate = ICmpInst::isLT(Pred) || ICmpInst::isGT(Pred);
if (IsStrictPredicate) {
FindIVPhiR = tryConvertToFindFirstIV(Plan, FindIVPhiR, IVOp, SE, L);
if (!FindIVPhiR)
return false;
}

// The reduction using MinMaxPhiR needs adjusting to compute the correct
// result:
Expand Down
1 change: 1 addition & 0 deletions llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,7 @@ bool VPRecipeBase::mayHaveSideEffects() const {
return cast<VPExpressionRecipe>(this)->mayHaveSideEffects();
case VPDerivedIVSC:
case VPFirstOrderRecurrencePHISC:
case VPReductionPHISC:
case VPPredInstPHISC:
case VPVectorEndPointerSC:
return false;
Expand Down
7 changes: 4 additions & 3 deletions llvm/lib/Transforms/Vectorize/VPlanTransforms.h
Original file line number Diff line number Diff line change
Expand Up @@ -146,9 +146,10 @@ struct VPlanTransforms {
const TargetLibraryInfo &TLI);

/// Try to legalize reductions with multiple in-loop uses. Currently only
/// min/max reductions used by FindLastIV reductions are supported. Otherwise
/// return false.
static bool handleMultiUseReductions(VPlan &Plan);
/// min/max reductions used by FindLastIV and FindFirstIV reductions are
/// supported. Otherwise return false.
static bool handleMultiUseReductions(VPlan &Plan, ScalarEvolution &SE,
const Loop *L);

/// Try to have all users of fixed-order recurrences appear after the recipe
/// defining their previous value, by either sinking users or hoisting recipes
Expand Down
Loading