-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[VPlan] Detect and create partial reductions in VPlan. (NFCI) #167851
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@llvm/pr-subscribers-vectorizers @llvm/pr-subscribers-llvm-transforms Author: Florian Hahn (fhahn) ChangesAs a first step, move the existing partial reduction detection logic to With this, partial reductions are detected and created together in a This allows forming partial reductions and bundling them up if Currently incldues #167846 and #167506 Patch is 187.38 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/167851.diff 11 Files Affected:
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 835b0995cc4fc..cd614e5e877e7 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -7969,177 +7969,7 @@ VPReplicateRecipe *VPRecipeBuilder::handleReplication(VPInstruction *VPI,
return Recipe;
}
-/// Find all possible partial reductions in the loop and track all of those that
-/// are valid so recipes can be formed later.
-void VPRecipeBuilder::collectScaledReductions(VFRange &Range) {
- // Find all possible partial reductions.
- SmallVector<std::pair<PartialReductionChain, unsigned>>
- PartialReductionChains;
- for (const auto &[Phi, RdxDesc] : Legal->getReductionVars()) {
- getScaledReductions(Phi, RdxDesc.getLoopExitInstr(), Range,
- PartialReductionChains);
- }
-
- // A partial reduction is invalid if any of its extends are used by
- // something that isn't another partial reduction. This is because the
- // extends are intended to be lowered along with the reduction itself.
-
- // Build up a set of partial reduction ops for efficient use checking.
- SmallPtrSet<User *, 4> PartialReductionOps;
- for (const auto &[PartialRdx, _] : PartialReductionChains)
- PartialReductionOps.insert(PartialRdx.ExtendUser);
-
- auto ExtendIsOnlyUsedByPartialReductions =
- [&PartialReductionOps](Instruction *Extend) {
- return all_of(Extend->users(), [&](const User *U) {
- return PartialReductionOps.contains(U);
- });
- };
-
- // Check if each use of a chain's two extends is a partial reduction
- // and only add those that don't have non-partial reduction users.
- for (auto Pair : PartialReductionChains) {
- PartialReductionChain Chain = Pair.first;
- if (ExtendIsOnlyUsedByPartialReductions(Chain.ExtendA) &&
- (!Chain.ExtendB || ExtendIsOnlyUsedByPartialReductions(Chain.ExtendB)))
- ScaledReductionMap.try_emplace(Chain.Reduction, Pair.second);
- }
-
- // Check that all partial reductions in a chain are only used by other
- // partial reductions with the same scale factor. Otherwise we end up creating
- // users of scaled reductions where the types of the other operands don't
- // match.
- for (const auto &[Chain, Scale] : PartialReductionChains) {
- auto AllUsersPartialRdx = [ScaleVal = Scale, this](const User *U) {
- auto *UI = cast<Instruction>(U);
- if (isa<PHINode>(UI) && UI->getParent() == OrigLoop->getHeader()) {
- return all_of(UI->users(), [ScaleVal, this](const User *U) {
- auto *UI = cast<Instruction>(U);
- return ScaledReductionMap.lookup_or(UI, 0) == ScaleVal;
- });
- }
- return ScaledReductionMap.lookup_or(UI, 0) == ScaleVal ||
- !OrigLoop->contains(UI->getParent());
- };
- if (!all_of(Chain.Reduction->users(), AllUsersPartialRdx))
- ScaledReductionMap.erase(Chain.Reduction);
- }
-}
-
-bool VPRecipeBuilder::getScaledReductions(
- Instruction *PHI, Instruction *RdxExitInstr, VFRange &Range,
- SmallVectorImpl<std::pair<PartialReductionChain, unsigned>> &Chains) {
- if (!CM.TheLoop->contains(RdxExitInstr))
- return false;
-
- auto *Update = dyn_cast<BinaryOperator>(RdxExitInstr);
- if (!Update)
- return false;
-
- Value *Op = Update->getOperand(0);
- Value *PhiOp = Update->getOperand(1);
- if (Op == PHI)
- std::swap(Op, PhiOp);
-
- // Try and get a scaled reduction from the first non-phi operand.
- // If one is found, we use the discovered reduction instruction in
- // place of the accumulator for costing.
- if (auto *OpInst = dyn_cast<Instruction>(Op)) {
- if (getScaledReductions(PHI, OpInst, Range, Chains)) {
- PHI = Chains.rbegin()->first.Reduction;
-
- Op = Update->getOperand(0);
- PhiOp = Update->getOperand(1);
- if (Op == PHI)
- std::swap(Op, PhiOp);
- }
- }
- if (PhiOp != PHI)
- return false;
-
- using namespace llvm::PatternMatch;
-
- // If the update is a binary operator, check both of its operands to see if
- // they are extends. Otherwise, see if the update comes directly from an
- // extend.
- Instruction *Exts[2] = {nullptr};
- BinaryOperator *ExtendUser = dyn_cast<BinaryOperator>(Op);
- std::optional<unsigned> BinOpc;
- Type *ExtOpTypes[2] = {nullptr};
- TTI::PartialReductionExtendKind ExtKinds[2] = {TTI::PR_None};
-
- auto CollectExtInfo = [this, &Exts, &ExtOpTypes,
- &ExtKinds](SmallVectorImpl<Value *> &Ops) -> bool {
- for (const auto &[I, OpI] : enumerate(Ops)) {
- const APInt *C;
- if (I > 0 && match(OpI, m_APInt(C)) &&
- canConstantBeExtended(C, ExtOpTypes[0], ExtKinds[0])) {
- ExtOpTypes[I] = ExtOpTypes[0];
- ExtKinds[I] = ExtKinds[0];
- continue;
- }
- Value *ExtOp;
- if (!match(OpI, m_ZExtOrSExt(m_Value(ExtOp))))
- return false;
- Exts[I] = cast<Instruction>(OpI);
-
- // TODO: We should be able to support live-ins.
- if (!CM.TheLoop->contains(Exts[I]))
- return false;
-
- ExtOpTypes[I] = ExtOp->getType();
- ExtKinds[I] = TTI::getPartialReductionExtendKind(Exts[I]);
- }
- return true;
- };
- if (ExtendUser) {
- if (!ExtendUser->hasOneUse())
- return false;
-
- // Use the side-effect of match to replace BinOp only if the pattern is
- // matched, we don't care at this point whether it actually matched.
- match(ExtendUser, m_Neg(m_BinOp(ExtendUser)));
-
- SmallVector<Value *> Ops(ExtendUser->operands());
- if (!CollectExtInfo(Ops))
- return false;
-
- BinOpc = std::make_optional(ExtendUser->getOpcode());
- } else if (match(Update, m_Add(m_Value(), m_Value()))) {
- // We already know the operands for Update are Op and PhiOp.
- SmallVector<Value *> Ops({Op});
- if (!CollectExtInfo(Ops))
- return false;
-
- ExtendUser = Update;
- BinOpc = std::nullopt;
- } else
- return false;
-
- PartialReductionChain Chain(RdxExitInstr, Exts[0], Exts[1], ExtendUser);
-
- TypeSize PHISize = PHI->getType()->getPrimitiveSizeInBits();
- TypeSize ASize = ExtOpTypes[0]->getPrimitiveSizeInBits();
- if (!PHISize.hasKnownScalarFactor(ASize))
- return false;
- unsigned TargetScaleFactor = PHISize.getKnownScalarFactor(ASize);
-
- if (LoopVectorizationPlanner::getDecisionAndClampRange(
- [&](ElementCount VF) {
- InstructionCost Cost = TTI->getPartialReductionCost(
- Update->getOpcode(), ExtOpTypes[0], ExtOpTypes[1],
- PHI->getType(), VF, ExtKinds[0], ExtKinds[1], BinOpc,
- CM.CostKind);
- return Cost.isValid();
- },
- Range)) {
- Chains.emplace_back(Chain, TargetScaleFactor);
- return true;
- }
-
- return false;
-}
VPRecipeBase *VPRecipeBuilder::tryToCreateWidenRecipe(VPSingleDefRecipe *R,
VFRange &Range) {
@@ -8167,12 +7997,11 @@ VPRecipeBase *VPRecipeBuilder::tryToCreateWidenRecipe(VPSingleDefRecipe *R,
assert(RdxDesc.getRecurrenceStartValue() ==
Phi->getIncomingValueForBlock(OrigLoop->getLoopPreheader()));
- // If the PHI is used by a partial reduction, set the scale factor.
- unsigned ScaleFactor =
- getScalingForReduction(RdxDesc.getLoopExitInstr()).value_or(1);
- PhiRecipe = new VPReductionPHIRecipe(
- Phi, RdxDesc.getRecurrenceKind(), *StartV, CM.isInLoopReduction(Phi),
- CM.useOrderedReductions(RdxDesc), ScaleFactor);
+ // Always create with scale factor 1. Partial reductions will be created
+ // later in createPartialReductions transform.
+ PhiRecipe = new VPReductionPHIRecipe(Phi, RdxDesc.getRecurrenceKind(),
+ *StartV, CM.isInLoopReduction(Phi),
+ CM.useOrderedReductions(RdxDesc));
} else {
// TODO: Currently fixed-order recurrences are modeled as chains of
// first-order recurrences. If there are no users of the intermediate
@@ -8208,9 +8037,6 @@ VPRecipeBase *VPRecipeBuilder::tryToCreateWidenRecipe(VPSingleDefRecipe *R,
VPI->getOpcode() == Instruction::Store)
return tryToWidenMemory(VPI, Range);
- if (std::optional<unsigned> ScaleFactor = getScalingForReduction(Instr))
- return tryToCreatePartialReduction(VPI, ScaleFactor.value());
-
if (!shouldWiden(Instr, Range))
return nullptr;
@@ -8230,40 +8056,6 @@ VPRecipeBase *VPRecipeBuilder::tryToCreateWidenRecipe(VPSingleDefRecipe *R,
return tryToWiden(VPI);
}
-VPRecipeBase *
-VPRecipeBuilder::tryToCreatePartialReduction(VPInstruction *Reduction,
- unsigned ScaleFactor) {
- assert(Reduction->getNumOperands() == 2 &&
- "Unexpected number of operands for partial reduction");
-
- VPValue *BinOp = Reduction->getOperand(0);
- VPValue *Accumulator = Reduction->getOperand(1);
- if (isa<VPReductionPHIRecipe>(BinOp) || isa<VPPartialReductionRecipe>(BinOp))
- std::swap(BinOp, Accumulator);
-
- assert(ScaleFactor ==
- vputils::getVFScaleFactor(Accumulator->getDefiningRecipe()) &&
- "all accumulators in chain must have same scale factor");
-
- unsigned ReductionOpcode = Reduction->getOpcode();
- auto *ReductionI = Reduction->getUnderlyingInstr();
- if (ReductionOpcode == Instruction::Sub) {
- auto *const Zero = ConstantInt::get(ReductionI->getType(), 0);
- SmallVector<VPValue *, 2> Ops;
- Ops.push_back(Plan.getOrAddLiveIn(Zero));
- Ops.push_back(BinOp);
- BinOp = new VPWidenRecipe(*ReductionI, Ops);
- Builder.insert(BinOp->getDefiningRecipe());
- ReductionOpcode = Instruction::Add;
- }
-
- VPValue *Cond = nullptr;
- if (CM.blockNeedsPredicationForAnyReason(ReductionI->getParent()))
- Cond = getBlockInMask(Builder.getInsertBlock());
- return new VPPartialReductionRecipe(ReductionOpcode, Accumulator, BinOp, Cond,
- ScaleFactor, ReductionI);
-}
-
void LoopVectorizationPlanner::buildVPlansWithVPRecipes(ElementCount MinVF,
ElementCount MaxVF) {
if (ElementCount::isKnownGT(MinVF, MaxVF))
@@ -8390,9 +8182,8 @@ VPlanPtr LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes(
// Construct wide recipes and apply predication for original scalar
// VPInstructions in the loop.
// ---------------------------------------------------------------------------
- VPRecipeBuilder RecipeBuilder(*Plan, OrigLoop, TLI, &TTI, Legal, CM, PSE,
- Builder, BlockMaskCache, LVer);
- RecipeBuilder.collectScaledReductions(Range);
+ VPRecipeBuilder RecipeBuilder(*Plan, OrigLoop, TLI, Legal, CM, PSE, Builder,
+ BlockMaskCache, LVer);
// Scan the body of the loop in a topological order to visit each basic block
// after having visited its predecessor basic blocks.
@@ -8501,10 +8292,9 @@ VPlanPtr LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes(
*Plan))
return nullptr;
- // Transform recipes to abstract recipes if it is legal and beneficial and
- // clamp the range for better cost estimation.
- // TODO: Enable following transform when the EVL-version of extended-reduction
- // and mulacc-reduction are implemented.
+ // Create partial reduction recipes for scaled reductions.
+ VPlanTransforms::createPartialReductions(*Plan, Range, &TTI, CM.CostKind);
+
if (!CM.foldTailWithEVL()) {
VPCostContext CostCtx(CM.TTI, *CM.TLI, *Plan, CM, CM.CostKind,
*CM.PSE.getSE(), OrigLoop);
@@ -8586,8 +8376,8 @@ VPlanPtr LoopVectorizationPlanner::tryToBuildVPlan(VFRange &Range) {
// Collect mapping of IR header phis to header phi recipes, to be used in
// addScalarResumePhis.
DenseMap<VPBasicBlock *, VPValue *> BlockMaskCache;
- VPRecipeBuilder RecipeBuilder(*Plan, OrigLoop, TLI, &TTI, Legal, CM, PSE,
- Builder, BlockMaskCache, nullptr /*LVer*/);
+ VPRecipeBuilder RecipeBuilder(*Plan, OrigLoop, TLI, Legal, CM, PSE, Builder,
+ BlockMaskCache, nullptr /*LVer*/);
for (auto &R : Plan->getVectorLoopRegion()->getEntryBasicBlock()->phis()) {
if (isa<VPCanonicalIVPHIRecipe>(&R))
continue;
@@ -8941,11 +8731,7 @@ void LoopVectorizationPlanner::adjustRecipesForReductions(
VPBuilder PHBuilder(Plan->getVectorPreheader());
VPValue *Iden = Plan->getOrAddLiveIn(
getRecurrenceIdentity(RK, PhiTy, RdxDesc.getFastMathFlags()));
- // If the PHI is used by a partial reduction, set the scale factor.
- unsigned ScaleFactor =
- RecipeBuilder.getScalingForReduction(RdxDesc.getLoopExitInstr())
- .value_or(1);
- auto *ScaleFactorVPV = Plan->getConstantInt(32, ScaleFactor);
+ auto *ScaleFactorVPV = Plan->getConstantInt(32, 1);
VPValue *StartV = PHBuilder.createNaryOp(
VPInstruction::ReductionStartVector,
{PhiR->getStartValue(), Iden, ScaleFactorVPV},
diff --git a/llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h b/llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h
index a7000aff06379..da3fde64368f4 100644
--- a/llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h
+++ b/llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h
@@ -19,30 +19,9 @@ namespace llvm {
class LoopVectorizationLegality;
class LoopVectorizationCostModel;
class TargetLibraryInfo;
-class TargetTransformInfo;
struct HistogramInfo;
struct VFRange;
-/// A chain of instructions that form a partial reduction.
-/// Designed to match either:
-/// reduction_bin_op (extend (A), accumulator), or
-/// reduction_bin_op (bin_op (extend (A), (extend (B))), accumulator).
-struct PartialReductionChain {
- PartialReductionChain(Instruction *Reduction, Instruction *ExtendA,
- Instruction *ExtendB, Instruction *ExtendUser)
- : Reduction(Reduction), ExtendA(ExtendA), ExtendB(ExtendB),
- ExtendUser(ExtendUser) {}
- /// The top-level binary operation that forms the reduction to a scalar
- /// after the loop body.
- Instruction *Reduction;
- /// The extension of each of the inner binary operation's operands.
- Instruction *ExtendA;
- Instruction *ExtendB;
-
- /// The user of the extends that is then reduced.
- Instruction *ExtendUser;
-};
-
/// Helper class to create VPRecipies from IR instructions.
class VPRecipeBuilder {
/// The VPlan new recipes are added to.
@@ -54,9 +33,6 @@ class VPRecipeBuilder {
/// Target Library Info.
const TargetLibraryInfo *TLI;
- // Target Transform Info.
- const TargetTransformInfo *TTI;
-
/// The legality analysis.
LoopVectorizationLegality *Legal;
@@ -81,9 +57,6 @@ class VPRecipeBuilder {
/// created.
SmallVector<VPHeaderPHIRecipe *, 4> PhisToFix;
- /// A mapping of partial reduction exit instructions to their scaling factor.
- DenseMap<const Instruction *, unsigned> ScaledReductionMap;
-
/// Loop versioning instance for getting noalias metadata guaranteed by
/// runtime checks.
LoopVersioning *LVer;
@@ -125,50 +98,21 @@ class VPRecipeBuilder {
VPHistogramRecipe *tryToWidenHistogram(const HistogramInfo *HI,
VPInstruction *VPI);
- /// Examines reduction operations to see if the target can use a cheaper
- /// operation with a wider per-iteration input VF and narrower PHI VF.
- /// Each element within Chains is a pair with a struct containing reduction
- /// information and the scaling factor between the number of elements in
- /// the input and output.
- /// Recursively calls itself to identify chained scaled reductions.
- /// Returns true if this invocation added an entry to Chains, otherwise false.
- /// i.e. returns false in the case that a subcall adds an entry to Chains,
- /// but the top-level call does not.
- bool getScaledReductions(
- Instruction *PHI, Instruction *RdxExitInstr, VFRange &Range,
- SmallVectorImpl<std::pair<PartialReductionChain, unsigned>> &Chains);
-
public:
VPRecipeBuilder(VPlan &Plan, Loop *OrigLoop, const TargetLibraryInfo *TLI,
- const TargetTransformInfo *TTI,
LoopVectorizationLegality *Legal,
LoopVectorizationCostModel &CM,
PredicatedScalarEvolution &PSE, VPBuilder &Builder,
DenseMap<VPBasicBlock *, VPValue *> &BlockMaskCache,
LoopVersioning *LVer)
- : Plan(Plan), OrigLoop(OrigLoop), TLI(TLI), TTI(TTI), Legal(Legal),
- CM(CM), PSE(PSE), Builder(Builder), BlockMaskCache(BlockMaskCache),
- LVer(LVer) {}
-
- std::optional<unsigned> getScalingForReduction(const Instruction *ExitInst) {
- auto It = ScaledReductionMap.find(ExitInst);
- return It == ScaledReductionMap.end() ? std::nullopt
- : std::make_optional(It->second);
+ : Plan(Plan), OrigLoop(OrigLoop), TLI(TLI), Legal(Legal), CM(CM),
+ PSE(PSE), Builder(Builder), BlockMaskCache(BlockMaskCache), LVer(LVer) {
}
- /// Find all possible partial reductions in the loop and track all of those
- /// that are valid so recipes can be formed later.
- void collectScaledReductions(VFRange &Range);
-
/// Create and return a widened recipe for \p R if one can be created within
/// the given VF \p Range.
VPRecipeBase *tryToCreateWidenRecipe(VPSingleDefRecipe *R, VFRange &Range);
- /// Create and return a partial reduction recipe for a reduction instruction
- /// along with binary operation and reduction phi operands.
- VPRecipeBase *tryToCreatePartialReduction(VPInstruction *Reduction,
- unsigned ScaleFactor);
-
/// Set the recipe created for given ingredient.
void setRecipe(Instruction *I, VPRecipeBase *R) {
assert(!Ingredient2Recipe.contains(I) &&
diff --git a/llvm/lib/Transforms/Vectorize/VPlan.h b/llvm/lib/Transforms/Vectorize/VPlan.h
index 08f77b75400bd..d986688377cbb 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.h
+++ b/llvm/lib/Transforms/Vectorize/VPlan.h
@@ -2386,6 +2386,9 @@ class VPReductionPHIRecipe : public VPHeaderPHIRecipe,
/// Get the factor that the VF of this recipe's output should be scaled by.
unsigned getVFScaleFactor() const { return VFScaleFactor; }
+ /// Set the factor that the VF of this recipe's output should be scaled by.
+ void setVFScaleFactor(unsigned Factor) { VFScaleFactor = Factor; }
+
#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
/// Print the recipe.
void print(raw_ostream &O, const Twine &Indent,
@@ -2808,8 +2811,7 @@ class VPPartialReductionRecipe : public VPReductionRecipe {
VPPartialReductionRecipe *clone() override {
return new VPPartialReductionRecipe(Opcode, getOperand(0), getOperand(1),
- getCondOp(), VFScaleFactor,
- getUnderlyingInstr());
+ getCondOp(), VFScaleFactor);
}
VP_CLASSOF_IMPL(VPDef::VPPartialReductionSC)
diff --git a/llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp b/llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
index 92ff0dcf67927..85be269055955 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
@@ -12,14 +12,19 @@
//===----------------------------------------------------------------------===//
#include "LoopVectorizationPlanner.h"
+#include "VPRecipeBuilder.h"
#include "VPlan.h"
+#include "VPlanAnalysis.h"
#include "VPlanCFG.h"
#include "VPlanDominatorTree.h"
+#include "VPlanHelpers.h"
#include "VPlanPatternMatch.h"
#include "VPlanTransforms.h"
+#include "VPlanUtils.h"
#include "llvm/Analysis/LoopInfo.h"
#include "llvm/Analysis/LoopIterator.h"
#include "llvm/Analysis/ScalarEvolution.h"
+#include "llvm/Analysis/TargetTransformInfo.h"
#include "llvm/IR/InstrTypes.h"
#include "llvm/IR/MDBuilder.h"
@@ -965,3 +970,372 @@ bool VPlanTransforms::handleMaxMinNumReductions(VPlan &Plan) {
MiddleTerm->setOperand(0, NewCond);
return true;
}
+
+namespace {
+/// A VPlan-based chain of recipes that form a partial reduction.
+/// Designed to match either:
+/// reduction_bin_op (extend (A), accumulator), or
+/// reduction_bin_op (bin_op (exten...
[truncated]
|
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
|
Could you use user branches to stack the PR, rather than including the raw changes? That would make it easier to see what this PR does. |
6e07036 to
5ace1fd
Compare
The initial just moves the code to form partial reductions just before
I've updated the PR to handle predicated partial reductions after closing #167506, and removed #167846 as it turned out to be independent. So now it should only include the relevant changes. |
As a first step, move the existing partial reduction detection logic to VPlan, trying to preserve the existing code structure & behavior as closely as possible. With this, partial reductions are detected and created together in a single step. This allows forming partial reductions and bundling them up if profitable together in a follow-up.
5ace1fd to
d2cbd8f
Compare
🐧 Linux x64 Test Results
|
SamTebbs33
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's looking good and I agree that we should move towards creating partial reductions at the same time as bundling into VPExpressionRecipe.
| // Transform recipes to abstract recipes if it is legal and beneficial and | ||
| // clamp the range for better cost estimation. | ||
| // TODO: Enable following transform when the EVL-version of extended-reduction | ||
| // and mulacc-reduction are implemented. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean to remove the whole comment?
| VPWidenCastRecipe *ExtendB; | ||
|
|
||
| /// The user of the extends that is then reduced. | ||
| VPWidenRecipe *ExtendUser; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be a good idea to use optionals for these two variables, to make it easy to handle the null case?
| VPValue *ExitValue = nullptr; | ||
| if (auto *RedPhiR = dyn_cast<VPReductionPHIRecipe>(Accumulator)) { | ||
| ExitValue = findComputeReductionResult(RedPhiR)->getOperand(1); | ||
| match(ExitValue, m_Select(m_VPValue(Cond), m_VPValue(), m_VPValue())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we care about matching a select here? As far as I can see, none of the removed code cares about selects, and the select for the cond is added at execution time, so I'm not sure why we need to care about selects here.
Same for line 1090.
As a first step, move the existing partial reduction detection logic to
VPlan, trying to preserve the existing code structure & behavior as
closely as possible.
With this, partial reductions are detected and created together in a
single step.
This allows forming partial reductions and bundling them up if
profitable together in a follow-up.