Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VPlan] Process simplifyRecipes via a worklist #133977

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

lukel97
Copy link
Contributor

@lukel97 lukel97 commented Apr 1, 2025

Stacked on #134073

Currently simplifyRecipes just traverses each instruction once in order, so if a simplification reveals a new possible combine it will be missed.

This moves it to an InstCombine style worklist which adds the new result and its users to the worklist when a simplification happens.

The transform is somewhat of a mix of InstCombine and InstSimplify, i.e. some simplifications just return values, so any new recipes need to be inserted in simplifyRecipe.

This is the first step to addressing the "Split up into simpler, modular combines" TODO.

I needed to split out the VPBlendRecipe simplifications into a separate transform. Which mask is discarded during normalisation is determined by the number of uses it has. So this makes sure that the masks are simplified completely first before deciding.

lukel97 added a commit to lukel97/llvm-project that referenced this pull request Apr 1, 2025
Stacked on llvm#133977

When looking at some EVL tail folded code in SPEC CPU 2017 I noticed we sometimes have both VPBlendRecipes and select VPInstructions in the same plan:

    EMIT vp<%active.lane.mask> = active lane mask vp<%5>, vp<%3>
    EMIT vp<%7> = icmp ...
    EMIT vp<%8> = logical-and vp<%active.lane.mask>, vp<%7>
    BLEND ir<%8> = ir<%n.015> ir<%foo>/vp<%8>
    EMIT vp<%9> = select vp<%active.lane.mask>, ir<%8>, ir<%n.015>

Since a blend will ultimately generate a chain of selects, we could fold the blend into the select:

    EMIT vp<%active.lane.mask> = active lane mask vp<%5>, vp<%3>
    EMIT vp<%7> = icmp ...
    EMIT vp<%8> = logical-and vp<%active.lane.mask>, vp<%7>
    EMIT ir<%8> = select vp<%8>, ir<%foo>, ir<%n.015>

So this patch canonicalizes blends to a series of select instructions, which allows them to be simplified further with other select instructions.

Eventually we may be able to remove VPBlendRecipes altogether and emit the select VPInstructions directly, but I've left that out of this patch for now as it needs the normalization logic to be rewritten in terms of selects.

The `BLEND %a, %b, Not(%mask) -> BLEND %b, %a, %mask` is converted to selects.

A `select c1, (select c2, x, y), x -> select (c1 & c2), x, y` combine is also added to prevent regressions, without it we end up with extra selects for some reason.

We also need to mark VPInstructions with underlying values as generating vectors in willGenerateVectors to prevent a regression.
VPRecipeBase *R = Worklist.pop_back_val();
if (VPValue *Result = simplifyRecipe(*R, TypeInfo)) {
R->getVPSingleValue()->replaceAllUsesWith(Result);
R->eraseFromParent();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think erasing R might invalidate the VTypeAnalysis cache, I'll take a look to see if this is a problem

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we need eraseFromParent, as there is recursivelyRemoveDeadRecipes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've manually erased the deleted value from the cache in 4e05ab9.

This was an existing issue with the simplification pass but it looks like it didn't trigger in the wild. I think it's probably easiest to fix it in this PR now that we erase all replaced recipes, which should prevent #120252.

I've also moved the VTypeAnalysis assertion up so it should run for all transforms, not just the trunc combine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@artagnon erasing the old instruction decreases the number of uses which can affect what transforms kick in, e.g. the blend simplifications are sensitive to the number of uses.

Also if the old recipe uses the new result, it will get added back to the worklist which will loop

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should mention I took the structure from this from InstructionCombining.cpp! There's a few things there that I think we could eventually copy over, e.g. adding any operands of the recipe to the worklist now that they have one less use.

@lukel97 lukel97 changed the title [VPlan] Process simplifyRecipes via a worklist. NFCI [VPlan] Process simplifyRecipes via a worklist Apr 2, 2025
@llvmbot
Copy link
Member

llvmbot commented Apr 2, 2025

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-vectorizers

Author: Luke Lau (lukel97)

Changes

Currently simplifyRecipes just traverses each instruction once in order, so if a simplification reveals a new possible combine it will be missed.

This moves it to an InstCombine style worklist which adds the new result and its users to the worklist when a simplification happens. It also processes instructions in reverse which should allow for larger patterns to be matched first.

The transform is somewhat of a mix of InstCombine and InstSimplify, i.e. some simplifications just return values, so any new recipes need to be inserted in simplifyRecipe.

This is the first step to addressing the "Split up into simpler, modular combines" TODO.

I needed to split out the VPBlendRecipe simplifications into a separate transform. Now that the instructions are processed in reverse order we were normalising the blends before other simplifications kicked in, but which mask is discarded during normalisation is determined by the number of uses it has. So this makes sure that the masks are simplified first before deciding.


Full diff: https://github.com/llvm/llvm-project/pull/133977.diff

3 Files Affected:

  • (modified) llvm/lib/Transforms/Vectorize/VPlanAnalysis.h (+3)
  • (modified) llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp (+122-96)
  • (modified) llvm/lib/Transforms/Vectorize/VPlanTransforms.h (+3)
diff --git a/llvm/lib/Transforms/Vectorize/VPlanAnalysis.h b/llvm/lib/Transforms/Vectorize/VPlanAnalysis.h
index cc21870bee2e3..ac2a8d997d2e9 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanAnalysis.h
+++ b/llvm/lib/Transforms/Vectorize/VPlanAnalysis.h
@@ -63,6 +63,9 @@ class VPTypeAnalysis {
 
   /// Return the LLVMContext used by the analysis.
   LLVMContext &getContext() { return Ctx; }
+
+  /// Remove \p V from the cache. You must call this after a value is erased.
+  void erase(VPValue *V) { CachedTypes.erase(V); }
 };
 
 // Collect a VPlan's ephemeral recipes (those used only by an assume).
diff --git a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
index 9a041c83438dc..7fe01657248a3 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
@@ -923,85 +923,16 @@ static void recursivelyDeleteDeadRecipes(VPValue *V) {
 }
 
 /// Try to simplify recipe \p R.
-static void simplifyRecipe(VPRecipeBase &R, VPTypeAnalysis &TypeInfo) {
+static VPValue *simplifyRecipe(VPRecipeBase &R, VPTypeAnalysis &TypeInfo) {
   using namespace llvm::VPlanPatternMatch;
 
-  if (auto *Blend = dyn_cast<VPBlendRecipe>(&R)) {
-    // Try to remove redundant blend recipes.
-    SmallPtrSet<VPValue *, 4> UniqueValues;
-    if (Blend->isNormalized() || !match(Blend->getMask(0), m_False()))
-      UniqueValues.insert(Blend->getIncomingValue(0));
-    for (unsigned I = 1; I != Blend->getNumIncomingValues(); ++I)
-      if (!match(Blend->getMask(I), m_False()))
-        UniqueValues.insert(Blend->getIncomingValue(I));
-
-    if (UniqueValues.size() == 1) {
-      Blend->replaceAllUsesWith(*UniqueValues.begin());
-      Blend->eraseFromParent();
-      return;
-    }
-
-    if (Blend->isNormalized())
-      return;
-
-    // Normalize the blend so its first incoming value is used as the initial
-    // value with the others blended into it.
-
-    unsigned StartIndex = 0;
-    for (unsigned I = 0; I != Blend->getNumIncomingValues(); ++I) {
-      // If a value's mask is used only by the blend then is can be deadcoded.
-      // TODO: Find the most expensive mask that can be deadcoded, or a mask
-      // that's used by multiple blends where it can be removed from them all.
-      VPValue *Mask = Blend->getMask(I);
-      if (Mask->getNumUsers() == 1 && !match(Mask, m_False())) {
-        StartIndex = I;
-        break;
-      }
-    }
-
-    SmallVector<VPValue *, 4> OperandsWithMask;
-    OperandsWithMask.push_back(Blend->getIncomingValue(StartIndex));
-
-    for (unsigned I = 0; I != Blend->getNumIncomingValues(); ++I) {
-      if (I == StartIndex)
-        continue;
-      OperandsWithMask.push_back(Blend->getIncomingValue(I));
-      OperandsWithMask.push_back(Blend->getMask(I));
-    }
-
-    auto *NewBlend = new VPBlendRecipe(
-        cast<PHINode>(Blend->getUnderlyingValue()), OperandsWithMask);
-    NewBlend->insertBefore(&R);
-
-    VPValue *DeadMask = Blend->getMask(StartIndex);
-    Blend->replaceAllUsesWith(NewBlend);
-    Blend->eraseFromParent();
-    recursivelyDeleteDeadRecipes(DeadMask);
-
-    /// Simplify BLEND %a, %b, Not(%mask) -> BLEND %b, %a, %mask.
-    VPValue *NewMask;
-    if (NewBlend->getNumOperands() == 3 &&
-        match(NewBlend->getMask(1), m_Not(m_VPValue(NewMask)))) {
-      VPValue *Inc0 = NewBlend->getOperand(0);
-      VPValue *Inc1 = NewBlend->getOperand(1);
-      VPValue *OldMask = NewBlend->getOperand(2);
-      NewBlend->setOperand(0, Inc1);
-      NewBlend->setOperand(1, Inc0);
-      NewBlend->setOperand(2, NewMask);
-      if (OldMask->getNumUsers() == 0)
-        cast<VPInstruction>(OldMask)->eraseFromParent();
-    }
-    return;
-  }
-
   // VPScalarIVSteps can only be simplified after unrolling. VPScalarIVSteps for
   // part 0 can be replaced by their start value, if only the first lane is
   // demanded.
   if (auto *Steps = dyn_cast<VPScalarIVStepsRecipe>(&R)) {
     if (Steps->getParent()->getPlan()->isUnrolled() && Steps->isPart0() &&
         vputils::onlyFirstLaneUsed(Steps)) {
-      Steps->replaceAllUsesWith(Steps->getOperand(0));
-      return;
+      return Steps->getOperand(0);
     }
   }
 
@@ -1011,11 +942,11 @@ static void simplifyRecipe(VPRecipeBase &R, VPTypeAnalysis &TypeInfo) {
     Type *TruncTy = TypeInfo.inferScalarType(Trunc);
     Type *ATy = TypeInfo.inferScalarType(A);
     if (TruncTy == ATy) {
-      Trunc->replaceAllUsesWith(A);
+      return A;
     } else {
       // Don't replace a scalarizing recipe with a widened cast.
       if (isa<VPReplicateRecipe>(&R))
-        return;
+        return nullptr;
       if (ATy->getScalarSizeInBits() < TruncTy->getScalarSizeInBits()) {
 
         unsigned ExtOpcode = match(R.getOperand(0), m_SExt(m_VPValue()))
@@ -1028,25 +959,13 @@ static void simplifyRecipe(VPRecipeBase &R, VPTypeAnalysis &TypeInfo) {
           VPC->setUnderlyingValue(UnderlyingExt);
         }
         VPC->insertBefore(&R);
-        Trunc->replaceAllUsesWith(VPC);
+        return VPC;
       } else if (ATy->getScalarSizeInBits() > TruncTy->getScalarSizeInBits()) {
         auto *VPC = new VPWidenCastRecipe(Instruction::Trunc, A, TruncTy);
         VPC->insertBefore(&R);
-        Trunc->replaceAllUsesWith(VPC);
+        return VPC;
       }
     }
-#ifndef NDEBUG
-    // Verify that the cached type info is for both A and its users is still
-    // accurate by comparing it to freshly computed types.
-    VPTypeAnalysis TypeInfo2(
-        R.getParent()->getPlan()->getCanonicalIV()->getScalarType());
-    assert(TypeInfo.inferScalarType(A) == TypeInfo2.inferScalarType(A));
-    for (VPUser *U : A->users()) {
-      auto *R = cast<VPRecipeBase>(U);
-      for (VPValue *VPV : R->definedValues())
-        assert(TypeInfo.inferScalarType(VPV) == TypeInfo2.inferScalarType(VPV));
-    }
-#endif
   }
 
   // Simplify (X && Y) || (X && !Y) -> X.
@@ -1056,17 +975,14 @@ static void simplifyRecipe(VPRecipeBase &R, VPTypeAnalysis &TypeInfo) {
   VPValue *X, *Y;
   if (match(&R,
             m_c_BinaryOr(m_LogicalAnd(m_VPValue(X), m_VPValue(Y)),
-                         m_LogicalAnd(m_Deferred(X), m_Not(m_Deferred(Y)))))) {
-    R.getVPSingleValue()->replaceAllUsesWith(X);
-    R.eraseFromParent();
-    return;
-  }
+                         m_LogicalAnd(m_Deferred(X), m_Not(m_Deferred(Y))))))
+    return X;
 
   if (match(&R, m_c_Mul(m_VPValue(A), m_SpecificInt(1))))
-    return R.getVPSingleValue()->replaceAllUsesWith(A);
+    return A;
 
   if (match(&R, m_Not(m_Not(m_VPValue(A)))))
-    return R.getVPSingleValue()->replaceAllUsesWith(A);
+    return A;
 
   // Remove redundant DerviedIVs, that is 0 + A * 1 -> A and 0 + 0 * x -> 0.
   if ((match(&R,
@@ -1075,16 +991,125 @@ static void simplifyRecipe(VPRecipeBase &R, VPTypeAnalysis &TypeInfo) {
              m_DerivedIV(m_SpecificInt(0), m_SpecificInt(0), m_VPValue()))) &&
       TypeInfo.inferScalarType(R.getOperand(1)) ==
           TypeInfo.inferScalarType(R.getVPSingleValue()))
-    return R.getVPSingleValue()->replaceAllUsesWith(R.getOperand(1));
+    return R.getOperand(1);
+
+  return nullptr;
 }
 
 void VPlanTransforms::simplifyRecipes(VPlan &Plan, Type &CanonicalIVTy) {
   ReversePostOrderTraversal<VPBlockDeepTraversalWrapper<VPBlockBase *>> RPOT(
       Plan.getEntry());
   VPTypeAnalysis TypeInfo(&CanonicalIVTy);
+  SetVector<VPRecipeBase *> Worklist;
+  for (VPBasicBlock *VPBB : VPBlockUtils::blocksOnly<VPBasicBlock>(RPOT))
+    for (VPRecipeBase &R : make_early_inc_range(*VPBB))
+      Worklist.insert(&R);
+
+  while (!Worklist.empty()) {
+    VPRecipeBase *R = Worklist.pop_back_val();
+    if (VPValue *Result = simplifyRecipe(*R, TypeInfo)) {
+      R->getVPSingleValue()->replaceAllUsesWith(Result);
+      TypeInfo.erase(R->getVPSingleValue());
+      R->eraseFromParent();
+      if (VPRecipeBase *ResultR = Result->getDefiningRecipe())
+        Worklist.insert(ResultR);
+      for (VPUser *U : Result->users())
+        if (auto *UR = dyn_cast<VPRecipeBase>(U))
+          if (UR != R)
+            Worklist.insert(UR);
+
+#ifndef NDEBUG
+      // Verify that the cached type info is for both Result and its users is
+      // still accurate by comparing it to freshly computed types.
+      VPTypeAnalysis TypeInfo2(&CanonicalIVTy);
+      assert(TypeInfo.inferScalarType(Result) ==
+             TypeInfo2.inferScalarType(Result));
+      for (VPUser *U : Result->users()) {
+        auto *R = cast<VPRecipeBase>(U);
+        for (VPValue *VPV : R->definedValues())
+          assert(TypeInfo.inferScalarType(VPV) ==
+                 TypeInfo2.inferScalarType(VPV));
+      }
+#endif
+    }
+  }
+}
+
+void VPlanTransforms::simplifyBlends(VPlan &Plan) {
+  using namespace llvm::VPlanPatternMatch;
+  ReversePostOrderTraversal<VPBlockDeepTraversalWrapper<VPBlockBase *>> RPOT(
+      Plan.getEntry());
+  SetVector<VPRecipeBase *> Worklist;
   for (VPBasicBlock *VPBB : VPBlockUtils::blocksOnly<VPBasicBlock>(RPOT)) {
     for (VPRecipeBase &R : make_early_inc_range(*VPBB)) {
-      simplifyRecipe(R, TypeInfo);
+      auto *Blend = dyn_cast<VPBlendRecipe>(&R);
+      if (!Blend)
+        continue;
+
+      // Try to remove redundant blend recipes.
+      SmallPtrSet<VPValue *, 4> UniqueValues;
+      if (Blend->isNormalized() || !match(Blend->getMask(0), m_False()))
+        UniqueValues.insert(Blend->getIncomingValue(0));
+      for (unsigned I = 1; I != Blend->getNumIncomingValues(); ++I)
+        if (!match(Blend->getMask(I), m_False()))
+          UniqueValues.insert(Blend->getIncomingValue(I));
+
+      if (UniqueValues.size() == 1) {
+        Blend->replaceAllUsesWith(*UniqueValues.begin());
+        Blend->eraseFromParent();
+        continue;
+      }
+
+      if (Blend->isNormalized())
+        continue;
+
+      // Normalize the blend so its first incoming value is used as the initial
+      // value with the others blended into it.
+
+      unsigned StartIndex = 0;
+      for (unsigned I = 0; I != Blend->getNumIncomingValues(); ++I) {
+        // If a value's mask is used only by the blend then is can be deadcoded.
+        // TODO: Find the most expensive mask that can be deadcoded, or a mask
+        // that's used by multiple blends where it can be removed from them all.
+        VPValue *Mask = Blend->getMask(I);
+        if (Mask->getNumUsers() == 1 && !match(Mask, m_False())) {
+          StartIndex = I;
+          break;
+        }
+      }
+
+      SmallVector<VPValue *, 4> OperandsWithMask;
+      OperandsWithMask.push_back(Blend->getIncomingValue(StartIndex));
+
+      for (unsigned I = 0; I != Blend->getNumIncomingValues(); ++I) {
+        if (I == StartIndex)
+          continue;
+        OperandsWithMask.push_back(Blend->getIncomingValue(I));
+        OperandsWithMask.push_back(Blend->getMask(I));
+      }
+
+      auto *NewBlend = new VPBlendRecipe(
+          cast<PHINode>(Blend->getUnderlyingValue()), OperandsWithMask);
+      NewBlend->insertBefore(&R);
+
+      VPValue *DeadMask = Blend->getMask(StartIndex);
+      Blend->replaceAllUsesWith(NewBlend);
+      Blend->eraseFromParent();
+      recursivelyDeleteDeadRecipes(DeadMask);
+
+      /// Simplify BLEND %a, %b, Not(%mask) -> BLEND %b, %a, %mask.
+      VPValue *NewMask;
+      if (NewBlend->getNumOperands() == 3 &&
+          match(NewBlend->getMask(1), m_Not(m_VPValue(NewMask)))) {
+        VPValue *Inc0 = NewBlend->getOperand(0);
+        VPValue *Inc1 = NewBlend->getOperand(1);
+        VPValue *OldMask = NewBlend->getOperand(2);
+        NewBlend->setOperand(0, Inc1);
+        NewBlend->setOperand(1, Inc0);
+        NewBlend->setOperand(2, NewMask);
+        if (OldMask->getNumUsers() == 0)
+          cast<VPInstruction>(OldMask)->eraseFromParent();
+      }
     }
   }
 }
@@ -1684,6 +1709,7 @@ void VPlanTransforms::optimize(VPlan &Plan) {
   runPass(removeRedundantInductionCasts, Plan);
 
   runPass(simplifyRecipes, Plan, *Plan.getCanonicalIV()->getScalarType());
+  runPass(simplifyBlends, Plan);
   runPass(removeDeadRecipes, Plan);
   runPass(legalizeAndOptimizeInductions, Plan);
   runPass(removeRedundantExpandSCEVRecipes, Plan);
diff --git a/llvm/lib/Transforms/Vectorize/VPlanTransforms.h b/llvm/lib/Transforms/Vectorize/VPlanTransforms.h
index c23ff38265670..8fdb3f0025a05 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanTransforms.h
+++ b/llvm/lib/Transforms/Vectorize/VPlanTransforms.h
@@ -183,6 +183,9 @@ struct VPlanTransforms {
   /// CanonicalIVTy as type for all un-typed live-ins in VPTypeAnalysis.
   static void simplifyRecipes(VPlan &Plan, Type &CanonicalIVTy);
 
+  /// Normalize and simplify VPBlendRecipes.
+  static void simplifyBlends(VPlan &Plan);
+
   /// If there's a single exit block, optimize its phi recipes that use exiting
   /// IV values by feeding them precomputed end values instead, possibly taken
   /// one step backwards.

@lukel97 lukel97 requested a review from paulwalker-arm April 2, 2025 11:41
Copy link
Contributor

@artagnon artagnon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the overall approach. I gave up on my previous attempt, but happy to see the work being picked up again!

}

void VPlanTransforms::simplifyRecipes(VPlan &Plan, Type &CanonicalIVTy) {
ReversePostOrderTraversal<VPBlockDeepTraversalWrapper<VPBlockBase *>> RPOT(
Plan.getEntry());
VPTypeAnalysis TypeInfo(&CanonicalIVTy);
SetVector<VPRecipeBase *> Worklist;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs to be a std::deque, because we want to pop_front, and push_back. See also: #93998.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh thanks for sharing that, I didn't know about that previous PR. Why do we need it to be pop_front/push_back? I believe InstCombine uses pop_back/push_back, i.e. immediately processes anything added to the list.

I also just realised that InstCombine pushes the initial instructions onto the worklist in order, so the perceived order is actually top to bottom. I've updated that here 3ca3643

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't matter yet, but matters when you will put up a patch with functional changes: I believe the order of simplification matters unlike in InstCombine, but I'm happy to address the point in the follow-up patch. Let's keep this thread open in case other reviewers have something to say about this.

VPRecipeBase *R = Worklist.pop_back_val();
if (VPValue *Result = simplifyRecipe(*R, TypeInfo)) {
R->getVPSingleValue()->replaceAllUsesWith(Result);
R->eraseFromParent();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we need eraseFromParent, as there is recursivelyRemoveDeadRecipes.

Comment on lines 1027 to 1022
for (VPUser *U : Result->users())
if (auto *UR = dyn_cast<VPRecipeBase>(U))
if (UR != R)
Worklist.insert(UR);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what the reasoning is behind this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I think I originally needed the UR != R check to prevent loops but this was before I switched to a SetVector, I've removed it in 67ce00f

@@ -1684,6 +1706,7 @@ void VPlanTransforms::optimize(VPlan &Plan) {
runPass(removeRedundantInductionCasts, Plan);

runPass(simplifyRecipes, Plan, *Plan.getCanonicalIV()->getScalarType());
runPass(simplifyBlends, Plan);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can split out this refactoring and land it first?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, split out in #134073

… NFC

This is split off from llvm#133977

VPBlendRecipe normalisation is sensitive to the number of users a mask has, so should probably be run after the masks are simplified as much as possible.

Note this could be run after removeDeadRecipes but this causes test diffs, some regressions, so this is left to a later patch.
@lukel97 lukel97 force-pushed the loop-vectorize/simplify-worklist branch from 3ca3643 to 9cf2368 Compare April 2, 2025 13:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants