Skip to content

[VPlan] Expand VPExpandSCEVRecipes to VPInstructions before CSE.#197643

Merged
fhahn merged 6 commits into
llvm:mainfrom
fhahn:vplan-expand-scev-to-vpinstructions-before-cse
May 29, 2026
Merged

[VPlan] Expand VPExpandSCEVRecipes to VPInstructions before CSE.#197643
fhahn merged 6 commits into
llvm:mainfrom
fhahn:vplan-expand-scev-to-vpinstructions-before-cse

Conversation

@fhahn
Copy link
Copy Markdown
Contributor

@fhahn fhahn commented May 14, 2026

Add expandSCEVExpressions transform that converts VPExpandSCEVRecipes
to VPInstructions where possible, running before CSE so duplicates with
other SCEV expansions (e.g., from addMinimumIterationCheck) are
eliminated. This also reuses existing loop-invariant IR values via
ScalarEvolution::getSCEVValues to avoid redundant computation.

Currently limited to SCEVMulExpr (along with constants, unknowns, and
vscale). Support for SCEVAddExpr and SCEVUDivExpr will follow in
subsequent patches.

Depends on #189455

@fhahn fhahn requested review from Mel-Chen, aniragil, artagnon and ayalz May 14, 2026 09:55
@llvmorg-github-actions llvmorg-github-actions Bot added backend:RISC-V vectorizers llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms labels May 14, 2026
@llvmorg-github-actions
Copy link
Copy Markdown

llvmorg-github-actions Bot commented May 14, 2026

@llvm/pr-subscribers-llvm-analysis
@llvm/pr-subscribers-vectorizers
@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-backend-risc-v

Author: Florian Hahn (fhahn)

Changes

Add expandSCEVExpressions transform that converts VPExpandSCEVRecipes
to VPInstructions where possible, running before CSE so duplicates with
other SCEV expansions (e.g., from addMinimumIterationCheck) are
eliminated. This also reuses existing loop-invariant IR values via
ScalarEvolution::getSCEVValues to avoid redundant computation.

Currently limited to SCEVMulExpr (along with constants, unknowns, and
vscale). Support for SCEVAddExpr and SCEVUDivExpr will follow in
subsequent patches.

Depends on #189455 (included in PR)


Patch is 297.92 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/197643.diff

108 Files Affected:

  • (modified) llvm/include/llvm/Analysis/ScalarEvolution.h (+2)
  • (modified) llvm/include/llvm/Transforms/Utils/ScalarEvolutionExpander.h (+5)
  • (modified) llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp (+24-18)
  • (modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+4-1)
  • (modified) llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp (+4-17)
  • (modified) llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp (+26-6)
  • (modified) llvm/lib/Transforms/Vectorize/VPlanTransforms.h (+15-9)
  • (modified) llvm/lib/Transforms/Vectorize/VPlanUtils.cpp (+62)
  • (modified) llvm/lib/Transforms/Vectorize/VPlanUtils.h (+23)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/conditional-branches-cost.ll (+1-2)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/conditional-scalar-assignment.ll (+12-24)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/findlast-epilogue-loop.ll (+1-2)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/gather-do-not-vectorize-addressing.ll (+1-2)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/induction-costs-sve.ll (+2-3)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/invalid-costs.ll (+1-2)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/load-cast-context.ll (+2-3)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/low_trip_count_predicates.ll (+4-6)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/masked_ldst_sme.ll (+1-2)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/outer_loop_prefer_scalable.ll (+1-2)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/partial-reduce-chained.ll (+18-36)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/partial-reduce-dot-product.ll (+9-18)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/partial-reduce-fdot-product.ll (+12-24)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/partial-reduce-incomplete-chains.ll (+1-2)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/partial-reduce-sub-epilogue-vec.ll (+4-6)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/partial-reduce-with-invariant-stores.ll (+1-2)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/partial-reduce.ll (+6-12)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/pr60831-sve-inv-store-crash.ll (+5-10)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/reduction-recurrence-costs-sve.ll (+3-5)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/reverse-load-scatter.ll (+1-2)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/scalable-avoid-scalarization.ll (+1-2)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/scalable-reduction-inloop-cond.ll (+2-4)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/scalable-strict-fadd.ll (+15-30)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll (+1-2)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/single-early-exit-interleave.ll (+1-2)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/store-costs-sve.ll (+2-3)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/struct-return-cost.ll (-1)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-cond-inv-loads.ll (+3-6)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-epilog-vect-inloop-reductions.ll (+2-3)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-epilog-vect-reductions.ll (+2-3)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-epilog-vect-strict-reductions.ll (+2-3)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-epilog-vect.ll (+11-17)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-epilog-vscale-fixed.ll (+4-6)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-fneg.ll (+1-2)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-gather-scatter.ll (+6-11)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-inductions-unusual-types.ll (+2-4)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-inductions.ll (+1-2)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-accesses.ll (+13-23)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-masked-accesses.ll (+4-8)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-inv-store.ll (+2-4)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-live-out-pointer-induction.ll (+1-2)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-multiexit.ll (+2-4)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-predicated-costs.ll (+1-2)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-vector-reverse.ll (+4-6)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-widen-extractvalue.ll (+1-2)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-widen-gep.ll (+2-4)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-widen-phi.ll (+7-10)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve2-histcnt-epilogue.ll (+1-2)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve2-histcnt.ll (+10-19)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/tail-folding-styles.ll (+1-2)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/transform-narrow-interleave-to-widen-memory-scalable.ll (+6-9)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/type-shrinkage-zext-costs.ll (+2-4)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/vector-loop-backedge-elimination-epilogue.ll (+6-9)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/vector-reverse.ll (+2-3)
  • (modified) llvm/test/Transforms/LoopVectorize/RISCV/conditional-scalar-assignment.ll (+1-2)
  • (modified) llvm/test/Transforms/LoopVectorize/RISCV/early-exit-live-out.ll (+3-6)
  • (modified) llvm/test/Transforms/LoopVectorize/RISCV/inloop-reduction.ll (+6-12)
  • (modified) llvm/test/Transforms/LoopVectorize/RISCV/interleaved-masked-access.ll (+2-4)
  • (modified) llvm/test/Transforms/LoopVectorize/RISCV/lmul.ll (+1-2)
  • (modified) llvm/test/Transforms/LoopVectorize/RISCV/partial-reduce-dot-product.ll (+8-16)
  • (modified) llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse.ll (+4-8)
  • (modified) llvm/test/Transforms/LoopVectorize/RISCV/sink-to-early-exit.ll (+1-2)
  • (modified) llvm/test/Transforms/LoopVectorize/RISCV/strided-accesses.ll (+8-16)
  • (modified) llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-cast-intrinsics.ll (+1-2)
  • (modified) llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-complex-mask.ll (+1-2)
  • (modified) llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-cond-reduction.ll (+8-16)
  • (modified) llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-div.ll (+8-16)
  • (modified) llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-fixed-order-recurrence.ll (+5-10)
  • (modified) llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-inloop-reduction.ll (+14-28)
  • (modified) llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-interleave.ll (+5-10)
  • (modified) llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-intermediate-store.ll (+2-4)
  • (modified) llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-iv32.ll (+1-2)
  • (modified) llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-masked-loadstore.ll (+1-2)
  • (modified) llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-ordered-reduction.ll (+1-2)
  • (modified) llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-reduction.ll (+14-28)
  • (modified) llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-reverse-load-store.ll (+2-4)
  • (modified) llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-safe-dep-distance.ll (+2-4)
  • (modified) llvm/test/Transforms/LoopVectorize/RISCV/transform-narrow-interleave-to-widen-memory.ll (+4-6)
  • (modified) llvm/test/Transforms/LoopVectorize/RISCV/truncate-to-minimal-bitwidth-evl-crash.ll (+1-2)
  • (modified) llvm/test/Transforms/LoopVectorize/RISCV/vectorize-vp-intrinsics.ll (+1-2)
  • (modified) llvm/test/Transforms/LoopVectorize/VPlan/expand-scev.ll (+3-3)
  • (modified) llvm/test/Transforms/LoopVectorize/first-order-recurrence-scalable-vf1.ll (+4-6)
  • (modified) llvm/test/Transforms/LoopVectorize/float-induction.ll (+22-28)
  • (modified) llvm/test/Transforms/LoopVectorize/if-reduction.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopVectorize/narrow-interleave-groups-scalable-vf.ll (+2-3)
  • (modified) llvm/test/Transforms/LoopVectorize/narrow-to-single-scalar-widen-gep-scalable.ll (+1-2)
  • (modified) llvm/test/Transforms/LoopVectorize/outer_loop_scalable.ll (+1-2)
  • (modified) llvm/test/Transforms/LoopVectorize/pointer-induction.ll (-1)
  • (modified) llvm/test/Transforms/LoopVectorize/pr31190.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopVectorize/scalable-assume.ll (+3-6)
  • (modified) llvm/test/Transforms/LoopVectorize/scalable-first-order-recurrence.ll (+12-24)
  • (modified) llvm/test/Transforms/LoopVectorize/scalable-inductions.ll (+8-12)
  • (modified) llvm/test/Transforms/LoopVectorize/scalable-iv-outside-user.ll (+1-2)
  • (modified) llvm/test/Transforms/LoopVectorize/scalable-lifetime.ll (+2-4)
  • (modified) llvm/test/Transforms/LoopVectorize/scalable-loop-unpredicated-body-scalar-tail.ll (+2-4)
  • (modified) llvm/test/Transforms/LoopVectorize/scalable-reduction-inloop.ll (+1-2)
  • (modified) llvm/test/Transforms/LoopVectorize/scalable-trunc-min-bitwidth.ll (+2-4)
  • (modified) llvm/test/Transforms/LoopVectorize/vectorize-force-tail-with-evl.ll (+3-5)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/sve-interleave-vectorization.ll (+1-2)
diff --git a/llvm/include/llvm/Analysis/ScalarEvolution.h b/llvm/include/llvm/Analysis/ScalarEvolution.h
index fd3a7fab1fd66..96f09b191c21e 100644
--- a/llvm/include/llvm/Analysis/ScalarEvolution.h
+++ b/llvm/include/llvm/Analysis/ScalarEvolution.h
@@ -63,6 +63,7 @@ class SCEVUnknown;
 class StructType;
 class TargetLibraryInfo;
 class Type;
+class VPSCEVExpander;
 enum SCEVTypes : unsigned short;
 
 LLVM_ABI extern bool VerifySCEV;
@@ -1636,6 +1637,7 @@ class ScalarEvolution {
   friend class SCEVCallbackVH;
   friend class SCEVExpander;
   friend class SCEVUnknown;
+  friend class VPSCEVExpander;
 
   /// The function we are analyzing.
   Function &F;
diff --git a/llvm/include/llvm/Transforms/Utils/ScalarEvolutionExpander.h b/llvm/include/llvm/Transforms/Utils/ScalarEvolutionExpander.h
index 42355f5841eab..5ff730b3755d0 100644
--- a/llvm/include/llvm/Transforms/Utils/ScalarEvolutionExpander.h
+++ b/llvm/include/llvm/Transforms/Utils/ScalarEvolutionExpander.h
@@ -311,6 +311,11 @@ class SCEVExpander : public SCEVUseVisitor<SCEVExpander, Value *> {
   LLVM_ABI bool isSafeToExpandAt(const SCEV *S,
                                  const Instruction *InsertionPoint) const;
 
+  /// Drop poison-generating flags from \p I, then try re-infer via SCEV.
+  LLVM_ABI static void
+  dropPoisonGeneratingAnnotationsAndReinfer(ScalarEvolution &SE,
+                                            Instruction *I);
+
   /// Insert code to directly compute the specified SCEV expression into the
   /// program.  The code is inserted into the specified block.
   LLVM_ABI Value *expandCodeFor(SCEVUse SH, Type *Ty, BasicBlock::iterator I);
diff --git a/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp b/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
index 8877688207548..3a65a0405a05b 100644
--- a/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
+++ b/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
@@ -1684,24 +1684,7 @@ Value *SCEVExpander::expand(SCEVUse S) {
   } else {
     for (Instruction *I : DropPoisonGeneratingInsts) {
       rememberFlags(I);
-      I->dropPoisonGeneratingAnnotations();
-      // See if we can re-infer from first principles any of the flags we just
-      // dropped.
-      if (auto *OBO = dyn_cast<OverflowingBinaryOperator>(I))
-        if (auto Flags = SE.getStrengthenedNoWrapFlagsFromBinOp(OBO)) {
-          auto *BO = cast<BinaryOperator>(I);
-          BO->setHasNoUnsignedWrap(
-            ScalarEvolution::maskFlags(*Flags, SCEV::FlagNUW) == SCEV::FlagNUW);
-          BO->setHasNoSignedWrap(
-            ScalarEvolution::maskFlags(*Flags, SCEV::FlagNSW) == SCEV::FlagNSW);
-        }
-      if (auto *NNI = dyn_cast<PossiblyNonNegInst>(I)) {
-        auto *Src = NNI->getOperand(0);
-        if (isImpliedByDomCondition(ICmpInst::ICMP_SGE, Src,
-                                    Constant::getNullValue(Src->getType()), I,
-                                    DL).value_or(false))
-          NNI->setNonNeg(true);
-      }
+      dropPoisonGeneratingAnnotationsAndReinfer(SE, I);
     }
   }
   // Remember the expanded value for this SCEV at this location.
@@ -1729,6 +1712,29 @@ void SCEVExpander::rememberFlags(Instruction *I) {
   OrigFlags.try_emplace(I, PoisonFlags(I));
 }
 
+void SCEVExpander::dropPoisonGeneratingAnnotationsAndReinfer(
+    ScalarEvolution &SE, Instruction *I) {
+  I->dropPoisonGeneratingAnnotations();
+  // See if we can re-infer from first principles any of the flags we just
+  // dropped.
+  if (auto *OBO = dyn_cast<OverflowingBinaryOperator>(I))
+    if (auto Flags = SE.getStrengthenedNoWrapFlagsFromBinOp(OBO)) {
+      auto *BO = cast<BinaryOperator>(I);
+      BO->setHasNoUnsignedWrap(
+          ScalarEvolution::maskFlags(*Flags, SCEV::FlagNUW) == SCEV::FlagNUW);
+      BO->setHasNoSignedWrap(
+          ScalarEvolution::maskFlags(*Flags, SCEV::FlagNSW) == SCEV::FlagNSW);
+    }
+  if (auto *NNI = dyn_cast<PossiblyNonNegInst>(I)) {
+    auto *Src = NNI->getOperand(0);
+    if (isImpliedByDomCondition(ICmpInst::ICMP_SGE, Src,
+                                Constant::getNullValue(Src->getType()), I,
+                                SE.getDataLayout())
+            .value_or(false))
+      NNI->setNonNeg(true);
+  }
+}
+
 void SCEVExpander::replaceCongruentIVInc(
     PHINode *&Phi, PHINode *&OrigPhi, Loop *L, const DominatorTree *DT,
     SmallVectorImpl<WeakTrackingVH> &DeadInsts) {
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 7213d4ae795ec..be1e6d7a17023 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -6163,6 +6163,9 @@ DenseMap<const SCEV *, Value *> LoopVectorizationPlanner::executePlan(
       CM.requiresScalarEpilogue(BestVF.isVector()), &BestVPlan.getVFxUF(),
       MaxRuntimeStep);
   VPlanTransforms::materializeFactors(BestVPlan, VectorPH, BestVF);
+  // Limit expansions to VPInstruction to when not vectorizing the main epilogue loop.
+  if (EpilogueVecKind == EpilogueVectorizationKind::None)
+    VPlanTransforms::expandSCEVExpressions(BestVPlan, *PSE.getSE(), *OrigLoop);
   VPlanTransforms::cse(BestVPlan);
   VPlanTransforms::simplifyRecipes(BestVPlan);
   VPlanTransforms::simplifyKnownEVL(BestVPlan, BestVF, PSE);
@@ -7324,7 +7327,7 @@ void LoopVectorizationPlanner::addMinimumIterationCheck(
                  CM.requiresScalarEpilogue(VF.isVector()),
                  CM.foldTailByMasking(), OrigLoop, BranchWeights,
                  OrigLoop->getLoopPredecessor()->getTerminator()->getDebugLoc(),
-                 PSE, /*CheckBlock=*/nullptr);
+                 PSE, Plan.getEntry());
 }
 
 // Determine how to lower the epilogue, which depends on 1) optimising
diff --git a/llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp b/llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
index 7c54a223f9793..4587856e9b9cc 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
@@ -1448,16 +1448,6 @@ void VPlanTransforms::attachCheckBlock(VPlan &Plan, Value *Cond,
   addBypassBranch(Plan, CheckBlockVPBB, CondVPV, AddBranchWeights);
 }
 
-/// Return an insert point in \p EntryVPBB after existing VPIRPhi,
-/// VPIRInstruction and VPExpandSCEVRecipe recipes.
-static VPBasicBlock::iterator getExpandSCEVInsertPt(VPBasicBlock *EntryVPBB) {
-  auto InsertPt = EntryVPBB->begin();
-  while (InsertPt != EntryVPBB->end() &&
-         isa<VPExpandSCEVRecipe, VPIRPhi, VPIRInstruction>(&*InsertPt))
-    ++InsertPt;
-  return InsertPt;
-}
-
 void VPlanTransforms::addMinimumIterationCheck(
     VPlan &Plan, ElementCount VF, unsigned UF,
     ElementCount MinProfitableTripCount, bool RequiresScalarEpilogue,
@@ -1489,10 +1479,7 @@ void VPlanTransforms::addMinimumIterationCheck(
     return SE.getUMaxExpr(MinProfitableTripCountSCEV, VFxUF);
   };
 
-  VPBasicBlock *EntryVPBB = Plan.getEntry();
-  // Place compare and branch in CheckBlock if given, ExpandSCEVs in Entry.
-  VPBasicBlock *CheckVPBB = CheckBlock ? CheckBlock : EntryVPBB;
-  VPBuilder Builder(CheckVPBB);
+  VPBuilder Builder(CheckBlock);
   VPValue *TripCountCheck = Plan.getFalse();
   const SCEV *Step = GetMinTripCount();
   // TripCountCheck = false, folding tail implies positive vector trip
@@ -1510,9 +1497,9 @@ void VPlanTransforms::addMinimumIterationCheck(
                                     TripCount, Step)) {
       // Generate the minimum iteration check only if we cannot prove the
       // check is known to be true, or known to be false.
-      // ExpandSCEV must be placed in Entry.
-      VPBuilder SCEVBuilder(EntryVPBB, getExpandSCEVInsertPt(EntryVPBB));
-      VPValue *MinTripCountVPV = SCEVBuilder.createExpandSCEV(Step);
+      VPValue *MinTripCountVPV =
+          VPSCEVExpander(Builder, Plan, *PSE.getSE(), *OrigLoop, DL)
+              .expand(Step);
       TripCountCheck = Builder.createICmp(
           CmpPred, TripCountVPV, MinTripCountVPV, DL, "min.iters.check");
     } // else step known to be < trip count, use TripCountCheck preset to false.
diff --git a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
index 673355ffb1c96..e73a44740902f 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
@@ -5016,6 +5016,28 @@ void VPlanTransforms::materializeFactors(VPlan &Plan, VPBasicBlock *VectorPH,
   VFxUF.replaceAllUsesWith(MulByUF);
 }
 
+void VPlanTransforms::expandSCEVExpressions(VPlan &Plan, ScalarEvolution &SE,
+                                            Loop &OrigLoop) {
+  auto *Entry = cast<VPIRBasicBlock>(Plan.getEntry());
+  VPBuilder Builder(Entry, Entry->begin());
+  VPSCEVExpander Expander(Builder, Plan, SE, OrigLoop);
+
+  // Expand VPExpandSCEVRecipes to VPInstructions using VPSCEVExpander. During
+  // the transition, unsupported SCEV expressions are still expanded to
+  // VPExpandSCEVRecipes.
+  for (VPRecipeBase &R : make_early_inc_range(*Entry)) {
+    auto *ExpSCEV = dyn_cast<VPExpandSCEVRecipe>(&R);
+    if (!ExpSCEV)
+      continue;
+    Builder.setInsertPoint(ExpSCEV);
+    VPValue *Expanded = Expander.expand(ExpSCEV->getSCEV());
+    ExpSCEV->replaceAllUsesWith(Expanded);
+    if (Plan.getTripCount() == ExpSCEV)
+      Plan.resetTripCount(Expanded);
+    ExpSCEV->eraseFromParent();
+  }
+}
+
 DenseMap<const SCEV *, Value *>
 VPlanTransforms::expandSCEVs(VPlan &Plan, ScalarEvolution &SE) {
   SCEVExpander Expander(SE, "induction", /*PreserveLCSSA=*/false);
@@ -5023,16 +5045,15 @@ VPlanTransforms::expandSCEVs(VPlan &Plan, ScalarEvolution &SE) {
   auto *Entry = cast<VPIRBasicBlock>(Plan.getEntry());
   BasicBlock *EntryBB = Entry->getIRBasicBlock();
   DenseMap<const SCEV *, Value *> ExpandedSCEVs;
+  // Expand remaining VPExpandSCEVRecipes to IR instructions using SCEVExpander.
   for (VPRecipeBase &R : make_early_inc_range(*Entry)) {
-    if (isa<VPIRInstruction, VPIRPhi>(&R))
-      continue;
     auto *ExpSCEV = dyn_cast<VPExpandSCEVRecipe>(&R);
     if (!ExpSCEV)
-      break;
+      continue;
     const SCEV *Expr = ExpSCEV->getSCEV();
     Value *Res =
         Expander.expandCodeFor(Expr, Expr->getType(), EntryBB->getTerminator());
-    ExpandedSCEVs[ExpSCEV->getSCEV()] = Res;
+    ExpandedSCEVs[Expr] = Res;
     VPValue *Exp = Plan.getOrAddLiveIn(Res);
     ExpSCEV->replaceAllUsesWith(Exp);
     if (Plan.getTripCount() == ExpSCEV)
@@ -5040,8 +5061,7 @@ VPlanTransforms::expandSCEVs(VPlan &Plan, ScalarEvolution &SE) {
     ExpSCEV->eraseFromParent();
   }
   assert(none_of(*Entry, IsaPred<VPExpandSCEVRecipe>) &&
-         "VPExpandSCEVRecipes must be at the beginning of the entry block, "
-         "before any VPIRInstructions");
+         "all VPExpandSCEVRecipes must have been expanded");
   // Add IR instructions in the entry basic block but not in the VPIRBasicBlock
   // to the VPIRBasicBlock.
   auto EI = Entry->begin();
diff --git a/llvm/lib/Transforms/Vectorize/VPlanTransforms.h b/llvm/lib/Transforms/Vectorize/VPlanTransforms.h
index c66d83d3177d3..bc8d5ba879f39 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanTransforms.h
+++ b/llvm/lib/Transforms/Vectorize/VPlanTransforms.h
@@ -164,15 +164,13 @@ struct VPlanTransforms {
   /// be added to the middle block.
   LLVM_ABI_FOR_TEST static void addMiddleCheck(VPlan &Plan, bool TailFolded);
 
-  // Create a check to \p Plan to see if the vector loop should be executed.
-  // If \p CheckBlock is non-null, the compare and branch are placed there;
-  // ExpandSCEV recipes are always placed in Entry.
+  // Create a check in \p CheckBlock to see if the vector loop should be
+  // executed.
   static void addMinimumIterationCheck(
       VPlan &Plan, ElementCount VF, unsigned UF,
       ElementCount MinProfitableTripCount, bool RequiresScalarEpilogue,
       bool TailFolded, Loop *OrigLoop, const uint32_t *MinItersBypassWeights,
-      DebugLoc DL, PredicatedScalarEvolution &PSE,
-      VPBasicBlock *CheckBlock = nullptr);
+      DebugLoc DL, PredicatedScalarEvolution &PSE, VPBasicBlock *CheckBlock);
 
   /// Add a new check block before the vector preheader to \p Plan to check if
   /// the main vector loop should be executed (TC >= VF * UF).
@@ -429,10 +427,18 @@ struct VPlanTransforms {
   static void materializeFactors(VPlan &Plan, VPBasicBlock *VectorPH,
                                  ElementCount VF);
 
-  /// Expand VPExpandSCEVRecipes in \p Plan's entry block. Each
-  /// VPExpandSCEVRecipe is replaced with a live-in wrapping the expanded IR
-  /// value. A mapping from SCEV expressions to their expanded IR value is
-  /// returned.
+  /// Try to expand VPExpandSCEVRecipes in \p Plan's entry block to
+  /// VPInstructions. Recipes that cannot be expanded (casts, min/max) are kept
+  /// for later IR-level expansion by expandSCEVs. Should run before CSE so
+  /// that duplicate expansions are eliminated. Existing loop-invariant IR
+  /// values are reused as live-ins.
+  static void expandSCEVExpressions(VPlan &Plan, ScalarEvolution &SE,
+                                    Loop &OrigLoop);
+
+  /// Expand remaining VPExpandSCEVRecipes in \p Plan's entry block using
+  /// SCEVExpander. Each VPExpandSCEVRecipe is replaced with a live-in wrapping
+  /// the expanded IR value. A mapping from SCEV expressions to their expanded
+  /// IR value is returned.
   static DenseMap<const SCEV *, Value *> expandSCEVs(VPlan &Plan,
                                                      ScalarEvolution &SE);
 
diff --git a/llvm/lib/Transforms/Vectorize/VPlanUtils.cpp b/llvm/lib/Transforms/Vectorize/VPlanUtils.cpp
index 24adcea1040b5..38e0f18284899 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanUtils.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanUtils.cpp
@@ -7,14 +7,17 @@
 //===----------------------------------------------------------------------===//
 
 #include "VPlanUtils.h"
+#include "LoopVectorizationPlanner.h"
 #include "VPlanAnalysis.h"
 #include "VPlanCFG.h"
 #include "VPlanDominatorTree.h"
 #include "VPlanPatternMatch.h"
 #include "llvm/ADT/TypeSwitch.h"
+#include "llvm/Analysis/LoopInfo.h"
 #include "llvm/Analysis/MemoryLocation.h"
 #include "llvm/Analysis/ScalarEvolutionExpressions.h"
 #include "llvm/Analysis/ScalarEvolutionPatternMatch.h"
+#include "llvm/Transforms/Utils/ScalarEvolutionExpander.h"
 
 using namespace llvm;
 using namespace llvm::VPlanPatternMatch;
@@ -839,3 +842,62 @@ bool vputils::isUsedByLoadStoreAddress(const VPValue *V) {
   }
   return false;
 }
+
+/// Try to find a loop-invariant IR value for \p S in \p OrigLoop's preheader
+/// that can be reused. Returns the corresponding live-in VPValue, or nullptr
+/// if no reusable IR value is found.
+VPValue *VPSCEVExpander::tryToReuseIRValue(const SCEV *S) {
+  if (isa<SCEVConstant, SCEVUnknown>(S))
+    return nullptr;
+  BasicBlock *PH = OrigLoop.getLoopPreheader();
+  if (!PH)
+    return nullptr;
+  for (Value *V : SE.getSCEVValues(S)) {
+    if (V->getType() != S->getType())
+      continue;
+    // Non-instruction values (arguments, globals) are always reusable.
+    auto *I = dyn_cast<Instruction>(V);
+    if (!I)
+      return Plan.getOrAddLiveIn(V);
+    // Only reuse instructions in the loop preheader, as instructions in
+    // sibling branches may not dominate this loop's preheader.
+    if (I->getParent() != PH)
+      continue;
+    SmallVector<Instruction *> DropPoisonGeneratingInsts;
+    if (!SE.canReuseInstruction(S, I, DropPoisonGeneratingInsts))
+      continue;
+    for (Instruction *DropI : DropPoisonGeneratingInsts)
+      SCEVExpander::dropPoisonGeneratingAnnotationsAndReinfer(SE, DropI);
+    return Plan.getOrAddLiveIn(V);
+  }
+  return nullptr;
+}
+
+VPValue *VPSCEVExpander::expand(const SCEV *S) {
+  if (VPValue *V = tryToReuseIRValue(S))
+    return V;
+
+  switch (S->getSCEVType()) {
+  case scConstant:
+    return Plan.getOrAddLiveIn(cast<SCEVConstant>(S)->getValue());
+  case scUnknown:
+    return Plan.getOrAddLiveIn(cast<SCEVUnknown>(S)->getValue());
+  case scVScale:
+    return Builder.createNaryOp(VPInstruction::VScale, {}, S->getType());
+  case scMulExpr: {
+    auto *Mul = cast<SCEVMulExpr>(S);
+    VPIRFlags::WrapFlagsTy WrapFlags(Mul->hasNoUnsignedWrap(),
+                                     Mul->hasNoSignedWrap());
+    VPValue *Result = expand(Mul->getOperand(0));
+    for (const SCEVUse &Op : drop_begin(Mul->operands()))
+      Result = Builder.createOverflowingOp(Instruction::Mul,
+                                           {Result, expand(Op)}, WrapFlags, DL);
+    return Result;
+  }
+  default:
+    // Unsupported SCEV kind; fall back to VPExpandSCEVRecipe.
+    assert(Builder.getInsertBlock() == Plan.getEntry() &&
+           "VPExpandSCEVRecipe fallback requires insertion in the entry block");
+    return Builder.createExpandSCEV(S);
+  }
+}
diff --git a/llvm/lib/Transforms/Vectorize/VPlanUtils.h b/llvm/lib/Transforms/Vectorize/VPlanUtils.h
index 21da1864d5d6a..fbcb972370c36 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanUtils.h
+++ b/llvm/lib/Transforms/Vectorize/VPlanUtils.h
@@ -176,6 +176,29 @@ VPSingleDefRecipe *findHeaderMask(VPlan &Plan);
 
 } // namespace vputils
 
+/// Lightweight SCEV-to-VPlan expander. Converts SCEV expressions into
+/// VPInstructions where possible, falling back to VPExpandSCEVRecipe for
+/// unsupported expressions (casts, min/max).
+class VPSCEVExpander {
+  VPBuilder &Builder;
+  VPlan &Plan;
+  ScalarEvolution &SE;
+  Loop &OrigLoop;
+  DebugLoc DL;
+
+  /// Try to find a loop-invariant IR value in OrigLoop's preheader whose
+  /// SCEV matches \p S. Returns the corresponding live-in VPValue, or nullptr
+  /// if none is found.
+  VPValue *tryToReuseIRValue(const SCEV *S);
+
+public:
+  VPSCEVExpander(VPBuilder &Builder, VPlan &Plan, ScalarEvolution &SE,
+                 Loop &OrigLoop, DebugLoc DL = DebugLoc())
+      : Builder(Builder), Plan(Plan), SE(SE), OrigLoop(OrigLoop), DL(DL) {}
+
+  /// Expand \p S into VPlan recipes using the builder.
+  VPValue *expand(const SCEV *S);
+};
 //===----------------------------------------------------------------------===//
 // Utilities for modifying predecessors and successors of VPlan blocks.
 //===----------------------------------------------------------------------===//
diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/conditional-branches-cost.ll b/llvm/test/Transforms/LoopVectorize/AArch64/conditional-branches-cost.ll
index 690a61e3e05c2..f877f0ba9cfee 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/conditional-branches-cost.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/conditional-branches-cost.ll
@@ -542,8 +542,7 @@ define void @multiple_exit_conditions(ptr %src, ptr noalias %dst) #1 {
 ; DEFAULT-NEXT:    [[MIN_ITERS_CHECK1:%.*]] = icmp ult i64 257, [[TMP3]]
 ; DEFAULT-NEXT:    br i1 [[MIN_ITERS_CHECK1]], label %[[SCALAR_PH:.*]], label %[[VECTOR_PH:.*]]
 ; DEFAULT:       [[VECTOR_PH]]:
-; DEFAULT-NEXT:    [[TMP4:%.*]] = call i64 @llvm.vscale.i64()
-; DEFAULT-NEXT:    [[TMP11:%.*]] = shl nuw i64 [[TMP4]], 2
+; DEFAULT-NEXT:    [[TMP11:%.*]] = shl nuw i64 [[TMP2]], 2
 ; DEFAULT-NEXT:    [[TMP5:%.*]] = shl nuw i64 [[TMP11]], 2
 ; DEFAULT-NEXT:    [[N_MOD_VF:%.*]] = urem i64 257, [[TMP5]]
 ; DEFAULT-NEXT:    [[N_VEC:%.*]] = sub i64 257, [[N_MOD_VF]]
diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/conditional-scalar-assignment.ll b/llvm/test/Transforms/LoopVectorize/AArch64/conditional-scalar-assignment.ll
index 9fc9f03461b69..fcc646cb49137 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/conditional-scalar-assignment.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/conditional-scalar-assignment.ll
@@ -31,8 +31,7 @@ define i32 @simple_csa_int_select(i64 %N, ptr %data, i32 %a) {
 ; SVE-NEXT:    [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[N]], [[TMP1]]
 ; SVE-NEXT:    br i1 [[MIN_ITERS_CHECK]], label %[[SCALAR_PH:.*]], label %[[VECTOR_PH:.*]]
 ; SVE:       [[VECTOR_PH]]:
-; SVE-NEXT:    [[TMP2:%.*]] = call i64 @llvm.vscale.i64()
-; SVE-NEXT:    [[TMP3:%.*]] = shl nuw i64 [[TMP2]], 2
+; SVE-NEXT:    [[TMP3:%.*]] = shl nuw i64 [[TMP0]], 2
 ; SVE-NEXT:    [[N_MOD_VF:%.*]] = urem i64 [[N]], [[TMP3]]
 ; SVE-NEXT:    [[N_VEC:%.*]] = sub i64 [[N]], [[N_MOD_VF]]
 ; SVE-NEXT:    [[BROADCAST_SPLATINSERT:%.*]] = insertelement <vscale x 4 x i32> poison, i32 [[A]], i64 0
@@ -120,8 +119,7 @@ define ptr @simple_csa_ptr_select(i64 %N, ptr %data, i64 %a, ptr %init) ...
[truncated]

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 14, 2026

✅ With the latest revision this PR passed the C/C++ code formatter.

@fhahn fhahn force-pushed the vplan-expand-scev-to-vpinstructions-before-cse branch from 9b7eeb8 to 4e13a7d Compare May 14, 2026 09:57
Comment thread llvm/include/llvm/Analysis/ScalarEvolution.h
Comment thread llvm/lib/Transforms/Vectorize/LoopVectorize.cpp Outdated
Comment on lines +5025 to +5027
// Expand VPExpandSCEVRecipes to VPInstructions using VPSCEVExpander. During
// the transition, unsupported SCEV expressions are still expanded to
// VPExpandSCEVRecipes.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, this is highly confusing! We don't "expand" to ExpandSCEVRecipes; we simply fall back to not expanding the ExpandSCEVRecipe to VPInstructions, and use SCEVExpander to expand to Instructions?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The early transform now only expands VPExpandSCEV to VPInstructions (and skips if it cannot be expanded).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to update this comment?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated thanks

Comment on lines +898 to +901
// Unsupported SCEV kind; fall back to VPExpandSCEVRecipe.
assert(Builder.getInsertBlock() == Plan.getEntry() &&
"VPExpandSCEVRecipe fallback requires insertion in the entry block");
return Builder.createExpandSCEV(S);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just return nullptr here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to do that for now, this also requires adding a bailout above if any operand is nullptr though. Updated in #189455

VPValue *VPSCEVExpander::tryToReuseIRValue(const SCEV *S) {
if (isa<SCEVConstant, SCEVUnknown>(S))
return nullptr;
BasicBlock *PH = OrigLoop.getLoopPreheader();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, can we not use Plan.getScalarPreheader()? Confused about why do we need the OrigLoop here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use the entry block, which is a VPIRBasicBlock at this point. The scalar preheader retrieved from VPlan only dominates the scalar loop, but not the vector loop

Comment thread llvm/lib/Transforms/Vectorize/VPlanUtils.cpp
Comment thread llvm/lib/Transforms/Vectorize/VPlanUtils.cpp
Comment thread llvm/lib/Transforms/Vectorize/VPlanTransforms.h
@fhahn fhahn force-pushed the vplan-expand-scev-to-vpinstructions-before-cse branch 2 times, most recently from 49bb901 to c3f77d5 Compare May 16, 2026 21:07
Copy link
Copy Markdown
Contributor

@artagnon artagnon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's attack the patch this is dependent on for now -- I've left a review.

Comment on lines +5025 to +5027
// Expand VPExpandSCEVRecipes to VPInstructions using VPSCEVExpander. During
// the transition, unsupported SCEV expressions are still expanded to
// VPExpandSCEVRecipes.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to update this comment?

Comment thread llvm/lib/Transforms/Vectorize/VPlanTransforms.h Outdated
Comment thread llvm/lib/Transforms/Vectorize/VPlanUtils.h Outdated
@fhahn fhahn force-pushed the vplan-expand-scev-to-vpinstructions-before-cse branch 3 times, most recently from c706cae to c726b5e Compare May 24, 2026 19:50
Add expandSCEVExpressions transform that converts VPExpandSCEVRecipes
to VPInstructions where possible, running before CSE so duplicates with
other SCEV expansions (e.g., from addMinimumIterationCheck) are
eliminated. This also reuses existing loop-invariant IR values via
ScalarEvolution::getSCEVValues to avoid redundant computation.

Currently limited to SCEVMulExpr (along with constants, unknowns, and
vscale). Support for SCEVAddExpr and SCEVUDivExpr will follow in
subsequent patches.
@fhahn fhahn force-pushed the vplan-expand-scev-to-vpinstructions-before-cse branch 2 times, most recently from a718890 to 9d14f49 Compare May 28, 2026 15:39
@fhahn fhahn force-pushed the vplan-expand-scev-to-vpinstructions-before-cse branch from 9d14f49 to 7229fd0 Compare May 28, 2026 15:45
Copy link
Copy Markdown
Contributor Author

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping

Comment on lines +5025 to +5027
// Expand VPExpandSCEVRecipes to VPInstructions using VPSCEVExpander. During
// the transition, unsupported SCEV expressions are still expanded to
// VPExpandSCEVRecipes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated thanks

Comment thread llvm/lib/Transforms/Vectorize/VPlanUtils.h Outdated
Comment thread llvm/lib/Transforms/Vectorize/VPlanUtils.cpp Outdated
Comment thread llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp Outdated
class VPSCEVExpander {
VPBuilder &Builder;
ScalarEvolution &SE;
DebugLoc DL;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
DebugLoc DL;
const DebugLoc &DL;

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DebugLoc is already a wrapper around pointer, IIUC usually it is used directly by value to avoid another level of indirection.

Comment thread llvm/lib/Transforms/Vectorize/VPlanUtils.h Outdated
Comment thread llvm/lib/Transforms/Vectorize/VPlanUtils.cpp Outdated
Copy link
Copy Markdown
Contributor

@artagnon artagnon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks! DL as a non-const-ref is also fine, in case there is some issue changing it?

Copy link
Copy Markdown
Contributor Author

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks! DL as a non-const-ref is also fine, in case there is some issue changing it?

responded inline, but forgot to submit comments....

DebugLoc is already a wrapper around pointer, IIUC usually it is used directly by value to avoid another level of indirection.

Comment thread llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp Outdated
Comment thread llvm/lib/Transforms/Vectorize/VPlanUtils.cpp Outdated
Comment thread llvm/lib/Transforms/Vectorize/VPlanUtils.cpp Outdated
class VPSCEVExpander {
VPBuilder &Builder;
ScalarEvolution &SE;
DebugLoc DL;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DebugLoc is already a wrapper around pointer, IIUC usually it is used directly by value to avoid another level of indirection.

Comment thread llvm/lib/Transforms/Vectorize/VPlanUtils.h Outdated
@fhahn fhahn merged commit ee87266 into llvm:main May 29, 2026
10 checks passed
@fhahn fhahn deleted the vplan-expand-scev-to-vpinstructions-before-cse branch May 29, 2026 20:14
llvm-upstreamsync Bot pushed a commit to qualcomm/cpullvm-toolchain that referenced this pull request May 29, 2026
…e CSE. (#197643)

Add expandSCEVExpressions transform that converts VPExpandSCEVRecipes
to VPInstructions where possible, running before CSE so duplicates with
other SCEV expansions (e.g., from addMinimumIterationCheck) are
eliminated. This also reuses existing loop-invariant IR values via
ScalarEvolution::getSCEVValues to avoid redundant computation.

Currently limited to SCEVMulExpr (along with constants, unknowns, and
vscale). Support for SCEVAddExpr and SCEVUDivExpr will follow in
subsequent patches.

Depends on llvm/llvm-project#189455

PR: llvm/llvm-project#197643
llvm-sync Bot pushed a commit to arm/arm-toolchain that referenced this pull request May 29, 2026
…e CSE. (#197643)

Add expandSCEVExpressions transform that converts VPExpandSCEVRecipes
to VPInstructions where possible, running before CSE so duplicates with
other SCEV expansions (e.g., from addMinimumIterationCheck) are
eliminated. This also reuses existing loop-invariant IR values via
ScalarEvolution::getSCEVValues to avoid redundant computation.

Currently limited to SCEVMulExpr (along with constants, unknowns, and
vscale). Support for SCEVAddExpr and SCEVUDivExpr will follow in
subsequent patches.

Depends on llvm/llvm-project#189455

PR: llvm/llvm-project#197643
@llvm-ci
Copy link
Copy Markdown

llvm-ci commented May 29, 2026

LLVM Buildbot has detected a new failure on builder clang-aarch64-quick running on linaro-clang-aarch64-quick while building llvm at step 5 "ninja check 1".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/65/builds/35076

Here is the relevant piece of the build log for the reference
Step 5 (ninja check 1) failure: stage 1 checked (failure)
******************** TEST 'LLVM-Unit :: Support/./SupportTests/171/430' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:/home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/unittests/Support/./SupportTests-LLVM-Unit-3022194-171-430.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=430 GTEST_SHARD_INDEX=171 /home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/unittests/Support/./SupportTests
--

Script:
--
/home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/unittests/Support/./SupportTests --gtest_filter=ProgramEnvTest.TestLockFile
--
/home/tcwg-buildbot/worker/clang-aarch64-quick/llvm/llvm/unittests/Support/ProgramTest.cpp:571: Failure
Value of: Error.empty()
  Actual: false
Expected: true


/home/tcwg-buildbot/worker/clang-aarch64-quick/llvm/llvm/unittests/Support/ProgramTest.cpp:571
Value of: Error.empty()
  Actual: false
Expected: true



********************


@llvm-ci
Copy link
Copy Markdown

llvm-ci commented May 29, 2026

LLVM Buildbot has detected a new failure on builder openmp-offload-amdgpu-runtime-2 running on rocm-worker-hw-02 while building llvm at step 6 "test-openmp".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/10/builds/29148

Here is the relevant piece of the build log for the reference
Step 6 (test-openmp) failure: test (failure)
******************** TEST 'libarcher :: races/lock-unrelated.c' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 13
/home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/./bin/clang -fopenmp  -gdwarf-4 -O1 -fsanitize=thread  -I /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests -I /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/src   /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/races/lock-unrelated.c -o /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/tools/archer/tests/races/Output/lock-unrelated.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/deflake.bash /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/tools/archer/tests/races/Output/lock-unrelated.c.tmp 2>&1 | tee /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/tools/archer/tests/races/Output/lock-unrelated.c.tmp.log | /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/./bin/FileCheck /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/races/lock-unrelated.c
# executed command: /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/./bin/clang -fopenmp -gdwarf-4 -O1 -fsanitize=thread -I /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests -I /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/runtime/src /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/races/lock-unrelated.c -o /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/tools/archer/tests/races/Output/lock-unrelated.c.tmp -latomic
# note: command had no output on stdout or stderr
# executed command: env TSAN_OPTIONS=ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1 /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/deflake.bash /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/tools/archer/tests/races/Output/lock-unrelated.c.tmp
# note: command had no output on stdout or stderr
# executed command: tee /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/runtimes/runtimes-bins/openmp/tools/archer/tests/races/Output/lock-unrelated.c.tmp.log
# note: command had no output on stdout or stderr
# executed command: /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.build/./bin/FileCheck /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/races/lock-unrelated.c
# .---command stderr------------
# | /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/races/lock-unrelated.c:47:11: error: CHECK: expected string not found in input
# | // CHECK: ThreadSanitizer: reported {{[1-7]}} warnings
# |           ^
# | <stdin>:23:5: note: scanning from here
# | DONE
# |     ^
# | <stdin>:24:1: note: possible intended match here
# | ThreadSanitizer: thread T4 finished with ignores enabled, created at:
# | ^
# | 
# | Input file: <stdin>
# | Check file: /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/races/lock-unrelated.c
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<<<<<
# |             .
# |             .
# |             .
# |            18:  #0 pthread_create /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:1082:3 (lock-unrelated.c.tmp+0xa557a) 
# |            19:  #1 __kmp_create_worker z_Linux_util.cpp (libomp.so+0xcbfa2) 
# |            20:  
# |            21: SUMMARY: ThreadSanitizer: data race /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/openmp/tools/archer/tests/races/lock-unrelated.c:31:8 in main.omp_outlined_debug__ 
# |            22: ================== 
# |            23: DONE 
# | check:47'0         X error: no match found
# |            24: ThreadSanitizer: thread T4 finished with ignores enabled, created at: 
# | check:47'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# | check:47'1     ?                                                                      possible intended match
# |            25:  #0 pthread_create /home/botworker/builds/openmp-offload-amdgpu-runtime-2/llvm.src/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:1082:3 (lock-unrelated.c.tmp+0xa557a) 
# | check:47'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |            26:  #1 __kmp_create_worker z_Linux_util.cpp (libomp.so+0xcbfa2) 
# | check:47'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# |            27:  
...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:RISC-V llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms vectorizers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants