LV: avoid printing erroneous zero TC #94164

artagnon · 2024-06-02T17:17:54Z

SCEV's getSmallConstantTripCount was previously returning a special zero value for invalid TC, and LV didn't check this before declaring that a trip count was found in its debug-printing. Since the API is clearer now, don't erroneously print a trip count of zero.

-- 8< --
Based on #94162.

ScalarEvolution::getSmallConstantTripCount and getSmallConstantMaxTripCount have a special zero return value to indicate that the trip count is unknown or on unsigned-wrap. This can cause confusion if callers aren't careful. Change it to never wrap, and return an std::optional that has a value on valid trip counts. This patch doesn't show the benefits of the change, and uses value_or(0) to migrate many callers in what is a non-functional change. Improvements are planned for future patches.

SCEV's getSmallConstantTripCount was previously returning a special zero value for invalid TC, and LV didn't check this before declaring that a trip count was found in its debug-printing. Since the API is clearer now, don't erroneously print a trip count of zero.

llvmbot · 2024-06-02T17:18:25Z

@llvm/pr-subscribers-llvm-analysis
@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-backend-powerpc

@llvm/pr-subscribers-backend-risc-v

Author: Ramkumar Ramachandra (artagnon)

Changes

SCEV's getSmallConstantTripCount was previously returning a special zero value for invalid TC, and LV didn't check this before declaring that a trip count was found in its debug-printing. Since the API is clearer now, don't erroneously print a trip count of zero.

-- 8< --
Based on #94162.

Patch is 25.68 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/94164.diff

15 Files Affected:

(modified) llvm/include/llvm/Analysis/ScalarEvolution.h (+11-15)
(modified) llvm/lib/Analysis/Loads.cpp (+2-2)
(modified) llvm/lib/Analysis/LoopCacheAnalysis.cpp (+4-3)
(modified) llvm/lib/Analysis/ScalarEvolution.cpp (+15-12)
(modified) llvm/lib/Target/Hexagon/HexagonTargetTransformInfo.cpp (+2-3)
(modified) llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp (+1-1)
(modified) llvm/lib/Transforms/Scalar/LoopDataPrefetch.cpp (+2-1)
(modified) llvm/lib/Transforms/Scalar/LoopFuse.cpp (+2-2)
(modified) llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp (+4-4)
(modified) llvm/lib/Transforms/Scalar/LoopUnrollAndJamPass.cpp (+3-2)
(modified) llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp (+4-3)
(modified) llvm/lib/Transforms/Utils/LoopUnroll.cpp (+2-2)
(modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+33-25)
(modified) llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse.ll (-2)
(modified) llvm/unittests/Analysis/UnrollAnalyzerTest.cpp (+1-1)

diff --git a/llvm/include/llvm/Analysis/ScalarEvolution.h b/llvm/include/llvm/Analysis/ScalarEvolution.h
index 72f3d94542496..6b51407f95a9b 100644
--- a/llvm/include/llvm/Analysis/ScalarEvolution.h
+++ b/llvm/include/llvm/Analysis/ScalarEvolution.h
@@ -810,27 +810,23 @@ class ScalarEvolution {
                                         const Loop *L);
 
   /// Returns the exact trip count of the loop if we can compute it, and
-  /// the result is a small constant.  '0' is used to represent an unknown
-  /// or non-constant trip count.  Note that a trip count is simply one more
+  /// the result is a small constant. Note that a trip count is simply one more
   /// than the backedge taken count for the loop.
-  unsigned getSmallConstantTripCount(const Loop *L);
+  std::optional<unsigned> getSmallConstantTripCount(const Loop *L);
 
   /// Return the exact trip count for this loop if we exit through ExitingBlock.
-  /// '0' is used to represent an unknown or non-constant trip count.  Note
-  /// that a trip count is simply one more than the backedge taken count for
-  /// the same exit.
-  /// This "trip count" assumes that control exits via ExitingBlock. More
-  /// precisely, it is the number of times that control will reach ExitingBlock
-  /// before taking the branch. For loops with multiple exits, it may not be
-  /// the number times that the loop header executes if the loop exits
-  /// prematurely via another branch.
-  unsigned getSmallConstantTripCount(const Loop *L,
-                                     const BasicBlock *ExitingBlock);
+  /// Note that a trip count is simply one more than the backedge taken count
+  /// for the same exit. This "trip count" assumes that control exits via
+  /// ExitingBlock. More precisely, it is the number of times that control will
+  /// reach ExitingBlock before taking the branch. For loops with multiple
+  /// exits, it may not be the number times that the loop header executes if the
+  /// loop exits prematurely via another branch.
+  std::optional<unsigned>
+  getSmallConstantTripCount(const Loop *L, const BasicBlock *ExitingBlock);
 
   /// Returns the upper bound of the loop trip count as a normal unsigned
   /// value.
-  /// Returns 0 if the trip count is unknown or not constant.
-  unsigned getSmallConstantMaxTripCount(const Loop *L);
+  std::optional<unsigned> getSmallConstantMaxTripCount(const Loop *L);
 
   /// Returns the largest constant divisor of the trip count as a normal
   /// unsigned value, if possible. This means that the actual trip count is
diff --git a/llvm/lib/Analysis/Loads.cpp b/llvm/lib/Analysis/Loads.cpp
index 478302d687b53..0549216c7a425 100644
--- a/llvm/lib/Analysis/Loads.cpp
+++ b/llvm/lib/Analysis/Loads.cpp
@@ -287,7 +287,7 @@ bool llvm::isDereferenceableAndAlignedInLoop(LoadInst *LI, Loop *L,
   if (!Step)
     return false;
 
-  auto TC = SE.getSmallConstantMaxTripCount(L);
+  std::optional<unsigned> TC = SE.getSmallConstantMaxTripCount(L);
   if (!TC)
     return false;
 
@@ -301,7 +301,7 @@ bool llvm::isDereferenceableAndAlignedInLoop(LoadInst *LI, Loop *L,
   // same.
   // For patterns with gaps (i.e. non unit stride), we are
   // accessing EltSize bytes at every Step.
-  APInt AccessSize = TC * Step->getAPInt();
+  APInt AccessSize = *TC * Step->getAPInt();
 
   assert(SE.isLoopInvariant(AddRec->getStart(), L) &&
          "implied by addrec definition");
diff --git a/llvm/lib/Analysis/LoopCacheAnalysis.cpp b/llvm/lib/Analysis/LoopCacheAnalysis.cpp
index 7ca9f15ad5fca..27803365a12f8 100644
--- a/llvm/lib/Analysis/LoopCacheAnalysis.cpp
+++ b/llvm/lib/Analysis/LoopCacheAnalysis.cpp
@@ -568,9 +568,10 @@ CacheCost::CacheCost(const LoopVectorTy &Loops, const LoopInfo &LI,
   assert(!Loops.empty() && "Expecting a non-empty loop vector.");
 
   for (const Loop *L : Loops) {
-    unsigned TripCount = SE.getSmallConstantTripCount(L);
-    TripCount = (TripCount == 0) ? DefaultTripCount : TripCount;
-    TripCounts.push_back({L, TripCount});
+    std::optional<unsigned> TripCount = SE.getSmallConstantTripCount(L);
+    if (!TripCount)
+      TripCount = DefaultTripCount;
+    TripCounts.push_back({L, *TripCount});
   }
 
   calculateCacheFootprint();
diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index e46d7183a2a35..1927ae62cda8e 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -6463,7 +6463,7 @@ getRangeForUnknownRecurrence(const SCEVUnknown *U) {
     // TODO: Handle the power function forms some day.
     return FullSet;
 
-  unsigned TC = getSmallConstantMaxTripCount(L);
+  std::optional<unsigned> TC = getSmallConstantMaxTripCount(L);
   if (!TC || TC >= BitWidth)
     return FullSet;
 
@@ -6474,7 +6474,7 @@ getRangeForUnknownRecurrence(const SCEVUnknown *U) {
 
   // Compute total shift amount, being careful of overflow and bitwidths.
   auto MaxShiftAmt = KnownStep.getMaxValue();
-  APInt TCAP(BitWidth, TC-1);
+  APInt TCAP(BitWidth, *TC - 1);
   bool Overflow = false;
   auto TotalShift = MaxShiftAmt.umul_ov(TCAP, Overflow);
   if (Overflow)
@@ -8174,26 +8174,28 @@ const SCEV *ScalarEvolution::getTripCountFromExitCount(const SCEV *ExitCount,
   return getAddExpr(getTruncateOrZeroExtend(ExitCount, EvalTy), getOne(EvalTy));
 }
 
-static unsigned getConstantTripCount(const SCEVConstant *ExitCount) {
+static std::optional<unsigned>
+getConstantTripCount(const SCEVConstant *ExitCount) {
   if (!ExitCount)
-    return 0;
+    return std::nullopt;
 
   ConstantInt *ExitConst = ExitCount->getValue();
 
-  // Guard against huge trip counts.
-  if (ExitConst->getValue().getActiveBits() > 32)
-    return 0;
+  // Guanteed to never overflow.
+  if (std::optional<uint64_t> V = ExitConst->getValue().tryZExtValue())
+    if (V < std::numeric_limits<unsigned>::max())
+      return ((unsigned)*V) + 1;
 
-  // In case of integer overflow, this returns 0, which is correct.
-  return ((unsigned)ExitConst->getZExtValue()) + 1;
+  return std::nullopt;
 }
 
-unsigned ScalarEvolution::getSmallConstantTripCount(const Loop *L) {
+std::optional<unsigned>
+ScalarEvolution::getSmallConstantTripCount(const Loop *L) {
   auto *ExitCount = dyn_cast<SCEVConstant>(getBackedgeTakenCount(L, Exact));
   return getConstantTripCount(ExitCount);
 }
 
-unsigned
+std::optional<unsigned>
 ScalarEvolution::getSmallConstantTripCount(const Loop *L,
                                            const BasicBlock *ExitingBlock) {
   assert(ExitingBlock && "Must pass a non-null exiting block!");
@@ -8204,7 +8206,8 @@ ScalarEvolution::getSmallConstantTripCount(const Loop *L,
   return getConstantTripCount(ExitCount);
 }
 
-unsigned ScalarEvolution::getSmallConstantMaxTripCount(const Loop *L) {
+std::optional<unsigned>
+ScalarEvolution::getSmallConstantMaxTripCount(const Loop *L) {
   const auto *MaxExitCount =
       dyn_cast<SCEVConstant>(getConstantMaxBackedgeTakenCount(L));
   return getConstantTripCount(MaxExitCount);
diff --git a/llvm/lib/Target/Hexagon/HexagonTargetTransformInfo.cpp b/llvm/lib/Target/Hexagon/HexagonTargetTransformInfo.cpp
index f47fcff5d6025..82e843996ed2d 100644
--- a/llvm/lib/Target/Hexagon/HexagonTargetTransformInfo.cpp
+++ b/llvm/lib/Target/Hexagon/HexagonTargetTransformInfo.cpp
@@ -88,9 +88,8 @@ void HexagonTTIImpl::getPeelingPreferences(Loop *L, ScalarEvolution &SE,
                                            TTI::PeelingPreferences &PP) {
   BaseT::getPeelingPreferences(L, SE, PP);
   // Only try to peel innermost loops with small runtime trip counts.
-  if (L && L->isInnermost() && canPeel(L) &&
-      SE.getSmallConstantTripCount(L) == 0 &&
-      SE.getSmallConstantMaxTripCount(L) > 0 &&
+  if (L && L->isInnermost() && canPeel(L) && !SE.getSmallConstantTripCount(L) &&
+      SE.getSmallConstantMaxTripCount(L) &&
       SE.getSmallConstantMaxTripCount(L) <= 5) {
     PP.PeelCount = 2;
   }
diff --git a/llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp b/llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp
index 3fa35efc2d159..cea12d68a5b12 100644
--- a/llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp
+++ b/llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp
@@ -346,7 +346,7 @@ bool PPCTTIImpl::isHardwareLoopProfitable(Loop *L, ScalarEvolution &SE,
   SchedModel.init(ST);
 
   // Do not convert small short loops to CTR loop.
-  unsigned ConstTripCount = SE.getSmallConstantTripCount(L);
+  unsigned ConstTripCount = SE.getSmallConstantTripCount(L).value_or(0);
   if (ConstTripCount && ConstTripCount < SmallCTRLoopThreshold) {
     SmallPtrSet<const Value *, 32> EphValues;
     CodeMetrics::collectEphemeralValues(L, &AC, EphValues);
diff --git a/llvm/lib/Transforms/Scalar/LoopDataPrefetch.cpp b/llvm/lib/Transforms/Scalar/LoopDataPrefetch.cpp
index cc1f56014eee9..a135ee2515462 100644
--- a/llvm/lib/Transforms/Scalar/LoopDataPrefetch.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopDataPrefetch.cpp
@@ -316,7 +316,8 @@ bool LoopDataPrefetch::runOnLoop(Loop *L) {
   if (ItersAhead > getMaxPrefetchIterationsAhead())
     return MadeChange;
 
-  unsigned ConstantMaxTripCount = SE->getSmallConstantMaxTripCount(L);
+  unsigned ConstantMaxTripCount =
+      SE->getSmallConstantMaxTripCount(L).value_or(0);
   if (ConstantMaxTripCount && ConstantMaxTripCount < ItersAhead + 1)
     return MadeChange;
 
diff --git a/llvm/lib/Transforms/Scalar/LoopFuse.cpp b/llvm/lib/Transforms/Scalar/LoopFuse.cpp
index e0b224d5ef735..bf861d82925dd 100644
--- a/llvm/lib/Transforms/Scalar/LoopFuse.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopFuse.cpp
@@ -754,8 +754,8 @@ struct LoopFuser {
 
     // Currently only considering loops with a single exit point
     // and a non-constant trip count.
-    const unsigned TC0 = SE.getSmallConstantTripCount(FC0.L);
-    const unsigned TC1 = SE.getSmallConstantTripCount(FC1.L);
+    const unsigned TC0 = SE.getSmallConstantTripCount(FC0.L).value_or(0);
+    const unsigned TC1 = SE.getSmallConstantTripCount(FC1.L).value_or(0);
 
     // If any of the tripcounts are zero that means that loop(s) do not have
     // a single exit or a constant tripcount.
diff --git a/llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp b/llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp
index 35a17d6060c94..db938b86dc1b7 100644
--- a/llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp
@@ -3152,7 +3152,7 @@ void LSRInstance::CollectChains() {
 void LSRInstance::FinalizeChain(IVChain &Chain) {
   assert(!Chain.Incs.empty() && "empty IV chains are not allowed");
   LLVM_DEBUG(dbgs() << "Final Chain: " << *Chain.Incs[0].UserInst << "\n");
-  
+
   for (const IVInc &Inc : Chain) {
     LLVM_DEBUG(dbgs() << "        Inc: " << *Inc.UserInst << "\n");
     auto UseI = find(Inc.UserInst->operands(), Inc.IVOperand);
@@ -6352,7 +6352,7 @@ struct SCEVDbgValueBuilder {
       if (Op.getOp() != dwarf::DW_OP_LLVM_arg) {
         Op.appendToVector(DestExpr);
         continue;
-      } 
+      }
 
       DestExpr.push_back(dwarf::DW_OP_LLVM_arg);
       // `DW_OP_LLVM_arg n` represents the nth LocationOp in this SCEV,
@@ -6822,8 +6822,8 @@ canFoldTermCondOfLoop(Loop *L, ScalarEvolution &SE, DominatorTree &DT,
   // the allowed cost with the loops trip count as best we can.
   const unsigned ExpansionBudget = [&]() {
     unsigned Budget = 2 * SCEVCheapExpansionBudget;
-    if (unsigned SmallTC = SE.getSmallConstantMaxTripCount(L))
-      return std::min(Budget, SmallTC);
+    if (std::optional<unsigned> SmallTC = SE.getSmallConstantMaxTripCount(L))
+      return std::min(Budget, *SmallTC);
     if (std::optional<unsigned> SmallTC = getLoopEstimatedTripCount(L))
       return std::min(Budget, *SmallTC);
     // Unknown trip count, assume long running by default.
diff --git a/llvm/lib/Transforms/Scalar/LoopUnrollAndJamPass.cpp b/llvm/lib/Transforms/Scalar/LoopUnrollAndJamPass.cpp
index 7b4c54370e48a..a10ff525317f8 100644
--- a/llvm/lib/Transforms/Scalar/LoopUnrollAndJamPass.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopUnrollAndJamPass.cpp
@@ -363,9 +363,10 @@ tryToUnrollAndJamLoop(Loop *L, DominatorTree &DT, LoopInfo *LI,
   // Find trip count and trip multiple
   BasicBlock *Latch = L->getLoopLatch();
   BasicBlock *SubLoopLatch = SubLoop->getLoopLatch();
-  unsigned OuterTripCount = SE.getSmallConstantTripCount(L, Latch);
+  unsigned OuterTripCount = SE.getSmallConstantTripCount(L, Latch).value_or(0);
   unsigned OuterTripMultiple = SE.getSmallConstantTripMultiple(L, Latch);
-  unsigned InnerTripCount = SE.getSmallConstantTripCount(SubLoop, SubLoopLatch);
+  unsigned InnerTripCount =
+      SE.getSmallConstantTripCount(SubLoop, SubLoopLatch).value_or(0);
 
   // Decide if, and by how much, to unroll
   bool IsCountSetExplicitly = computeUnrollAndJamCount(
diff --git a/llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp b/llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp
index 10fc9e9303e89..b3506fc9ca9dd 100644
--- a/llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp
@@ -1234,9 +1234,10 @@ tryToUnrollLoop(Loop *L, DominatorTree &DT, LoopInfo *LI, ScalarEvolution &SE,
   SmallVector<BasicBlock *, 8> ExitingBlocks;
   L->getExitingBlocks(ExitingBlocks);
   for (BasicBlock *ExitingBlock : ExitingBlocks)
-    if (unsigned TC = SE.getSmallConstantTripCount(L, ExitingBlock))
+    if (std::optional<unsigned> TC =
+            SE.getSmallConstantTripCount(L, ExitingBlock))
       if (!TripCount || TC < TripCount)
-        TripCount = TripMultiple = TC;
+        TripCount = TripMultiple = *TC;
 
   if (!TripCount) {
     // If no exact trip count is known, determine the trip multiple of either
@@ -1269,7 +1270,7 @@ tryToUnrollLoop(Loop *L, DominatorTree &DT, LoopInfo *LI, ScalarEvolution &SE,
   unsigned MaxTripCount = 0;
   bool MaxOrZero = false;
   if (!TripCount) {
-    MaxTripCount = SE.getSmallConstantMaxTripCount(L);
+    MaxTripCount = SE.getSmallConstantMaxTripCount(L).value_or(0);
     MaxOrZero = SE.isBackedgeTakenCountMaxOrZero(L);
   }
 
diff --git a/llvm/lib/Transforms/Utils/LoopUnroll.cpp b/llvm/lib/Transforms/Utils/LoopUnroll.cpp
index 1216538195fbd..d057b45a5bd1b 100644
--- a/llvm/lib/Transforms/Utils/LoopUnroll.cpp
+++ b/llvm/lib/Transforms/Utils/LoopUnroll.cpp
@@ -477,7 +477,7 @@ llvm::UnrollLoop(Loop *L, UnrollLoopOptions ULO, LoopInfo *LI,
   L->getExitBlocks(ExitBlocks);
   std::vector<BasicBlock *> OriginalLoopBlocks = L->getBlocks();
 
-  const unsigned MaxTripCount = SE->getSmallConstantMaxTripCount(L);
+  const unsigned MaxTripCount = SE->getSmallConstantMaxTripCount(L).value_or(0);
   const bool MaxOrZero = SE->isBackedgeTakenCountMaxOrZero(L);
   unsigned EstimatedLoopInvocationWeight = 0;
   std::optional<unsigned> OriginalTripCount =
@@ -507,7 +507,7 @@ llvm::UnrollLoop(Loop *L, UnrollLoopOptions ULO, LoopInfo *LI,
       continue;
 
     ExitInfo &Info = ExitInfos.try_emplace(ExitingBlock).first->second;
-    Info.TripCount = SE->getSmallConstantTripCount(L, ExitingBlock);
+    Info.TripCount = SE->getSmallConstantTripCount(L, ExitingBlock).value_or(0);
     Info.TripMultiple = SE->getSmallConstantTripMultiple(L, ExitingBlock);
     if (Info.TripCount != 0) {
       Info.BreakoutTrip = Info.TripCount % ULO.Count;
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 6628f3d53f56a..275baf5a3adb3 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -428,7 +428,7 @@ static unsigned getReciprocalPredBlockProb() { return 2; }
 static std::optional<unsigned> getSmallBestKnownTC(ScalarEvolution &SE,
                                                    Loop *L) {
   // Check if exact trip count is known.
-  if (unsigned ExpectedTC = SE.getSmallConstantTripCount(L))
+  if (std::optional<unsigned> ExpectedTC = SE.getSmallConstantTripCount(L))
     return ExpectedTC;
 
   // Check if there is an expected trip count available from profile data.
@@ -437,7 +437,7 @@ static std::optional<unsigned> getSmallBestKnownTC(ScalarEvolution &SE,
       return *EstimatedTC;
 
   // Check if upper bound estimate is known.
-  if (unsigned ExpectedTC = SE.getSmallConstantMaxTripCount(L))
+  if (std::optional<unsigned> ExpectedTC = SE.getSmallConstantMaxTripCount(L))
     return ExpectedTC;
 
   return std::nullopt;
@@ -1629,14 +1629,14 @@ class LoopVectorizationCostModel {
   /// elements is a power-of-2 larger than zero. If scalable vectorization is
   /// disabled or unsupported, then the scalable part will be equal to
   /// ElementCount::getScalable(0).
-  FixedScalableVFPair computeFeasibleMaxVF(unsigned MaxTripCount,
+  FixedScalableVFPair computeFeasibleMaxVF(std::optional<unsigned> MaxTripCount,
                                            ElementCount UserVF,
                                            bool FoldTailByMasking);
 
   /// \return the maximized element count based on the targets vector
   /// registers and the loop trip-count, but limited to a maximum safe VF.
   /// This is a helper function of computeFeasibleMaxVF.
-  ElementCount getMaximizedVFForTarget(unsigned MaxTripCount,
+  ElementCount getMaximizedVFForTarget(std::optional<unsigned> MaxTripCount,
                                        unsigned SmallestType,
                                        unsigned WidestType,
                                        ElementCount MaxSafeVF,
@@ -2046,8 +2046,9 @@ class GeneratedRTChecks {
           unsigned BestTripCount = 2;
 
           // If exact trip count is known use that.
-          if (unsigned SmallTC = SE->getSmallConstantTripCount(OuterLoop))
-            BestTripCount = SmallTC;
+          if (std::optional<unsigned> SmallTC =
+                  SE->getSmallConstantTripCount(OuterLoop))
+            BestTripCount = *SmallTC;
           else if (LoopVectorizeWithBlockFrequency) {
             // Else use profile data if available.
             if (auto EstimatedTC = getLoopEstimatedTripCount(OuterLoop))
@@ -2382,7 +2383,7 @@ static bool isIndvarOverflowCheckKnownFalse(
   // We know the runtime overflow check is known false iff the (max) trip-count
   // is known and (max) trip-count + (VF * UF) does not overflow in the type of
   // the vector loop induction variable.
-  if (unsigned TC =
+  if (std::optional<unsigned> TC =
           Cost->PSE.getSE()->getSmallConstantMaxTripCount(Cost->TheLoop)) {
     uint64_t MaxVF = VF.getKnownMinValue();
     if (VF.isScalable()) {
@@ -2393,7 +2394,7 @@ static bool isIndvarOverflowCheckKnownFalse(
       MaxVF *= *MaxVScale;
     }
 
-    return (MaxUIntTripCount - TC).ugt(MaxVF * MaxUF);
+    return (MaxUIntTripCount - *TC).ugt(MaxVF * MaxUF);
   }
 
   return false;
@@ -4447,7 +4448,8 @@ LoopVectorizationCostModel::getMaxLegalScalableVF(unsigned MaxSafeElements) {
 }
 
 FixedScalableVFPair LoopVectorizationCostModel::computeFeasibleMaxVF(
-    unsigned MaxTripCount, ElementCount UserVF, bool FoldTailByMasking) {
+    std::optional<unsigned> MaxTripCount, ElementCount UserVF,
+    bool FoldTailByMasking) {
   MinBWs = computeMinimumValueSizes(TheLoop->getBlocks(), *DB, &TTI);
   unsigned SmallestType, WidestType;
   std::tie(SmallestType, WidestType) = getSmallestAndWidestTypes();
@@ -4563,9 +4565,13 @@ LoopVectorizationCostModel::computeMaxVF(ElementCount UserVF, unsigned UserIC) {
     return FixedScalableVFPair::getNone();
   }
 
-  unsigned TC = PSE.getSE()->getSmallConstantTripCount(TheLoop);
-  unsigned MaxTC = PSE.getSE()->getSmallConstantMaxTripCount(TheLoop);
-  LLVM_DEBUG(dbgs() << "LV: Found trip count: " << TC << '\n');
+  std::optional<unsigned> TC = PSE.getSE()->getSmallConstantTripCount(TheLoop);
+  std::optional<unsigned> MaxTC =
+      PSE.getSE()->getSmallConstantMaxTripCount(TheLoop);
+
+  if (TC)
+    LLVM_DEBUG(dbgs() << "LV: Found trip count: " << TC << '\n');
+
   if (TC == 1) {
     reportVectorizationFailure("Single iteration (non) loop",
         "loop trip count is one, irrelevant for vectorization",
@@ -4702,7 +4708,7 @@ LoopVectorizationCostModel::computeMaxVF(ElementCount UserVF, unsigned UserIC) {
     return FixedScalableVFPair::getNone();
   }
 
-  if (TC == 0) {
+  if (!TC) {
     reportVectorizationFailure(
         "Unable to calculate the loop count due to complex control flow",
         "unable to calculate the loop count due to complex control flow",
@@ -4720,8 +4726,8 @@ LoopVectorizationCostModel::computeMaxVF(ElementCount UserVF, unsigned UserIC) {
 }
 
 Element...
[truncated]

artagnon · 2024-07-31T12:22:28Z

Not pursuing this.

artagnon added 2 commits June 2, 2024 16:02

artagnon requested review from nikic and fhahn June 2, 2024 17:17

llvmbot added backend:RISC-V backend:PowerPC vectorizers llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms labels Jun 2, 2024

artagnon closed this Jul 31, 2024

artagnon deleted the lv-tc-zero branch July 31, 2024 12:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LV: avoid printing erroneous zero TC #94164

LV: avoid printing erroneous zero TC #94164

Uh oh!

artagnon commented Jun 2, 2024

Uh oh!

llvmbot commented Jun 2, 2024 •

edited

Loading

Uh oh!

artagnon commented Jul 31, 2024

Uh oh!

Uh oh!

LV: avoid printing erroneous zero TC #94164

LV: avoid printing erroneous zero TC #94164

Uh oh!

Conversation

artagnon commented Jun 2, 2024

Uh oh!

llvmbot commented Jun 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

artagnon commented Jul 31, 2024

Uh oh!

Uh oh!

llvmbot commented Jun 2, 2024 •

edited

Loading