[TTI] Add cover functions for costing build and explode vectors [nfc] #85421

preames · 2024-03-15T16:00:32Z

This is just an API cleanup at the moment. The newly added routines just proxy to the existing getScalarizationOverhead. I think the diff speaks for itself in terms of code clarity.

llvmbot · 2024-03-15T16:01:01Z

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-llvm-analysis

Author: Philip Reames (preames)

Changes

This is just an API cleanup at the moment. The newly added routines just proxy to the existing getScalarizationOverhead. I think the diff speaks for itself in terms of code clarity.

Full diff: https://github.com/llvm/llvm-project/pull/85421.diff

3 Files Affected:

(modified) llvm/include/llvm/Analysis/TargetTransformInfo.h (+20)
(modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+10-15)
(modified) llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp (+4-8)

diff --git a/llvm/include/llvm/Analysis/TargetTransformInfo.h b/llvm/include/llvm/Analysis/TargetTransformInfo.h
index c43a1b5c1b2aa0..46ea102d0084b5 100644
--- a/llvm/include/llvm/Analysis/TargetTransformInfo.h
+++ b/llvm/include/llvm/Analysis/TargetTransformInfo.h
@@ -871,6 +871,26 @@ class TargetTransformInfo {
                                            bool Insert, bool Extract,
                                            TTI::TargetCostKind CostKind) const;
 
+  /// Estimate the cost of a build_vector of unknown elements at the indices
+  /// implied by the active lanes in DemandedElts.  The default implementation
+  /// will simply cost a series of insertelements, but some targets can do
+  /// significantly better.
+  InstructionCost getBuildVectorCost(VectorType *Ty,
+                                     const APInt &DemandedElts,
+                                     TTI::TargetCostKind CostKind) const {
+    return getScalarizationOverhead(Ty, DemandedElts, true, false, CostKind);
+  }
+
+  /// Estimate the cost of exploding a vector of unknown elements at the
+  /// indices implied by the active lanes in DemandedElts into individual
+  /// scalar registers.  The default implementation will simply cost a
+  /// series of extractelements, but some targets can do significantly better.
+  InstructionCost getExplodeVectorCost(VectorType *Ty,
+                                       const APInt &DemandedElts,
+                                       TTI::TargetCostKind CostKind) const {
+    return getScalarizationOverhead(Ty, DemandedElts, false, true, CostKind);
+  }
+
   /// Estimate the overhead of scalarizing an instructions unique
   /// non-constant operands. The (potentially vector) types to use for each of
   /// argument are passes via Tys.
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 52b992b19e4b04..d999606836630c 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -5846,10 +5846,9 @@ InstructionCost LoopVectorizationCostModel::computePredInstDiscount(
     // and phi nodes.
     TTI::TargetCostKind CostKind = TTI::TCK_RecipThroughput;
     if (isScalarWithPredication(I, VF) && !I->getType()->isVoidTy()) {
-      ScalarCost += TTI.getScalarizationOverhead(
+      ScalarCost += TTI.getBuildVectorCost(
           cast<VectorType>(ToVectorTy(I->getType(), VF)),
-          APInt::getAllOnes(VF.getFixedValue()), /*Insert*/ true,
-          /*Extract*/ false, CostKind);
+          APInt::getAllOnes(VF.getFixedValue()), CostKind);
       ScalarCost +=
           VF.getFixedValue() * TTI.getCFInstrCost(Instruction::PHI, CostKind);
     }
@@ -5865,10 +5864,9 @@ InstructionCost LoopVectorizationCostModel::computePredInstDiscount(
         if (canBeScalarized(J))
           Worklist.push_back(J);
         else if (needsExtract(J, VF)) {
-          ScalarCost += TTI.getScalarizationOverhead(
+          ScalarCost += TTI.getExplodeVectorCost(
               cast<VectorType>(ToVectorTy(J->getType(), VF)),
-              APInt::getAllOnes(VF.getFixedValue()), /*Insert*/ false,
-              /*Extract*/ true, CostKind);
+              APInt::getAllOnes(VF.getFixedValue()), CostKind);
         }
       }
 
@@ -6011,9 +6009,8 @@ LoopVectorizationCostModel::getMemInstScalarizationCost(Instruction *I,
     // Add the cost of an i1 extract and a branch
     auto *Vec_i1Ty =
         VectorType::get(IntegerType::getInt1Ty(ValTy->getContext()), VF);
-    Cost += TTI.getScalarizationOverhead(
-        Vec_i1Ty, APInt::getAllOnes(VF.getKnownMinValue()),
-        /*Insert=*/false, /*Extract=*/true, CostKind);
+    Cost += TTI.getExplodeVectorCost(
+        Vec_i1Ty, APInt::getAllOnes(VF.getKnownMinValue()), CostKind);
     Cost += TTI.getCFInstrCost(Instruction::Br, CostKind);
 
     if (useEmulatedMaskMemRefHack(I, VF))
@@ -6386,10 +6383,9 @@ InstructionCost LoopVectorizationCostModel::getScalarizationOverhead(
   Type *RetTy = ToVectorTy(I->getType(), VF);
   if (!RetTy->isVoidTy() &&
       (!isa<LoadInst>(I) || !TTI.supportsEfficientVectorElementLoadStore()))
-    Cost += TTI.getScalarizationOverhead(
+    Cost += TTI.getBuildVectorCost(
         cast<VectorType>(RetTy), APInt::getAllOnes(VF.getKnownMinValue()),
-        /*Insert*/ true,
-        /*Extract*/ false, CostKind);
+        CostKind);
 
   // Some targets keep addresses scalar.
   if (isa<LoadInst>(I) && !TTI.prefersVectorizedAddressing())
@@ -6827,9 +6823,8 @@ LoopVectorizationCostModel::getInstructionCost(Instruction *I, ElementCount VF,
       auto *Vec_i1Ty =
           VectorType::get(IntegerType::getInt1Ty(RetTy->getContext()), VF);
       return (
-          TTI.getScalarizationOverhead(
-              Vec_i1Ty, APInt::getAllOnes(VF.getFixedValue()),
-              /*Insert*/ false, /*Extract*/ true, CostKind) +
+          TTI.getExplodeVectorCost(
+              Vec_i1Ty, APInt::getAllOnes(VF.getFixedValue()), CostKind) +
           (TTI.getCFInstrCost(Instruction::Br, CostKind) * VF.getFixedValue()));
     } else if (I->getParent() == TheLoop->getLoopLatch() || VF.isScalar())
       // The back-edge branch will remain, as will all scalar branches.
diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index b4cce680e2876f..61013c2017f47a 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -8715,9 +8715,7 @@ BoUpSLP::getEntryCost(const TreeEntry *E, ArrayRef<Value *> VectorizedVals,
     assert(Offset < NumElts && "Failed to find vector index offset");
 
     InstructionCost Cost = 0;
-    Cost -= TTI->getScalarizationOverhead(SrcVecTy, DemandedElts,
-                                          /*Insert*/ true, /*Extract*/ false,
-                                          CostKind);
+    Cost -= TTI->getBuildVectorCost(SrcVecTy, DemandedElts, CostKind);
 
     // First cost - resize to actual vector size if not identity shuffle or
     // need to shift the vector.
@@ -9816,9 +9814,9 @@ InstructionCost BoUpSLP::getTreeCost(ArrayRef<Value *> VectorizedVals) {
         MutableArrayRef(Vector.data(), Vector.size()), Base,
         [](const TreeEntry *E) { return E->getVectorFactor(); }, ResizeToVF,
         EstimateShufflesCost);
-    InstructionCost InsertCost = TTI->getScalarizationOverhead(
+    InstructionCost InsertCost = TTI->getBuildVectorCost(
         cast<FixedVectorType>(FirstUsers[I].first->getType()), DemandedElts[I],
-        /*Insert*/ true, /*Extract*/ false, TTI::TCK_RecipThroughput);
+        TTI::TCK_RecipThroughput);
     Cost -= InsertCost;
   }
 
@@ -10531,9 +10529,7 @@ InstructionCost BoUpSLP::getGatherCost(ArrayRef<Value *> VL,
     EstimateInsertCost(I, V);
   }
   if (ForPoisonSrc)
-    Cost =
-        TTI->getScalarizationOverhead(VecTy, ~ShuffledElements, /*Insert*/ true,
-                                      /*Extract*/ false, CostKind);
+    Cost = TTI->getBuildVectorCost(VecTy, ~ShuffledElements, CostKind);
   if (DuplicateNonConst)
     Cost +=
         TTI->getShuffleCost(TargetTransformInfo::SK_PermuteSingleSrc, VecTy);

github-actions · 2024-03-15T16:03:02Z

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:

git-clang-format --diff 33960c90258ed78b9b877b1a43e219d1cbc2efce 90ef5a77188af9b7d2ff922066a2868b78bfd937 -- llvm/include/llvm/Analysis/TargetTransformInfo.h llvm/lib/Transforms/Vectorize/LoopVectorize.cpp llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp

View the diff from clang-format here.

diff --git a/llvm/include/llvm/Analysis/TargetTransformInfo.h b/llvm/include/llvm/Analysis/TargetTransformInfo.h
index 46ea102d00..6e5fa5c996 100644
--- a/llvm/include/llvm/Analysis/TargetTransformInfo.h
+++ b/llvm/include/llvm/Analysis/TargetTransformInfo.h
@@ -875,8 +875,7 @@ public:
   /// implied by the active lanes in DemandedElts.  The default implementation
   /// will simply cost a series of insertelements, but some targets can do
   /// significantly better.
-  InstructionCost getBuildVectorCost(VectorType *Ty,
-                                     const APInt &DemandedElts,
+  InstructionCost getBuildVectorCost(VectorType *Ty, const APInt &DemandedElts,
                                      TTI::TargetCostKind CostKind) const {
     return getScalarizationOverhead(Ty, DemandedElts, true, false, CostKind);
   }
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index d999606836..0d36690ab8 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -6383,9 +6383,9 @@ InstructionCost LoopVectorizationCostModel::getScalarizationOverhead(
   Type *RetTy = ToVectorTy(I->getType(), VF);
   if (!RetTy->isVoidTy() &&
       (!isa<LoadInst>(I) || !TTI.supportsEfficientVectorElementLoadStore()))
-    Cost += TTI.getBuildVectorCost(
-        cast<VectorType>(RetTy), APInt::getAllOnes(VF.getKnownMinValue()),
-        CostKind);
+    Cost += TTI.getBuildVectorCost(cast<VectorType>(RetTy),
+                                   APInt::getAllOnes(VF.getKnownMinValue()),
+                                   CostKind);
 
   // Some targets keep addresses scalar.
   if (isa<LoadInst>(I) && !TTI.prefersVectorizedAddressing())

Introduce utilities for costing build vector and explode vector operations inside the TTI target implementation logic. As can be seen these are by far the most common operations actually performed. In case the goal isn't clear here, I plan to eliminate getScalarizationOverhead from the TTI interface layer. All of our targets cost a combined insert and extract as equivalent to a explode vector followed by a build vector so the combined interface can be killed off. This is the inverse of llvm#85421. Once both patches land, only the actual meat of the change remains. One subtlety here - we have to be very careful to make sure we're calling the directly analogous cover function. We've got a base class and subclass involved here, and it's important at times whether we call a method on the subclass or the base class. This is harder to follow since we have multiple getScalarizationOverhead variants with different signatures - most of which only exist on the base class, but some (not all) of which proxy back to the sub-class.

[TTI] Add cover functions for costing build and explode vectors [nfc]

90ef5a7

This is just an API cleanup at the moment. The newly added routines just proxy to the existing getScalarizationOverhead. I think the diff speaks for itself in terms of code clarity.

preames requested review from alexey-bataev and topperc March 15, 2024 16:00

llvmbot added vectorizers llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms labels Mar 15, 2024

preames mentioned this pull request Mar 15, 2024

[TTI] Introduce utilities for target costing of build & explode vector [NFC] #85455

Closed

preames closed this Jul 24, 2024

preames deleted the pr-tti-api-for-build-and-explode-vector branch July 24, 2024 20:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[TTI] Add cover functions for costing build and explode vectors [nfc] #85421

[TTI] Add cover functions for costing build and explode vectors [nfc] #85421

Uh oh!

preames commented Mar 15, 2024

Uh oh!

llvmbot commented Mar 15, 2024 •

edited

Loading

Uh oh!

github-actions bot commented Mar 15, 2024

Uh oh!

Uh oh!

[TTI] Add cover functions for costing build and explode vectors [nfc] #85421

[TTI] Add cover functions for costing build and explode vectors [nfc] #85421

Uh oh!

Conversation

preames commented Mar 15, 2024

Uh oh!

llvmbot commented Mar 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Mar 15, 2024

Uh oh!

Uh oh!

llvmbot commented Mar 15, 2024 •

edited

Loading