[SLP] Compute a shuffle mask for SK_InsertSubvector #85408

preames · 2024-03-15T14:59:11Z

This is the third of a series of small patches to compute shuffle masks for the couple of cases where we call getShuffleCost without one. My goal is to add an invariant that all calls to getShuffleCost for fixed length vectors have a mask.

After this change, there is one SK_InsertSubvector case left. I excluded it from this patch just because I thought it worthy of individual attention and review.

This is the third of a series of small patches to compute shuffle masks for the couple of cases where we call getShuffleCost without one. My goal is to add an invariant that all calls to getShuffleCost for fixed length vectors have a mask. After this change, there is one SK_InsertSubvector case left. I excluded it from this patch just because I thought it worthy of individual attention and review.

llvmbot · 2024-03-15T14:59:40Z

@llvm/pr-subscribers-llvm-transforms

Author: Philip Reames (preames)

Changes

This is the third of a series of small patches to compute shuffle masks for the couple of cases where we call getShuffleCost without one. My goal is to add an invariant that all calls to getShuffleCost for fixed length vectors have a mask.

After this change, there is one SK_InsertSubvector case left. I excluded it from this patch just because I thought it worthy of individual attention and review.

Full diff: https://github.com/llvm/llvm-project/pull/85408.diff

1 Files Affected:

(modified) llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp (+11-4)

diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index b4cce680e2876f..b6868fb3f3ca3f 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -4328,9 +4328,12 @@ BoUpSLP::LoadsState BoUpSLP::canVectorizeLoads(
               llvm_unreachable(
                   "Expected only consecutive, strided or masked gather loads.");
             }
+            SmallVector<int> ShuffleMask(VL.size());
+            for (int i = 0; i < VL.size(); i++)
+              ShuffleMask[i] = i / VF == I ? VL.size() + i % VF : i;
             VecLdCost +=
                 TTI.getShuffleCost(TTI ::SK_InsertSubvector, VecTy,
-                                   std::nullopt, CostKind, I * VF, SubVecTy);
+                                   ShuffleMask, CostKind, I * VF, SubVecTy);
           }
           // If masked gather cost is higher - better to vectorize, so
           // consider it as a gather node. It will be better estimated
@@ -7454,7 +7457,7 @@ getShuffleCost(const TargetTransformInfo &TTI, TTI::ShuffleKind Kind,
         Index + NumSrcElts <= static_cast<int>(Mask.size()))
       return TTI.getShuffleCost(
           TTI::SK_InsertSubvector,
-          FixedVectorType::get(Tp->getElementType(), Mask.size()), std::nullopt,
+          FixedVectorType::get(Tp->getElementType(), Mask.size()), Mask,
           TTI::TCK_RecipThroughput, Index, Tp);
   }
   return TTI.getShuffleCost(Kind, Tp, Mask, CostKind, Index, SubTp, Args);
@@ -7727,9 +7730,13 @@ class BoUpSLP::ShuffleCostEstimator : public BaseShuffleAnalysis {
         }
         if (NeedInsertSubvectorAnalysis) {
           // Add the cost for the subvectors insert.
-          for (int I = VF, E = VL.size(); I < E; I += VF)
+          SmallVector<int> ShuffleMask(VL.size());
+          for (int I = VF, E = VL.size(); I < E; I += VF) {
+            for (int i = 0; i < E; i++)
+              ShuffleMask[i] = i / VF == I ? E + i % VF : i;
             GatherCost += TTI.getShuffleCost(TTI::SK_InsertSubvector, VecTy,
-                                             std::nullopt, CostKind, I, LoadTy);
+                                             ShuffleMask, CostKind, I, LoadTy);
+          }
         }
         GatherCost -= ScalarsCost;
       }

github-actions · 2024-03-15T15:01:55Z

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:

git-clang-format --diff 33960c90258ed78b9b877b1a43e219d1cbc2efce fdb6519fed34446194bec7d23155248302aed6ad -- llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp

View the diff from clang-format here.

diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index d850c0ad78..50af772e30 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -4332,8 +4332,8 @@ BoUpSLP::LoadsState BoUpSLP::canVectorizeLoads(
             for (int Idx : seq<int>(0, VL.size()))
               ShuffleMask[Idx] = Idx / VF == I ? VL.size() + Idx % VF : Idx;
             VecLdCost +=
-                TTI.getShuffleCost(TTI ::SK_InsertSubvector, VecTy,
-                                   ShuffleMask, CostKind, I * VF, SubVecTy);
+                TTI.getShuffleCost(TTI ::SK_InsertSubvector, VecTy, ShuffleMask,
+                                   CostKind, I * VF, SubVecTy);
           }
           // If masked gather cost is higher - better to vectorize, so
           // consider it as a gather node. It will be better estimated

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp

Co-authored-by: Alexey Bataev <a.bataev@gmx.com>

alexey-bataev · 2024-03-15T15:25:10Z

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp

@@ -4328,9 +4328,12 @@ BoUpSLP::LoadsState BoUpSLP::canVectorizeLoads(
              llvm_unreachable(
                  "Expected only consecutive, strided or masked gather loads.");
            }
+            SmallVector<int> ShuffleMask(VL.size());
+            for (int Idx : seq<int>(0, VL.size()))
+              ShuffleMask[i] = i / VF == I ? VL.size() + i % VF : i;


All is must be replaced too

alexey-bataev

LG

preames requested a review from alexey-bataev March 15, 2024 14:59

llvmbot added vectorization llvm:transforms labels Mar 15, 2024

alexey-bataev reviewed Mar 15, 2024

View reviewed changes

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp Outdated Show resolved Hide resolved

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp Outdated Show resolved Hide resolved

preames and others added 2 commits March 15, 2024 08:15

Update llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp

aed09af

Co-authored-by: Alexey Bataev <a.bataev@gmx.com>

Update llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp

943a1d4

Co-authored-by: Alexey Bataev <a.bataev@gmx.com>

alexey-bataev reviewed Mar 15, 2024

View reviewed changes

Fix build after applying suggestions

fdb6519

alexey-bataev approved these changes Mar 15, 2024

View reviewed changes

preames merged commit 45e41f9 into llvm:main Mar 15, 2024
2 of 4 checks passed

preames deleted the pr-slp-mask-in-insertsubvector branch March 15, 2024 15:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SLP] Compute a shuffle mask for SK_InsertSubvector #85408

[SLP] Compute a shuffle mask for SK_InsertSubvector #85408

preames commented Mar 15, 2024

llvmbot commented Mar 15, 2024

github-actions bot commented Mar 15, 2024 •

edited

alexey-bataev Mar 15, 2024

alexey-bataev left a comment

[SLP] Compute a shuffle mask for SK_InsertSubvector #85408

[SLP] Compute a shuffle mask for SK_InsertSubvector #85408

Conversation

preames commented Mar 15, 2024

llvmbot commented Mar 15, 2024

github-actions bot commented Mar 15, 2024 • edited

alexey-bataev Mar 15, 2024

Choose a reason for hiding this comment

alexey-bataev left a comment

Choose a reason for hiding this comment

github-actions bot commented Mar 15, 2024 •

edited