Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SLP] Compute a shuffle mask for SK_InsertSubvector #85408

Merged
merged 4 commits into from
Mar 15, 2024

Conversation

preames
Copy link
Collaborator

@preames preames commented Mar 15, 2024

This is the third of a series of small patches to compute shuffle masks for the couple of cases where we call getShuffleCost without one. My goal is to add an invariant that all calls to getShuffleCost for fixed length vectors have a mask.

After this change, there is one SK_InsertSubvector case left. I excluded it from this patch just because I thought it worthy of individual attention and review.

This is the third of a series of small patches to compute shuffle masks for the couple of cases where we call getShuffleCost without one. My goal is to add an invariant that all calls to getShuffleCost for fixed length vectors have a mask.

After this change, there is one SK_InsertSubvector case left.  I excluded it from this patch just because I thought it worthy of individual attention and review.
@llvmbot
Copy link
Collaborator

llvmbot commented Mar 15, 2024

@llvm/pr-subscribers-llvm-transforms

Author: Philip Reames (preames)

Changes

This is the third of a series of small patches to compute shuffle masks for the couple of cases where we call getShuffleCost without one. My goal is to add an invariant that all calls to getShuffleCost for fixed length vectors have a mask.

After this change, there is one SK_InsertSubvector case left. I excluded it from this patch just because I thought it worthy of individual attention and review.


Full diff: https://github.com/llvm/llvm-project/pull/85408.diff

1 Files Affected:

  • (modified) llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp (+11-4)
diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index b4cce680e2876f..b6868fb3f3ca3f 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -4328,9 +4328,12 @@ BoUpSLP::LoadsState BoUpSLP::canVectorizeLoads(
               llvm_unreachable(
                   "Expected only consecutive, strided or masked gather loads.");
             }
+            SmallVector<int> ShuffleMask(VL.size());
+            for (int i = 0; i < VL.size(); i++)
+              ShuffleMask[i] = i / VF == I ? VL.size() + i % VF : i;
             VecLdCost +=
                 TTI.getShuffleCost(TTI ::SK_InsertSubvector, VecTy,
-                                   std::nullopt, CostKind, I * VF, SubVecTy);
+                                   ShuffleMask, CostKind, I * VF, SubVecTy);
           }
           // If masked gather cost is higher - better to vectorize, so
           // consider it as a gather node. It will be better estimated
@@ -7454,7 +7457,7 @@ getShuffleCost(const TargetTransformInfo &TTI, TTI::ShuffleKind Kind,
         Index + NumSrcElts <= static_cast<int>(Mask.size()))
       return TTI.getShuffleCost(
           TTI::SK_InsertSubvector,
-          FixedVectorType::get(Tp->getElementType(), Mask.size()), std::nullopt,
+          FixedVectorType::get(Tp->getElementType(), Mask.size()), Mask,
           TTI::TCK_RecipThroughput, Index, Tp);
   }
   return TTI.getShuffleCost(Kind, Tp, Mask, CostKind, Index, SubTp, Args);
@@ -7727,9 +7730,13 @@ class BoUpSLP::ShuffleCostEstimator : public BaseShuffleAnalysis {
         }
         if (NeedInsertSubvectorAnalysis) {
           // Add the cost for the subvectors insert.
-          for (int I = VF, E = VL.size(); I < E; I += VF)
+          SmallVector<int> ShuffleMask(VL.size());
+          for (int I = VF, E = VL.size(); I < E; I += VF) {
+            for (int i = 0; i < E; i++)
+              ShuffleMask[i] = i / VF == I ? E + i % VF : i;
             GatherCost += TTI.getShuffleCost(TTI::SK_InsertSubvector, VecTy,
-                                             std::nullopt, CostKind, I, LoadTy);
+                                             ShuffleMask, CostKind, I, LoadTy);
+          }
         }
         GatherCost -= ScalarsCost;
       }

Copy link

github-actions bot commented Mar 15, 2024

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 33960c90258ed78b9b877b1a43e219d1cbc2efce fdb6519fed34446194bec7d23155248302aed6ad -- llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
View the diff from clang-format here.
diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index d850c0ad78..50af772e30 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -4332,8 +4332,8 @@ BoUpSLP::LoadsState BoUpSLP::canVectorizeLoads(
             for (int Idx : seq<int>(0, VL.size()))
               ShuffleMask[Idx] = Idx / VF == I ? VL.size() + Idx % VF : Idx;
             VecLdCost +=
-                TTI.getShuffleCost(TTI ::SK_InsertSubvector, VecTy,
-                                   ShuffleMask, CostKind, I * VF, SubVecTy);
+                TTI.getShuffleCost(TTI ::SK_InsertSubvector, VecTy, ShuffleMask,
+                                   CostKind, I * VF, SubVecTy);
           }
           // If masked gather cost is higher - better to vectorize, so
           // consider it as a gather node. It will be better estimated

preames and others added 2 commits March 15, 2024 08:15
Co-authored-by: Alexey Bataev <a.bataev@gmx.com>
Co-authored-by: Alexey Bataev <a.bataev@gmx.com>
@@ -4328,9 +4328,12 @@ BoUpSLP::LoadsState BoUpSLP::canVectorizeLoads(
llvm_unreachable(
"Expected only consecutive, strided or masked gather loads.");
}
SmallVector<int> ShuffleMask(VL.size());
for (int Idx : seq<int>(0, VL.size()))
ShuffleMask[i] = i / VF == I ? VL.size() + i % VF : i;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All is must be replaced too

Copy link
Member

@alexey-bataev alexey-bataev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG

@preames preames merged commit 45e41f9 into llvm:main Mar 15, 2024
2 of 4 checks passed
@preames preames deleted the pr-slp-mask-in-insertsubvector branch March 15, 2024 15:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants