-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SLP] Compute a shuffle mask for SK_InsertSubvector #85408
Conversation
This is the third of a series of small patches to compute shuffle masks for the couple of cases where we call getShuffleCost without one. My goal is to add an invariant that all calls to getShuffleCost for fixed length vectors have a mask. After this change, there is one SK_InsertSubvector case left. I excluded it from this patch just because I thought it worthy of individual attention and review.
@llvm/pr-subscribers-llvm-transforms Author: Philip Reames (preames) ChangesThis is the third of a series of small patches to compute shuffle masks for the couple of cases where we call getShuffleCost without one. My goal is to add an invariant that all calls to getShuffleCost for fixed length vectors have a mask. After this change, there is one SK_InsertSubvector case left. I excluded it from this patch just because I thought it worthy of individual attention and review. Full diff: https://github.com/llvm/llvm-project/pull/85408.diff 1 Files Affected:
diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index b4cce680e2876f..b6868fb3f3ca3f 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -4328,9 +4328,12 @@ BoUpSLP::LoadsState BoUpSLP::canVectorizeLoads(
llvm_unreachable(
"Expected only consecutive, strided or masked gather loads.");
}
+ SmallVector<int> ShuffleMask(VL.size());
+ for (int i = 0; i < VL.size(); i++)
+ ShuffleMask[i] = i / VF == I ? VL.size() + i % VF : i;
VecLdCost +=
TTI.getShuffleCost(TTI ::SK_InsertSubvector, VecTy,
- std::nullopt, CostKind, I * VF, SubVecTy);
+ ShuffleMask, CostKind, I * VF, SubVecTy);
}
// If masked gather cost is higher - better to vectorize, so
// consider it as a gather node. It will be better estimated
@@ -7454,7 +7457,7 @@ getShuffleCost(const TargetTransformInfo &TTI, TTI::ShuffleKind Kind,
Index + NumSrcElts <= static_cast<int>(Mask.size()))
return TTI.getShuffleCost(
TTI::SK_InsertSubvector,
- FixedVectorType::get(Tp->getElementType(), Mask.size()), std::nullopt,
+ FixedVectorType::get(Tp->getElementType(), Mask.size()), Mask,
TTI::TCK_RecipThroughput, Index, Tp);
}
return TTI.getShuffleCost(Kind, Tp, Mask, CostKind, Index, SubTp, Args);
@@ -7727,9 +7730,13 @@ class BoUpSLP::ShuffleCostEstimator : public BaseShuffleAnalysis {
}
if (NeedInsertSubvectorAnalysis) {
// Add the cost for the subvectors insert.
- for (int I = VF, E = VL.size(); I < E; I += VF)
+ SmallVector<int> ShuffleMask(VL.size());
+ for (int I = VF, E = VL.size(); I < E; I += VF) {
+ for (int i = 0; i < E; i++)
+ ShuffleMask[i] = i / VF == I ? E + i % VF : i;
GatherCost += TTI.getShuffleCost(TTI::SK_InsertSubvector, VecTy,
- std::nullopt, CostKind, I, LoadTy);
+ ShuffleMask, CostKind, I, LoadTy);
+ }
}
GatherCost -= ScalarsCost;
}
|
You can test this locally with the following command:git-clang-format --diff 33960c90258ed78b9b877b1a43e219d1cbc2efce fdb6519fed34446194bec7d23155248302aed6ad -- llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp View the diff from clang-format here.diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index d850c0ad78..50af772e30 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -4332,8 +4332,8 @@ BoUpSLP::LoadsState BoUpSLP::canVectorizeLoads(
for (int Idx : seq<int>(0, VL.size()))
ShuffleMask[Idx] = Idx / VF == I ? VL.size() + Idx % VF : Idx;
VecLdCost +=
- TTI.getShuffleCost(TTI ::SK_InsertSubvector, VecTy,
- ShuffleMask, CostKind, I * VF, SubVecTy);
+ TTI.getShuffleCost(TTI ::SK_InsertSubvector, VecTy, ShuffleMask,
+ CostKind, I * VF, SubVecTy);
}
// If masked gather cost is higher - better to vectorize, so
// consider it as a gather node. It will be better estimated
|
Co-authored-by: Alexey Bataev <a.bataev@gmx.com>
Co-authored-by: Alexey Bataev <a.bataev@gmx.com>
@@ -4328,9 +4328,12 @@ BoUpSLP::LoadsState BoUpSLP::canVectorizeLoads( | |||
llvm_unreachable( | |||
"Expected only consecutive, strided or masked gather loads."); | |||
} | |||
SmallVector<int> ShuffleMask(VL.size()); | |||
for (int Idx : seq<int>(0, VL.size())) | |||
ShuffleMask[i] = i / VF == I ? VL.size() + i % VF : i; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All i
s must be replaced too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG
This is the third of a series of small patches to compute shuffle masks for the couple of cases where we call getShuffleCost without one. My goal is to add an invariant that all calls to getShuffleCost for fixed length vectors have a mask.
After this change, there is one SK_InsertSubvector case left. I excluded it from this patch just because I thought it worthy of individual attention and review.