-
Notifications
You must be signed in to change notification settings - Fork 10.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[VPlan] Ignore incoming values with constant false mask. #89384
Conversation
Ignore incoming values with constant false masks when trying to simplify VPBlendRecipes. As a follow-on optimization, we should also be able to drop all incoming values with false masks by creating a new VPBlendRecipe with those operands dropped.
@llvm/pr-subscribers-llvm-transforms Author: Florian Hahn (fhahn) ChangesIgnore incoming values with constant false masks when trying to simplify VPBlendRecipes. As a follow-on optimization, we should also be able to drop all incoming values with false masks by creating a new VPBlendRecipe with those operands dropped. Full diff: https://github.com/llvm/llvm-project/pull/89384.diff 3 Files Affected:
diff --git a/llvm/lib/Transforms/Vectorize/VPlanPatternMatch.h b/llvm/lib/Transforms/Vectorize/VPlanPatternMatch.h
index ae7717eb7cc97f..cbd53a2307a105 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanPatternMatch.h
+++ b/llvm/lib/Transforms/Vectorize/VPlanPatternMatch.h
@@ -52,7 +52,7 @@ template <typename Class> struct bind_ty {
/// Match a specified integer value or vector of all elements of that
/// value.
-struct specific_intval {
+template <unsigned BitWidth = 0> struct specific_intval {
APInt Val;
specific_intval(APInt V) : Val(std::move(V)) {}
@@ -67,14 +67,17 @@ struct specific_intval {
CI = dyn_cast_or_null<ConstantInt>(
C->getSplatValue(/*AllowPoison=*/false));
- return CI && APInt::isSameValue(CI->getValue(), Val);
+ return CI && APInt::isSameValue(CI->getValue(), Val) &&
+ (BitWidth == 0 || CI->getBitWidth() == BitWidth);
}
};
-inline specific_intval m_SpecificInt(uint64_t V) {
- return specific_intval(APInt(64, V));
+inline specific_intval<0> m_SpecificInt(uint64_t V) {
+ return specific_intval<0>(APInt(64, V));
}
+inline specific_intval<1> m_False() { return specific_intval<1>(APInt(64, 0)); }
+
/// Matching combinators
template <typename LTy, typename RTy> struct match_combine_or {
LTy L;
diff --git a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
index 007dc3f89b3fb9..c1dd04a2b1c67e 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
@@ -884,18 +884,19 @@ void VPlanTransforms::clearReductionWrapFlags(VPlan &Plan) {
/// Try to simplify recipe \p R.
static void simplifyRecipe(VPRecipeBase &R, VPTypeAnalysis &TypeInfo) {
+ using namespace llvm::VPlanPatternMatch;
// Try to remove redundant blend recipes.
if (auto *Blend = dyn_cast<VPBlendRecipe>(&R)) {
VPValue *Inc0 = Blend->getIncomingValue(0);
for (unsigned I = 1; I != Blend->getNumIncomingValues(); ++I)
- if (Inc0 != Blend->getIncomingValue(I))
+ if (Inc0 != Blend->getIncomingValue(I) &&
+ !match(Blend->getMask(I), m_False()))
return;
Blend->replaceAllUsesWith(Inc0);
Blend->eraseFromParent();
return;
}
- using namespace llvm::VPlanPatternMatch;
VPValue *A;
if (match(&R, m_Trunc(m_ZExtOrSExt(m_VPValue(A))))) {
VPValue *Trunc = R.getVPSingleValue();
diff --git a/llvm/test/Transforms/LoopVectorize/unused-blend-mask-for-first-operand.ll b/llvm/test/Transforms/LoopVectorize/unused-blend-mask-for-first-operand.ll
index c622925510dd48..d0c74897f264bc 100644
--- a/llvm/test/Transforms/LoopVectorize/unused-blend-mask-for-first-operand.ll
+++ b/llvm/test/Transforms/LoopVectorize/unused-blend-mask-for-first-operand.ll
@@ -10,17 +10,13 @@ define void @test_not_first_lane_only_constant(ptr %A, ptr noalias %B) {
; CHECK-NEXT: entry:
; CHECK-NEXT: br i1 false, label [[SCALAR_PH:%.*]], label [[VECTOR_PH:%.*]]
; CHECK: vector.ph:
-; CHECK-NEXT: [[BROADCAST_SPLATINSERT3:%.*]] = insertelement <4 x ptr> poison, ptr [[B]], i64 0
-; CHECK-NEXT: [[BROADCAST_SPLAT4:%.*]] = shufflevector <4 x ptr> [[BROADCAST_SPLATINSERT3]], <4 x ptr> poison, <4 x i32> zeroinitializer
; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
; CHECK: vector.body:
; CHECK-NEXT: [[INDEX:%.*]] = phi i32 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ]
; CHECK-NEXT: [[OFFSET_IDX:%.*]] = trunc i32 [[INDEX]] to i16
; CHECK-NEXT: [[TMP0:%.*]] = add i16 [[OFFSET_IDX]], 0
; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds i16, ptr [[A]], i16 [[TMP0]]
-; CHECK-NEXT: [[PREDPHI:%.*]] = select <4 x i1> zeroinitializer, <4 x ptr> poison, <4 x ptr> [[BROADCAST_SPLAT4]]
-; CHECK-NEXT: [[TMP12:%.*]] = extractelement <4 x ptr> [[PREDPHI]], i32 0
-; CHECK-NEXT: [[TMP13:%.*]] = load i16, ptr [[TMP12]], align 2
+; CHECK-NEXT: [[TMP13:%.*]] = load i16, ptr %B, align 2
; CHECK-NEXT: [[BROADCAST_SPLATINSERT5:%.*]] = insertelement <4 x i16> poison, i16 [[TMP13]], i64 0
; CHECK-NEXT: [[BROADCAST_SPLAT6:%.*]] = shufflevector <4 x i16> [[BROADCAST_SPLATINSERT5]], <4 x i16> poison, <4 x i32> zeroinitializer
; CHECK-NEXT: [[TMP2:%.*]] = getelementptr inbounds i16, ptr [[TMP1]], i32 0
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, adding a couple of thoughts.
} | ||
|
||
inline specific_intval<1> m_False() { return specific_intval<1>(APInt(64, 0)); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better assert the bitwidth is 1 when one tries to match False or True, rather than mismatch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added an assert before checking APInt::isSameValue(CI->getValue(), Val)
, thanks!
// Try to remove redundant blend recipes. | ||
if (auto *Blend = dyn_cast<VPBlendRecipe>(&R)) { | ||
VPValue *Inc0 = Blend->getIncomingValue(0); | ||
for (unsigned I = 1; I != Blend->getNumIncomingValues(); ++I) | ||
if (Inc0 != Blend->getIncomingValue(I)) | ||
if (Inc0 != Blend->getIncomingValue(I) && | ||
!match(Blend->getMask(I), m_False())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine, but could be more general: first filter all operands having false masks by replacing the original blend with a simpler blend, then apply the existing check if all operands are the same? May deserve additional tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Ah, mentioned in the commit message as follow-up, very well.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do separately, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Long-term, optimizing masks may be addressed when generated by a VPlan2VPlan Predicator, rather than by a separate subsequent redundancy eliminator.
@@ -52,7 +52,7 @@ template <typename Class> struct bind_ty { | |||
|
|||
/// Match a specified integer value or vector of all elements of that | |||
/// value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: worth a comment that zero BitWidth represents an optional wildcard - that it need not be checked?
Any issues arise when building w/o asserts where Bidwidth is unused - but like a parameter rather than local variable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment and also checked that there are no warnings with in release builds w/o assertions (at least with clang)
Ignore incoming values with constant false masks when trying to simplify VPBlendRecipes.
As a follow-on optimization, we should also be able to drop all incoming values with false masks by creating a new VPBlendRecipe with those operands dropped.