Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VPlan] Ignore incoming values with constant false mask. #89384

Merged
merged 5 commits into from
Apr 23, 2024

Conversation

fhahn
Copy link
Contributor

@fhahn fhahn commented Apr 19, 2024

Ignore incoming values with constant false masks when trying to simplify VPBlendRecipes.

As a follow-on optimization, we should also be able to drop all incoming values with false masks by creating a new VPBlendRecipe with those operands dropped.

Ignore incoming values with constant false masks when trying to simplify
VPBlendRecipes.

As a follow-on optimization, we should also be able to drop all incoming
values with false masks by creating a new VPBlendRecipe with those
operands dropped.
@llvmbot
Copy link
Collaborator

llvmbot commented Apr 19, 2024

@llvm/pr-subscribers-llvm-transforms

Author: Florian Hahn (fhahn)

Changes

Ignore incoming values with constant false masks when trying to simplify VPBlendRecipes.

As a follow-on optimization, we should also be able to drop all incoming values with false masks by creating a new VPBlendRecipe with those operands dropped.


Full diff: https://github.com/llvm/llvm-project/pull/89384.diff

3 Files Affected:

  • (modified) llvm/lib/Transforms/Vectorize/VPlanPatternMatch.h (+7-4)
  • (modified) llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp (+3-2)
  • (modified) llvm/test/Transforms/LoopVectorize/unused-blend-mask-for-first-operand.ll (+1-5)
diff --git a/llvm/lib/Transforms/Vectorize/VPlanPatternMatch.h b/llvm/lib/Transforms/Vectorize/VPlanPatternMatch.h
index ae7717eb7cc97f..cbd53a2307a105 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanPatternMatch.h
+++ b/llvm/lib/Transforms/Vectorize/VPlanPatternMatch.h
@@ -52,7 +52,7 @@ template <typename Class> struct bind_ty {
 
 /// Match a specified integer value or vector of all elements of that
 /// value.
-struct specific_intval {
+template <unsigned BitWidth = 0> struct specific_intval {
   APInt Val;
 
   specific_intval(APInt V) : Val(std::move(V)) {}
@@ -67,14 +67,17 @@ struct specific_intval {
         CI = dyn_cast_or_null<ConstantInt>(
             C->getSplatValue(/*AllowPoison=*/false));
 
-    return CI && APInt::isSameValue(CI->getValue(), Val);
+    return CI && APInt::isSameValue(CI->getValue(), Val) &&
+           (BitWidth == 0 || CI->getBitWidth() == BitWidth);
   }
 };
 
-inline specific_intval m_SpecificInt(uint64_t V) {
-  return specific_intval(APInt(64, V));
+inline specific_intval<0> m_SpecificInt(uint64_t V) {
+  return specific_intval<0>(APInt(64, V));
 }
 
+inline specific_intval<1> m_False() { return specific_intval<1>(APInt(64, 0)); }
+
 /// Matching combinators
 template <typename LTy, typename RTy> struct match_combine_or {
   LTy L;
diff --git a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
index 007dc3f89b3fb9..c1dd04a2b1c67e 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
@@ -884,18 +884,19 @@ void VPlanTransforms::clearReductionWrapFlags(VPlan &Plan) {
 
 /// Try to simplify recipe \p R.
 static void simplifyRecipe(VPRecipeBase &R, VPTypeAnalysis &TypeInfo) {
+  using namespace llvm::VPlanPatternMatch;
   // Try to remove redundant blend recipes.
   if (auto *Blend = dyn_cast<VPBlendRecipe>(&R)) {
     VPValue *Inc0 = Blend->getIncomingValue(0);
     for (unsigned I = 1; I != Blend->getNumIncomingValues(); ++I)
-      if (Inc0 != Blend->getIncomingValue(I))
+      if (Inc0 != Blend->getIncomingValue(I) &&
+          !match(Blend->getMask(I), m_False()))
         return;
     Blend->replaceAllUsesWith(Inc0);
     Blend->eraseFromParent();
     return;
   }
 
-  using namespace llvm::VPlanPatternMatch;
   VPValue *A;
   if (match(&R, m_Trunc(m_ZExtOrSExt(m_VPValue(A))))) {
     VPValue *Trunc = R.getVPSingleValue();
diff --git a/llvm/test/Transforms/LoopVectorize/unused-blend-mask-for-first-operand.ll b/llvm/test/Transforms/LoopVectorize/unused-blend-mask-for-first-operand.ll
index c622925510dd48..d0c74897f264bc 100644
--- a/llvm/test/Transforms/LoopVectorize/unused-blend-mask-for-first-operand.ll
+++ b/llvm/test/Transforms/LoopVectorize/unused-blend-mask-for-first-operand.ll
@@ -10,17 +10,13 @@ define void @test_not_first_lane_only_constant(ptr %A, ptr noalias %B)  {
 ; CHECK-NEXT:  entry:
 ; CHECK-NEXT:    br i1 false, label [[SCALAR_PH:%.*]], label [[VECTOR_PH:%.*]]
 ; CHECK:       vector.ph:
-; CHECK-NEXT:    [[BROADCAST_SPLATINSERT3:%.*]] = insertelement <4 x ptr> poison, ptr [[B]], i64 0
-; CHECK-NEXT:    [[BROADCAST_SPLAT4:%.*]] = shufflevector <4 x ptr> [[BROADCAST_SPLATINSERT3]], <4 x ptr> poison, <4 x i32> zeroinitializer
 ; CHECK-NEXT:    br label [[VECTOR_BODY:%.*]]
 ; CHECK:       vector.body:
 ; CHECK-NEXT:    [[INDEX:%.*]] = phi i32 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ]
 ; CHECK-NEXT:    [[OFFSET_IDX:%.*]] = trunc i32 [[INDEX]] to i16
 ; CHECK-NEXT:    [[TMP0:%.*]] = add i16 [[OFFSET_IDX]], 0
 ; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr inbounds i16, ptr [[A]], i16 [[TMP0]]
-; CHECK-NEXT:    [[PREDPHI:%.*]] = select <4 x i1> zeroinitializer, <4 x ptr> poison, <4 x ptr> [[BROADCAST_SPLAT4]]
-; CHECK-NEXT:    [[TMP12:%.*]] = extractelement <4 x ptr> [[PREDPHI]], i32 0
-; CHECK-NEXT:    [[TMP13:%.*]] = load i16, ptr [[TMP12]], align 2
+; CHECK-NEXT:    [[TMP13:%.*]] = load i16, ptr %B, align 2
 ; CHECK-NEXT:    [[BROADCAST_SPLATINSERT5:%.*]] = insertelement <4 x i16> poison, i16 [[TMP13]], i64 0
 ; CHECK-NEXT:    [[BROADCAST_SPLAT6:%.*]] = shufflevector <4 x i16> [[BROADCAST_SPLATINSERT5]], <4 x i16> poison, <4 x i32> zeroinitializer
 ; CHECK-NEXT:    [[TMP2:%.*]] = getelementptr inbounds i16, ptr [[TMP1]], i32 0

Copy link
Collaborator

@ayalz ayalz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, adding a couple of thoughts.

}

inline specific_intval<1> m_False() { return specific_intval<1>(APInt(64, 0)); }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better assert the bitwidth is 1 when one tries to match False or True, rather than mismatch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an assert before checking APInt::isSameValue(CI->getValue(), Val), thanks!

// Try to remove redundant blend recipes.
if (auto *Blend = dyn_cast<VPBlendRecipe>(&R)) {
VPValue *Inc0 = Blend->getIncomingValue(0);
for (unsigned I = 1; I != Blend->getNumIncomingValues(); ++I)
if (Inc0 != Blend->getIncomingValue(I))
if (Inc0 != Blend->getIncomingValue(I) &&
!match(Blend->getMask(I), m_False()))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine, but could be more general: first filter all operands having false masks by replacing the original blend with a simpler blend, then apply the existing check if all operands are the same? May deserve additional tests.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Ah, mentioned in the commit message as follow-up, very well.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do separately, thanks!

Copy link
Collaborator

@ayalz ayalz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Long-term, optimizing masks may be addressed when generated by a VPlan2VPlan Predicator, rather than by a separate subsequent redundancy eliminator.

@@ -52,7 +52,7 @@ template <typename Class> struct bind_ty {

/// Match a specified integer value or vector of all elements of that
/// value.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: worth a comment that zero BitWidth represents an optional wildcard - that it need not be checked?

Any issues arise when building w/o asserts where Bidwidth is unused - but like a parameter rather than local variable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment and also checked that there are no warnings with in release builds w/o assertions (at least with clang)

@fhahn fhahn merged commit dadf6f2 into llvm:main Apr 23, 2024
3 of 4 checks passed
@fhahn fhahn deleted the vplan-simplify-blend-false-mask branch April 23, 2024 12:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants