[InstCombine] Treat umax as select(icmp eq x, 0), 1, x) in binop select fold. #65978

davemgreen · 2023-09-11T16:48:58Z

There is an existing instcombine for folding:
(A ? B : C) binop (A ? E : F) -> A ? (B binop E) : (C binop F).
However this will not combine if the select (x>=1 ? x : 1) has
been converted to a umax(x, 1). This adds code to treat the umax
as a select, allowing binops to fold into the select.

This patch handles the more special case where the predicate is the limit
extreme (such as x==0 ? x : 1), meaning it will use a ==0 predicate as
opposed to the >= predicate usually applied to a umax. According to alive this
is valid so long as the variables are not undef, are known to be non-equal
(https://alive2.llvm.org/ce/z/VAYw3J) or known to be the same value
(https://alive2.llvm.org/ce/z/4TSd7v).

Here are some proofs for different min/maxes:
umax: https://alive2.llvm.org/ce/z/kLw4J9
umin: https://alive2.llvm.org/ce/z/nvPa7p
smin: https://alive2.llvm.org/ce/z/HS8V7V
smax: https://alive2.llvm.org/ce/z/D2ocju

llvmbot · 2023-09-11T16:49:56Z

@llvm/pr-subscribers-llvm-transforms

Changes

There is an existing instcombine in SimplifySelectsFeedingBinaryOp for folding `(A ? B : C) binop (A ? E : F) -> A ? (B binop E) : (C binop F)`. However this will not combine if the select `(x>=1 ? x : 1)` has been converted to a `umax(x, 1)`. This adds code to treat the umax as a `select(x==0,1,x)` if it matches the other operand, allowing binops to fold into the select/umax pair.

Full diff: https://github.com/llvm/llvm-project/pull/65978.diff

2 Files Affected:

(modified) llvm/lib/Transforms/InstCombine/InstructionCombining.cpp (+20)
(modified) llvm/test/Transforms/InstCombine/binop-select.ll (+58)

diff --git a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
index ed8709ea4c051f7..dc551a8dba1900f 100644
--- a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
@@ -1103,9 +1103,29 @@ Value *InstCombinerImpl::SimplifySelectsFeedingBinaryOp(BinaryOperator &I,
   Value *A, *B, *C, *D, *E, *F;
   bool LHSIsSelect = match(LHS, m_Select(m_Value(A), m_Value(B), m_Value(C)));
   bool RHSIsSelect = match(RHS, m_Select(m_Value(D), m_Value(E), m_Value(F)));
+
   if (!LHSIsSelect && !RHSIsSelect)
     return nullptr;
 
+  // Treat umax(x, 1) as select(icmp(eq, x, 0), 1, x), if it matches the other
+  // predicate.
+  auto TryMatchSelectFromUMax = [](bool LHSIsSelect, Value *RHS, bool &RHSIsSelect,
+                                   Value *A, Value *B, Value *C, Value *&D,
+                                   Value *&E, Value *&F) {
+    CmpInst::Predicate Pred;
+    Value *X;
+    if (LHSIsSelect && !RHSIsSelect &&
+        match(RHS, m_UMax(m_Value(X), m_One())) &&
+        match(A, m_c_ICmp(Pred, m_Specific(X), m_Zero())) &&
+        Pred == ICmpInst::ICMP_EQ) {
+      RHSIsSelect = true;
+      match(RHS, m_UMax(m_Value(F), m_Value(E)));
+      D = A;
+    }
+  };
+  TryMatchSelectFromUMax(LHSIsSelect, RHS, RHSIsSelect, A, B, C, D, E, F);
+  TryMatchSelectFromUMax(RHSIsSelect, LHS, LHSIsSelect, D, E, F, A, B, C);
+
   FastMathFlags FMF;
   BuilderTy::FastMathFlagGuard Guard(Builder);
   if (isa(&I)) {
diff --git a/llvm/test/Transforms/InstCombine/binop-select.ll b/llvm/test/Transforms/InstCombine/binop-select.ll
index a59e19897f061d1..b064b3447e8135b 100644
--- a/llvm/test/Transforms/InstCombine/binop-select.ll
+++ b/llvm/test/Transforms/InstCombine/binop-select.ll
@@ -403,3 +403,61 @@ define i32 @ashr_sel_op1_use(i1 %b) {
   %r = ashr i32 -2, %s
   ret i32 %r
 }
+
+
+define i32 @umax_as_select_sub(i32 %a) {
+; CHECK-LABEL: @umax_as_select_sub(
+; CHECK-NEXT:    [[C_NOT:%.*]] = icmp eq i32 [[A:%.*]], 0
+; CHECK-NEXT:    [[TMP1:%.*]] = sub i32 2, [[A]]
+; CHECK-NEXT:    [[B:%.*]] = select i1 [[C_NOT]], i32 -1, i32 [[TMP1]]
+; CHECK-NEXT:    ret i32 [[B]]
+;
+  %c = icmp ugt i32 %a, 0
+  %s = select i1 %c, i32 2, i32 0
+  %m = call i32 @llvm.umax.i32(i32 %a, i32 1)
+  %b = sub i32 %s, %m
+  ret i32 %b
+}
+
+define i32 @umax_as_select_sub_c(i32 %a) {
+; CHECK-LABEL: @umax_as_select_sub_c(
+; CHECK-NEXT:    [[C_NOT:%.*]] = icmp eq i32 [[A:%.*]], 0
+; CHECK-NEXT:    [[TMP1:%.*]] = add i32 [[A]], -2
+; CHECK-NEXT:    [[B:%.*]] = select i1 [[C_NOT]], i32 1, i32 [[TMP1]]
+; CHECK-NEXT:    ret i32 [[B]]
+;
+  %c = icmp ugt i32 %a, 0
+  %s = select i1 %c, i32 2, i32 0
+  %m = call i32 @llvm.umax.i32(i32 %a, i32 1)
+  %b = sub i32 %m, %s
+  ret i32 %b
+}
+
+define i32 @umax_as_select_add(i32 %a, i32 %x, i32 %y) {
+; CHECK-LABEL: @umax_as_select_add(
+; CHECK-NEXT:    [[C_NOT:%.*]] = icmp eq i32 [[A:%.*]], 0
+; CHECK-NEXT:    [[S:%.*]] = select i1 [[C_NOT]], i32 [[Y:%.*]], i32 [[X:%.*]]
+; CHECK-NEXT:    [[M:%.*]] = call i32 @llvm.umax.i32(i32 [[A]], i32 1)
+; CHECK-NEXT:    [[B:%.*]] = add i32 [[S]], [[M]]
+; CHECK-NEXT:    ret i32 [[B]]
+;
+  %c = icmp ugt i32 %a, 0
+  %s = select i1 %c, i32 %x, i32 %y
+  %m = call i32 @llvm.umax.i32(i32 %a, i32 1)
+  %b = add i32 %s, %m
+  ret i32 %b
+}
+
+define i32 @umax_as_select_mul(i32 %a) {
+; CHECK-LABEL: @umax_as_select_mul(
+; CHECK-NEXT:    [[TMP1:%.*]] = shl i32 [[A:%.*]], 1
+; CHECK-NEXT:    ret i32 [[TMP1]]
+;
+  %c = icmp ugt i32 %a, 0
+  %s = select i1 %c, i32 2, i32 0
+  %m = select i1 %c, i32 %a, i32 1
+  %b = mul i32 %s, %m
+  ret i32 %b
+}
+
+declare i32 @llvm.umax.i32(i32, i32)

nikic

Why is this specific to umin(1)? Doesn't the same work for, say, umin(2)? https://alive2.llvm.org/ce/z/gQqtpt

Or really any min/max intrinsic decomposed into select form.

davemgreen · 2023-11-16T18:41:24Z

Hi - thanks for taking a look and sorry this took a while to get back to. This was causing some regressions in my motivating case due to knock-on register allocation. That should hopefully be resolved now and so it leads to an improvement as you would expect.

I've been looking at trying to generalize it to more cases and deal with the undef. I'll update this patch with the more special case, which the general case can be added to but is more complex. I think I have it working, but have to check through some of the details.

Unfortunately it appears some cases are not valid with undef arguments. I've tried to limit it to valid cases (as opposed to adding freeze). I know that undef can be odd at times but I'm not sure it should be this odd.

…ct fold. There is an existing instcombine for folding: `(A ? B : C) binop (A ? E : F) -> A ? (B binop E) : (C binop F)`. However this will not combine if the select `(x>=1 ? x : 1)` has been converted to a `umax(x, 1)`. This adds code to treat the umax as a select, allowing binops to fold into the select. This patch handles the more special case where the predicate is the limit extreme (such as `x==0 ? x : 1`), meaning it will use a ==0 predicate as opposed to the >= predicate usually applied to a umax. According to alive this is valid so long as the variables are not undef, are known to be non-equal (https://alive2.llvm.org/ce/z/VAYw3J) or known to be the same value (https://alive2.llvm.org/ce/z/4TSd7v). Here are some proofs for different min/maxes: umax: https://alive2.llvm.org/ce/z/kLw4J9 umin: https://alive2.llvm.org/ce/z/nvPa7p smin: https://alive2.llvm.org/ce/z/HS8V7V smax: https://alive2.llvm.org/ce/z/D2ocju

This is a generalization to the code added in llvm#65978, attempting to handle more with predicates and min/max that match. It was originally trying to use matchDecomposedSelectPattern but I replaced that with just checking the predicates and SPF match. The opposite conditions are also handled, which just inverts the order of the E/F variables. TODO: Add some proofs.

davemgreen · 2023-11-16T18:51:14Z

The more general cases will hopefully be something like davemgreen@7ba90fb.

nikic · 2023-11-16T20:40:51Z

I'm not a fan of the special undef handling here. We should either insert the necessary freeze, or, if that covers your motivating case, we can start with guarding the transform under isGuaranteedNotToBeUndefOrPoison().

davemgreen requested review from nikic, rotateright and goldsteinn September 11, 2023 16:48

davemgreen requested a review from a team as a code owner September 11, 2023 16:48

llvmbot added the llvm:transforms label Sep 11, 2023

nikic reviewed Sep 15, 2023

View reviewed changes

[Instcombine] Add tests for treating umax as a select. NFC

aee299e

davemgreen force-pushed the gh-ic-umaxasselect branch from 8883173 to dcf0716 Compare November 16, 2023 18:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[InstCombine] Treat umax as select(icmp eq x, 0), 1, x) in binop select fold. #65978

[InstCombine] Treat umax as select(icmp eq x, 0), 1, x) in binop select fold. #65978

davemgreen commented Sep 11, 2023 •

edited

llvmbot commented Sep 11, 2023

nikic left a comment

davemgreen commented Nov 16, 2023

davemgreen commented Nov 16, 2023

nikic commented Nov 16, 2023

[InstCombine] Treat umax as select(icmp eq x, 0), 1, x) in binop select fold. #65978

Are you sure you want to change the base?

[InstCombine] Treat umax as select(icmp eq x, 0), 1, x) in binop select fold. #65978

Conversation

davemgreen commented Sep 11, 2023 • edited

llvmbot commented Sep 11, 2023

nikic left a comment

Choose a reason for hiding this comment

davemgreen commented Nov 16, 2023

davemgreen commented Nov 16, 2023

nikic commented Nov 16, 2023

davemgreen commented Sep 11, 2023 •

edited