Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[InstCombine] Fold select (A &/| B), T, F if select B, T, F is foldable #76621

Merged
merged 2 commits into from
Dec 31, 2023

Conversation

dtcxzyw
Copy link
Member

@dtcxzyw dtcxzyw commented Dec 30, 2023

This patch does the following folds:

(select A && B, T, F) -> (select A, (select B, T, F), F)
(select A || B, T, F) -> (select A, T, (select B, T, F))

if (select B, T, F) can be folded into a value or a canonicalized SPF.
Alive2: https://alive2.llvm.org/ce/z/4Bdrbu

The original motivation of this patch is to simplify the following pattern:

%.sroa.speculated.i = tail call i64 @llvm.umax.i64(i64 %sub.ptr.div.i.i, i64 1)
%add.i = add i64 %.sroa.speculated.i, %sub.ptr.div.i.i
%cmp7.i = icmp ult i64 %add.i, %sub.ptr.div.i.i
%cmp9.i = icmp ugt i64 %add.i, 1152921504606846975
%or.cond.i = or i1 %cmp7.i, %cmp9.i
%cond.i = select i1 %or.cond.i, i64 1152921504606846975, i64 %add.i
->
%.sroa.speculated.i = tail call i64 @llvm.umax.i64(i64 %sub.ptr.div.i.i, i64 1)
%add.i = add i64 %.sroa.speculated.i, %sub.ptr.div.i.i
%cmp7.i = icmp ult i64 %add.i, %sub.ptr.div.i.i
%max = call i64 @llvm.umax.i64(i64 %add.i, 1152921504606846975)
%cond.i = select i1 %cmp7.i, i64 1152921504606846975, i64 %max

The later form has a better codegen for some backends. It is also more analysis-friendly than the original one.
Godbolt: https://godbolt.org/z/eK6eb5jf1
Alive2: https://alive2.llvm.org/ce/z/VHlxL2

Compile-time impact: http://llvm-compile-time-tracker.com/compare.php?from=7c71d3996a72b9b024622f23bf556539b961c88c&to=638ce8666fadaca1ab2639a3c2bc52a4a8508f40&stat=instructions:u

stage1-O3 stage1-ReleaseThinLTO stage1-ReleaseLTO-g stage1-O0-g stage2-O3 stage2-O0-g stage2-clang
+0.02% -0.00% +0.02% -0.03% -0.00% -0.05% -0.00%

It is an alternative to #76203 and #76363 because we can simplify select (icmp eq/ne a, b), a, b into b or a.
Fixes #75784.
Fixes #76043.

@llvmbot
Copy link
Collaborator

llvmbot commented Dec 30, 2023

@llvm/pr-subscribers-llvm-transforms

Author: Yingwei Zheng (dtcxzyw)

Changes

This patch does the following folds:

(select A && B, T, F) -> (select A, (select B, T, F), F)
(select A || B, T, F) -> (select A, T, (select B, T, F))

if (select B, T, F) can be folded into a value or a canonicalized SPF.
Alive2: https://alive2.llvm.org/ce/z/4Bdrbu

The original motivation of this patch is to simplify the following pattern:

%.sroa.speculated.i = tail call i64 @<!-- -->llvm.umax.i64(i64 %sub.ptr.div.i.i, i64 1)
%add.i = add i64 %.sroa.speculated.i, %sub.ptr.div.i.i
%cmp7.i = icmp ult i64 %add.i, %sub.ptr.div.i.i
%cmp9.i = icmp ugt i64 %add.i, 1152921504606846975
%or.cond.i = or i1 %cmp7.i, %cmp9.i
%cond.i = select i1 %or.cond.i, i64 1152921504606846975, i64 %add.i
-&gt;
%.sroa.speculated.i = tail call i64 @<!-- -->llvm.umax.i64(i64 %sub.ptr.div.i.i, i64 1)
%add.i = add i64 %.sroa.speculated.i, %sub.ptr.div.i.i
%cmp7.i = icmp ult i64 %add.i, %sub.ptr.div.i.i
%max = call i64 @<!-- -->llvm.umax.i64(i64 %add.i, 1152921504606846975)
%cond.i = select i1 %cmp7.i, i64 1152921504606846975, i64 %max

The later form has a better codegen for some backends. It is also more analysis-friendly than the original one.
Godbolt: https://godbolt.org/z/zsz3bvGze
Alive2: https://alive2.llvm.org/ce/z/VHlxL2

Compile-time impact: http://llvm-compile-time-tracker.com/compare.php?from=7c71d3996a72b9b024622f23bf556539b961c88c&amp;to=638ce8666fadaca1ab2639a3c2bc52a4a8508f40&amp;stat=instructions:u

stage1-O3 stage1-ReleaseThinLTO stage1-ReleaseLTO-g stage1-O0-g stage2-O3 stage2-O0-g stage2-clang
+0.02% -0.00% +0.02% -0.03% -0.00% -0.05% -0.00%

It is also an alternative to #76203 because we can simplify select (icmp eq/ne a, b), a, b into b or a.
Fixes #75784.


Patch is 28.80 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/76621.diff

5 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp (+58-11)
  • (modified) llvm/test/Transforms/InstCombine/select-and-or.ll (+386-24)
  • (modified) llvm/test/Transforms/InstCombine/select-factorize.ll (+5-5)
  • (modified) llvm/test/Transforms/InstCombine/zext-or-icmp.ll (+3-5)
  • (modified) llvm/test/Transforms/Reassociate/basictest.ll (+1-3)
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp b/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
index 3c6ce450c5bcfa..cf66f5be2d408f 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
@@ -1171,14 +1171,15 @@ static Value *foldSelectCttzCtlz(ICmpInst *ICI, Value *TrueVal, Value *FalseVal,
   return nullptr;
 }
 
-static Instruction *canonicalizeSPF(SelectInst &Sel, ICmpInst &Cmp,
-                                    InstCombinerImpl &IC) {
+static Value *canonicalizeSPF(ICmpInst &Cmp, Value *TrueVal, Value *FalseVal,
+                              InstCombinerImpl &IC) {
   Value *LHS, *RHS;
   // TODO: What to do with pointer min/max patterns?
-  if (!Sel.getType()->isIntOrIntVectorTy())
+  if (!TrueVal->getType()->isIntOrIntVectorTy())
     return nullptr;
 
-  SelectPatternFlavor SPF = matchSelectPattern(&Sel, LHS, RHS).Flavor;
+  SelectPatternFlavor SPF =
+      matchDecomposedSelectPattern(&Cmp, TrueVal, FalseVal, LHS, RHS).Flavor;
   if (SPF == SelectPatternFlavor::SPF_ABS ||
       SPF == SelectPatternFlavor::SPF_NABS) {
     if (!Cmp.hasOneUse() && !RHS->hasOneUse())
@@ -1188,13 +1189,13 @@ static Instruction *canonicalizeSPF(SelectInst &Sel, ICmpInst &Cmp,
     bool IntMinIsPoison = SPF == SelectPatternFlavor::SPF_ABS &&
                           match(RHS, m_NSWNeg(m_Specific(LHS)));
     Constant *IntMinIsPoisonC =
-        ConstantInt::get(Type::getInt1Ty(Sel.getContext()), IntMinIsPoison);
+        ConstantInt::get(Type::getInt1Ty(Cmp.getContext()), IntMinIsPoison);
     Instruction *Abs =
         IC.Builder.CreateBinaryIntrinsic(Intrinsic::abs, LHS, IntMinIsPoisonC);
 
     if (SPF == SelectPatternFlavor::SPF_NABS)
-      return BinaryOperator::CreateNeg(Abs); // Always without NSW flag!
-    return IC.replaceInstUsesWith(Sel, Abs);
+      return IC.Builder.CreateNeg(Abs); // Always without NSW flag!
+    return Abs;
   }
 
   if (SelectPatternResult::isMinOrMax(SPF)) {
@@ -1215,8 +1216,7 @@ static Instruction *canonicalizeSPF(SelectInst &Sel, ICmpInst &Cmp,
     default:
       llvm_unreachable("Unexpected SPF");
     }
-    return IC.replaceInstUsesWith(
-        Sel, IC.Builder.CreateBinaryIntrinsic(IntrinsicID, LHS, RHS));
+    return IC.Builder.CreateBinaryIntrinsic(IntrinsicID, LHS, RHS);
   }
 
   return nullptr;
@@ -1677,8 +1677,9 @@ Instruction *InstCombinerImpl::foldSelectInstWithICmp(SelectInst &SI,
   if (Instruction *NewSel = foldSelectValueEquivalence(SI, *ICI))
     return NewSel;
 
-  if (Instruction *NewSPF = canonicalizeSPF(SI, *ICI, *this))
-    return NewSPF;
+  if (Value *V =
+          canonicalizeSPF(*ICI, SI.getTrueValue(), SI.getFalseValue(), *this))
+    return replaceInstUsesWith(SI, V);
 
   if (Value *V = foldSelectInstWithICmpConst(SI, ICI, Builder))
     return replaceInstUsesWith(SI, V);
@@ -3793,5 +3794,51 @@ Instruction *InstCombinerImpl::visitSelectInst(SelectInst &SI) {
   if (Instruction *I = foldBitCeil(SI, Builder))
     return I;
 
+  // Fold:
+  // (select A && B, T, F) -> (select A, (select B, T, F), F)
+  // (select A || B, T, F) -> (select A, T, (select B, T, F))
+  // if (select B, T, F) is foldable.
+  // TODO: preserve FMF flags
+  auto FoldSelectWithAndOrCond = [&](bool IsAnd, Value *A,
+                                     Value *B) -> Instruction * {
+    if (Value *V = simplifySelectInst(B, TrueVal, FalseVal,
+                                      SQ.getWithInstruction(&SI)))
+      return SelectInst::Create(A, IsAnd ? V : TrueVal, IsAnd ? FalseVal : V);
+
+    // Is (select B, T, F) a SPF?
+    if (CondVal->hasOneUse() && SelType->isIntOrIntVectorTy()) {
+      Value *LHS, *RHS;
+      if (ICmpInst *Cmp = dyn_cast<ICmpInst>(B))
+        if (Value *V = canonicalizeSPF(*Cmp, TrueVal, FalseVal, *this))
+          return SelectInst::Create(A, IsAnd ? V : TrueVal,
+                                    IsAnd ? FalseVal : V);
+    }
+
+    return nullptr;
+  };
+
+  Value *LHS, *RHS;
+  if (match(CondVal, m_And(m_Value(LHS), m_Value(RHS)))) {
+    if (Instruction *I = FoldSelectWithAndOrCond(/*IsAnd*/ true, LHS, RHS))
+      return I;
+    if (Instruction *I = FoldSelectWithAndOrCond(/*IsAnd*/ true, RHS, LHS))
+      return I;
+  } else if (match(CondVal, m_Or(m_Value(LHS), m_Value(RHS)))) {
+    if (Instruction *I = FoldSelectWithAndOrCond(/*IsAnd*/ false, LHS, RHS))
+      return I;
+    if (Instruction *I = FoldSelectWithAndOrCond(/*IsAnd*/ false, RHS, LHS))
+      return I;
+  } else {
+    // We cannot swap the operands of logical and/or.
+    // TODO: Can we swap the operands by inserting a freeze?
+    if (match(CondVal, m_LogicalAnd(m_Value(LHS), m_Value(RHS)))) {
+      if (Instruction *I = FoldSelectWithAndOrCond(/*IsAnd*/ true, LHS, RHS))
+        return I;
+    } else if (match(CondVal, m_LogicalOr(m_Value(LHS), m_Value(RHS)))) {
+      if (Instruction *I = FoldSelectWithAndOrCond(/*IsAnd*/ false, LHS, RHS))
+        return I;
+    }
+  }
+
   return nullptr;
 }
diff --git a/llvm/test/Transforms/InstCombine/select-and-or.ll b/llvm/test/Transforms/InstCombine/select-and-or.ll
index 7edcd767b86ecb..ad63044cdc322a 100644
--- a/llvm/test/Transforms/InstCombine/select-and-or.ll
+++ b/llvm/test/Transforms/InstCombine/select-and-or.ll
@@ -613,9 +613,9 @@ define i1 @and_or2_wrong_operand(i1 %a, i1 %b, i1 %c, i1 %d) {
 
 define i1 @and_or3(i1 %a, i1 %b, i32 %x, i32 %y) {
 ; CHECK-LABEL: @and_or3(
-; CHECK-NEXT:    [[C:%.*]] = icmp ne i32 [[X:%.*]], [[Y:%.*]]
-; CHECK-NEXT:    [[TMP1:%.*]] = select i1 [[C]], i1 true, i1 [[A:%.*]]
-; CHECK-NEXT:    [[R:%.*]] = select i1 [[B:%.*]], i1 [[TMP1]], i1 false
+; CHECK-NEXT:    [[TMP1:%.*]] = icmp ne i32 [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT:    [[TMP2:%.*]] = select i1 [[TMP1]], i1 true, i1 [[A:%.*]]
+; CHECK-NEXT:    [[R:%.*]] = select i1 [[B:%.*]], i1 [[TMP2]], i1 false
 ; CHECK-NEXT:    ret i1 [[R]]
 ;
   %c = icmp eq i32 %x, %y
@@ -626,9 +626,9 @@ define i1 @and_or3(i1 %a, i1 %b, i32 %x, i32 %y) {
 
 define i1 @and_or3_commuted(i1 %a, i1 %b, i32 %x, i32 %y) {
 ; CHECK-LABEL: @and_or3_commuted(
-; CHECK-NEXT:    [[C:%.*]] = icmp ne i32 [[X:%.*]], [[Y:%.*]]
-; CHECK-NEXT:    [[TMP1:%.*]] = select i1 [[C]], i1 true, i1 [[A:%.*]]
-; CHECK-NEXT:    [[R:%.*]] = select i1 [[B:%.*]], i1 [[TMP1]], i1 false
+; CHECK-NEXT:    [[TMP1:%.*]] = icmp ne i32 [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT:    [[TMP2:%.*]] = select i1 [[TMP1]], i1 true, i1 [[A:%.*]]
+; CHECK-NEXT:    [[R:%.*]] = select i1 [[B:%.*]], i1 [[TMP2]], i1 false
 ; CHECK-NEXT:    ret i1 [[R]]
 ;
   %c = icmp eq i32 %x, %y
@@ -665,9 +665,9 @@ define i1 @and_or3_multiuse(i1 %a, i1 %b, i32 %x, i32 %y) {
 
 define <2 x i1> @and_or3_vec(<2 x i1> %a, <2 x i1> %b, <2 x i32> %x, <2 x i32> %y) {
 ; CHECK-LABEL: @and_or3_vec(
-; CHECK-NEXT:    [[C:%.*]] = icmp ne <2 x i32> [[X:%.*]], [[Y:%.*]]
-; CHECK-NEXT:    [[TMP1:%.*]] = select <2 x i1> [[C]], <2 x i1> <i1 true, i1 true>, <2 x i1> [[A:%.*]]
-; CHECK-NEXT:    [[R:%.*]] = select <2 x i1> [[B:%.*]], <2 x i1> [[TMP1]], <2 x i1> zeroinitializer
+; CHECK-NEXT:    [[TMP1:%.*]] = icmp ne <2 x i32> [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT:    [[TMP2:%.*]] = select <2 x i1> [[TMP1]], <2 x i1> <i1 true, i1 true>, <2 x i1> [[A:%.*]]
+; CHECK-NEXT:    [[R:%.*]] = select <2 x i1> [[B:%.*]], <2 x i1> [[TMP2]], <2 x i1> zeroinitializer
 ; CHECK-NEXT:    ret <2 x i1> [[R]]
 ;
   %c = icmp eq <2 x i32> %x, %y
@@ -678,9 +678,9 @@ define <2 x i1> @and_or3_vec(<2 x i1> %a, <2 x i1> %b, <2 x i32> %x, <2 x i32> %
 
 define <2 x i1> @and_or3_vec_commuted(<2 x i1> %a, <2 x i1> %b, <2 x i32> %x, <2 x i32> %y) {
 ; CHECK-LABEL: @and_or3_vec_commuted(
-; CHECK-NEXT:    [[C:%.*]] = icmp ne <2 x i32> [[X:%.*]], [[Y:%.*]]
-; CHECK-NEXT:    [[TMP1:%.*]] = select <2 x i1> [[C]], <2 x i1> <i1 true, i1 true>, <2 x i1> [[A:%.*]]
-; CHECK-NEXT:    [[R:%.*]] = select <2 x i1> [[B:%.*]], <2 x i1> [[TMP1]], <2 x i1> zeroinitializer
+; CHECK-NEXT:    [[TMP1:%.*]] = icmp ne <2 x i32> [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT:    [[TMP2:%.*]] = select <2 x i1> [[TMP1]], <2 x i1> <i1 true, i1 true>, <2 x i1> [[A:%.*]]
+; CHECK-NEXT:    [[R:%.*]] = select <2 x i1> [[B:%.*]], <2 x i1> [[TMP2]], <2 x i1> zeroinitializer
 ; CHECK-NEXT:    ret <2 x i1> [[R]]
 ;
   %c = icmp eq <2 x i32> %x, %y
@@ -877,9 +877,9 @@ entry:
 
 define i1 @or_and3(i1 %a, i1 %b, i32 %x, i32 %y) {
 ; CHECK-LABEL: @or_and3(
-; CHECK-NEXT:    [[C:%.*]] = icmp ne i32 [[X:%.*]], [[Y:%.*]]
-; CHECK-NEXT:    [[TMP1:%.*]] = select i1 [[C]], i1 [[B:%.*]], i1 false
-; CHECK-NEXT:    [[R:%.*]] = select i1 [[A:%.*]], i1 true, i1 [[TMP1]]
+; CHECK-NEXT:    [[TMP1:%.*]] = icmp ne i32 [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT:    [[TMP2:%.*]] = select i1 [[TMP1]], i1 [[B:%.*]], i1 false
+; CHECK-NEXT:    [[R:%.*]] = select i1 [[A:%.*]], i1 true, i1 [[TMP2]]
 ; CHECK-NEXT:    ret i1 [[R]]
 ;
   %c = icmp eq i32 %x, %y
@@ -890,9 +890,9 @@ define i1 @or_and3(i1 %a, i1 %b, i32 %x, i32 %y) {
 
 define i1 @or_and3_commuted(i1 %a, i1 %b, i32 %x, i32 %y) {
 ; CHECK-LABEL: @or_and3_commuted(
-; CHECK-NEXT:    [[C:%.*]] = icmp ne i32 [[X:%.*]], [[Y:%.*]]
-; CHECK-NEXT:    [[TMP1:%.*]] = select i1 [[C]], i1 [[B:%.*]], i1 false
-; CHECK-NEXT:    [[R:%.*]] = select i1 [[A:%.*]], i1 true, i1 [[TMP1]]
+; CHECK-NEXT:    [[TMP1:%.*]] = icmp ne i32 [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT:    [[TMP2:%.*]] = select i1 [[TMP1]], i1 [[B:%.*]], i1 false
+; CHECK-NEXT:    [[R:%.*]] = select i1 [[A:%.*]], i1 true, i1 [[TMP2]]
 ; CHECK-NEXT:    ret i1 [[R]]
 ;
   %c = icmp eq i32 %x, %y
@@ -929,9 +929,9 @@ define i1 @or_and3_multiuse(i1 %a, i1 %b, i32 %x, i32 %y) {
 
 define <2 x i1> @or_and3_vec(<2 x i1> %a, <2 x i1> %b, <2 x i32> %x, <2 x i32> %y) {
 ; CHECK-LABEL: @or_and3_vec(
-; CHECK-NEXT:    [[C:%.*]] = icmp ne <2 x i32> [[X:%.*]], [[Y:%.*]]
-; CHECK-NEXT:    [[TMP1:%.*]] = select <2 x i1> [[C]], <2 x i1> [[B:%.*]], <2 x i1> zeroinitializer
-; CHECK-NEXT:    [[R:%.*]] = select <2 x i1> [[A:%.*]], <2 x i1> <i1 true, i1 true>, <2 x i1> [[TMP1]]
+; CHECK-NEXT:    [[TMP1:%.*]] = icmp ne <2 x i32> [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT:    [[TMP2:%.*]] = select <2 x i1> [[TMP1]], <2 x i1> [[B:%.*]], <2 x i1> zeroinitializer
+; CHECK-NEXT:    [[R:%.*]] = select <2 x i1> [[A:%.*]], <2 x i1> <i1 true, i1 true>, <2 x i1> [[TMP2]]
 ; CHECK-NEXT:    ret <2 x i1> [[R]]
 ;
   %c = icmp eq <2 x i32> %x, %y
@@ -942,9 +942,9 @@ define <2 x i1> @or_and3_vec(<2 x i1> %a, <2 x i1> %b, <2 x i32> %x, <2 x i32> %
 
 define <2 x i1> @or_and3_vec_commuted(<2 x i1> %a, <2 x i1> %b, <2 x i32> %x, <2 x i32> %y) {
 ; CHECK-LABEL: @or_and3_vec_commuted(
-; CHECK-NEXT:    [[C:%.*]] = icmp ne <2 x i32> [[X:%.*]], [[Y:%.*]]
-; CHECK-NEXT:    [[TMP1:%.*]] = select <2 x i1> [[C]], <2 x i1> [[B:%.*]], <2 x i1> zeroinitializer
-; CHECK-NEXT:    [[R:%.*]] = select <2 x i1> [[A:%.*]], <2 x i1> <i1 true, i1 true>, <2 x i1> [[TMP1]]
+; CHECK-NEXT:    [[TMP1:%.*]] = icmp ne <2 x i32> [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT:    [[TMP2:%.*]] = select <2 x i1> [[TMP1]], <2 x i1> [[B:%.*]], <2 x i1> zeroinitializer
+; CHECK-NEXT:    [[R:%.*]] = select <2 x i1> [[A:%.*]], <2 x i1> <i1 true, i1 true>, <2 x i1> [[TMP2]]
 ; CHECK-NEXT:    ret <2 x i1> [[R]]
 ;
   %c = icmp eq <2 x i32> %x, %y
@@ -965,3 +965,365 @@ define i1 @or_and3_wrong_operand(i1 %a, i1 %b, i32 %x, i32 %y, i1 %d) {
   %r = select i1 %cond, i1 %d, i1 %b
   ret i1 %r
 }
+
+define i8 @test_or_umax(i8 %x, i8 %y, i1 %cond) {
+; CHECK-LABEL: @test_or_umax(
+; CHECK-NEXT:    [[TMP1:%.*]] = call i8 @llvm.umax.i8(i8 [[X:%.*]], i8 [[Y:%.*]])
+; CHECK-NEXT:    [[RET:%.*]] = select i1 [[COND:%.*]], i8 [[X]], i8 [[TMP1]]
+; CHECK-NEXT:    ret i8 [[RET]]
+;
+  %cmp = icmp ugt i8 %x, %y
+  %or = select i1 %cond, i1 true, i1 %cmp
+  %ret = select i1 %or, i8 %x, i8 %y
+  ret i8 %ret
+}
+
+define i8 @test_or_umin(i8 %x, i8 %y, i1 %cond) {
+; CHECK-LABEL: @test_or_umin(
+; CHECK-NEXT:    [[TMP1:%.*]] = call i8 @llvm.umin.i8(i8 [[X:%.*]], i8 [[Y:%.*]])
+; CHECK-NEXT:    [[RET:%.*]] = select i1 [[COND:%.*]], i8 [[Y]], i8 [[TMP1]]
+; CHECK-NEXT:    ret i8 [[RET]]
+;
+  %cmp = icmp ugt i8 %x, %y
+  %or = select i1 %cond, i1 true, i1 %cmp
+  %ret = select i1 %or, i8 %y, i8 %x
+  ret i8 %ret
+}
+
+define i8 @test_and_umax(i8 %x, i8 %y, i1 %cond) {
+; CHECK-LABEL: @test_and_umax(
+; CHECK-NEXT:    [[TMP1:%.*]] = call i8 @llvm.umax.i8(i8 [[X:%.*]], i8 [[Y:%.*]])
+; CHECK-NEXT:    [[RET:%.*]] = select i1 [[COND:%.*]], i8 [[TMP1]], i8 [[Y]]
+; CHECK-NEXT:    ret i8 [[RET]]
+;
+  %cmp = icmp ugt i8 %x, %y
+  %and = select i1 %cond, i1 %cmp, i1 false
+  %ret = select i1 %and, i8 %x, i8 %y
+  ret i8 %ret
+}
+
+define i8 @test_and_umin(i8 %x, i8 %y, i1 %cond) {
+; CHECK-LABEL: @test_and_umin(
+; CHECK-NEXT:    [[TMP1:%.*]] = call i8 @llvm.umin.i8(i8 [[X:%.*]], i8 [[Y:%.*]])
+; CHECK-NEXT:    [[RET:%.*]] = select i1 [[COND:%.*]], i8 [[TMP1]], i8 [[X]]
+; CHECK-NEXT:    ret i8 [[RET]]
+;
+  %cmp = icmp ugt i8 %x, %y
+  %and = select i1 %cond, i1 %cmp, i1 false
+  %ret = select i1 %and, i8 %y, i8 %x
+  ret i8 %ret
+}
+
+define i8 @test_or_umax_bitwise1(i8 %x, i8 %y, i8 %val) {
+; CHECK-LABEL: @test_or_umax_bitwise1(
+; CHECK-NEXT:    [[COND:%.*]] = icmp eq i8 [[VAL:%.*]], 0
+; CHECK-NEXT:    [[TMP1:%.*]] = call i8 @llvm.umax.i8(i8 [[X:%.*]], i8 [[Y:%.*]])
+; CHECK-NEXT:    [[RET:%.*]] = select i1 [[COND]], i8 [[X]], i8 [[TMP1]]
+; CHECK-NEXT:    ret i8 [[RET]]
+;
+  %cond = icmp eq i8 %val, 0 ; thwart complexity-based ordering
+  %cmp = icmp ugt i8 %x, %y
+  %or = or i1 %cond, %cmp
+  %ret = select i1 %or, i8 %x, i8 %y
+  ret i8 %ret
+}
+
+define i8 @test_or_umax_bitwise2(i8 %x, i8 %y, i8 %val) {
+; CHECK-LABEL: @test_or_umax_bitwise2(
+; CHECK-NEXT:    [[COND:%.*]] = icmp eq i8 [[VAL:%.*]], 0
+; CHECK-NEXT:    [[TMP1:%.*]] = call i8 @llvm.umax.i8(i8 [[X:%.*]], i8 [[Y:%.*]])
+; CHECK-NEXT:    [[RET:%.*]] = select i1 [[COND]], i8 [[X]], i8 [[TMP1]]
+; CHECK-NEXT:    ret i8 [[RET]]
+;
+  %cond = icmp eq i8 %val, 0 ; thwart complexity-based ordering
+  %cmp = icmp ugt i8 %x, %y
+  %or = or i1 %cmp, %cond
+  %ret = select i1 %or, i8 %x, i8 %y
+  ret i8 %ret
+}
+
+define i8 @test_and_umax_bitwise1(i8 %x, i8 %y, i8 %val) {
+; CHECK-LABEL: @test_and_umax_bitwise1(
+; CHECK-NEXT:    [[COND:%.*]] = icmp eq i8 [[VAL:%.*]], 0
+; CHECK-NEXT:    [[TMP1:%.*]] = call i8 @llvm.umax.i8(i8 [[X:%.*]], i8 [[Y:%.*]])
+; CHECK-NEXT:    [[RET:%.*]] = select i1 [[COND]], i8 [[TMP1]], i8 [[Y]]
+; CHECK-NEXT:    ret i8 [[RET]]
+;
+  %cond = icmp eq i8 %val, 0 ; thwart complexity-based ordering
+  %cmp = icmp ugt i8 %x, %y
+  %and = and i1 %cond, %cmp
+  %ret = select i1 %and, i8 %x, i8 %y
+  ret i8 %ret
+}
+
+define i8 @test_and_umax_bitwise2(i8 %x, i8 %y, i8 %val) {
+; CHECK-LABEL: @test_and_umax_bitwise2(
+; CHECK-NEXT:    [[COND:%.*]] = icmp eq i8 [[VAL:%.*]], 0
+; CHECK-NEXT:    [[TMP1:%.*]] = call i8 @llvm.umax.i8(i8 [[X:%.*]], i8 [[Y:%.*]])
+; CHECK-NEXT:    [[RET:%.*]] = select i1 [[COND]], i8 [[TMP1]], i8 [[Y]]
+; CHECK-NEXT:    ret i8 [[RET]]
+;
+  %cond = icmp eq i8 %val, 0 ; thwart complexity-based ordering
+  %cmp = icmp ugt i8 %x, %y
+  %and = and i1 %cmp, %cond
+  %ret = select i1 %and, i8 %x, i8 %y
+  ret i8 %ret
+}
+
+; Other SPFs
+
+define i8 @test_or_smax(i8 %x, i8 %y, i1 %cond) {
+; CHECK-LABEL: @test_or_smax(
+; CHECK-NEXT:    [[TMP1:%.*]] = call i8 @llvm.smax.i8(i8 [[X:%.*]], i8 [[Y:%.*]])
+; CHECK-NEXT:    [[RET:%.*]] = select i1 [[COND:%.*]], i8 [[X]], i8 [[TMP1]]
+; CHECK-NEXT:    ret i8 [[RET]]
+;
+  %cmp = icmp sgt i8 %x, %y
+  %or = select i1 %cond, i1 true, i1 %cmp
+  %ret = select i1 %or, i8 %x, i8 %y
+  ret i8 %ret
+}
+
+define i8 @test_or_abs(i8 %x, i1 %cond) {
+; CHECK-LABEL: @test_or_abs(
+; CHECK-NEXT:    [[TMP1:%.*]] = call i8 @llvm.abs.i8(i8 [[X:%.*]], i1 true)
+; CHECK-NEXT:    [[RET:%.*]] = select i1 [[COND:%.*]], i8 [[X]], i8 [[TMP1]]
+; CHECK-NEXT:    ret i8 [[RET]]
+;
+  %cmp = icmp sgt i8 %x, -1
+  %neg = sub nsw i8 0, %x
+  %or = select i1 %cond, i1 true, i1 %cmp
+  %ret = select i1 %or, i8 %x, i8 %neg
+  ret i8 %ret
+}
+
+; TODO: fold SPF_FMAXNUM
+define float @test_or_fmaxnum(float %x, float %y, i1 %cond) {
+; CHECK-LABEL: @test_or_fmaxnum(
+; CHECK-NEXT:    [[CMP:%.*]] = fcmp nnan ogt float [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT:    [[OR:%.*]] = select i1 [[COND:%.*]], i1 true, i1 [[CMP]]
+; CHECK-NEXT:    [[RET:%.*]] = select i1 [[OR]], float [[X]], float [[Y]]
+; CHECK-NEXT:    ret float [[RET]]
+;
+  %cmp = fcmp nnan ogt float %x, %y
+  %or = select i1 %cond, i1 true, i1 %cmp
+  %ret = select i1 %or, float %x, float %y
+  ret float %ret
+}
+
+; Negative tests
+
+define i8 @test_or_umax_invalid_logical(i8 %x, i8 %y, i1 %cond) {
+; CHECK-LABEL: @test_or_umax_invalid_logical(
+; CHECK-NEXT:    [[CMP:%.*]] = icmp ugt i8 [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT:    [[OR:%.*]] = select i1 [[CMP]], i1 true, i1 [[COND:%.*]]
+; CHECK-NEXT:    [[RET:%.*]] = select i1 [[OR]], i8 [[X]], i8 [[Y]]
+; CHECK-NEXT:    ret i8 [[RET]]
+;
+  %cmp = icmp ugt i8 %x, %y
+  %or = select i1 %cmp, i1 true, i1 %cond
+  %ret = select i1 %or, i8 %x, i8 %y
+  ret i8 %ret
+}
+
+define i8 @test_and_umax_invalid_logical(i8 %x, i8 %y, i1 %cond) {
+; CHECK-LABEL: @test_and_umax_invalid_logical(
+; CHECK-NEXT:    [[CMP:%.*]] = icmp ugt i8 [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT:    [[AND:%.*]] = select i1 [[CMP]], i1 [[COND:%.*]], i1 false
+; CHECK-NEXT:    [[RET:%.*]] = select i1 [[AND]], i8 [[X]], i8 [[Y]]
+; CHECK-NEXT:    ret i8 [[RET]]
+;
+  %cmp = icmp ugt i8 %x, %y
+  %and = select i1 %cmp, i1 %cond, i1 false
+  %ret = select i1 %and, i8 %x, i8 %y
+  ret i8 %ret
+}
+
+define i8 @test_or_umax_multiuse_cond(i8 %x, i8 %y, i1 %cond) {
+; CHECK-LABEL: @test_or_umax_multiuse_cond(
+; CHECK-NEXT:    [[CMP:%.*]] = icmp ugt i8 [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT:    [[OR:%.*]] = select i1 [[COND:%.*]], i1 true, i1 [[CMP]]
+; CHECK-NEXT:    call void @use(i1 [[OR]])
+; CHECK-NEXT:    [[RET:%.*]] = select i1 [[OR]], i8 [[X]], i8 [[Y]]
+; CHECK-NEXT:    ret i8 [[RET]]
+;
+  %cmp = icmp ugt i8 %x, %y
+  %or = select i1 %cond, i1 true, i1 %cmp
+  call void @use(i1 %or)
+  %ret = select i1 %or, i8 %x, i8 %y
+  ret i8 %ret
+}
+
+; Tests from PR76203
+
+define i8 @test_or_eq_a_b(i1 %other_cond, i8 %a, i8 %b)  {
+; CHECK-LABEL: @test_or_eq_a_b(
+; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[OTHER_COND:%.*]], i8 [[A:%.*]], i8 [[B:%.*]]
+; CHECK-NEXT:    ret i8 [[SELECT]]
+;
+  %cmp = icmp eq i8 %a, %b
+  %cond = or i1 %other_cond, %cmp
+  %select = select i1 %cond, i8 %a, i8 %b
+  ret i8 %select
+}
+
+define i8 @test_and_ne_a_b(i1 %other_cond, i8 %a, i8 %b)  {
+; CHECK-LABEL: @test_and_ne_a_b(
+; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[OTHER_COND:%.*]], i8 [[A:%.*]], i8 [[B:%.*]]
+; CHECK-NEXT:    ret i8 [[SELECT]]
+;
+  %cmp = icmp ne i8 %a, %b
+  %cond = and i1 %other_cond, %cmp
+  %select = select i1 %cond, i8 %a, i8 %b
+  ret i8 %select
+}
+
+define i8 @test_or_eq_a_b_commuted(i1 %other_cond, i8 %a, i8 %b)  {
+; CHECK-LABEL: @test_or_eq_a_b_commuted(
+; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[OTHER_COND:%.*]], i8 [[B:%.*]], i8 [[A:%.*]]
+; CHECK-NEXT:    ret i8 [[SELECT]]
+;
+  %cmp = icmp eq i8 %a, %b
+  %cond = or i1 %other_cond, %cmp
+  %select = select i1 %cond, i8 %b, i8 %a
+  ret i8 %select
+}
+
+define i8 @test_and_ne_a_b_commuted(i1 %other_cond, i8 %a, i8 %b)  {
+; CHECK-LABEL: @test_and_ne_a_b_commuted(
+; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[OTHER_COND:%.*]], i8 [[B:%.*]], i8 [[A:%.*]]
+; CHECK-NEXT:    ret i8 [[SELECT]]
+;
+  %cmp = icmp ne i8 %a, %b
+  %cond = and i1 %other_cond, %cmp
+  %select = select i1 %cond, i8 %b, i8 %a
+  ret i8 %select
+}
+
+define i8 @test_or_eq_different_operands(i8 %a, i8 %b, i8 %c)  {
+; CHECK-LABEL: @test_or_eq_different_operands(
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i8 [[A:%.*]], [[C:%.*]]
+; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[CMP]], i8 [[A]], i8 [[B:%.*]]
+; CHECK-NEXT:    ret i8 [[SELECT]]
+;
+  %cmp = icmp eq i8 %a, %c
+  %cmp1 = icmp eq i8 %b, %a
+  %cond = or i1 %cmp, %cmp1
+  %select = select i1 %cond, i8 %a, i8 %b
+  ret i8 %select
+}
+
+define i8 @test_or_eq_a_b_multi_use(i1 %other_cond, i8 %a, i8 %b)  {
+; CHECK-LABEL: @test_or_eq_a_b_multi_use(
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i8 [[A:%.*]], [[B:%.*]]
+; CHECK-NEXT:    [[COND:%.*]] = or i1 [[CMP]], [[OTHER_COND:%.*]]
+; CHECK-NEXT:    call void @use(i1 [[CMP]])
+; CHECK-NEXT:    call void @use(i1 [[COND]])
+; CHECK-NEXT:    [[SELECT:%.*]] = select i1 [[OTHER_COND]], i8 [[A]], i8 [[B]]
+; CHECK-NEXT:    ret i8 [[SELECT]]
+;
+  %cmp = icmp eq i...
[truncated]

return SelectInst::Create(A, IsAnd ? V : TrueVal, IsAnd ? FalseVal : V);

// Is (select B, T, F) a SPF?
if (CondVal->hasOneUse() && SelType->isIntOrIntVectorTy()) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be profitable if we can fold more decomposed patterns here (e.g., foldAndOrOfICmps).
But I don't think it is a scalable approach.
See also #76623 (comment).

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp Outdated Show resolved Hide resolved
@@ -303,7 +303,7 @@ define i1 @and_logic_and_logic_or_not_one_use(i1 %c, i1 %a, i1 %b) {
; CHECK-LABEL: @and_logic_and_logic_or_not_one_use(
; CHECK-NEXT: [[AC:%.*]] = and i1 [[A:%.*]], [[C:%.*]]
; CHECK-NEXT: [[BC:%.*]] = select i1 [[B:%.*]], i1 [[C]], i1 false
; CHECK-NEXT: [[OR:%.*]] = select i1 [[BC]], i1 true, i1 [[AC]]
; CHECK-NEXT: [[OR:%.*]] = select i1 [[B]], i1 [[C]], i1 [[AC]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd consider this a regression, as logical or is more canonical. Skip if the outer select is already a logical op?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Or maybe if the operand is already constant?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Though it would be an improvement if the condition weren't multi-use, so maybe it's okay to just ignore this.)

@nikic
Copy link
Contributor

nikic commented Dec 30, 2023

Does this also fix #76043?

dtcxzyw added a commit that referenced this pull request Dec 31, 2023
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@XChy XChy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@dtcxzyw dtcxzyw merged commit b23f59a into llvm:main Dec 31, 2023
4 checks passed
@dtcxzyw dtcxzyw deleted the perf/select-and-or-fold branch December 31, 2023 10:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants