Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[InstCombine] Improve coverage of foldSelectValueEquivalence #88298

Closed

Conversation

goldsteinn
Copy link
Contributor

@goldsteinn goldsteinn commented Apr 10, 2024

  • [InstCombine] Add tests for expanding foldSelectValueEquivalence; NFC
  • [InstCombine] Improve coverage of foldSelectValueEquivalence
  1. We don't need the noundef check if the new simplification is a
    constant and doesn't contain undefs.

This cleans up regressions from folding multiuse:
(icmp eq/ne (sub/xor x, y), 0) -> (icmp eq/ne x, y).
in #87275

@goldsteinn goldsteinn requested a review from nikic as a code owner April 10, 2024 17:12
@goldsteinn goldsteinn requested a review from dtcxzyw April 10, 2024 17:12
@goldsteinn goldsteinn changed the title goldsteinn/cmp eq fold select op [InstCombine] Improve coverage of foldSelectValueEquivalence Apr 10, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Apr 10, 2024

@llvm/pr-subscribers-llvm-transforms

Author: None (goldsteinn)

Changes
  • [InstCombine] Add tests for expanding foldSelectValueEquivalence; NFC
  • [InstCombine] Improve coverage of foldSelectValueEquivalence

Full diff: https://github.com/llvm/llvm-project/pull/88298.diff

4 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp (+80-24)
  • (modified) llvm/test/Transforms/InstCombine/abs-1.ll (+2-5)
  • (added) llvm/test/Transforms/InstCombine/select-cmp-eq-op-fold.ll (+137)
  • (modified) llvm/test/Transforms/InstCombine/select.ll (+2-4)
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp b/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
index 2d78fcee1152d7..8a1eabef8d3589 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
@@ -1285,22 +1285,76 @@ Instruction *InstCombinerImpl::foldSelectValueEquivalence(SelectInst &Sel,
     Swapped = true;
   }
 
-  // In X == Y ? f(X) : Z, try to evaluate f(Y) and replace the operand.
-  // Make sure Y cannot be undef though, as we might pick different values for
-  // undef in the icmp and in f(Y). Additionally, take care to avoid replacing
-  // X == Y ? X : Z with X == Y ? Y : Z, as that would lead to an infinite
-  // replacement cycle.
   Value *CmpLHS = Cmp.getOperand(0), *CmpRHS = Cmp.getOperand(1);
-  if (TrueVal != CmpLHS &&
-      isGuaranteedNotToBeUndefOrPoison(CmpRHS, SQ.AC, &Sel, &DT)) {
-    if (Value *V = simplifyWithOpReplaced(TrueVal, CmpLHS, CmpRHS, SQ,
-                                          /* AllowRefinement */ true))
-      // Require either the replacement or the simplification result to be a
-      // constant to avoid infinite loops.
-      // FIXME: Make this check more precise.
-      if (isa<Constant>(CmpRHS) || isa<Constant>(V))
-        return replaceOperand(Sel, Swapped ? 2 : 1, V);
+  auto ReplaceLHSOpWithRHSOp = [&](Value *OldOp,
+                                   Value *NewOp) -> Instruction * {
+    // In X == Y ? f(X) : Z, try to evaluate f(Y) and replace the operand.
+    // Take care to avoid replacing X == Y ? X : Z with X == Y ? Y : Z, as that
+    // would lead to an infinite replacement cycle.
+    // If we will be able to evaluate f(Y) to a constant, we can allow undef,
+    // otherwise Y cannot be undef as we might pick different values for undef
+    // in the icmp and in f(Y).
+    if (TrueVal == OldOp)
+      return nullptr;
+
+    std::optional<bool> IsNeverUndefCached;
+    auto IsNeverUndef = [&](Value *Op) {
+      if (!IsNeverUndefCached.has_value())
+        IsNeverUndefCached =
+            isGuaranteedNotToBeUndefOrPoison(Op, SQ.AC, &Sel, &DT);
+      return *IsNeverUndefCached;
+    };
 
+    if (Value *V = simplifyWithOpReplaced(TrueVal, OldOp, NewOp, SQ,
+                                          /* AllowRefinement */ true)) {
+      // Need some gurantees about the new simplified op to ensure we don't inf
+      // loop.
+      // If we simplify to a constant, replace.
+      bool ShouldReplace = match(V, m_ImmConstant());
+      bool NeedsNoUndef = !ShouldReplace;
+      // Or replace if either NewOp is a constant
+      if (!ShouldReplace && match(NewOp, m_ImmConstant()))
+        ShouldReplace = true;
+      // Or if we end up simplifying f(Y) -> Y i.e: Old & New -> New & New ->
+      // New.
+      if (!ShouldReplace && V == NewOp)
+        ShouldReplace = true;
+
+      // Finally, if we are going to create a new one-use instruction, replace.
+      if (!ShouldReplace && isa<Instruction>(OldOp) && OldOp->hasNUses(2) &&
+          (!isa<Instruction>(NewOp) || !NewOp->hasOneUse()))
+        ShouldReplace = true;
+
+      // Unless we simplify the new instruction to a constant, need to ensure Y
+      // is not undef.
+      if (NeedsNoUndef && ShouldReplace)
+        ShouldReplace = IsNeverUndef(NewOp);
+
+      if (ShouldReplace)
+        return replaceOperand(Sel, Swapped ? 2 : 1, V);
+    }
+    // If we can't simplify, but we will either:
+    //  1) Create a new binop where both ops are NewOp i.e (add x, y) is "worse"
+    //     than (add y, y) in this case, wait until the second call so we don't
+    //     miss a one-use simplification.
+    //  2) Create a new one-use instruction.
+    // proceed.
+    else if (match(TrueVal, m_c_BinOp(m_Specific(OldOp), m_Specific(NewOp))) ||
+             (isa<Instruction>(TrueVal) && TrueVal->hasOneUse() &&
+              isa<Instruction>(OldOp) && OldOp->hasNUses(2) &&
+              (!isa<Instruction>(NewOp) || !NewOp->hasOneUse()))) {
+      auto *TrueIns = cast<Instruction>(TrueVal);
+      for (unsigned OpIdx = 0; OpIdx < TrueIns->getNumOperands(); ++OpIdx) {
+        if (TrueIns->getOperand(OpIdx) == OldOp) {
+          // Need to ensure NewOp is noundef (same reason as above). Wait until
+          // the last moment to do this check as it can be relatively expensive.
+          if (!IsNeverUndef(NewOp))
+            break;
+          TrueIns->setOperand(OpIdx, NewOp);
+          return replaceOperand(Sel, Swapped ? 2 : 1, TrueIns);
+        }
+      }
+    }
     // Even if TrueVal does not simplify, we can directly replace a use of
     // CmpLHS with CmpRHS, as long as the instruction is not used anywhere
     // else and is safe to speculatively execute (we may end up executing it
@@ -1308,17 +1362,19 @@ Instruction *InstCombinerImpl::foldSelectValueEquivalence(SelectInst &Sel,
     // undefined behavior). Only do this if CmpRHS is a constant, as
     // profitability is not clear for other cases.
     // FIXME: Support vectors.
-    if (match(CmpRHS, m_ImmConstant()) && !match(CmpLHS, m_ImmConstant()) &&
-        !Cmp.getType()->isVectorTy())
-      if (replaceInInstruction(TrueVal, CmpLHS, CmpRHS))
+    if (OldOp == CmpLHS && match(NewOp, m_ImmConstant()) &&
+        !match(OldOp, m_ImmConstant()) && !Cmp.getType()->isVectorTy() &&
+        IsNeverUndef(NewOp))
+      if (replaceInInstruction(TrueVal, OldOp, NewOp))
         return &Sel;
-  }
-  if (TrueVal != CmpRHS &&
-      isGuaranteedNotToBeUndefOrPoison(CmpLHS, SQ.AC, &Sel, &DT))
-    if (Value *V = simplifyWithOpReplaced(TrueVal, CmpRHS, CmpLHS, SQ,
-                                          /* AllowRefinement */ true))
-      if (isa<Constant>(CmpLHS) || isa<Constant>(V))
-        return replaceOperand(Sel, Swapped ? 2 : 1, V);
+
+    return nullptr;
+  };
+
+  if (Instruction *R = ReplaceLHSOpWithRHSOp(CmpLHS, CmpRHS))
+    return R;
+  if (Instruction *R = ReplaceLHSOpWithRHSOp(CmpRHS, CmpLHS))
+    return R;
 
   auto *FalseInst = dyn_cast<Instruction>(FalseVal);
   if (!FalseInst)
diff --git a/llvm/test/Transforms/InstCombine/abs-1.ll b/llvm/test/Transforms/InstCombine/abs-1.ll
index 7355c560c820b2..71021d069f1350 100644
--- a/llvm/test/Transforms/InstCombine/abs-1.ll
+++ b/llvm/test/Transforms/InstCombine/abs-1.ll
@@ -852,11 +852,8 @@ define i8 @abs_diff_signed_sgt_nuw_extra_use3(i8 %a, i8 %b) {
 
 define i32 @abs_diff_signed_slt_swap_wrong_pred1(i32 %a, i32 %b) {
 ; CHECK-LABEL: @abs_diff_signed_slt_swap_wrong_pred1(
-; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i32 [[A:%.*]], [[B:%.*]]
-; CHECK-NEXT:    [[SUB_BA:%.*]] = sub nsw i32 [[B]], [[A]]
-; CHECK-NEXT:    [[SUB_AB:%.*]] = sub nsw i32 [[A]], [[B]]
-; CHECK-NEXT:    [[COND:%.*]] = select i1 [[CMP]], i32 [[SUB_BA]], i32 [[SUB_AB]]
-; CHECK-NEXT:    ret i32 [[COND]]
+; CHECK-NEXT:    [[SUB_AB:%.*]] = sub nsw i32 [[A:%.*]], [[B:%.*]]
+; CHECK-NEXT:    ret i32 [[SUB_AB]]
 ;
   %cmp = icmp eq i32 %a, %b
   %sub_ba = sub nsw i32 %b, %a
diff --git a/llvm/test/Transforms/InstCombine/select-cmp-eq-op-fold.ll b/llvm/test/Transforms/InstCombine/select-cmp-eq-op-fold.ll
new file mode 100644
index 00000000000000..e8ecd0c166ee1e
--- /dev/null
+++ b/llvm/test/Transforms/InstCombine/select-cmp-eq-op-fold.ll
@@ -0,0 +1,137 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt < %s -passes=instcombine -S | FileCheck %s
+
+declare void @use.i1(i1)
+declare void @use.i8(i8)
+define i8 @replace_with_y_noundef(i8 %x, i8 noundef %y, i8 %z) {
+; CHECK-LABEL: @replace_with_y_noundef(
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i8 [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT:    [[SEL:%.*]] = select i1 [[CMP]], i8 [[Y]], i8 [[Z:%.*]]
+; CHECK-NEXT:    ret i8 [[SEL]]
+;
+  %cmp = icmp eq i8 %x, %y
+  %and = and i8 %x, %y
+  %sel = select i1 %cmp, i8 %and, i8 %z
+  ret i8 %sel
+}
+
+define i8 @replace_with_x_noundef(i8 noundef %x, i8 %y, i8 %z) {
+; CHECK-LABEL: @replace_with_x_noundef(
+; CHECK-NEXT:    [[CMP:%.*]] = icmp ne i8 [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT:    call void @use.i1(i1 [[CMP]])
+; CHECK-NEXT:    [[SEL:%.*]] = select i1 [[CMP]], i8 [[Z:%.*]], i8 [[X]]
+; CHECK-NEXT:    ret i8 [[SEL]]
+;
+  %cmp = icmp ne i8 %x, %y
+  call void @use.i1(i1 %cmp)
+  %and = or i8 %x, %y
+  %sel = select i1 %cmp, i8 %z, i8 %and
+  ret i8 %sel
+}
+
+define i8 @replace_with_x_maybe_undef_fail(i8 %x, i8 %y, i8 %z) {
+; CHECK-LABEL: @replace_with_x_maybe_undef_fail(
+; CHECK-NEXT:    [[CMP:%.*]] = icmp ne i8 [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT:    call void @use.i1(i1 [[CMP]])
+; CHECK-NEXT:    [[AND:%.*]] = or i8 [[X]], [[Y]]
+; CHECK-NEXT:    [[SEL:%.*]] = select i1 [[CMP]], i8 [[Z:%.*]], i8 [[AND]]
+; CHECK-NEXT:    ret i8 [[SEL]]
+;
+  %cmp = icmp ne i8 %x, %y
+  call void @use.i1(i1 %cmp)
+  %and = or i8 %x, %y
+  %sel = select i1 %cmp, i8 %z, i8 %and
+  ret i8 %sel
+}
+
+define i8 @replace_with_y_for_new_oneuse(i8 noundef %xx, i8 noundef %y, i8 %z) {
+; CHECK-LABEL: @replace_with_y_for_new_oneuse(
+; CHECK-NEXT:    [[X:%.*]] = mul i8 [[XX:%.*]], 13
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i8 [[X]], [[Y:%.*]]
+; CHECK-NEXT:    [[ADD:%.*]] = shl nuw i8 [[Y]], 1
+; CHECK-NEXT:    [[SEL:%.*]] = select i1 [[CMP]], i8 [[ADD]], i8 [[Z:%.*]]
+; CHECK-NEXT:    ret i8 [[SEL]]
+;
+  %x = mul i8 %xx, 13
+  %cmp = icmp eq i8 %x, %y
+  %add = add nuw i8 %x, %y
+  %sel = select i1 %cmp, i8 %add, i8 %z
+  ret i8 %sel
+}
+
+define i8 @replace_with_x_for_new_oneuse(i8 noundef %xx, i8 noundef %yy, i8 %z, i8 %w) {
+; CHECK-LABEL: @replace_with_x_for_new_oneuse(
+; CHECK-NEXT:    [[X:%.*]] = mul i8 [[XX:%.*]], 13
+; CHECK-NEXT:    [[Y:%.*]] = add i8 [[YY:%.*]], [[W:%.*]]
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i8 [[X]], [[Y]]
+; CHECK-NEXT:    [[MUL:%.*]] = mul i8 [[X]], [[X]]
+; CHECK-NEXT:    [[SEL:%.*]] = select i1 [[CMP]], i8 [[MUL]], i8 [[Z:%.*]]
+; CHECK-NEXT:    ret i8 [[SEL]]
+;
+  %x = mul i8 %xx, 13
+  %y = add i8 %yy, %w
+  %cmp = icmp eq i8 %x, %y
+  %mul = mul i8 %x, %y
+  %sel = select i1 %cmp, i8 %mul, i8 %z
+  ret i8 %sel
+}
+
+define i8 @replace_with_x_for_simple_binop(i8 noundef %xx, i8 %yy, i8 %z, i8 %w) {
+; CHECK-LABEL: @replace_with_x_for_simple_binop(
+; CHECK-NEXT:    [[X:%.*]] = mul i8 [[XX:%.*]], 13
+; CHECK-NEXT:    [[Y:%.*]] = add i8 [[YY:%.*]], [[W:%.*]]
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i8 [[X]], [[Y]]
+; CHECK-NEXT:    [[MUL:%.*]] = mul i8 [[X]], [[X]]
+; CHECK-NEXT:    call void @use.i8(i8 [[Y]])
+; CHECK-NEXT:    [[SEL:%.*]] = select i1 [[CMP]], i8 [[MUL]], i8 [[Z:%.*]]
+; CHECK-NEXT:    ret i8 [[SEL]]
+;
+  %x = mul i8 %xx, 13
+  %y = add i8 %yy, %w
+  %cmp = icmp eq i8 %x, %y
+  %mul = mul i8 %x, %y
+  call void @use.i8(i8 %y)
+  %sel = select i1 %cmp, i8 %mul, i8 %z
+  ret i8 %sel
+}
+
+define i8 @replace_with_none_for_new_oneuse_fail_maybe_undef(i8 %xx, i8 %y, i8 %z) {
+; CHECK-LABEL: @replace_with_none_for_new_oneuse_fail_maybe_undef(
+; CHECK-NEXT:    [[X:%.*]] = mul i8 [[XX:%.*]], 13
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i8 [[X]], [[Y:%.*]]
+; CHECK-NEXT:    [[MUL:%.*]] = mul i8 [[X]], [[Y]]
+; CHECK-NEXT:    [[SEL:%.*]] = select i1 [[CMP]], i8 [[MUL]], i8 [[Z:%.*]]
+; CHECK-NEXT:    ret i8 [[SEL]]
+;
+  %x = mul i8 %xx, 13
+  %cmp = icmp eq i8 %x, %y
+  %mul = mul i8 %x, %y
+  %sel = select i1 %cmp, i8 %mul, i8 %z
+  ret i8 %sel
+}
+
+define i8 @replace_with_y_for_simple_binop(i8 %x, i8 noundef %y, i8 %z) {
+; CHECK-LABEL: @replace_with_y_for_simple_binop(
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i8 [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT:    [[MUL:%.*]] = mul nsw i8 [[Y]], [[Y]]
+; CHECK-NEXT:    [[SEL:%.*]] = select i1 [[CMP]], i8 [[MUL]], i8 [[Z:%.*]]
+; CHECK-NEXT:    ret i8 [[SEL]]
+;
+  %cmp = icmp eq i8 %x, %y
+  %mul = mul nsw i8 %x, %y
+  %sel = select i1 %cmp, i8 %mul, i8 %z
+  ret i8 %sel
+}
+
+define i8 @replace_with_y_for_simple_binop_fail(i8 %x, i8 noundef %y, i8 %z, i8 %q) {
+; CHECK-LABEL: @replace_with_y_for_simple_binop_fail(
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i8 [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT:    [[MUL:%.*]] = mul nsw i8 [[X]], [[Q:%.*]]
+; CHECK-NEXT:    [[SEL:%.*]] = select i1 [[CMP]], i8 [[MUL]], i8 [[Z:%.*]]
+; CHECK-NEXT:    ret i8 [[SEL]]
+;
+  %cmp = icmp eq i8 %x, %y
+  %mul = mul nsw i8 %x, %q
+  %sel = select i1 %cmp, i8 %mul, i8 %z
+  ret i8 %sel
+}
diff --git a/llvm/test/Transforms/InstCombine/select.ll b/llvm/test/Transforms/InstCombine/select.ll
index 05fcf662352954..b577def762d53c 100644
--- a/llvm/test/Transforms/InstCombine/select.ll
+++ b/llvm/test/Transforms/InstCombine/select.ll
@@ -2787,8 +2787,7 @@ define <2 x i8> @select_replacement_add_eq_vec_nonuniform(<2 x i8> %x, <2 x i8>
 define <2 x i8> @select_replacement_add_eq_vec_poison(<2 x i8> %x, <2 x i8> %y) {
 ; CHECK-LABEL: @select_replacement_add_eq_vec_poison(
 ; CHECK-NEXT:    [[CMP:%.*]] = icmp eq <2 x i8> [[X:%.*]], <i8 1, i8 poison>
-; CHECK-NEXT:    [[ADD:%.*]] = add <2 x i8> [[X]], <i8 1, i8 1>
-; CHECK-NEXT:    [[SEL:%.*]] = select <2 x i1> [[CMP]], <2 x i8> [[ADD]], <2 x i8> [[Y:%.*]]
+; CHECK-NEXT:    [[SEL:%.*]] = select <2 x i1> [[CMP]], <2 x i8> <i8 2, i8 poison>, <2 x i8> [[Y:%.*]]
 ; CHECK-NEXT:    ret <2 x i8> [[SEL]]
 ;
   %cmp = icmp eq <2 x i8> %x, <i8 1, i8 poison>
@@ -2839,8 +2838,7 @@ define i8 @select_replacement_sub_noundef(i8 %x, i8 noundef %y, i8 %z) {
 define i8 @select_replacement_sub(i8 %x, i8 %y, i8 %z) {
 ; CHECK-LABEL: @select_replacement_sub(
 ; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i8 [[X:%.*]], [[Y:%.*]]
-; CHECK-NEXT:    [[SUB:%.*]] = sub i8 [[X]], [[Y]]
-; CHECK-NEXT:    [[SEL:%.*]] = select i1 [[CMP]], i8 [[SUB]], i8 [[Z:%.*]]
+; CHECK-NEXT:    [[SEL:%.*]] = select i1 [[CMP]], i8 0, i8 [[Z:%.*]]
 ; CHECK-NEXT:    ret i8 [[SEL]]
 ;
   %cmp = icmp eq i8 %x, %y

@goldsteinn
Copy link
Contributor Author

ping

@goldsteinn goldsteinn force-pushed the goldsteinn/cmp-eq-fold-select-op branch from b302265 to 4b0f05a Compare May 5, 2024 02:51
@goldsteinn
Copy link
Contributor Author

ping

@dtcxzyw
Copy link
Member

dtcxzyw commented May 11, 2024

Can you copy the commit message into the PR description? I know you always merge your patchs locally. But it may be more convenient for me to review :)

@goldsteinn
Copy link
Contributor Author

ping

1 similar comment
@goldsteinn
Copy link
Contributor Author

ping

Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This patch is too complicated to me. I can do some tests to see the ir diff.

@dtcxzyw
Copy link
Member

dtcxzyw commented Jun 4, 2024

This patch is too complicated to me. I can do some tests to see the ir diff.

Unfortunately, I have to upgrade my datasets to remove icmp/fcmp constant expressions :(

@goldsteinn
Copy link
Contributor Author

This patch is too complicated to me. I can do some tests to see the ir diff.

Would it help if I split it into multiple commits? If not any suggestions for simplifying to make it reviewable?

@dtcxzyw
Copy link
Member

dtcxzyw commented Jun 4, 2024

If not any suggestions for simplifying to make it reviewable?

I guess I can provide some suggestions after the ir diff is available :)

@goldsteinn
Copy link
Contributor Author

If not any suggestions for simplifying to make it reviewable?

I guess I can provide some suggestions after the ir diff is available :)

Okay, thanks

@nikic
Copy link
Contributor

nikic commented Jun 4, 2024

What was the motivating case from #87275?

@goldsteinn
Copy link
Contributor Author

What was the motivating case from #87275?

Yes, although we are waiting to get @dtcxzyw's results to see if it has other benefits.

dtcxzyw added a commit to dtcxzyw/llvm-opt-benchmark that referenced this pull request Jun 5, 2024
@nikic
Copy link
Contributor

nikic commented Jun 5, 2024

Sorry, what I meant is, which of the test cases / changes in this PR is the one that originally motivated this patch?

@dtcxzyw
Copy link
Member

dtcxzyw commented Jun 5, 2024

If not any suggestions for simplifying to make it reviewable?

I guess I can provide some suggestions after the ir diff is available :)

See https://alive2.llvm.org/ce/z/EUTKhF. I think we can do this simplification iff simplifyWithOpReplaced returns a constant.

@goldsteinn
Copy link
Contributor Author

If not any suggestions for simplifying to make it reviewable?

I guess I can provide some suggestions after the ir diff is available :)

See https://alive2.llvm.org/ce/z/EUTKhF. I think we can do this simplification iff simplifyWithOpReplaced returns a constant.

Looking at the diff, it seems we get some benefit from the non-const simplification.
Ill split to multiple commits. If that doesn't simplify review ill split to multiple PRs.

@goldsteinn goldsteinn force-pushed the goldsteinn/cmp-eq-fold-select-op branch from 4fd7f78 to e3d8065 Compare June 5, 2024 19:33
@goldsteinn
Copy link
Contributor Author

@nikic and @dtcxzyw

I simplified the code a bit. Hopefully that helps. Also I split changes for folding constants / non-constants into two seperate commits.

If its still too complicated I will split to 2 seperate PRs. That said I hope for both of the changes to get in. The non-const change seems to be the more common in @dtcxzyw's tests and seems to have independent value.

Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first two commits LGTM. Please file a separate PR for the remaining changes. If there are some problems in the third commit, we don't need to revert all the commits.


// If we simplified the TrueArm -> NewOp then replace.
// This handles things like `select`/`min`/`max`/`or`/`and`/etc...
if (NewOp == V)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering whether we will enter an infinite loop here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you do some stress tests (e.g., csmith) for 3-4 days (~10 million test cases on a 128C server)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, maybe a case like (select (icmp eq (or x, y), y), (or x, y), Q). Where we replace y with (or x, y) and then (or x, y), x/y) -> (or x, y).

I think for it to inf loop through would need NewOp == TrueArm. I think we can probably safely just return null for that case.

Either way, ill run csmith for a bit.

What do you mean 128C server?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe 128C == 128 Cores ?
Could be 128 C° though.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the second commit is not quite correct. I'll try to come up with an example.

bool NewOpNeverUndef = false;

if (Value *V = simplifyWithOpReplaced(TrueVal, OldOp, NewOp, SQ,
/* AllowRefinement= */ true)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/* AllowRefinement= */ true)) {
/* AllowRefinement=*/true)) {

@nikic
Copy link
Contributor

nikic commented Jun 6, 2024

See select_replacement_add_eq_vec_undef for the problematic case. The issue is with undef values, not poison values, and that problem exists even if the result is a constant. I've clarified the implementation to that effect.

@goldsteinn
Copy link
Contributor Author

See select_replacement_add_eq_vec_undef for the problematic case. The issue is with undef values, not poison values, and that problem exists even if the result is a constant. I've clarified the implementation to that effect.

So I think this can be fixed if the constant result doesn't contain any undefs.

@goldsteinn goldsteinn force-pushed the goldsteinn/cmp-eq-fold-select-op branch from e3d8065 to 750b54f Compare June 6, 2024 21:00
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please drop the last commit and its tests from this PR?

@goldsteinn
Copy link
Contributor Author

Can you please drop the last commit and its tests from this PR?

Done

@goldsteinn goldsteinn force-pushed the goldsteinn/cmp-eq-fold-select-op branch from 750b54f to 3d0149b Compare June 6, 2024 21:32
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Comment on lines 1308 to 1313
if (match(V, m_ImmConstant())) {
if (isGuaranteedNotToBeUndef(V, SQ.AC, &Sel, &DT) ||
isGuaranteedNotToBeUndef(NewOp, SQ.AC, &Sel, &DT))
return replaceOperand(Sel, Swapped ? 2 : 1, V);
return nullptr;
}
Copy link
Contributor

@nikic nikic Jun 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (match(V, m_ImmConstant())) {
if (isGuaranteedNotToBeUndef(V, SQ.AC, &Sel, &DT) ||
isGuaranteedNotToBeUndef(NewOp, SQ.AC, &Sel, &DT))
return replaceOperand(Sel, Swapped ? 2 : 1, V);
return nullptr;
}
if (match(V, m_ImmConstant()) &&
isGuaranteedNotToBeUndef(V, SQ.AC, &Sel, &DT))
return replaceOperand(Sel, Swapped ? 2 : 1, V);

Would all tests still pass if you did this? I think you added select_replacement_add_eq_vec_undef_okay to cover that second isGuaranteedNotToBeUndef call, but I think that test would also be covered by falling through to the following case. I'd also be fine with just regressing that one entirely, as the folding result being a constant with undef isn't something we expect to see in practice (but have to guard against as a matter of correctness).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right. Let me try and update.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

Going to push this in a bit.

…nstants

We don't need the `noundef` check if the new simplification is a
constant.

This cleans up regressions from folding multiuse:
    `(icmp eq/ne (sub/xor x, y), 0)` -> `(icmp eq/ne x, y)`.
@goldsteinn goldsteinn force-pushed the goldsteinn/cmp-eq-fold-select-op branch from 3d0149b to a422148 Compare June 6, 2024 22:30
@goldsteinn goldsteinn closed this in 7e7c29b Jun 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants