Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[InstCombine] Fold xored one-complemented operand comparisons #69882

Merged
merged 2 commits into from
Nov 14, 2023

Conversation

elhewaty
Copy link
Contributor

@elhewaty elhewaty commented Oct 22, 2023

@llvmbot
Copy link
Collaborator

llvmbot commented Oct 22, 2023

@llvm/pr-subscribers-llvm-transforms

Author: None (elhewaty)

Changes
  • [InstCombine] Add test coverage for comparisons of operands including one-complemented oparands(NFC).
  • [InstCombine] Fold xored one-complemented operand comparisons.

Full diff: https://github.com/llvm/llvm-project/pull/69882.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp (+29)
  • (modified) llvm/test/Transforms/InstCombine/icmp-of-xor-x.ll (+112)
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
index 66e2b6c72cce46c..4c19edfb27d2f2b 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
@@ -6929,6 +6929,23 @@ static Instruction *foldReductionIdiom(ICmpInst &I,
   return nullptr;
 }
 
+// Gets the inverse of the predicate, but not the full predicate,
+// it doesn't change the equality, e.g SLE <-> SGE, SLT <-> SGT,
+// ULE <-> UGE, ULT <-> UGT
+static ICmpInst::Predicate ConvertPred(ICmpInst::Predicate Pred) {
+  switch(Pred) {
+    case ICmpInst::ICMP_SLE: return ICmpInst::ICMP_SGE;
+    case ICmpInst::ICMP_SGE: return ICmpInst::ICMP_SLE;
+    case ICmpInst::ICMP_SLT: return ICmpInst::ICMP_SGT;
+    case ICmpInst::ICMP_SGT: return ICmpInst::ICMP_SLT;
+    case ICmpInst::ICMP_ULE: return ICmpInst::ICMP_UGE;
+    case ICmpInst::ICMP_UGE: return ICmpInst::ICMP_ULE;
+    case ICmpInst::ICMP_ULT: return ICmpInst::ICMP_UGT;
+    case ICmpInst::ICMP_UGT: return ICmpInst::ICMP_ULT;
+    default: llvm_unreachable("Invalid Predicate");
+  }
+}
+
 Instruction *InstCombinerImpl::visitICmpInst(ICmpInst &I) {
   bool Changed = false;
   const SimplifyQuery Q = SQ.getWithInstruction(&I);
@@ -7127,6 +7144,18 @@ Instruction *InstCombinerImpl::visitICmpInst(ICmpInst &I) {
       return new ICmpInst(I.getInversePredicate(), Builder.CreateAnd(A, B),
                           Op1);
 
+    // Transform (~A ^ B) s< ~A --> (A ^ B) s> A,
+    //           (~A ^ B) s> ~A --> (A ^ B) s< A,
+    //           (~A ^ B) s<= ~A --> (A ^ B) s>= A,
+    //           (~A ^ B) s>= ~A --> (A ^ B) s<= A,
+    //           (~A ^ B) u< ~A --> (A ^ B) u< A,
+    //           (~A ^ B) u> ~A --> (A ^ B) u< A,
+    //           (~A ^ B) u<= ~A --> (A ^ B) u>= A,
+    // and       (~A ^ B) u>= ~A --> (A ^ B) <= A
+    if (match(Op0, m_Xor(m_Not(m_Value(A)), m_Value(B))) &&
+        match(Op1, m_Not(m_Value(A))) && !I.isEquality())
+      return new ICmpInst(ConvertPred(Pred), Builder.CreateXor(A, B), A);
+
     // ~X < ~Y --> Y < X
     // ~X < C -->  X > ~C
     if (match(Op0, m_Not(m_Value(A)))) {
diff --git a/llvm/test/Transforms/InstCombine/icmp-of-xor-x.ll b/llvm/test/Transforms/InstCombine/icmp-of-xor-x.ll
index 9b6572697cf5e8f..893fb868e6adc77 100644
--- a/llvm/test/Transforms/InstCombine/icmp-of-xor-x.ll
+++ b/llvm/test/Transforms/InstCombine/icmp-of-xor-x.ll
@@ -4,6 +4,118 @@
 declare void @llvm.assume(i1)
 declare void @barrier()
 
+define i32 @test_slt_xor(i32 %0, i32 %1) {
+; CHECK-LABEL: @test_slt_xor(
+; CHECK-NEXT:    [[TMP3:%.*]] = xor i32 [[TMP0:%.*]], [[TMP1:%.*]]
+; CHECK-NEXT:    [[TMP4:%.*]] = icmp sgt i32 [[TMP3]], [[TMP0]]
+; CHECK-NEXT:    [[TMP5:%.*]] = zext i1 [[TMP4]] to i32
+; CHECK-NEXT:    ret i32 [[TMP5]]
+;
+  %3 = xor i32 %0, -1
+  %4 = xor i32 %3, %1
+  %5 = icmp slt i32 %4, %3
+  %6 = zext i1 %5 to i32
+  ret i32 %6
+}
+
+define i32 @test_sle_xor(i32 %0, i32 %1) {
+; CHECK-LABEL: @test_sle_xor(
+; CHECK-NEXT:    [[TMP3:%.*]] = xor i32 [[TMP0:%.*]], [[TMP1:%.*]]
+; CHECK-NEXT:    [[TMP4:%.*]] = icmp sge i32 [[TMP3]], [[TMP0]]
+; CHECK-NEXT:    [[TMP5:%.*]] = zext i1 [[TMP4]] to i32
+; CHECK-NEXT:    ret i32 [[TMP5]]
+;
+  %3 = xor i32 %0, -1
+  %4 = xor i32 %3, %1
+  %5 = icmp sle i32 %4, %3
+  %6 = zext i1 %5 to i32
+  ret i32 %6
+}
+
+define i32 @test_sgt_xor(i32 %0, i32 %1) {
+; CHECK-LABEL: @test_sgt_xor(
+; CHECK-NEXT:    [[TMP3:%.*]] = xor i32 [[TMP0:%.*]], [[TMP1:%.*]]
+; CHECK-NEXT:    [[TMP4:%.*]] = icmp slt i32 [[TMP3]], [[TMP0]]
+; CHECK-NEXT:    [[TMP5:%.*]] = zext i1 [[TMP4]] to i32
+; CHECK-NEXT:    ret i32 [[TMP5]]
+;
+  %3 = xor i32 %0, -1
+  %4 = xor i32 %3, %1
+  %5 = icmp sgt i32 %4, %3
+  %6 = zext i1 %5 to i32
+  ret i32 %6
+}
+
+define i32 @test_sge_xor(i32 %0, i32 %1) {
+; CHECK-LABEL: @test_sge_xor(
+; CHECK-NEXT:    [[TMP3:%.*]] = xor i32 [[TMP0:%.*]], [[TMP1:%.*]]
+; CHECK-NEXT:    [[TMP4:%.*]] = icmp sle i32 [[TMP3]], [[TMP0]]
+; CHECK-NEXT:    [[TMP5:%.*]] = zext i1 [[TMP4]] to i32
+; CHECK-NEXT:    ret i32 [[TMP5]]
+;
+  %3 = xor i32 %0, -1
+  %4 = xor i32 %3, %1
+  %5 = icmp sge i32 %4, %3
+  %6 = zext i1 %5 to i32
+  ret i32 %6
+}
+
+define i32 @test_ult_xor(i32 %0, i32 %1) {
+; CHECK-LABEL: @test_ult_xor(
+; CHECK-NEXT:    [[TMP3:%.*]] = xor i32 [[TMP0:%.*]], [[TMP1:%.*]]
+; CHECK-NEXT:    [[TMP4:%.*]] = icmp ugt i32 [[TMP3]], [[TMP0]]
+; CHECK-NEXT:    [[TMP5:%.*]] = zext i1 [[TMP4]] to i32
+; CHECK-NEXT:    ret i32 [[TMP5]]
+;
+  %3 = xor i32 %0, -1
+  %4 = xor i32 %3, %1
+  %5 = icmp ult i32 %4, %3
+  %6 = zext i1 %5 to i32
+  ret i32 %6
+}
+
+define i32 @test_ule_xor(i32 %0, i32 %1) {
+; CHECK-LABEL: @test_ule_xor(
+; CHECK-NEXT:    [[TMP3:%.*]] = xor i32 [[TMP0:%.*]], [[TMP1:%.*]]
+; CHECK-NEXT:    [[TMP4:%.*]] = icmp uge i32 [[TMP3]], [[TMP0]]
+; CHECK-NEXT:    [[TMP5:%.*]] = zext i1 [[TMP4]] to i32
+; CHECK-NEXT:    ret i32 [[TMP5]]
+;
+  %3 = xor i32 %0, -1
+  %4 = xor i32 %3, %1
+  %5 = icmp ule i32 %4, %3
+  %6 = zext i1 %5 to i32
+  ret i32 %6
+}
+
+define i32 @test_ugt_xor(i32 %0, i32 %1) {
+; CHECK-LABEL: @test_ugt_xor(
+; CHECK-NEXT:    [[TMP3:%.*]] = xor i32 [[TMP0:%.*]], [[TMP1:%.*]]
+; CHECK-NEXT:    [[TMP4:%.*]] = icmp ult i32 [[TMP3]], [[TMP0]]
+; CHECK-NEXT:    [[TMP5:%.*]] = zext i1 [[TMP4]] to i32
+; CHECK-NEXT:    ret i32 [[TMP5]]
+;
+  %3 = xor i32 %0, -1
+  %4 = xor i32 %3, %1
+  %5 = icmp ugt i32 %4, %3
+  %6 = zext i1 %5 to i32
+  ret i32 %6
+}
+
+define i32 @test_uge_xor(i32 %0, i32 %1) {
+; CHECK-LABEL: @test_uge_xor(
+; CHECK-NEXT:    [[TMP3:%.*]] = xor i32 [[TMP0:%.*]], [[TMP1:%.*]]
+; CHECK-NEXT:    [[TMP4:%.*]] = icmp ule i32 [[TMP3]], [[TMP0]]
+; CHECK-NEXT:    [[TMP5:%.*]] = zext i1 [[TMP4]] to i32
+; CHECK-NEXT:    ret i32 [[TMP5]]
+;
+  %3 = xor i32 %0, -1
+  %4 = xor i32 %3, %1
+  %5 = icmp uge i32 %4, %3
+  %6 = zext i1 %5 to i32
+  ret i32 %6
+}
+
 define i1 @xor_uge(i8 %x, i8 %y) {
 ; CHECK-LABEL: @xor_uge(
 ; CHECK-NEXT:    [[YNZ:%.*]] = icmp ne i8 [[Y:%.*]], 0

@elhewaty
Copy link
Contributor Author

@dtcxzyw please have a look.

@dtcxzyw dtcxzyw changed the title instcombine [InstCombine] Fold xored one-complemented operand comparisons Oct 22, 2023
@elhewaty
Copy link
Contributor Author

(~A ^ B) s< ~A --> (A ^ B) s> A
https://alive2.llvm.org/ce/z/kptBzu

(~A ^ B) s> ~A --> (A ^ B) s< A
https://alive2.llvm.org/ce/z/3_86kB

(~A ^ B) s<= ~A --> (A ^ B) s>= A
https://alive2.llvm.org/ce/z/qS7atw

(~A ^ B) s>= ~A --> (A ^ B) s<= A
https://alive2.llvm.org/ce/z/qS7atw

(~A ^ B) u< ~A --> (A ^ B) u< A
https://alive2.llvm.org/ce/z/4Z5Mpi

(~A ^ B) u> ~A --> (A ^ B) u< A
https://alive2.llvm.org/ce/z/mpY1Vb

(~A ^ B) u<= ~A --> (A ^ B) u>= A
https://alive2.llvm.org/ce/z/udZM2U

(~A ^ B) u>= ~A --> (A ^ B) <= A
https://alive2.llvm.org/ce/z/22u3Zm

@github-actions
Copy link

github-actions bot commented Oct 22, 2023

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 01893b54835ba8c0e97d67361a7b6128cd775565 b348c4631aa4da367e34a77c57c074fd3499f28a -- llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
View the diff from clang-format here.
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
index 83cb65aa0d..588b16a8c6 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
@@ -7328,13 +7328,12 @@ Instruction *InstCombinerImpl::foldFCmpIntToFPConst(FCmpInst &I,
   // [0, UMAX], but it may still be fractional.  See if it is fractional by
   // casting the FP value to the integer value and back, checking for equality.
   // Don't do this for zero, because -0.0 is not fractional.
-  Constant *RHSInt = LHSUnsigned
-    ? ConstantExpr::getFPToUI(RHSC, IntTy)
-    : ConstantExpr::getFPToSI(RHSC, IntTy);
+  Constant *RHSInt = LHSUnsigned ? ConstantExpr::getFPToUI(RHSC, IntTy)
+                                 : ConstantExpr::getFPToSI(RHSC, IntTy);
   if (!RHS.isZero()) {
     bool Equal = LHSUnsigned
-      ? ConstantExpr::getUIToFP(RHSInt, RHSC->getType()) == RHSC
-      : ConstantExpr::getSIToFP(RHSInt, RHSC->getType()) == RHSC;
+                     ? ConstantExpr::getUIToFP(RHSInt, RHSC->getType()) == RHSC
+                     : ConstantExpr::getSIToFP(RHSInt, RHSC->getType()) == RHSC;
     if (!Equal) {
       // If we had a comparison against a fractional value, we have to adjust
       // the compare predicate and sometimes the value.  RHSC is rounded towards

@dtcxzyw dtcxzyw requested a review from arsenm October 22, 2023 15:59
%4 = xor i32 %3, %1
%5 = icmp uge i32 %4, %3
%6 = zext i1 %5 to i32
ret i32 %6
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you swap the operand order in some of the tests. I.e all your tests of of the form:
~A ^ B pred A. Can you:

  1. Add tests where its form: A pred Xor
  2. Add tests where the xor is of form A ^ ~B
  3. Can you add tests for ~A ^ B pred ~C

You don't need to add new tests for all these cases, just flip around operands in some of your current tests.

@elhewaty
Copy link
Contributor Author

Alive2: https://alive2.llvm.org/ce/z/9DoJpD

@elhewaty
Copy link
Contributor Author

@dtcxzyw why did the build fail?

@dtcxzyw
Copy link
Member

dtcxzyw commented Oct 24, 2023

@dtcxzyw why did the build fail?

You should add Value *C to shadow the definition const APInt *C in the outer scope.

@goldsteinn
Copy link
Contributor

@dtcxzyw why did the build fail?

You should add Value *C to shadow the definition const APInt *C in the outer scope.

Or instead rename variables to X/Y/Z as shadow variables can cause confusion and in llvm C usually means constant which isn't the case here.

@elhewaty elhewaty force-pushed the instcombine branch 2 times, most recently from f58c447 to a7ec75f Compare October 24, 2023 18:59
@nikic
Copy link
Contributor

nikic commented Oct 25, 2023

@elhewaty This is due to an incorrect rebase around the processUMulZExtIdiom() call. Can you please rebase another time over ea99df2, which should remove all the diffs unrelated to your change?

@elhewaty
Copy link
Contributor Author

@dtcxzyw updated.

Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with the value name fix in test_slt_xor and test_sle_xor.

@elhewaty
Copy link
Contributor Author

@dtcxzyw I used git clang-format but the check failed

Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@dtcxzyw
Copy link
Member

dtcxzyw commented Nov 13, 2023

@dtcxzyw I used git clang-format but the check failed

Emm, it looks like an internal failure of the buildbot.

@elhewaty
Copy link
Contributor Author

So, What is the future of the pull?
Will it land?

@elhewaty
Copy link
Contributor Author

@dtcxzyw Please take a look.

Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Please wait for additional approval from other reviewers.

@elhewaty
Copy link
Contributor Author

@nikic @goldsteinn @arsenm

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dtcxzyw dtcxzyw merged commit daddf40 into llvm:main Nov 14, 2023
2 of 3 checks passed
@elhewaty elhewaty deleted the instcombine branch November 14, 2023 14:20
zahiraam pushed a commit to zahiraam/llvm-project that referenced this pull request Nov 20, 2023
…9882)

- [InstCombine] Add test coverage for comparisons of operands including
one-complemented oparands(NFC).
- [InstCombine] Fold xored one-complemented operand comparisons.
Alive2: https://alive2.llvm.org/ce/z/PZMJeB
Fixes llvm#69803.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

clang is suboptimal for (~a ^ b) < ~a
5 participants