Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ValueTracking] Compute knownbits from (icmp upred (add/sub nuw X, Y), C) #87180

Closed
wants to merge 2 commits into from

Conversation

goldsteinn
Copy link
Contributor

@goldsteinn goldsteinn commented Mar 31, 2024

  • [ValueTracking] Add tests for computing knownbits from (icmp upred (add/sub nuw X, Y), C); NFC

  • [ValueTracking] Compute knownbits from (icmp upred (add/sub nuw X, Y), C)

    (icmp ule/ult (add nuw X, Y), C) implies both (icmp ule/ult X, C) and
    (icmp ule/ult Y, C). We can use this to deduce leading zeros in X/Y.

    (icmp uge/ugt (sub nuw X, Y), C) implies (icmp uge/uge X, C) . We
    can use this to deduce leading ones in X.

    Proofs: https://alive2.llvm.org/ce/z/sc5k22

@goldsteinn goldsteinn requested a review from nikic as a code owner March 31, 2024 03:42
@goldsteinn goldsteinn changed the title goldsteinn/kb nuw add [ValueTracking] Compute knownbits from (icmp ult/ule (add nuw X, Y), C) Mar 31, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Mar 31, 2024

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-llvm-transforms

Author: None (goldsteinn)

Changes
  • [ValueTracking] Add tests for computing knownbits from (icmp ult/ule (add nuw X, Y), C); NFC
  • [ValueTracking] Compute knownbits from (icmp ult/ule (add nuw X, Y), C)

Based on: #87180


Full diff: https://github.com/llvm/llvm-project/pull/87180.diff

2 Files Affected:

  • (modified) llvm/lib/Analysis/ValueTracking.cpp (+28-12)
  • (modified) llvm/test/Transforms/InstCombine/known-bits.ll (+98-5)
diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index b5e8a1d22f264b..5bc2d605d3a1b1 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -698,13 +698,19 @@ static void computeKnownBitsFromCmp(const Value *V, CmpInst::Predicate Pred,
     break;
   }
   default:
-    const APInt *Offset = nullptr;
-    if (match(LHS, m_CombineOr(m_V, m_AddLike(m_V, m_APInt(Offset)))) &&
-        match(RHS, m_APInt(C))) {
-      ConstantRange LHSRange = ConstantRange::makeAllowedICmpRegion(Pred, *C);
-      if (Offset)
-        LHSRange = LHSRange.sub(*Offset);
-      Known = Known.unionWith(LHSRange.toKnownBits());
+    if (match(RHS, m_APInt(C))) {
+      const APInt *Offset = nullptr;
+      if (match(LHS, m_CombineOr(m_V, m_AddLike(m_V, m_APInt(Offset))))) {
+        ConstantRange LHSRange = ConstantRange::makeAllowedICmpRegion(Pred, *C);
+        if (Offset)
+          LHSRange = LHSRange.sub(*Offset);
+        Known = Known.unionWith(LHSRange.toKnownBits());
+      }
+      if ((Pred == ICmpInst::ICMP_ULE || Pred == ICmpInst::ICMP_ULT) &&
+          match(LHS, m_c_NUWAdd(m_V, m_Value()))) {
+        Known.Zero.setHighBits(
+            (*C - (Pred == ICmpInst::ICMP_ULT)).countLeadingZeros());
+      }
     }
     break;
   }
@@ -9283,11 +9289,21 @@ void llvm::findValuesAffectedByCondition(
             AddAffected(X);
         }
       } else {
-        // Handle (A + C1) u< C2, which is the canonical form of
-        // A > C3 && A < C4.
-        if (match(A, m_AddLike(m_Value(X), m_ConstantInt())) &&
-            match(B, m_ConstantInt()))
-          AddAffected(X);
+        if (match(B, m_ConstantInt())) {
+          // Handle (A + C1) u< C2, which is the canonical form of
+          // A > C3 && A < C4.
+          if (match(A, m_AddLike(m_Value(X), m_ConstantInt())))
+            AddAffected(X);
+
+          Value *Y;
+          // X & Y u> C -> X >u C && Y >u C
+          // X | Y u< C -> X u< C && Y u< C
+          if (ICmpInst::isUnsigned(Pred) &&
+              match(A, m_NUWAdd(m_Value(X), m_Value(Y)))) {
+            AddAffected(X);
+            AddAffected(Y);
+          }
+        }
 
         // Handle icmp slt/sgt (bitcast X to int), 0/-1, which is supported
         // by computeKnownFPClass().
diff --git a/llvm/test/Transforms/InstCombine/known-bits.ll b/llvm/test/Transforms/InstCombine/known-bits.ll
index 5305c78f691231..0ed63512f8cb49 100644
--- a/llvm/test/Transforms/InstCombine/known-bits.ll
+++ b/llvm/test/Transforms/InstCombine/known-bits.ll
@@ -124,7 +124,6 @@ exit:
   ret i8 %or2
 }
 
-
 define i8 @test_cond_and_bothways(i8 %x) {
 ; CHECK-LABEL: @test_cond_and_bothways(
 ; CHECK-NEXT:    [[AND:%.*]] = and i8 [[X:%.*]], 91
@@ -181,8 +180,6 @@ exit:
   ret i8 %or2
 }
 
-
-
 define i8 @test_cond_and_commuted(i8 %x, i1 %c1, i1 %c2) {
 ; CHECK-LABEL: @test_cond_and_commuted(
 ; CHECK-NEXT:    [[AND:%.*]] = and i8 [[X:%.*]], 3
@@ -343,7 +340,7 @@ exit:
   ret i8 %or2
 }
 
-define i32 @test_icmp_trunc1(i32 %x){
+define i32 @test_icmp_trunc1(i32 %x) {
 ; CHECK-LABEL: @test_icmp_trunc1(
 ; CHECK-NEXT:  entry:
 ; CHECK-NEXT:    [[Y:%.*]] = trunc i32 [[X:%.*]] to i16
@@ -365,7 +362,7 @@ else:
   ret i32 0
 }
 
-define i32 @test_icmp_trunc_assume(i32 %x){
+define i32 @test_icmp_trunc_assume(i32 %x) {
 ; CHECK-LABEL: @test_icmp_trunc_assume(
 ; CHECK-NEXT:  entry:
 ; CHECK-NEXT:    [[Y:%.*]] = trunc i32 [[X:%.*]] to i16
@@ -532,7 +529,103 @@ if.else:
   ret i1 %other
 }
 
+define i8 @test_icmp_add(i8 %n, i8 %n2, i8 %other) {
+; CHECK-LABEL: @test_icmp_add(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[N_ADD:%.*]] = add nuw i8 [[N:%.*]], [[N2:%.*]]
+; CHECK-NEXT:    [[CMP:%.*]] = icmp ult i8 [[N_ADD]], 32
+; CHECK-NEXT:    br i1 [[CMP]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK:       if.then:
+; CHECK-NEXT:    ret i8 0
+; CHECK:       if.else:
+; CHECK-NEXT:    ret i8 [[OTHER:%.*]]
+;
+entry:
+  %n_add = add nuw i8 %n, %n2
+  %cmp = icmp ult i8 %n_add, 32
+  br i1 %cmp, label %if.then, label %if.else
+
+if.then:
+  %r = and i8 %n, 32
+  ret i8 %r
+
+if.else:
+  ret i8 %other
+}
+
+define i8 @test_icmp_add2(i8 %n, i8 %n2, i8 %other) {
+; CHECK-LABEL: @test_icmp_add2(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[N_ADD:%.*]] = add nuw i8 [[N:%.*]], [[N2:%.*]]
+; CHECK-NEXT:    [[CMP:%.*]] = icmp ugt i8 [[N_ADD]], 14
+; CHECK-NEXT:    br i1 [[CMP]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK:       if.then:
+; CHECK-NEXT:    ret i8 [[OTHER:%.*]]
+; CHECK:       if.else:
+; CHECK-NEXT:    ret i8 0
+;
+entry:
+  %n_add = add nuw i8 %n, %n2
+  %cmp = icmp uge i8 %n_add, 15
+  br i1 %cmp, label %if.then, label %if.else
+
+if.then:
+  ret i8 %other
+if.else:
+  %r = and i8 %n, 32
+  ret i8 %r
 
+}
+
+define i8 @test_icmp_add_fail_bad_range(i8 %n, i8 %n2, i8 %other) {
+; CHECK-LABEL: @test_icmp_add_fail_bad_range(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[N_ADD:%.*]] = add nuw i8 [[N:%.*]], [[N2:%.*]]
+; CHECK-NEXT:    [[CMP:%.*]] = icmp ult i8 [[N_ADD]], 33
+; CHECK-NEXT:    br i1 [[CMP]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK:       if.then:
+; CHECK-NEXT:    [[R:%.*]] = and i8 [[N]], 32
+; CHECK-NEXT:    ret i8 [[R]]
+; CHECK:       if.else:
+; CHECK-NEXT:    ret i8 [[OTHER:%.*]]
+;
+entry:
+  %n_add = add nuw i8 %n, %n2
+  %cmp = icmp ule i8 %n_add, 32
+  br i1 %cmp, label %if.then, label %if.else
+
+if.then:
+  %r = and i8 %n, 32
+  ret i8 %r
+
+if.else:
+  ret i8 %other
+}
+
+define i8 @test_icmp_add_fail_bad_pred(i8 %n, i8 %n2, i8 %other) {
+; CHECK-LABEL: @test_icmp_add_fail_bad_pred(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[N_ADD:%.*]] = add nuw i8 [[N:%.*]], [[N2:%.*]]
+; CHECK-NEXT:    [[CMP:%.*]] = icmp ugt i8 [[N_ADD]], 32
+; CHECK-NEXT:    br i1 [[CMP]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK:       if.then:
+; CHECK-NEXT:    [[R:%.*]] = and i8 [[N]], 32
+; CHECK-NEXT:    ret i8 [[R]]
+; CHECK:       if.else:
+; CHECK-NEXT:    ret i8 [[OTHER:%.*]]
+;
+entry:
+  %n_add = add nuw i8 %n, %n2
+  %cmp = icmp ugt i8 %n_add, 32
+  br i1 %cmp, label %if.then, label %if.else
+
+if.then:
+  %r = and i8 %n, 32
+  ret i8 %r
+
+if.else:
+  ret i8 %other
+}
 
 declare void @use(i1)
 declare void @sink(i8)

@goldsteinn goldsteinn requested a review from dtcxzyw March 31, 2024 03:43
llvm/lib/Analysis/ValueTracking.cpp Outdated Show resolved Hide resolved
dtcxzyw added a commit to dtcxzyw/llvm-opt-benchmark that referenced this pull request Mar 31, 2024
dtcxzyw added a commit to dtcxzyw/llvm-opt-benchmark that referenced this pull request Mar 31, 2024
@dtcxzyw
Copy link
Member

dtcxzyw commented Apr 1, 2024

Can you have a look at the regression dtcxzyw/llvm-opt-benchmark#465 (comment)?

@goldsteinn
Copy link
Contributor Author

Can you have a look at the regression dtcxzyw/llvm-opt-benchmark#465 (comment)?

Okay, reduced test:

define void @foo(i32 %x, i32 %y) {
  %z = add nuw i32 %x, %y
  %cmp0 = icmp ult i32 %z, 255
  br i1 %cmp0, label %L0, label %L2

L0:
  %sub_x = sub i32 255, %x
  br label %L1

L1:
  %phi = phi i32 [ 0, %L0 ], [ 1, %L1 ]
  %cmp1 = icmp ult i32 %phi, %sub_x
  br i1 %cmp1, label %L1, label %L2

L2:
  ret void
}

The issue occurs due to the InstCombine fold at:
https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp#L2280-L2288

We produce:

Bad:

define void @foo(i32 %x, i32 %y) local_unnamed_addr {
  %z = add nuw i32 %x, %y
  %cmp0 = icmp ult i32 %z, 255
  br i1 %cmp0, label %L0, label %L2

L0:                                               ; preds = %0
  %sub_x = xor i32 %x, 255
  br label %L1

L1:                                               ; preds = %L1, %L0
  %phi = phi i32 [ 0, %L0 ], [ 1, %L1 ]
  %cmp1 = icmp ult i32 %phi, %sub_x
  br i1 %cmp1, label %L1, label %L2

L2:                                               ; preds = %L1, %0
  ret void
}

vs
Good:

define void @foo(i32 %x, i32 %y) local_unnamed_addr {
  %z = add nuw i32 %x, %y
  %cmp0 = icmp ult i32 %z, 255
  br i1 %cmp0, label %L0, label %L2

L0:                                               ; preds = %0
  %sub_x = sub i32 255, %x
  br label %L1

L1:                                               ; preds = %L1, %L0
  %phi = phi i32 [ 0, %L0 ], [ 1, %L1 ]
  %cmp1 = icmp ult i32 %phi, %sub_x
  br i1 %cmp1, label %L1, label %L2

L2:                                               ; preds = %L1, %0
  ret void
}

The issue then arises because we fold (icmp eq/ne (sub C0, x), C1) -> (icmp eq/ne x, C2) even if the sub has multi-use but we don't do the same for xor:
https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp#L3463-L3474

Ill add a patch for the xor case.

goldsteinn added a commit to goldsteinn/llvm-project that referenced this pull request Apr 1, 2024
Two folds unlocked:
    `(icmp eq/ne (xor x, C0), C1)` -> `(icmp eq/ne x, C2)`
    `(icmp eq/ne (xor x, y), 0)` -> `(icmp eq/ne x, y)`

This fixes regressions assosiated with llvm#87180
goldsteinn added a commit to goldsteinn/llvm-project that referenced this pull request Apr 4, 2024
Two folds unlocked:
    `(icmp eq/ne (xor x, C0), C1)` -> `(icmp eq/ne x, C2)`
    `(icmp eq/ne (xor x, y), 0)` -> `(icmp eq/ne x, y)`

This fixes regressions assosiated with llvm#87180
@goldsteinn
Copy link
Contributor Author

My preference would be to not have this blocked by the missing multi-use xor fold. It seems pretty unrelated that we end up with that pattern w/ this patch but not w/ (in that one case) and handling multi-use xor seems independently valuable.

goldsteinn added a commit to goldsteinn/llvm-project that referenced this pull request Apr 4, 2024
Two folds unlocked:
    `(icmp eq/ne (xor x, C0), C1)` -> `(icmp eq/ne x, C2)`
    `(icmp eq/ne (xor x, y), 0)` -> `(icmp eq/ne x, y)`

This fixes regressions assosiated with llvm#87180
@goldsteinn goldsteinn changed the title [ValueTracking] Compute knownbits from (icmp ult/ule (add nuw X, Y), C) [ValueTracking] Compute knownbits from (icmp upred (add/sub nuw X, Y), C) Apr 5, 2024
@goldsteinn
Copy link
Contributor Author

@dtcxzyw i update with support for sub nuw as well.

@goldsteinn
Copy link
Contributor Author

rebased

@nikic
Copy link
Contributor

nikic commented Apr 25, 2024

Needs another rebase.

@dtcxzyw i update with support for sub nuw as well.

Why?

@goldsteinn
Copy link
Contributor Author

Needs another rebase.

@dtcxzyw i update with support for sub nuw as well.

Why?

I guess the add nuw mirrors the or case and the sub nuw mirrors the and case. I had meant to leave it as a folllowup todo but it seemed to fit well enough.

@goldsteinn
Copy link
Contributor Author

ping

1 similar comment
@goldsteinn
Copy link
Contributor Author

ping

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


if.else:
ret i8 %other
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Negative tests without nuw?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done + rebased/

…), C)`

`(icmp ule/ult (add nuw X, Y), C)` implies both `(icmp ule/ult X, C)` and
`(icmp ule/ult Y, C)`. We can use this to deduce leading zeros in `X`/`Y`.

`(icmp uge/ugt (sub nuw X, Y), C)` implies `(icmp uge/uge X, C)` . We
can use this to deduce leading ones in `X`.

Proofs: https://alive2.llvm.org/ce/z/sc5k22
goldsteinn added a commit to goldsteinn/llvm-project that referenced this pull request Jun 7, 2024
Two folds unlocked:
    `(icmp eq/ne (xor x, C0), C1)` -> `(icmp eq/ne x, C2)`
    `(icmp eq/ne (xor x, y), 0)` -> `(icmp eq/ne x, y)`

This fixes regressions assosiated with llvm#87180
goldsteinn added a commit that referenced this pull request Jun 7, 2024
Two folds unlocked:
    `(icmp eq/ne (xor x, C0), C1)` -> `(icmp eq/ne x, C2)`
    `(icmp eq/ne (xor x, y), 0)` -> `(icmp eq/ne x, y)`

This fixes regressions assosiated with #87180

Closes #87275
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants