Skip to content

Allow folding icmp eq (add X, C2), C when there is more than one-use when we can compute the range #144566

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

AZero13
Copy link
Contributor

@AZero13 AZero13 commented Jun 17, 2025

If there are multiple uses of an add, we can fold a comparison anyway if we can compute a constant range for it, which should happen in cases such as saturated add.

Alive2: https://alive2.llvm.org/ce/z/Y4eHav

@llvmbot
Copy link
Member

llvmbot commented Jun 17, 2025

@llvm/pr-subscribers-llvm-analysis
@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-llvm-transforms

Author: AZero13 (AZero13)

Changes

This is because this actually is just:

// ((X + Y) u< X) ? -1 : (X + Y) --> uadd.sat(X, Y)
// ((X + Y) u< Y) ? -1 : (X + Y) --> uadd.sat(X, Y)


Full diff: https://github.com/llvm/llvm-project/pull/144566.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp (+9-2)
  • (modified) llvm/test/Transforms/InstCombine/saturating-add-sub.ll (+11)
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp b/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
index 73ba0f78e8053..d733a37efd68e 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
@@ -1013,12 +1013,19 @@ static Value *canonicalizeSaturatedAdd(ICmpInst *Cmp, Value *TVal, Value *FVal,
 
   // uge -1 is canonicalized to eq -1 and requires special handling
   // (a == -1) ? -1 : a + 1 -> uadd.sat(a, 1)
+  // ult 1 is canonicalized to eq 0
+  // (a + 1 == 0) ? -1 : a + 1 -> uadd.sat(a, 1)
   if (Pred == ICmpInst::ICMP_EQ) {
     if (match(FVal, m_Add(m_Specific(Cmp0), m_One())) &&
-        match(Cmp1, m_AllOnes())) {
+        match(Cmp1, m_AllOnes()))
       return Builder.CreateBinaryIntrinsic(
           Intrinsic::uadd_sat, Cmp0, ConstantInt::get(Cmp0->getType(), 1));
-    }
+
+    if (match(Cmp1, m_Zero()) && match(Cmp0, m_Add(m_Value(X), m_One())) &&
+        match(FVal, m_Add(m_Specific(X), m_One())))
+      return Builder.CreateBinaryIntrinsic(Intrinsic::uadd_sat, X,
+                                           ConstantInt::get(X->getType(), 1));
+
     return nullptr;
   }
 
diff --git a/llvm/test/Transforms/InstCombine/saturating-add-sub.ll b/llvm/test/Transforms/InstCombine/saturating-add-sub.ll
index cfd679c0cc592..392551defb7ef 100644
--- a/llvm/test/Transforms/InstCombine/saturating-add-sub.ll
+++ b/llvm/test/Transforms/InstCombine/saturating-add-sub.ll
@@ -2350,4 +2350,15 @@ define i8 @fold_add_umax_to_usub_multiuse(i8 %a) {
   ret i8 %sel
 }
 
+define i32 @add_check_zero(i32 %num) {
+; CHECK-LABEL: @add_check_zero(
+; CHECK-NEXT:    [[COND:%.*]] = call i32 @llvm.uadd.sat.i32(i32 [[ADD:%.*]], i32 1)
+; CHECK-NEXT:    ret i32 [[COND]]
+;
+  %add = add i32 %num, 1
+  %cmp = icmp eq i32 %add, 0
+  %cond = select i1 %cmp, i32 -1, i32 %add
+  ret i32 %cond
+}
+
 declare void @usei8(i8)

@AZero13 AZero13 force-pushed the sub-add branch 3 times, most recently from c00d050 to 978106e Compare June 17, 2025 18:29
@topperc
Copy link
Collaborator

topperc commented Jun 17, 2025

I don't think this is the right place to fix this. We need to rewrite

  %add = add i32 %num, 1
  %cmp = icmp eq i32 %add, 0

to

  %cmp = icmp eq i32 %num, -1

This is similar to

  %add = add i32 %num, 2
  %cmp = icmp ult i32 %add, 2
-->
  %cmp = icmp uge %num,2

which is handled by InstCombinerImpl::foldICmpAddConstant using this code, but we already excluded equality comparisons by the time we get there.

  auto CR = ConstantRange::makeExactICmpRegion(Pred, C).subtract(*C2);           
  const APInt &Upper = CR.getUpper();                                            
  const APInt &Lower = CR.getLower();                                            
  if (Cmp.isSigned()) {                                                          
    if (Lower.isSignMask())                                                      
      return new ICmpInst(ICmpInst::ICMP_SLT, X, ConstantInt::get(Ty, Upper));   
    if (Upper.isSignMask())                                                      
      return new ICmpInst(ICmpInst::ICMP_SGE, X, ConstantInt::get(Ty, Lower));   
  } else {                                                                       
    if (Lower.isMinValue())                                                      
      return new ICmpInst(ICmpInst::ICMP_ULT, X, ConstantInt::get(Ty, Upper));   
    if (Upper.isMinValue())                                                      
      return new ICmpInst(ICmpInst::ICMP_UGE, X, ConstantInt::get(Ty, Lower));   
  } 

@AZero13
Copy link
Contributor Author

AZero13 commented Jun 17, 2025

I don't think this is the right place to fix this. We need to rewrite

  %add = add i32 %num, 1
  %cmp = icmp eq i32 %add, 0

to

  %cmp = icmp eq i32 %num, -1

Then the rest just works.
unsigned square2eer(unsigned a) {
return a + 7 == 7;
}

folds fine, but when we have multiple uses, it is not simplifying because it is reusing the a + 7

@topperc
Copy link
Collaborator

topperc commented Jun 17, 2025

I don't think this is the right place to fix this. We need to rewrite

  %add = add i32 %num, 1
  %cmp = icmp eq i32 %add, 0

to

  %cmp = icmp eq i32 %num, -1

Then the rest just works.
unsigned square2eer(unsigned a) {
return a + 7 == 7;
}

folds fine, but when we have multiple uses, it is not simplifying because it is reusing the a + 7

I don't understand why we're willing to fold an ugt compare when the add has multiple uses, but not an eq compare.

@AZero13
Copy link
Contributor Author

AZero13 commented Jun 17, 2025

I don't think this is the right place to fix this. We need to rewrite

  %add = add i32 %num, 1
  %cmp = icmp eq i32 %add, 0

to

  %cmp = icmp eq i32 %num, -1

Then the rest just works.
unsigned square2eer(unsigned a) {
return a + 7 == 7;
}

folds fine, but when we have multiple uses, it is not simplifying because it is reusing the a + 7

I don't understand why we're willing to fold an ugt compare when the add has multiple uses, but not an eq compare.

Beats me, this is just my observation.

@topperc
Copy link
Collaborator

topperc commented Jun 17, 2025

I don't think this is the right place to fix this. We need to rewrite

  %add = add i32 %num, 1
  %cmp = icmp eq i32 %add, 0

to

  %cmp = icmp eq i32 %num, -1

Then the rest just works.
unsigned square2eer(unsigned a) {
return a + 7 == 7;
}

folds fine, but when we have multiple uses, it is not simplifying because it is reusing the a + 7

I don't understand why we're willing to fold an ugt compare when the add has multiple uses, but not an eq compare.

Beats me, this is just my observation.

There's a FIXME for it in InstCombinerImpl::foldICmpBinOpEqualityWithConstant

  case Instruction::Add: {                                                       
    // (A + C2) == C --> A == (C - C2)                                           
    // (A + C2) != C --> A != (C - C2)                                           
    // TODO: Remove the one-use limitation? See discussion in D58633

@AZero13
Copy link
Contributor Author

AZero13 commented Jun 17, 2025

InstCombinerImpl::foldICmpBinOpEqualityWithConstant

Oh I did a different thing:

// (X + C1) == C) --> X == C - C1
// (X + C1) != C) --> X != C - C1
Constant *C1;
if (Cmp.isEquality() && match(Y, m_ImmConstant(C1))) {
return new ICmpInst(Pred, Y,
ConstantExpr::getSub(ConstantInt::get(Ty, C), C1));
}

@AZero13
Copy link
Contributor Author

AZero13 commented Jun 17, 2025

Screenshot 2025-06-17 at 3 44 44 PM This was my plan

@AZero13
Copy link
Contributor Author

AZero13 commented Jun 17, 2025

// Match special-case for increment-by-1.
if (Pred == ICmpInst::ICMP_EQ) {
  // (a + 1) == 0
  // (1 + a) == 0
  if (AddExpr.match(ICmpLHS) && m_ZeroInt().match(ICmpRHS) &&
      (m_One().match(AddLHS) || m_One().match(AddRHS)))
    return L.match(AddLHS) && R.match(AddRHS) && S.match(ICmpLHS);
  // 0 == (a + 1)
  // 0 == (1 + a)
  if (m_ZeroInt().match(ICmpLHS) && AddExpr.match(ICmpRHS) &&
      (m_One().match(AddLHS) || m_One().match(AddRHS)))
    return L.match(AddLHS) && R.match(AddRHS) && S.match(ICmpRHS);
}

So does this mean I can remove this from UAddWithOverflow_match?

@AZero13
Copy link
Contributor Author

AZero13 commented Jun 17, 2025

@topperc yeah this is going to require a lot of fixes. Can I just go back to what I did at the start because this change is otherwise gonna affect 50 test files, many of which regressed and should be addressed in another PR.

@AZero13 AZero13 force-pushed the sub-add branch 2 times, most recently from 978106e to 8a12f91 Compare June 17, 2025 20:50
@topperc
Copy link
Collaborator

topperc commented Jun 17, 2025

This patch gets what you want with only 1 test update. So it seems like only handling the (a + 1) == 0 case doesn't require a lot.

diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
index 084e7fbaa268..e6542223d695 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
@@ -3160,7 +3160,7 @@ Instruction *InstCombinerImpl::foldICmpAddConstant(ICmpInst &Cmp,
       return replaceInstUsesWith(Cmp, Cond);
   }
   const APInt *C2;
-  if (Cmp.isEquality() || !match(Y, m_APInt(C2)))
+  if (!match(Y, m_APInt(C2)))
     return nullptr;
 
   // Fold icmp pred (add X, C2), C.
@@ -3184,7 +3184,7 @@ Instruction *InstCombinerImpl::foldICmpAddConstant(ICmpInst &Cmp,
       return new ICmpInst(Pred, X, ConstantInt::get(Ty, NewC));
   }
 
-  if (ICmpInst::isUnsigned(Pred) && Add->hasNoSignedWrap() &&
+  if (!Cmp.isEquality() && ICmpInst::isUnsigned(Pred) && Add->hasNoSignedWrap() &&
       C.isNonNegative() && (C - *C2).isNonNegative() &&
       computeConstantRange(X, /*ForSigned=*/true).add(*C2).isAllNonNegative())
     return new ICmpInst(ICmpInst::getSignedPredicate(Pred), X,
@@ -3205,6 +3205,9 @@ Instruction *InstCombinerImpl::foldICmpAddConstant(ICmpInst &Cmp,
       return new ICmpInst(ICmpInst::ICMP_UGE, X, ConstantInt::get(Ty, Lower));
   }
 
+  if (Cmp.isEquality())
+    return nullptr;
+
   // This set of folds is intentionally placed after folds that use no-wrapping
   // flags because those folds are likely better for later analysis/codegen.
   const APInt SMax = APInt::getSignedMaxValue(Ty->getScalarSizeInBits());
diff --git a/llvm/test/Transforms/InstCombine/uaddo.ll b/llvm/test/Transforms/InstCombine/uaddo.ll
index ae7a07ec8000..89a9569bdb7c 100644
--- a/llvm/test/Transforms/InstCombine/uaddo.ll
+++ b/llvm/test/Transforms/InstCombine/uaddo.ll
@@ -158,7 +158,7 @@ define i1 @uaddo_1(i8 %x, ptr %p) {
 ; CHECK-LABEL: @uaddo_1(
 ; CHECK-NEXT:    [[A:%.*]] = add i8 [[X:%.*]], 1
 ; CHECK-NEXT:    store i8 [[A]], ptr [[P:%.*]], align 1
-; CHECK-NEXT:    [[C:%.*]] = icmp eq i8 [[A]], 0
+; CHECK-NEXT:    [[C:%.*]] = icmp eq i8 [[X]], -1
 ; CHECK-NEXT:    ret i1 [[C]]
 ;
   %a = add i8 %x, 1

@AZero13 AZero13 changed the title [InstCombine] Canonicalize (a + 1 == 0) ? -1 : a + 1 -> uadd.sat(a, 1) Allow folding icmp eq (add X, C2), C when there is more than one-use when we can compute the range Jun 17, 2025
Copy link

github-actions bot commented Jun 17, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

…when we can compute the range

If there are multiple uses of an add, we can fold a comparison anyway if we can compute a constant range for it, which should happen in cases such as saturated add.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants