Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ValueTracking] Convert isKnownNonZero to use SimplifyQuery #85863

Merged
merged 4 commits into from
Apr 12, 2024

Conversation

dtcxzyw
Copy link
Member

@dtcxzyw dtcxzyw commented Mar 19, 2024

This patch converts isKnownNonZero to use SimplifyQuery. Then we can use the context information from DomCondCache.

Fixes #85823.
Alive2: https://alive2.llvm.org/ce/z/QUvHVj

@llvmbot
Copy link
Collaborator

llvmbot commented Mar 19, 2024

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-clang-codegen
@llvm/pr-subscribers-llvm-analysis
@llvm/pr-subscribers-compiler-rt-sanitizer

@llvm/pr-subscribers-llvm-transforms

Author: Yingwei Zheng (dtcxzyw)

Changes

Fixes #85823.
Alive2: https://alive2.llvm.org/ce/z/QUvHVj


Full diff: https://github.com/llvm/llvm-project/pull/85863.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp (+6-3)
  • (modified) llvm/test/Transforms/InstCombine/icmp-dom.ll (+85)
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
index 0dce0077bf1588..7cb5e98dea2581 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
@@ -1771,11 +1771,14 @@ Instruction *InstCombinerImpl::foldICmpAndConstConst(ICmpInst &Cmp,
       return new ICmpInst(NewPred, X, Zero);
     }
 
-    APInt NewC2 = *C2;
     KnownBits Know = computeKnownBits(And->getOperand(0), 0, And);
+    if (Know.One.intersects(*C2))
+      return replaceInstUsesWith(
+          Cmp, ConstantInt::getBool(Cmp.getType(), isICMP_NE));
+
     // Set high zeros of C2 to allow matching negated power-of-2.
-    NewC2 = *C2 | APInt::getHighBitsSet(C2->getBitWidth(),
-                                        Know.countMinLeadingZeros());
+    APInt NewC2 = *C2 | APInt::getHighBitsSet(C2->getBitWidth(),
+                                              Know.countMinLeadingZeros());
 
     // Restrict this fold only for single-use 'and' (PR10267).
     // ((%x & C) == 0) --> %x u< (-C)  iff (-C) is power of two.
diff --git a/llvm/test/Transforms/InstCombine/icmp-dom.ll b/llvm/test/Transforms/InstCombine/icmp-dom.ll
index f4b9022d14349b..d97a2e707ae235 100644
--- a/llvm/test/Transforms/InstCombine/icmp-dom.ll
+++ b/llvm/test/Transforms/InstCombine/icmp-dom.ll
@@ -403,3 +403,88 @@ truelabel:
 falselabel:
   ret i8 0
 }
+
+define i1 @and_mask1_eq(i32 %conv) {
+; CHECK-LABEL: @and_mask1_eq(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[AND:%.*]] = and i32 [[CONV:%.*]], 1
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i32 [[AND]], 0
+; CHECK-NEXT:    br i1 [[CMP]], label [[THEN:%.*]], label [[ELSE:%.*]]
+; CHECK:       then:
+; CHECK-NEXT:    ret i1 false
+; CHECK:       else:
+; CHECK-NEXT:    call void @dummy()
+; CHECK-NEXT:    ret i1 false
+;
+entry:
+  %and = and i32 %conv, 1
+  %cmp = icmp eq i32 %and, 0
+  br i1 %cmp, label %then, label %else
+
+then:
+  ret i1 0
+
+else:
+  call void @dummy()
+  %and1 = and i32 %conv, 3
+  %cmp1 = icmp eq i32 %and1, 0
+  ret i1 %cmp1
+}
+
+define i1 @and_mask1_ne(i32 %conv) {
+; CHECK-LABEL: @and_mask1_ne(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[AND:%.*]] = and i32 [[CONV:%.*]], 1
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i32 [[AND]], 0
+; CHECK-NEXT:    br i1 [[CMP]], label [[THEN:%.*]], label [[ELSE:%.*]]
+; CHECK:       then:
+; CHECK-NEXT:    ret i1 false
+; CHECK:       else:
+; CHECK-NEXT:    call void @dummy()
+; CHECK-NEXT:    ret i1 true
+;
+entry:
+  %and = and i32 %conv, 1
+  %cmp = icmp eq i32 %and, 0
+  br i1 %cmp, label %then, label %else
+
+then:
+  ret i1 0
+
+else:
+  call void @dummy()
+  %and1 = and i32 %conv, 3
+  %cmp1 = icmp ne i32 %and1, 0
+  ret i1 %cmp1
+}
+
+define i1 @and_mask2(i32 %conv) {
+; CHECK-LABEL: @and_mask2(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[AND:%.*]] = and i32 [[CONV:%.*]], 4
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i32 [[AND]], 0
+; CHECK-NEXT:    br i1 [[CMP]], label [[THEN:%.*]], label [[ELSE:%.*]]
+; CHECK:       then:
+; CHECK-NEXT:    ret i1 false
+; CHECK:       else:
+; CHECK-NEXT:    call void @dummy()
+; CHECK-NEXT:    [[AND1:%.*]] = and i32 [[CONV]], 3
+; CHECK-NEXT:    [[CMP1:%.*]] = icmp eq i32 [[AND1]], 0
+; CHECK-NEXT:    ret i1 [[CMP1]]
+;
+entry:
+  %and = and i32 %conv, 4
+  %cmp = icmp eq i32 %and, 0
+  br i1 %cmp, label %then, label %else
+
+then:
+  ret i1 0
+
+else:
+  call void @dummy()
+  %and1 = and i32 %conv, 3
+  %cmp1 = icmp eq i32 %and1, 0
+  ret i1 %cmp1
+}
+
+declare void @dummy()

dtcxzyw added a commit to dtcxzyw/llvm-opt-benchmark that referenced this pull request Mar 19, 2024
@dtcxzyw dtcxzyw changed the title [InstCombine] Fold icmp eq/ne (X and C), 0 when partial bits are known [InstCombine] Fold icmp eq/ne (X and C), 0 when partial bits are known Mar 19, 2024
@dtcxzyw
Copy link
Member Author

dtcxzyw commented Mar 20, 2024

Oh, simplifyICmpWithZero in InstSimplify should handle this.

/// Try hard to fold icmp with zero RHS because this is a common case.
static Value *simplifyICmpWithZero(CmpInst::Predicate Pred, Value *LHS,
Value *RHS, const SimplifyQuery &Q) {
if (!match(RHS, m_Zero()))
return nullptr;
Type *ITy = getCompareTy(LHS); // The return type.
switch (Pred) {
default:
llvm_unreachable("Unknown ICmp predicate!");
case ICmpInst::ICMP_ULT:
return getFalse(ITy);
case ICmpInst::ICMP_UGE:
return getTrue(ITy);
case ICmpInst::ICMP_EQ:
case ICmpInst::ICMP_ULE:
if (isKnownNonZero(LHS, Q.DL, 0, Q.AC, Q.CxtI, Q.DT, Q.IIQ.UseInstrInfo))
return getFalse(ITy);

I will fix it later.

@dtcxzyw dtcxzyw changed the title [InstCombine] Fold icmp eq/ne (X and C), 0 when partial bits are known [ValueTracking] Convert isKnownNonZero to use SimplifyQuery Mar 21, 2024
dtcxzyw added a commit to dtcxzyw/llvm-opt-benchmark that referenced this pull request Mar 21, 2024
@nikic
Copy link
Contributor

nikic commented Mar 21, 2024

Can you please fix the clang build?

@@ -641,7 +641,7 @@ LazyValueInfoImpl::solveBlockValueImpl(Value *Val, BasicBlock *BB) {
// instruction is placed, even if it could legally be hoisted much higher.
// That is unfortunate.
PointerType *PT = dyn_cast<PointerType>(BBI->getType());
if (PT && isKnownNonZero(BBI, DL))
if (PT && isKnownNonZero(BBI, /*Depth=*/0, SimplifyQuery(DL)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, you could just pass DL here and other places that use only DL. It will be converted implicitly. But this is fine.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen labels Mar 21, 2024
@dtcxzyw
Copy link
Member Author

dtcxzyw commented Mar 21, 2024

Can you please fix the clang build?

Done.

@dtcxzyw
Copy link
Member Author

dtcxzyw commented Apr 11, 2024

Ping.

AssumptionCache *AC = nullptr,
const Instruction *CxtI = nullptr,
const DominatorTree *DT = nullptr,
bool UseInstrInfo = true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you also need to erase the old API? computeKnownBits has both.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any reason to keep the old API. It is unlikely to break LLVM downstream users' build (except some static analysis tools).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few places it makes the code simpler (i.e the cases with just DL and depth==0).

%and1 = and i32 %conv, 7
%cmp1 = icmp eq i32 %and1, 0
ret i1 %cmp1
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nicer if this where NFC...

@goldsteinn
Copy link
Contributor

LGTM. Wait on nikic

@nikic
Copy link
Contributor

nikic commented Apr 11, 2024

Looks like the clang build is failing again?

@@ -645,7 +645,7 @@ LazyValueInfoImpl::solveBlockValueImpl(Value *Val, BasicBlock *BB) {
// instruction is placed, even if it could legally be hoisted much higher.
// That is unfortunate.
PointerType *PT = dyn_cast<PointerType>(BBI->getType());
if (PT && isKnownNonZero(BBI, DL))
if (PT && isKnownNonZero(BBI, /*Depth=*/0, DL))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side note: We're currently inconsistent about this, but I think it would be best if we moved all the depth parameters after SQ, because end-users of the API don't need to specify depth.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you aren't going to implement that, I will after this patch lands.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that this PR changes it so that isKnownNonZero(BBI, DL) is valid in LLVM 18, used to be valid in LLVM 19, is currently an error in LLVM 19, but will become valid in LLVM 19 again is unfortunate and difficult to account for in downstream projects. @goldsteinn If you are going to change that, do you think it is worth updating downstream code to support the current API, or is that change going to go in fast enough that we can just wait?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I don't make the llvm19 cutoff, I'll drop it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I had been slightly optimistic that this was just going to be a few days, but it sounds like it's going to be longer. In that case I will check what is easiest for us (which might be to submit that LLVM change ourselves).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created #88873 for it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for making the PR :)

@dtcxzyw
Copy link
Member Author

dtcxzyw commented Apr 12, 2024

Looks like the clang build is failing again?

Done.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG

@dtcxzyw dtcxzyw merged commit e0a6287 into llvm:main Apr 12, 2024
4 checks passed
@dtcxzyw dtcxzyw deleted the perf/icmp-and-c-zero branch April 12, 2024 15:47
hvdijk added a commit that referenced this pull request Apr 16, 2024
Prior to #85863, the required parameters of llvm::isKnownNonZero were
Value and DataLayout. After, they are Value, Depth, and SimplifyQuery,
where SimplifyQuery is implicitly constructible from DataLayout. The
change to move Depth before SimplifyQuery needed callers to be updated
unnecessarily, and as commented in #85863, we actually want Depth to be
after SimplifyQuery anyway so that it can be defaulted and the caller
does not need to specify it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ValueTracking] Missed optimization: Infer known bits from a & mask == 0
5 participants