[InstCombine] Fold minmax intrinsic using KnownBits information #76242

dtcxzyw · 2023-12-22T14:49:37Z

This patch tries to fold minmax intrinsic by using computeConstantRangeIncludingKnownBits.
Fixes regression in _karatsuba_rec:cpython/Modules/_decimal/libmpdec/mpdecimal.c, which was introduced by #71396.
See also dtcxzyw/llvm-opt-benchmark#16 (comment).

Alive2 for splat vectors with undef: https://alive2.llvm.org/ce/z/J8hKWd
It is an alternative to #76221.

llvmbot · 2023-12-22T14:50:05Z

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-llvm-transforms

Author: Yingwei Zheng (dtcxzyw)

Changes

This patch tries to fold minmax intrinsic by using computeConstantRangeIncludingKnownBits.
Fixes regression in _karatsuba_rec:cpython/Modules/_decimal/libmpdec/mpdecimal.c, which was introduced by #71396.
See also dtcxzyw/llvm-opt-benchmark#16 (comment).

It is an alternative to #76221.

Full diff: https://github.com/llvm/llvm-project/pull/76242.diff

4 Files Affected:

(modified) llvm/include/llvm/Analysis/ValueTracking.h (+5)
(modified) llvm/lib/Analysis/ValueTracking.cpp (+4-4)
(modified) llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp (+13)
(modified) llvm/test/Transforms/InstCombine/minmax-intrinsics.ll (+63)

diff --git a/llvm/include/llvm/Analysis/ValueTracking.h b/llvm/include/llvm/Analysis/ValueTracking.h
index a3186e61b94adf..baa16306ebf5df 100644
--- a/llvm/include/llvm/Analysis/ValueTracking.h
+++ b/llvm/include/llvm/Analysis/ValueTracking.h
@@ -863,6 +863,11 @@ ConstantRange computeConstantRange(const Value *V, bool ForSigned,
                                    const DominatorTree *DT = nullptr,
                                    unsigned Depth = 0);
 
+/// Combine constant ranges from computeConstantRange() and computeKnownBits().
+ConstantRange
+computeConstantRangeIncludingKnownBits(const WithCache<const Value *> &V,
+                                       bool ForSigned, const SimplifyQuery &SQ);
+
 /// Return true if this function can prove that the instruction I will
 /// always transfer execution to one of its successors (including the next
 /// instruction that follows within a basic block). E.g. this is not
diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index 769d921eb1e8d1..cac2602d455f9d 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -6289,10 +6289,10 @@ static OverflowResult mapOverflowResult(ConstantRange::OverflowResult OR) {
 }
 
 /// Combine constant ranges from computeConstantRange() and computeKnownBits().
-static ConstantRange
-computeConstantRangeIncludingKnownBits(const WithCache<const Value *> &V,
-                                       bool ForSigned,
-                                       const SimplifyQuery &SQ) {
+ConstantRange
+llvm::computeConstantRangeIncludingKnownBits(const WithCache<const Value *> &V,
+                                             bool ForSigned,
+                                             const SimplifyQuery &SQ) {
   ConstantRange CR1 =
       ConstantRange::fromKnownBits(V.getKnownBits(SQ), ForSigned);
   ConstantRange CR2 = computeConstantRange(V, ForSigned, SQ.IIQ.UseInstrInfo);
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
index a272357fa04a41..76000270e868eb 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
@@ -1796,6 +1796,19 @@ Instruction *InstCombinerImpl::visitCallInst(CallInst &CI) {
     if (Instruction *NewMinMax = factorizeMinMaxTree(II))
        return NewMinMax;
 
+    // Try to fold minmax based on range information
+    ICmpInst::Predicate Pred =
+        ICmpInst::getNonStrictPredicate(MinMaxIntrinsic::getPredicate(IID));
+    bool IsSigned = MinMaxIntrinsic::isSigned(IID);
+    const auto LHS_CR = llvm::computeConstantRangeIncludingKnownBits(
+        I0, IsSigned, SQ.getWithInstruction(II));
+    const auto RHS_CR = llvm::computeConstantRangeIncludingKnownBits(
+        I1, IsSigned, SQ.getWithInstruction(II));
+    if (LHS_CR.icmp(Pred, RHS_CR))
+      return replaceInstUsesWith(*II, I0);
+    if (RHS_CR.icmp(Pred, LHS_CR))
+      return replaceInstUsesWith(*II, I1);
+
     break;
   }
   case Intrinsic::bitreverse: {
diff --git a/llvm/test/Transforms/InstCombine/minmax-intrinsics.ll b/llvm/test/Transforms/InstCombine/minmax-intrinsics.ll
index f3833a420ee835..500ea0aa73f741 100644
--- a/llvm/test/Transforms/InstCombine/minmax-intrinsics.ll
+++ b/llvm/test/Transforms/InstCombine/minmax-intrinsics.ll
@@ -2489,3 +2489,66 @@ define i1 @PR57986() {
   %umin = call i1 @llvm.umin.i1(i1 ptrtoint (ptr @g to i1), i1 true)
   ret i1 %umin
 }
+
+define i8 @fold_umax_with_knownbits_info(i8 %a, i8 %b) {
+; CHECK-LABEL: @fold_umax_with_knownbits_info(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[A1:%.*]] = or i8 [[A:%.*]], 1
+; CHECK-NEXT:    [[A2:%.*]] = shl i8 [[B:%.*]], 1
+; CHECK-NEXT:    [[SUB:%.*]] = sub i8 [[A1]], [[A2]]
+; CHECK-NEXT:    ret i8 [[SUB]]
+;
+entry:
+  %a1 = or i8 %a, 1
+  %a2 = shl i8 %b, 1
+  %sub = sub i8 %a1, %a2
+  %val = call i8 @llvm.umax.i8(i8 %sub, i8 1)
+  ret i8 %val
+}
+
+define i8 @fold_umin_with_knownbits_info(i8 %a, i8 %b) {
+; CHECK-LABEL: @fold_umin_with_knownbits_info(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    ret i8 3
+;
+entry:
+  %a1 = or i8 %a, 3
+  %a2 = shl i8 %b, 2
+  %sub = sub i8 %a1, %a2
+  %val = call i8 @llvm.umin.i8(i8 %sub, i8 3)
+  ret i8 %val
+}
+
+define i8 @fold_umax_with_knownbits_info_fail(i8 %a, i8 %b) {
+; CHECK-LABEL: @fold_umax_with_knownbits_info_fail(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[A1:%.*]] = or i8 [[A:%.*]], 2
+; CHECK-NEXT:    [[A2:%.*]] = shl i8 [[B:%.*]], 1
+; CHECK-NEXT:    [[SUB:%.*]] = sub i8 [[A1]], [[A2]]
+; CHECK-NEXT:    [[VAL:%.*]] = call i8 @llvm.umax.i8(i8 [[SUB]], i8 1)
+; CHECK-NEXT:    ret i8 [[VAL]]
+;
+entry:
+  %a1 = or i8 %a, 2
+  %a2 = shl i8 %b, 1
+  %sub = sub i8 %a1, %a2
+  %val = call i8 @llvm.umax.i8(i8 %sub, i8 1)
+  ret i8 %val
+}
+
+define i8 @fold_umin_with_knownbits_info_fail(i8 %a, i8 %b) {
+; CHECK-LABEL: @fold_umin_with_knownbits_info_fail(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[A1:%.*]] = or i8 [[A:%.*]], 1
+; CHECK-NEXT:    [[A2:%.*]] = shl i8 [[B:%.*]], 2
+; CHECK-NEXT:    [[SUB:%.*]] = sub i8 [[A1]], [[A2]]
+; CHECK-NEXT:    [[VAL:%.*]] = call i8 @llvm.umin.i8(i8 [[SUB]], i8 3)
+; CHECK-NEXT:    ret i8 [[VAL]]
+;
+entry:
+  %a1 = or i8 %a, 1
+  %a2 = shl i8 %b, 2
+  %sub = sub i8 %a1, %a2
+  %val = call i8 @llvm.umin.i8(i8 %sub, i8 3)
+  ret i8 %val
+}

PR Link: llvm/llvm-project#76242

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp

nikic

LGTM

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp

PR Link: llvm/llvm-project#76242

dtcxzyw requested a review from goldsteinn December 22, 2023 14:49

dtcxzyw requested a review from nikic as a code owner December 22, 2023 14:49

llvmbot added llvm:analysis llvm:transforms labels Dec 22, 2023

dtcxzyw added a commit to dtcxzyw/llvm-opt-benchmark that referenced this pull request Dec 22, 2023

pre-commit: test PR76242

b8bf27d

PR Link: llvm/llvm-project#76242

dtcxzyw mentioned this pull request Dec 22, 2023

pre-commit: test PR76242 dtcxzyw/llvm-opt-benchmark#26

Closed

nikic reviewed Dec 22, 2023

View reviewed changes

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp Outdated Show resolved Hide resolved

dtcxzyw force-pushed the perf/fold-minmax-with-knownbits branch from 5297a54 to 94c65a1 Compare December 22, 2023 16:32

[InstCombine] Add pre-commit tests. NFC.

bd763f6

dtcxzyw force-pushed the perf/fold-minmax-with-knownbits branch from 94c65a1 to 1700573 Compare December 22, 2023 19:00

nikic approved these changes Dec 22, 2023

View reviewed changes

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp Outdated Show resolved Hide resolved

[InstCombine] Fold minmax intrinsic using KnownBits information

de719bd

dtcxzyw force-pushed the perf/fold-minmax-with-knownbits branch from 1700573 to de719bd Compare December 22, 2023 20:01

dtcxzyw added a commit to dtcxzyw/llvm-opt-benchmark that referenced this pull request Dec 22, 2023

pre-commit: test PR76242

320b70f

PR Link: llvm/llvm-project#76242

dtcxzyw merged commit 345d7b1 into llvm:main Dec 22, 2023
3 of 4 checks passed

dtcxzyw deleted the perf/fold-minmax-with-knownbits branch December 22, 2023 20:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[InstCombine] Fold minmax intrinsic using KnownBits information #76242

[InstCombine] Fold minmax intrinsic using KnownBits information #76242

dtcxzyw commented Dec 22, 2023 •

edited

Loading

llvmbot commented Dec 22, 2023 •

edited

Loading

nikic left a comment

[InstCombine] Fold minmax intrinsic using KnownBits information #76242

[InstCombine] Fold minmax intrinsic using KnownBits information #76242

Conversation

dtcxzyw commented Dec 22, 2023 • edited Loading

llvmbot commented Dec 22, 2023 • edited Loading

nikic left a comment

Choose a reason for hiding this comment

dtcxzyw commented Dec 22, 2023 •

edited

Loading

llvmbot commented Dec 22, 2023 •

edited

Loading