[SCCP] Extend `visitBinaryOperator` to overflowing binary ops #84470

antoniofrighetto · 2024-03-08T12:27:40Z

Leverage more refined ranges results when handling overflowing binary operators.

llvmbot · 2024-03-08T12:28:10Z

@llvm/pr-subscribers-function-specialization

@llvm/pr-subscribers-llvm-transforms

Author: Antonio Frighetto (antoniofrighetto)

Changes

Leverage more refined ranges results when handling overflowing binary operators.

Full diff: https://github.com/llvm/llvm-project/pull/84470.diff

2 Files Affected:

(modified) llvm/lib/Transforms/Utils/SCCPSolver.cpp (+12-1)
(modified) llvm/test/Transforms/SCCP/add-nuw-nsw-flags.ll (+29)

diff --git a/llvm/lib/Transforms/Utils/SCCPSolver.cpp b/llvm/lib/Transforms/Utils/SCCPSolver.cpp
index a185e8cd371c60..a0e522d9e555c7 100644
--- a/llvm/lib/Transforms/Utils/SCCPSolver.cpp
+++ b/llvm/lib/Transforms/Utils/SCCPSolver.cpp
@@ -1486,7 +1486,18 @@ void SCCPInstVisitor::visitBinaryOperator(Instruction &I) {
   // Try to simplify to a constant range.
   ConstantRange A = getConstantRange(V1State, I.getType());
   ConstantRange B = getConstantRange(V2State, I.getType());
-  ConstantRange R = A.binaryOp(cast<BinaryOperator>(&I)->getOpcode(), B);
+
+  auto *BO = cast<BinaryOperator>(&I);
+  ConstantRange R = ConstantRange::getEmpty(I.getType()->getScalarSizeInBits());
+  if (auto *OBO = dyn_cast<OverflowingBinaryOperator>(BO)) {
+    bool HasNUW = OBO->hasNoUnsignedWrap();
+    bool HasNSW = OBO->hasNoSignedWrap();
+    unsigned Flags = (HasNUW ? OverflowingBinaryOperator::NoUnsignedWrap : 0) |
+                     (HasNSW ? OverflowingBinaryOperator::NoSignedWrap : 0);
+    R = A.overflowingBinaryOp(BO->getOpcode(), B, Flags);
+  } else {
+    R = A.binaryOp(BO->getOpcode(), B);
+  }
   mergeInValue(&I, ValueLatticeElement::getRange(R));
 
   // TODO: Currently we do not exploit special values that produce something
diff --git a/llvm/test/Transforms/SCCP/add-nuw-nsw-flags.ll b/llvm/test/Transforms/SCCP/add-nuw-nsw-flags.ll
index b8f5d5dba0c4b2..05d9acd1919629 100644
--- a/llvm/test/Transforms/SCCP/add-nuw-nsw-flags.ll
+++ b/llvm/test/Transforms/SCCP/add-nuw-nsw-flags.ll
@@ -240,3 +240,32 @@ then:
 else:
   ret i16 0
 }
+
+define i1 @test_add_nuw_sub(i32 %a) {
+; CHECK-LABEL: @test_add_nuw_sub(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[ADD:%.*]] = add nuw i32 [[A:%.*]], 10000
+; CHECK-NEXT:    [[SUB:%.*]] = add i32 [[ADD]], -5000
+; CHECK-NEXT:    ret i1 false
+;
+entry:
+  %add = add nuw i32 %a, 10000
+  %sub = add i32 %add, -5000
+  %cond = icmp ult i32 %sub, 5000
+  ret i1 %cond
+}
+
+define i1 @test_add_nsw_sub(i32 %a) {
+; CHECK-LABEL: @test_add_nsw_sub(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[ADD:%.*]] = add nsw i32 [[A:%.*]], 10000
+; CHECK-NEXT:    [[SUB:%.*]] = add nsw i32 [[ADD]], -5000
+; CHECK-NEXT:    [[COND:%.*]] = icmp ult i32 [[SUB]], 5000
+; CHECK-NEXT:    ret i1 [[COND]]
+;
+entry:
+  %add = add nsw i32 %a, 10000
+  %sub = add i32 %add, -5000
+  %cond = icmp ult i32 %sub, 5000
+  ret i1 %cond
+}

llvm/lib/Transforms/Utils/SCCPSolver.cpp

nikic · 2024-03-08T16:02:39Z

Some compile-time overhead, but probably acceptable: https://llvm-compile-time-tracker.com/compare.php?from=eb8f379567e8d014194faefe02ce92813e237afc&to=52f204492d88fad02ef9d510e23be3ceee63671c&stat=instructions:u Wonder whether there is any optimization potential in the ConstantRange implementation.

antoniofrighetto · 2024-03-08T16:13:19Z

Maybe acceptable, although don't look that good to me too, OTOH addWithNoWrap and subWithNoWrap look already quite optimized :/

PR Link: llvm/llvm-project#84470

dtcxzyw

LGTM.
BTW, would you like to fix the regression dtcxzyw/llvm-opt-benchmark#338 (comment)?
Alive2: https://alive2.llvm.org/ce/z/H2u9si

dtcxzyw · 2024-03-08T19:22:16Z

Another regression: https://github.com/dtcxzyw/llvm-opt-benchmark/pull/338/files#r1518193574
Alive2: https://alive2.llvm.org/ce/z/pZwdiS

nikic · 2024-03-08T20:57:20Z

LGTM. BTW, would you like to fix the regression dtcxzyw/llvm-opt-benchmark#338 (comment)? Alive2: https://alive2.llvm.org/ce/z/H2u9si

I was a bit confused about this because it does not seem like an improvement -- but you're only talking about the case where %b is constant, right?

dtcxzyw · 2024-03-09T05:41:38Z

LGTM. BTW, would you like to fix the regression dtcxzyw/llvm-opt-benchmark#338 (comment)? Alive2: https://alive2.llvm.org/ce/z/H2u9si

I was a bit confused about this because it does not seem like an improvement -- but you're only talking about the case where %b is constant, right?

Yeah, %b should be a positive constant.

XChy · 2024-03-09T06:40:49Z

Some compile-time overhead, but probably acceptable: https://llvm-compile-time-tracker.com/compare.php?from=eb8f379567e8d014194faefe02ce92813e237afc&to=52f204492d88fad02ef9d510e23be3ceee63671c&stat=instructions:u Wonder whether there is any optimization potential in the ConstantRange implementation.

Out of curiosity, I'm wondering why such minor modification on SCCP brings obivous overhead on compile-time. From my perspective, this patch only replaces add/sub with NoWrap implementation.

nikic · 2024-03-09T08:57:36Z

Some compile-time overhead, but probably acceptable: https://llvm-compile-time-tracker.com/compare.php?from=eb8f379567e8d014194faefe02ce92813e237afc&to=52f204492d88fad02ef9d510e23be3ceee63671c&stat=instructions:u Wonder whether there is any optimization potential in the ConstantRange implementation.

Out of curiosity, I'm wondering why such minor modification on SCCP brings obivous overhead on compile-time. From my perspective, this patch only replaces add/sub with NoWrap implementation.

Basically yes, but the NoWrap implementations are just more expensive. add() on ConstantRange is just two additions. With nowrap flags, you take a normal add() and combine it with multiple saturating additions and range intersections.

antoniofrighetto · 2024-03-09T10:59:13Z

LGTM.
BTW, would you like to fix the regression dtcxzyw/llvm-opt-benchmark#338 (comment)?
Alive2: https://alive2.llvm.org/ce/z/H2u9si

Would it make sense to handle this in ConstraintElimination?

nikic · 2024-03-09T11:34:23Z

LGTM.
BTW, would you like to fix the regression dtcxzyw/llvm-opt-benchmark#338 (comment)?
Alive2: https://alive2.llvm.org/ce/z/H2u9si

Would it make sense to handle this in ConstraintElimination?

This looks more suited to InstCombine to me.

dtcxzyw · 2024-03-09T11:40:16Z

LGTM.
BTW, would you like to fix the regression dtcxzyw/llvm-opt-benchmark#338 (comment)?
Alive2: https://alive2.llvm.org/ce/z/H2u9si

Would it make sense to handle this in ConstraintElimination?

This looks more suited to InstCombine to me.

Yeah, we should fold sub 0, (udiv nneg X, nneg C) into sdiv nneg X, -C in InstCombine.

antoniofrighetto · 2024-03-09T16:33:32Z

Another regression: https://github.com/dtcxzyw/llvm-opt-benchmark/pull/338/files#r1518193574
Alive2: https://alive2.llvm.org/ce/z/pZwdiS

Note that this already happens to be handled when isSignMask adheres to the type bitwidth (https://alive2.llvm.org/ce/z/KL7JYt) here:

llvm-project/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp

Lines 1768 to 1772 in 2fe81ed

    
           if (C2->isSignMask()) { 
        
             Constant *Zero = Constant::getNullValue(X->getType()); 
        
             auto NewPred = isICMP_NE ? ICmpInst::ICMP_SLT : ICmpInst::ICMP_SGE; 
        
             return new ICmpInst(NewPred, X, Zero); 
        
           }

If it makes sense, we can extend it to handle the following missing case:

define i1 @src(i64 %a) {
entry:
  %rem817 = urem i64 %a, 1000
  %rem817.neg = sub nsw i64 0, %rem817
  %1 = and i64 %rem817.neg, 2147483648
  %cmp10.not = icmp eq i64 %1, 0
  ret i1 %cmp10.not
}

However, I think this holds as long as %rem817.neg is bounded up to a certain constant, and the output for computeKnownBits(And->getOperand(0), 0, And) for this instance might be suboptimal (zeroes 0, zeroes 1)? Is there a better way to handle this? Can take a look at VT, but would like to know if this makes sense.

antoniofrighetto · 2024-03-14T13:23:13Z

@nikic, think we can merge this?

nikic · 2024-03-14T14:00:05Z

Yes, let's merge it.

antoniofrighetto · 2024-03-14T15:01:01Z

Merging it. I think we can still solve the aforementioned regression though if needed (also, computeKnownBits output looks fine, I missed we only have the nsw flag).

Leverage more refined ranges results when handling overflowing binary operators.

antoniofrighetto requested a review from nikic March 8, 2024 12:27

llvmbot added function-specialization llvm:transforms labels Mar 8, 2024

dtcxzyw reviewed Mar 8, 2024

View reviewed changes

llvm/lib/Transforms/Utils/SCCPSolver.cpp Outdated Show resolved Hide resolved

antoniofrighetto force-pushed the feature/sccp-binop-overflowing branch from 018fdfb to 2b9945e Compare March 8, 2024 16:09

dtcxzyw added a commit to dtcxzyw/llvm-opt-benchmark that referenced this pull request Mar 8, 2024

pre-commit: test PR84470

2ce093a

PR Link: llvm/llvm-project#84470

dtcxzyw mentioned this pull request Mar 8, 2024

pre-commit: test PR84470 dtcxzyw/llvm-opt-benchmark#338

Closed

dtcxzyw approved these changes Mar 8, 2024

View reviewed changes

dtcxzyw mentioned this pull request Mar 9, 2024

Missed Optimization: Aligned Pointer Optimizations Can't Happen With Prefered OR Instead of ADD #84401

Open

[SCCP] Extend visitBinaryOperator to overflowing binary ops

6ae4fcf

Leverage more refined ranges results when handling overflowing binary operators.

antoniofrighetto force-pushed the feature/sccp-binop-overflowing branch from 2b9945e to 6ae4fcf Compare March 14, 2024 15:02

antoniofrighetto merged commit 6ae4fcf into llvm:main Mar 14, 2024
3 of 4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SCCP] Extend `visitBinaryOperator` to overflowing binary ops #84470

[SCCP] Extend `visitBinaryOperator` to overflowing binary ops #84470

antoniofrighetto commented Mar 8, 2024

llvmbot commented Mar 8, 2024 •

edited

nikic commented Mar 8, 2024

antoniofrighetto commented Mar 8, 2024

dtcxzyw left a comment

dtcxzyw commented Mar 8, 2024

nikic commented Mar 8, 2024

dtcxzyw commented Mar 9, 2024

XChy commented Mar 9, 2024 •

edited

nikic commented Mar 9, 2024

antoniofrighetto commented Mar 9, 2024

nikic commented Mar 9, 2024

dtcxzyw commented Mar 9, 2024

antoniofrighetto commented Mar 9, 2024 •

edited

antoniofrighetto commented Mar 14, 2024

nikic commented Mar 14, 2024

antoniofrighetto commented Mar 14, 2024

[SCCP] Extend visitBinaryOperator to overflowing binary ops #84470

[SCCP] Extend visitBinaryOperator to overflowing binary ops #84470

Conversation

antoniofrighetto commented Mar 8, 2024

llvmbot commented Mar 8, 2024 • edited

nikic commented Mar 8, 2024

antoniofrighetto commented Mar 8, 2024

dtcxzyw left a comment

Choose a reason for hiding this comment

dtcxzyw commented Mar 8, 2024

nikic commented Mar 8, 2024

dtcxzyw commented Mar 9, 2024

XChy commented Mar 9, 2024 • edited

nikic commented Mar 9, 2024

antoniofrighetto commented Mar 9, 2024

nikic commented Mar 9, 2024

dtcxzyw commented Mar 9, 2024

antoniofrighetto commented Mar 9, 2024 • edited

antoniofrighetto commented Mar 14, 2024

nikic commented Mar 14, 2024

antoniofrighetto commented Mar 14, 2024

[SCCP] Extend `visitBinaryOperator` to overflowing binary ops #84470

[SCCP] Extend `visitBinaryOperator` to overflowing binary ops #84470

llvmbot commented Mar 8, 2024 •

edited

XChy commented Mar 9, 2024 •

edited

antoniofrighetto commented Mar 9, 2024 •

edited