-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SCCP] Extend visitBinaryOperator
to overflowing binary ops
#84470
[SCCP] Extend visitBinaryOperator
to overflowing binary ops
#84470
Conversation
@llvm/pr-subscribers-function-specialization @llvm/pr-subscribers-llvm-transforms Author: Antonio Frighetto (antoniofrighetto) ChangesLeverage more refined ranges results when handling overflowing binary operators. Full diff: https://github.com/llvm/llvm-project/pull/84470.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Utils/SCCPSolver.cpp b/llvm/lib/Transforms/Utils/SCCPSolver.cpp
index a185e8cd371c60..a0e522d9e555c7 100644
--- a/llvm/lib/Transforms/Utils/SCCPSolver.cpp
+++ b/llvm/lib/Transforms/Utils/SCCPSolver.cpp
@@ -1486,7 +1486,18 @@ void SCCPInstVisitor::visitBinaryOperator(Instruction &I) {
// Try to simplify to a constant range.
ConstantRange A = getConstantRange(V1State, I.getType());
ConstantRange B = getConstantRange(V2State, I.getType());
- ConstantRange R = A.binaryOp(cast<BinaryOperator>(&I)->getOpcode(), B);
+
+ auto *BO = cast<BinaryOperator>(&I);
+ ConstantRange R = ConstantRange::getEmpty(I.getType()->getScalarSizeInBits());
+ if (auto *OBO = dyn_cast<OverflowingBinaryOperator>(BO)) {
+ bool HasNUW = OBO->hasNoUnsignedWrap();
+ bool HasNSW = OBO->hasNoSignedWrap();
+ unsigned Flags = (HasNUW ? OverflowingBinaryOperator::NoUnsignedWrap : 0) |
+ (HasNSW ? OverflowingBinaryOperator::NoSignedWrap : 0);
+ R = A.overflowingBinaryOp(BO->getOpcode(), B, Flags);
+ } else {
+ R = A.binaryOp(BO->getOpcode(), B);
+ }
mergeInValue(&I, ValueLatticeElement::getRange(R));
// TODO: Currently we do not exploit special values that produce something
diff --git a/llvm/test/Transforms/SCCP/add-nuw-nsw-flags.ll b/llvm/test/Transforms/SCCP/add-nuw-nsw-flags.ll
index b8f5d5dba0c4b2..05d9acd1919629 100644
--- a/llvm/test/Transforms/SCCP/add-nuw-nsw-flags.ll
+++ b/llvm/test/Transforms/SCCP/add-nuw-nsw-flags.ll
@@ -240,3 +240,32 @@ then:
else:
ret i16 0
}
+
+define i1 @test_add_nuw_sub(i32 %a) {
+; CHECK-LABEL: @test_add_nuw_sub(
+; CHECK-NEXT: entry:
+; CHECK-NEXT: [[ADD:%.*]] = add nuw i32 [[A:%.*]], 10000
+; CHECK-NEXT: [[SUB:%.*]] = add i32 [[ADD]], -5000
+; CHECK-NEXT: ret i1 false
+;
+entry:
+ %add = add nuw i32 %a, 10000
+ %sub = add i32 %add, -5000
+ %cond = icmp ult i32 %sub, 5000
+ ret i1 %cond
+}
+
+define i1 @test_add_nsw_sub(i32 %a) {
+; CHECK-LABEL: @test_add_nsw_sub(
+; CHECK-NEXT: entry:
+; CHECK-NEXT: [[ADD:%.*]] = add nsw i32 [[A:%.*]], 10000
+; CHECK-NEXT: [[SUB:%.*]] = add nsw i32 [[ADD]], -5000
+; CHECK-NEXT: [[COND:%.*]] = icmp ult i32 [[SUB]], 5000
+; CHECK-NEXT: ret i1 [[COND]]
+;
+entry:
+ %add = add nsw i32 %a, 10000
+ %sub = add i32 %add, -5000
+ %cond = icmp ult i32 %sub, 5000
+ ret i1 %cond
+}
|
Some compile-time overhead, but probably acceptable: https://llvm-compile-time-tracker.com/compare.php?from=eb8f379567e8d014194faefe02ce92813e237afc&to=52f204492d88fad02ef9d510e23be3ceee63671c&stat=instructions:u Wonder whether there is any optimization potential in the ConstantRange implementation. |
018fdfb
to
2b9945e
Compare
Maybe acceptable, although don't look that good to me too, OTOH |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
BTW, would you like to fix the regression dtcxzyw/llvm-opt-benchmark#338 (comment)?
Alive2: https://alive2.llvm.org/ce/z/H2u9si
Another regression: https://github.com/dtcxzyw/llvm-opt-benchmark/pull/338/files#r1518193574 |
I was a bit confused about this because it does not seem like an improvement -- but you're only talking about the case where |
Yeah, |
Out of curiosity, I'm wondering why such minor modification on SCCP brings obivous overhead on compile-time. From my perspective, this patch only replaces |
Basically yes, but the NoWrap implementations are just more expensive. add() on ConstantRange is just two additions. With nowrap flags, you take a normal add() and combine it with multiple saturating additions and range intersections. |
Would it make sense to handle this in ConstraintElimination? |
This looks more suited to InstCombine to me. |
Yeah, we should fold |
Note that this already happens to be handled when llvm-project/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp Lines 1768 to 1772 in 2fe81ed
If it makes sense, we can extend it to handle the following missing case: define i1 @src(i64 %a) {
entry:
%rem817 = urem i64 %a, 1000
%rem817.neg = sub nsw i64 0, %rem817
%1 = and i64 %rem817.neg, 2147483648
%cmp10.not = icmp eq i64 %1, 0
ret i1 %cmp10.not
} However, I think this holds as long as |
@nikic, think we can merge this? |
Yes, let's merge it. |
Merging it. I think we can still solve the aforementioned regression though if needed (also, |
Leverage more refined ranges results when handling overflowing binary operators.
2b9945e
to
6ae4fcf
Compare
Leverage more refined ranges results when handling overflowing binary operators.