-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[InstCombine]: Eliminate redundant modulus for urem #157644
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-transforms Author: None (kper) ChangesFolds Alive: https://alive2.llvm.org/ce/z/b_GQJX Fixes #157370 Full diff: https://github.com/llvm/llvm-project/pull/157644.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp b/llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
index a9aacc707cc20..5288be9c84383 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
@@ -2473,6 +2473,14 @@ Instruction *InstCombinerImpl::visitURem(BinaryOperator &I) {
}
}
+ Value *A;
+ Value *B;
+ // urem(urem(A, B), Op1) -> urem(A, Op1)
+ if (match(Op0, m_URem(m_Value(A), m_Value(B)))) {
+ Value *Fold = Builder.CreateURem(A, Op1);
+ return replaceInstUsesWith(I, Fold);
+ }
+
return nullptr;
}
diff --git a/llvm/test/Transforms/InstCombine/urem.ll b/llvm/test/Transforms/InstCombine/urem.ll
new file mode 100644
index 0000000000000..751dc4991907a
--- /dev/null
+++ b/llvm/test/Transforms/InstCombine/urem.ll
@@ -0,0 +1,82 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt < %s -passes=instcombine -S | FileCheck %s
+
+declare void @use(i8)
+
+define i8 @fold_urem_constants(i64 %arg0) {
+; CHECK-LABEL: @fold_urem_constants(
+; CHECK-NEXT: [[V0:%.*]] = trunc i64 [[ARG0:%.*]] to i8
+; CHECK-NEXT: [[V2:%.*]] = urem i8 [[V0]], 5
+; CHECK-NEXT: ret i8 [[V2]]
+;
+ %v0 = trunc i64 %arg0 to i8
+ %v1 = urem i8 %v0, 25
+ %v2 = urem i8 %v1, 5
+ ret i8 %v2
+}
+
+define i8 @fold_urem_variables(i64 %arg0, i8 %A, i8 %B) {
+; CHECK-LABEL: @fold_urem_variables(
+; CHECK-NEXT: [[V0:%.*]] = trunc i64 [[ARG0:%.*]] to i8
+; CHECK-NEXT: [[V2:%.*]] = urem i8 [[V0]], [[B:%.*]]
+; CHECK-NEXT: ret i8 [[V2]]
+;
+ %v0 = trunc i64 %arg0 to i8
+ %v1 = urem i8 %v0, %A
+ %v2 = urem i8 %v1, %B
+ ret i8 %v2
+}
+
+define i8 @fold_urem_constants_multi_use(i64 %arg0) {
+; CHECK-LABEL: @fold_urem_constants_multi_use(
+; CHECK-NEXT: [[V0:%.*]] = trunc i64 [[ARG0:%.*]] to i8
+; CHECK-NEXT: [[V1:%.*]] = urem i8 [[V0]], 25
+; CHECK-NEXT: call void @use(i8 [[V1]])
+; CHECK-NEXT: [[V2:%.*]] = urem i8 [[V0]], 5
+; CHECK-NEXT: ret i8 [[V2]]
+;
+ %v0 = trunc i64 %arg0 to i8
+ %v1 = urem i8 %v0, 25
+ call void @use(i8 %v1)
+ %v2 = urem i8 %v1, 5
+ ret i8 %v2
+}
+
+define i8 @fold_urem_variables_multi_use(i64 %arg0, i8 %A, i8 %B) {
+; CHECK-LABEL: @fold_urem_variables_multi_use(
+; CHECK-NEXT: [[V0:%.*]] = trunc i64 [[ARG0:%.*]] to i8
+; CHECK-NEXT: [[V1:%.*]] = urem i8 [[V0]], [[A:%.*]]
+; CHECK-NEXT: call void @use(i8 [[V1]])
+; CHECK-NEXT: [[V2:%.*]] = urem i8 [[V0]], [[B:%.*]]
+; CHECK-NEXT: ret i8 [[V2]]
+;
+ %v0 = trunc i64 %arg0 to i8
+ %v1 = urem i8 %v0, %A
+ call void @use(i8 %v1)
+ %v2 = urem i8 %v1, %B
+ ret i8 %v2
+}
+
+define i8 @fold_urem_constants_negative_test(i64 %arg0) {
+; CHECK-LABEL: @fold_urem_constants_negative_test(
+; CHECK-NEXT: [[V0:%.*]] = trunc i64 [[ARG0:%.*]] to i8
+; CHECK-NEXT: [[V2:%.*]] = urem i8 [[V0]], 25
+; CHECK-NEXT: ret i8 [[V2]]
+;
+ %v0 = trunc i64 %arg0 to i8
+ %v1 = urem i8 %v0, 25
+ %v2 = urem i8 %v0, 5 ; removes this because not used
+ ret i8 %v1
+}
+
+define i8 @fold_urem_variables_negative_test(i64 %arg0, i8 %A, i8 %B) {
+; CHECK-LABEL: @fold_urem_variables_negative_test(
+; CHECK-NEXT: [[V0:%.*]] = trunc i64 [[ARG0:%.*]] to i8
+; CHECK-NEXT: [[V1:%.*]] = urem i8 [[V0]], [[A:%.*]]
+; CHECK-NEXT: ret i8 [[V1]]
+;
+ %v0 = trunc i64 %arg0 to i8
+ %v1 = urem i8 %v0, %A
+ %v2 = urem i8 %v0, %B ; removes this because not used
+ ret i8 %v1
+}
|
7cabd0b
to
f648f02
Compare
f648f02
to
b6529db
Compare
b6529db
to
5a7f62d
Compare
5a7f62d
to
b9559a7
Compare
const APInt *Op1Cst, *BCst; | ||
// urem(urem(A, BCst), Op1Cst) -> urem(A, Op1Cst) | ||
// iff urem(BCst, Op1Cst) == 0 | ||
if (match(Op0, m_URem(m_Value(A), m_APInt(BCst))) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we need a one-use check on the inner urem. Otherwise we often end up breaking up div+rem patterns, and also performing the urem on a wider type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, can you rerun the benchmark? (I am not sure whether I need rights for that)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like with this change the fold never triggers anymore...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems so that it's always this div+rem pair in the pattern posted in the original issue. (actually found it only once)
https://github.com/dtcxzyw/llvm-opt-benchmark/blob/main/bench/ffmpeg/optimized/ac3dec_fixed.ll#L7580
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure what's here left for me to do
Should I close this PR because it seems to be unprofitable to do the optimisation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think we should close this issue as "won't fix". It's a valid transform, but it doesn't look like it will be profitable in practice.
b9559a7
to
e6ed237
Compare
e6ed237
to
d575ebb
Compare
Folds
urem
when first modulus is redundant.Alive: https://alive2.llvm.org/ce/z/b_GQJX
Fixes #157370