-
Notifications
You must be signed in to change notification settings - Fork 12k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[InstCombine] Fold switch(rol(x, C1)) case C2:
to switch(x) case rol(C2, -C1):
#86307
base: main
Are you sure you want to change the base?
Conversation
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write If you have received no comments on your PR for a week, you can request a review If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-llvm-transforms Author: Monad (YanWQ-monad) ChangesThis solves #86161. It is worth mentioning that, as @dtcxzyw pointed out, there is an inverse fold in llvm-project/llvm/lib/Transforms/Utils/SimplifyCFG.cpp Lines 6911 to 6996 in 90454a6
, so IR may be flickering in the optimization pipeline. SimplifyCFG will have the upper hand, so there won't be any major problems. But in some rare cases, it might cause a regression (in @dtcxzyw's opt-benchmark).
Full diff: https://github.com/llvm/llvm-project/pull/86307.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
index 7c40fb4fc86082..b6611cfbbfc1f4 100644
--- a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
@@ -3645,6 +3645,16 @@ Instruction *InstCombinerImpl::visitSwitchInst(SwitchInst &SI) {
}
}
+ // Fold 'switch(rol(x, C1)) case C2:' to 'switch(x) case rol(C2, -C1):'
+ if (match(Cond,
+ m_FShl(m_Value(Op0), m_Deferred(Op0), m_ConstantInt(ShiftAmt)))) {
+ for (auto &Case : SI.cases()) {
+ const APInt NewCase = Case.getCaseValue()->getValue().rotr(ShiftAmt);
+ Case.setValue(ConstantInt::get(SI.getContext(), NewCase));
+ }
+ return replaceOperand(SI, 0, Op0);
+ }
+
KnownBits Known = computeKnownBits(Cond, 0, &SI);
unsigned LeadingKnownZeros = Known.countMinLeadingZeros();
unsigned LeadingKnownOnes = Known.countMinLeadingOnes();
diff --git a/llvm/test/Transforms/InstCombine/switch-rol.ll b/llvm/test/Transforms/InstCombine/switch-rol.ll
new file mode 100644
index 00000000000000..1cd55ff91c9492
--- /dev/null
+++ b/llvm/test/Transforms/InstCombine/switch-rol.ll
@@ -0,0 +1,33 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 4
+; RUN: opt < %s -passes=instcombine -S | FileCheck %s
+
+declare void @dummy()
+
+define i32 @switch_rol(i32 %a) #0 {
+; CHECK-LABEL: define i32 @switch_rol(
+; CHECK-SAME: i32 [[A:%.*]]) {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: switch i32 [[A]], label [[DEFAULT:%.*]] [
+; CHECK-NEXT: i32 0, label [[TRAP_EXIT:%.*]]
+; CHECK-NEXT: i32 20, label [[TRAP_EXIT]]
+; CHECK-NEXT: ]
+; CHECK: default:
+; CHECK-NEXT: call void @dummy()
+; CHECK-NEXT: br label [[TRAP_EXIT]]
+; CHECK: trap.exit:
+; CHECK-NEXT: ret i32 0
+;
+entry:
+ %rol = call i32 @llvm.fshl.i32(i32 %a, i32 %a, i32 30)
+ switch i32 %rol, label %default [
+ i32 0, label %trap.exit
+ i32 5, label %trap.exit
+ ]
+
+default:
+ call void @dummy()
+ br label %trap.exit
+
+trap.exit:
+ ret i32 0
+}
|
This change looks fine, although it would be nice if we had some helper generically detecting if an op is reversable so we could generalize all these cases. |
Would also be useful for things like |
It has been supported. llvm-project/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp Lines 3621 to 3633 in bbcfe6f
|
Err I don't mean that to necessarily new support in |
LGTM. |
See #86346 for what I mean. |
@YanWQ-monad I'd like to move the logic into SimplifyCFG. Then we will run into an infinite loop if the existing code in SimplifyCFG reverts your fold :) |
Shouldn't that also apply for all the other cases we handle here? |
No. We should leave other optimizations in InstCombine because they don't modify DomTree and some of them need DT information. I suggest we implement this fold in SimplifyCFG so as to make it easier to see the impact. |
I don't really see your point. How is this fold different from the |
I don't really like the idea of SimplifyCFG and InstCombine changing the code back and forth. Maybe we should limit the fold to the case where it will not make the switch less dense? |
It's rare that the fold will make the switch dense, at least as the issue shows, and in dtcxzyw's benchmark. So a more reasonable solution should be limit the fold so that it won't affect If so, |
I think folding it if the number of reachable successors <= 3 is OK, because the backend doesn't create a lookup table in this case, and the real-world case in original issue also holds only 3 successors. |
This closes #86161.
It is worth mentioning that, as @dtcxzyw pointed out, there is an inverse fold in
llvm-project/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
Lines 6911 to 6996 in 90454a6
, so IR may be flickering in the optimization pipeline.
SimplifyCFG
will have the upper hand, so there won't be any major problems. But in some rare cases, it might cause a regression (in @dtcxzyw's opt-benchmark).