-
Notifications
You must be signed in to change notification settings - Fork 10.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SimplifyCFG] Simplify conditional branches on const icmp eq's #73334
base: main
Are you sure you want to change the base?
[SimplifyCFG] Simplify conditional branches on const icmp eq's #73334
Conversation
@llvm/pr-subscribers-coroutines @llvm/pr-subscribers-llvm-transforms Author: None (yonillasky) ChangesThe issue that is being addressed here is shown in the Before this fix, what happens there is that:
I am fixing that, by explicitly doing the necessary constant-folding for this specific case. Full diff: https://github.com/llvm/llvm-project/pull/73334.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
index 3bcd896639a8ec2..2fe0c281662aa36 100644
--- a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
+++ b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
@@ -3578,6 +3578,13 @@ static bool FoldTwoEntryPHINode(PHINode *PN, const TargetTransformInfo &TTI,
return true;
}
+static BranchInst *decayCondBranchToUncondBranch(IRBuilderBase &Builder, BranchInst *BI, bool Eval) {
+ unsigned SuccessorIdx = (Eval) ? 0 : 1;
+ auto *NewBI = Builder.CreateBr(BI->getSuccessor(SuccessorIdx));
+ BI->eraseFromParent();
+ return NewBI;
+}
+
static Value *createLogicalOp(IRBuilderBase &Builder,
Instruction::BinaryOps Opc, Value *LHS,
Value *RHS, const Twine &Name = "") {
@@ -7325,6 +7332,17 @@ bool SimplifyCFGOpt::simplifyCondBranch(BranchInst *BI, IRBuilder<> &Builder) {
if (mergeConditionalStores(PBI, BI, DTU, DL, TTI))
return requestResimplify();
+ // Check if the condition is an equality between two constants. This can form due to other
+ // CFGSimplify steps, and may prevent further simplification if we don't deal with it here.
+ if (auto ICmp = dyn_cast<ICmpInst>(BI->getCondition()))
+ if (ICmp->getPredicate() == CmpInst::ICMP_EQ)
+ if (auto *LHS = dyn_cast<ConstantInt>(ICmp->getOperand(0)))
+ if (auto *RHS = dyn_cast<ConstantInt>(ICmp->getOperand(1))) {
+ bool CondEval = LHS->getZExtValue() == RHS->getZExtValue();
+ decayCondBranchToUncondBranch(Builder, BI, CondEval);
+ return requestResimplify();
+ }
+
return false;
}
diff --git a/llvm/test/Transforms/SimplifyCFG/constant-valued-cond-br.ll b/llvm/test/Transforms/SimplifyCFG/constant-valued-cond-br.ll
new file mode 100644
index 000000000000000..51262dab3ab0436
--- /dev/null
+++ b/llvm/test/Transforms/SimplifyCFG/constant-valued-cond-br.ll
@@ -0,0 +1,47 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 4
+; RUN: opt < %s -passes=simplifycfg -simplifycfg-require-and-preserve-domtree=1 -S | FileCheck %s
+
+define void @const_valued_cond_br(ptr %P) {
+; CHECK-LABEL: define void @const_valued_cond_br(
+; CHECK-SAME: ptr [[P:%.*]]) {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: [[COND:%.*]] = icmp eq i32 42, 42
+; CHECK-NEXT: store i32 123, ptr [[P]], align 4
+; CHECK-NEXT: ret void
+;
+entry:
+ %cond = icmp eq i32 42, 42
+ br i1 %cond, label %a, label %b
+a:
+ store i32 123, ptr %P
+ br label %b
+b:
+ ret void
+}
+
+
+
+define void @intersection_block_with_dead_predecessor(ptr %P) {
+; CHECK-LABEL: define void @intersection_block_with_dead_predecessor(
+; CHECK-SAME: ptr [[P:%.*]]) {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: [[COND:%.*]] = icmp eq i32 1, 1
+; CHECK-NEXT: store i32 321, ptr [[P]], align 4
+; CHECK-NEXT: ret void
+;
+entry:
+ br label %b
+b:
+ %x = phi i32 [1, %entry], [2, %a]
+ switch i32 %x, label %c [
+ i32 1, label %d
+ ]
+c:
+ store i32 123, ptr %P
+ ret void
+d:
+ store i32 321, ptr %P
+ ret void
+a: ; unreachable
+ br label %b
+}
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
b2af341
to
dd500ad
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the broader motivation for this? That is, why is it important to handle this within a single SimplifyCFG invocation? We'd probably want to have a PhaseOrdering test for that.
I'm not entirely sure whether this is the best approach. A possible alternative would be to instead perform additional simplification when removing the unreachable block and simplifying the phi to a constant. We could try to simplify all instructions that used the phi at that point.
dd500ad
to
f52b501
Compare
Pay close attention to the test. To fully simplify the test case with existing code you'd need to run SimplifyCFG first, then a pass that folds constants, then SimplifyCFG again.
I like the idea. However, how about specifically attempting to constant-fold the successor's BranchInst condition, if it has one, then? |
So ... the suggested alternative is to replace all calls to |
It is generally necessary to run SimplifyCFG, InstCombine and SimplifyCFG again to get most optimization opportunities. It can make sense to short-cut this in specific cases for phase ordering reasons, but we should have specific motivation for that. That's why I'm asking what your original motivation here is. Another possibility is to make InstCombine slightly stronger so it can fold the phi away without a prior SimplifyCFG run -- in fact, it already does so for blocks that are dynamically unreachable, but fails to do that for blocks that are statically unreachable. That difference in not intended and just an implementation artifact. |
The scenario that prompted me to make the change was that in CoroCleanup it explicitly makes a SimplifyCFG call on all affected coroutines, I saw it fail to fully optimize the CFG in a certain scenario. Should it expect that InstCombine / SimplifyCFG will run afterwards and fix whatever problems remain? I thought the simplify being there meant that the coro pipeline was expected to output "cleaned up" code already -- there are very few passes after it... but honestly I'm not that familiar with all the various pipelines and pass schedules that LLVM has. |
I think I can make the change you suggested without too much trouble. I'll do it and update the PR shortly. |
I've sort of managed to implement the CR suggestion. |
f52b501
to
df232c2
Compare
2b8d95c
to
7702be4
Compare
7702be4
to
73edb08
Compare
The issue that is being addressed here is shown in the
@ intersection_block_with_dead_predecessor
test.Before this fix, what happens there is that:
%a
is dead, and deletes it%c
is a dead block)I am fixing that, by explicitly doing the necessary constant-folding for this specific case.