-
Notifications
You must be signed in to change notification settings - Fork 14.9k
[SimplifyCFG] Don't perform "redirecting phis between unmergeable BBs and successor BB" optimization when not using jump tables. #160560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… and successor BB" optimization when not using jump tables. This optimization works well for most cases but implicitly assumes that the backend will be able to generate jump tables (see "And lookup table optimization should convert it into add %arg 1." in PR #67275 description). When -fno-jump-tables is used however this regresses code size, which is especially painful in targets like armv7 where there's already high register pressure. rdar://157858055
This stack of pull requests is managed by Graphite. Learn more about stacking. |
@llvm/pr-subscribers-llvm-transforms Author: Amara Emerson (aemerson) ChangesThis optimization works well for most cases but implicitly assumes that the When -fno-jump-tables is used however this regresses code size, which is rdar://157858055 Full diff: https://github.com/llvm/llvm-project/pull/160560.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Utils/Local.cpp b/llvm/lib/Transforms/Utils/Local.cpp
index 123881e276584..1a3d543160137 100644
--- a/llvm/lib/Transforms/Utils/Local.cpp
+++ b/llvm/lib/Transforms/Utils/Local.cpp
@@ -1159,6 +1159,12 @@ bool llvm::TryToSimplifyUncondBranchFromEmptyBlock(BasicBlock *BB,
bool BBPhisMergeable = BBKillable || CanRedirectPredsOfEmptyBBToSucc(
BB, Succ, BBPreds, CommonPred);
+ // If the function has the "no-jump-tables" attribute, merging phis
+ // can make size worse at -Oz.
+ auto *MF = BB->getParent();
+ if (MF->hasMinSize() && MF->hasFnAttribute("no-jump-tables"))
+ BBPhisMergeable = false;
+
if ((!BBKillable && !BBPhisMergeable) || introduceTooManyPhiEntries(BB, Succ))
return false;
diff --git a/llvm/test/Transforms/SimplifyCFG/branch-fold.ll b/llvm/test/Transforms/SimplifyCFG/branch-fold.ll
index 8e7b91ea172be..2f4c5d9590e0e 100644
--- a/llvm/test/Transforms/SimplifyCFG/branch-fold.ll
+++ b/llvm/test/Transforms/SimplifyCFG/branch-fold.ll
@@ -145,8 +145,42 @@ Succ:
ret i8 %phi2
}
+define i8 @common_pred_no_jt(i8 noundef %arg, i1 %c1, i1 %c2) #0 {
+; CHECK-LABEL: @common_pred_no_jt(
+; CHECK-NEXT: Pred:
+; CHECK-NEXT: call void @dummy()
+; CHECK-NEXT: br i1 [[C1:%.*]], label [[COMMONPRED:%.*]], label [[SUCC:%.*]]
+; CHECK: CommonPred:
+; CHECK-NEXT: call void @dummy()
+; CHECK-NEXT: br i1 [[C2:%.*]], label [[SUCC1:%.*]], label [[SUCC]]
+; CHECK: BB:
+; CHECK-NEXT: [[PHI1:%.*]] = phi i8 [ 0, [[PRED:%.*]] ], [ 1, [[COMMONPRED]] ]
+; CHECK-NEXT: br label [[SUCC1]]
+; CHECK: Succ:
+; CHECK-NEXT: [[PHI2:%.*]] = phi i8 [ [[PHI1]], [[SUCC]] ], [ 4, [[COMMONPRED]] ]
+; CHECK-NEXT: ret i8 [[PHI2]]
+;
+Pred:
+call void @dummy()
+ br i1 %c1, label %CommonPred, label %BB
+
+CommonPred:
+call void @dummy()
+ br i1 %c2, label %Succ, label %BB
+
+BB:
+ %phi1 = phi i8 [ 0, %Pred ], [1, %CommonPred]
+ br label %Succ
+
+Succ:
+ %phi2 = phi i8 [ %phi1, %BB ], [ 4, %CommonPred ]
+ ret i8 %phi2
+}
declare void @dummy()
+attributes #0 = { minsize "no-jump-tables"="true" }
+
+
!0 = !{!"branch_weights", i32 3, i32 7}
!1 = !{!"branch_weights", i32 11, i32 4}
;.
|
I'm not really clear about what no-jump-tables has to do with this transform. The specific example from #67275 does benefit from a switch to arithmetic conversion -- which, since recently, is not actually affected by I also checked the codegen for you test case with and without this transform on armv7, and it seems to be the same: https://llvm.godbolt.org/z/673dbfGKs This particular example doesn't even generate a code structure where jump tables would be relevant. |
Right, sorry I should have made it clearer that the size issue isn't demonstrated by the test case, that was just trying to ensure that the general transform doesn't happen without jump tables. I've got a reduced test case that shows a significant (7%+) size difference when compiling with and without this change. I'm attaching the interestingness script used to reduce this down. GitHub won't let me attach IR so I've put it here: https://godbolt.org/z/7baoK9j1a Compiling that with:
should show the issue. |
Since this probably isn't the right solution I'm going to close this. |
This optimization works well for most cases but implicitly assumes that the
backend will be able to generate jump tables (see "And lookup table optimization
should convert it into add %arg 1." in PR #67275 description).
When -fno-jump-tables is used however this regresses code size, which is
especially painful in targets like armv7 where there's already high register
pressure.
rdar://157858055