-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[GlobalMerge] add a command to force global merge #168231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@llvm/pr-subscribers-backend-arm Author: Austin (Zhenhang1213) ChangesI've found that in certain performance scenarios, particularly with the -O2 this PR can significantly enhance the efficiency of loading global variables. Full diff: https://github.com/llvm/llvm-project/pull/168231.diff 2 Files Affected:
diff --git a/llvm/lib/CodeGen/GlobalMerge.cpp b/llvm/lib/CodeGen/GlobalMerge.cpp
index e58d7e344c28b..ef7b1c758ff71 100644
--- a/llvm/lib/CodeGen/GlobalMerge.cpp
+++ b/llvm/lib/CodeGen/GlobalMerge.cpp
@@ -111,6 +111,11 @@ EnableGlobalMerge("enable-global-merge", cl::Hidden,
cl::desc("Enable the global merge pass"),
cl::init(true));
+static cl::opt<bool>
+ForceEnableGlobalMerge("force-enable-global-merge", cl::Hidden,
+ cl::desc("Force enable the global merge, regardless of the optimization level"),
+ cl::init(false));
+
static cl::opt<unsigned>
GlobalMergeMaxOffset("global-merge-max-offset", cl::Hidden,
cl::desc("Set maximum offset for global merge pass"),
@@ -374,7 +379,8 @@ bool GlobalMergeImpl::doMerge(SmallVectorImpl<GlobalVariable *> &Globals,
Function *ParentFn = I->getParent()->getParent();
// If we're only optimizing for size, ignore non-minsize functions.
- if (Opt.SizeOnly && !ParentFn->hasMinSize())
+ // And add a config to force global merge
+ if (!ForceEnableGlobalMerge && (Opt.SizeOnly && !ParentFn->hasMinSize()))
continue;
size_t UGSIdx = GlobalUsesByFunction[ParentFn];
diff --git a/llvm/test/CodeGen/ARM/force-global-merge.ll b/llvm/test/CodeGen/ARM/force-global-merge.ll
new file mode 100644
index 0000000000000..a7b791dc0a634
--- /dev/null
+++ b/llvm/test/CodeGen/ARM/force-global-merge.ll
@@ -0,0 +1,23 @@
+; RUN: llc -mtriple=arm-eabi -force-enable-global-merge %s -o - | FileCheck %s
+
+@g_value1 = dso_local local_unnamed_addr global i32 0, align 4
+@g_value2 = dso_local local_unnamed_addr global i32 0, align 4
+@g_value3 = dso_local local_unnamed_addr global i32 0, align 4
+@g_value4 = dso_local local_unnamed_addr global i32 0, align 4
+
+define dso_local i32 @foo1() local_unnamed_addr {
+entry:
+ %0 = load i32, ptr @g_value1, align 4
+ %1 = load i32, ptr @g_value2, align 4
+ %2 = load i32, ptr @g_value3, align 4
+ %3 = load i32, ptr @g_value4, align 4
+ %call = tail call i32 @foo(i32 %0, i32 %1, i32 %2, i32 %3)
+ ret i32 %call
+}
+
+declare i32 @foo(i32, i32, i32, i32)
+
+; CHECK: ldr [[BASE:r[0-9]+]], .LCPI0_0
+; CHECK: ldm [[BASE]], {[[R0:r[0-9]+]], [[R1:r[0-9]+]], [[R2:r[0-9]+]], [[R3:r[0-9]+]]}
+; CHECK: .LCPI0_0:
+; CHECK-NEXT: .long .L_MergedGlobals
|
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
db5321b to
ea12a0a
Compare
I've found that in certain performance scenarios, particularly with the -O2 this PR can significantly enhance the efficiency of loading global variables.
ea12a0a to
1ee11e4
Compare
|
cc @davemgreen |
|
Does -arm-global-merge already do this by disabling OnlyOptimizeForSize? |
Thx, I got it |
I've found that in certain performance scenarios, particularly with the -O2 this PR can significantly enhance the efficiency of loading global variables.