Skip to content

Commit

Permalink
[AArch64] Merge globals when optimising for size
Browse files Browse the repository at this point in the history
Extern global merging is good for code-size. There's definitely potential for
performance too, but there's one regression in a benchmark that needs
investigating, so that's why we enable it only when we optimise for size for
now.

Patch by Ramakota Reddy and Sjoerd Meijer.

Differential Revision: https://reviews.llvm.org/D61947

llvm-svn: 363130
  • Loading branch information
Sjoerd Meijer committed Jun 12, 2019
1 parent f763102 commit de73404
Show file tree
Hide file tree
Showing 2 changed files with 35 additions and 1 deletion.
15 changes: 14 additions & 1 deletion llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -462,7 +462,20 @@ bool AArch64PassConfig::addPreISel() {
EnableGlobalMerge == cl::BOU_TRUE) {
bool OnlyOptimizeForSize = (TM->getOptLevel() < CodeGenOpt::Aggressive) &&
(EnableGlobalMerge == cl::BOU_UNSET);
addPass(createGlobalMergePass(TM, 4095, OnlyOptimizeForSize));

// Merging of extern globals is enabled by default on non-Mach-O as we
// expect it to be generally either beneficial or harmless. On Mach-O it
// is disabled as we emit the .subsections_via_symbols directive which
// means that merging extern globals is not safe.
bool MergeExternalByDefault = !TM->getTargetTriple().isOSBinFormatMachO();

// FIXME: extern global merging is only enabled when we optimise for size
// because there are some regressions with it also enabled for performance.
if (!OnlyOptimizeForSize)
MergeExternalByDefault = false;

addPass(createGlobalMergePass(TM, 4095, OnlyOptimizeForSize,
MergeExternalByDefault));
}

return false;
Expand Down
21 changes: 21 additions & 0 deletions llvm/test/CodeGen/AArch64/global-merge-minsize.ll
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
; RUN: llc %s -o - -verify-machineinstrs | FileCheck %s

target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
target triple = "aarch64-arm-none-eabi"

@global0 = dso_local local_unnamed_addr global i32 0, align 4
@global1 = dso_local local_unnamed_addr global i32 0, align 4

define dso_local i32 @func() minsize optsize {
; CHECK-LABEL: @func
; CHECK: adrp x8, .L_MergedGlobals
; CHECK-NEXT: add x8, x8, :lo12:.L_MergedGlobals
; CHECK-NEXT: ldp w9, w8, [x8]
; CHECK-NEXT: add w0, w8, w9
; CHECK-NEXT: ret
entry:
%0 = load i32, i32* @global0, align 4
%1 = load i32, i32* @global1, align 4
%add = add nsw i32 %1, %0
ret i32 %add
}

0 comments on commit de73404

Please sign in to comment.