Skip to content

Commit

Permalink
[ThinLTO] Ensure we always select the same function copy to import
Browse files Browse the repository at this point in the history
In order to always import the same copy of a linkonce function,
even when encountering it with different thresholds (a higher one then a
lower one), keep track of the summary we decided to import.
This ensures that the backend only gets a single definition to import
for each GUID, so that it doesn't need to choose one.

Move the largest threshold the GUID was considered for import into the
current module out of the ImportMap (which is part of a larger map
maintained across the whole index), and into a new map just maintained
for the current module we are computing imports for. This saves some
memory since we no longer have the thresholds maintained across the
whole index (and throughout the in-process backends when doing a normal
non-distributed ThinLTO build), at the cost of some additional
information being maintained for each invocation of ComputeImportForModule
(the selected summary pointer for each import).

There is an additional map lookup for each callee being considered for
importing, however, this was able to subsume a map lookup in the
Worklist iteration that invokes computeImportForFunction. We also are
able to avoid calling selectCallee if we already failed to import at the
same or higher threshold.

I compared the run time and peak memory for the SPEC2006 471.omnetpp
benchmark (running in-process ThinLTO backends), as well as for a large
internal benchmark with a distributed ThinLTO build (so just looking at
the thin link time/memory). Across a number of runs with and without
this change there was no significant change in the time and memory.

(I tried a few other variations of the change but they also didn't
improve time or peak memory).

Reviewers: davidxl

Subscribers: mehdi_amini, inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D48670

llvm-svn: 337050
  • Loading branch information
teresajohnson committed Jul 13, 2018
1 parent d66941f commit d94c059
Show file tree
Hide file tree
Showing 7 changed files with 192 additions and 77 deletions.
16 changes: 11 additions & 5 deletions llvm/include/llvm/Transforms/IPO/FunctionImport.h
Expand Up @@ -33,11 +33,17 @@ class Module;
/// based on the provided summary informations.
class FunctionImporter {
public:
/// Set of functions to import from a source module. Each entry is a map
/// containing all the functions to import for a source module.
/// The keys is the GUID identifying a function to import, and the value
/// is the threshold applied when deciding to import it.
using FunctionsToImportTy = std::map<GlobalValue::GUID, unsigned>;
/// Set of functions to import from a source module. Each entry is a set
/// containing all the GUIDs of all functions to import for a source module.
using FunctionsToImportTy = std::unordered_set<GlobalValue::GUID>;

/// Map of callee GUID considered for import into a given module to a pair
/// consisting of the largest threshold applied when deciding whether to
/// import it and, if we decided to import, a pointer to the summary instance
/// imported. If we decided not to import, the summary will be nullptr.
using ImportThresholdsTy =
DenseMap<GlobalValue::GUID,
std::pair<unsigned, const GlobalValueSummary *>>;

/// The map contains an entry for every module to import from, the key being
/// the module identifier to pass to the ModuleLoader. The value is the set of
Expand Down
4 changes: 2 additions & 2 deletions llvm/lib/LTO/LTO.cpp
Expand Up @@ -156,7 +156,7 @@ static void computeCacheKey(

AddUint64(Entry.second.size());
for (auto &Fn : Entry.second)
AddUint64(Fn.first);
AddUint64(Fn);
}

// Include the hash for the resolved ODR.
Expand Down Expand Up @@ -221,7 +221,7 @@ static void computeCacheKey(
// so we need to collect their used resolutions as well.
for (auto &ImpM : ImportList)
for (auto &ImpF : ImpM.second)
AddUsedThings(Index.findSummaryInModule(ImpF.first, ImpM.first()));
AddUsedThings(Index.findSummaryInModule(ImpF, ImpM.first()));

auto AddTypeIdSummary = [&](StringRef TId, const TypeIdSummary &S) {
AddString(TId);
Expand Down
157 changes: 88 additions & 69 deletions llvm/lib/Transforms/IPO/FunctionImport.cpp
Expand Up @@ -262,7 +262,7 @@ static void computeImportForReferencedGlobals(
!RefSummary->modulePath().empty() &&
!GlobalValue::isInterposableLinkage(RefSummary->linkage()) &&
RefSummary->refs().empty()) {
ImportList[RefSummary->modulePath()][VI.getGUID()] = 1;
ImportList[RefSummary->modulePath()].insert(VI.getGUID());
if (ExportLists)
(*ExportLists)[RefSummary->modulePath()].insert(VI.getGUID());
break;
Expand All @@ -278,7 +278,8 @@ static void computeImportForFunction(
const unsigned Threshold, const GVSummaryMapTy &DefinedGVSummaries,
SmallVectorImpl<EdgeInfo> &Worklist,
FunctionImporter::ImportMapTy &ImportList,
StringMap<FunctionImporter::ExportSetTy> *ExportLists = nullptr) {
StringMap<FunctionImporter::ExportSetTy> *ExportLists,
FunctionImporter::ImportThresholdsTy &ImportThresholds) {
computeImportForReferencedGlobals(Summary, DefinedGVSummaries, ImportList,
ExportLists);
static int ImportCount = 0;
Expand Down Expand Up @@ -315,19 +316,85 @@ static void computeImportForFunction(
const auto NewThreshold =
Threshold * GetBonusMultiplier(Edge.second.getHotness());

auto *CalleeSummary = selectCallee(Index, VI.getSummaryList(), NewThreshold,
Summary.modulePath());
if (!CalleeSummary) {
LLVM_DEBUG(
dbgs() << "ignored! No qualifying callee with summary found.\n");
continue;
}
auto IT = ImportThresholds.insert(
std::make_pair(VI.getGUID(), std::make_pair(NewThreshold, nullptr)));
bool PreviouslyVisited = !IT.second;
auto &ProcessedThreshold = IT.first->second.first;
auto &CalleeSummary = IT.first->second.second;

const FunctionSummary *ResolvedCalleeSummary = nullptr;
if (CalleeSummary) {
assert(PreviouslyVisited);
// Since the traversal of the call graph is DFS, we can revisit a function
// a second time with a higher threshold. In this case, it is added back
// to the worklist with the new threshold (so that its own callee chains
// can be considered with the higher threshold).
if (NewThreshold <= ProcessedThreshold) {
LLVM_DEBUG(
dbgs() << "ignored! Target was already imported with Threshold "
<< ProcessedThreshold << "\n");
continue;
}
// Update with new larger threshold.
ProcessedThreshold = NewThreshold;
ResolvedCalleeSummary = cast<FunctionSummary>(CalleeSummary);
} else {
// If we already rejected importing a callee at the same or higher
// threshold, don't waste time calling selectCallee.
if (PreviouslyVisited && NewThreshold <= ProcessedThreshold) {
LLVM_DEBUG(
dbgs() << "ignored! Target was already rejected with Threshold "
<< ProcessedThreshold << "\n");
continue;
}

// "Resolve" the summary
const auto *ResolvedCalleeSummary = cast<FunctionSummary>(CalleeSummary->getBaseObject());
CalleeSummary = selectCallee(Index, VI.getSummaryList(), NewThreshold,
Summary.modulePath());
if (!CalleeSummary) {
// Update with new larger threshold if this was a retry (otherwise
// we would have already inserted with NewThreshold above).
if (PreviouslyVisited)
ProcessedThreshold = NewThreshold;
LLVM_DEBUG(
dbgs() << "ignored! No qualifying callee with summary found.\n");
continue;
}

assert(ResolvedCalleeSummary->instCount() <= NewThreshold &&
"selectCallee() didn't honor the threshold");
// "Resolve" the summary
CalleeSummary = CalleeSummary->getBaseObject();
ResolvedCalleeSummary = cast<FunctionSummary>(CalleeSummary);

assert(ResolvedCalleeSummary->instCount() <= NewThreshold &&
"selectCallee() didn't honor the threshold");

auto ExportModulePath = ResolvedCalleeSummary->modulePath();
auto ILI = ImportList[ExportModulePath].insert(VI.getGUID());
// We previously decided to import this GUID definition if it was already
// inserted in the set of imports from the exporting module.
bool PreviouslyImported = !ILI.second;

// Make exports in the source module.
if (ExportLists) {
auto &ExportList = (*ExportLists)[ExportModulePath];
ExportList.insert(VI.getGUID());
if (!PreviouslyImported) {
// This is the first time this function was exported from its source
// module, so mark all functions and globals it references as exported
// to the outside if they are defined in the same source module.
// For efficiency, we unconditionally add all the referenced GUIDs
// to the ExportList for this module, and will prune out any not
// defined in the module later in a single pass.
for (auto &Edge : ResolvedCalleeSummary->calls()) {
auto CalleeGUID = Edge.first.getGUID();
ExportList.insert(CalleeGUID);
}
for (auto &Ref : ResolvedCalleeSummary->refs()) {
auto GUID = Ref.getGUID();
ExportList.insert(GUID);
}
}
}
}

auto GetAdjustedThreshold = [](unsigned Threshold, bool IsHotCallsite) {
// Adjust the threshold for next level of imported functions.
Expand All @@ -342,44 +409,8 @@ static void computeImportForFunction(
Edge.second.getHotness() == CalleeInfo::HotnessType::Hot;
const auto AdjThreshold = GetAdjustedThreshold(Threshold, IsHotCallsite);

auto ExportModulePath = ResolvedCalleeSummary->modulePath();
auto &ProcessedThreshold = ImportList[ExportModulePath][VI.getGUID()];
/// Since the traversal of the call graph is DFS, we can revisit a function
/// a second time with a higher threshold. In this case, it is added back to
/// the worklist with the new threshold.
if (ProcessedThreshold && ProcessedThreshold >= AdjThreshold) {
LLVM_DEBUG(dbgs() << "ignored! Target was already seen with Threshold "
<< ProcessedThreshold << "\n");
continue;
}
bool PreviouslyImported = ProcessedThreshold != 0;
// Mark this function as imported in this module, with the current Threshold
ProcessedThreshold = AdjThreshold;

ImportCount++;

// Make exports in the source module.
if (ExportLists) {
auto &ExportList = (*ExportLists)[ExportModulePath];
ExportList.insert(VI.getGUID());
if (!PreviouslyImported) {
// This is the first time this function was exported from its source
// module, so mark all functions and globals it references as exported
// to the outside if they are defined in the same source module.
// For efficiency, we unconditionally add all the referenced GUIDs
// to the ExportList for this module, and will prune out any not
// defined in the module later in a single pass.
for (auto &Edge : ResolvedCalleeSummary->calls()) {
auto CalleeGUID = Edge.first.getGUID();
ExportList.insert(CalleeGUID);
}
for (auto &Ref : ResolvedCalleeSummary->refs()) {
auto GUID = Ref.getGUID();
ExportList.insert(GUID);
}
}
}

// Insert the newly imported function to the worklist.
Worklist.emplace_back(ResolvedCalleeSummary, AdjThreshold, VI.getGUID());
}
Expand All @@ -395,6 +426,7 @@ static void ComputeImportForModule(
// Worklist contains the list of function imported in this module, for which
// we will analyse the callees and may import further down the callgraph.
SmallVector<EdgeInfo, 128> Worklist;
FunctionImporter::ImportThresholdsTy ImportThresholds;

// Populate the worklist with the import for the functions in the current
// module
Expand All @@ -416,25 +448,18 @@ static void ComputeImportForModule(
LLVM_DEBUG(dbgs() << "Initialize import for " << VI << "\n");
computeImportForFunction(*FuncSummary, Index, ImportInstrLimit,
DefinedGVSummaries, Worklist, ImportList,
ExportLists);
ExportLists, ImportThresholds);
}

// Process the newly imported functions and add callees to the worklist.
while (!Worklist.empty()) {
auto FuncInfo = Worklist.pop_back_val();
auto *Summary = std::get<0>(FuncInfo);
auto Threshold = std::get<1>(FuncInfo);
auto GUID = std::get<2>(FuncInfo);

// Check if we later added this summary with a higher threshold.
// If so, skip this entry.
auto ExportModulePath = Summary->modulePath();
auto &LatestProcessedThreshold = ImportList[ExportModulePath][GUID];
if (LatestProcessedThreshold > Threshold)
continue;

computeImportForFunction(*Summary, Index, Threshold, DefinedGVSummaries,
Worklist, ImportList, ExportLists);
Worklist, ImportList, ExportLists,
ImportThresholds);
}
}

Expand All @@ -451,11 +476,6 @@ static bool isGlobalVarSummary(const ModuleSummaryIndex &Index,

static GlobalValue::GUID getGUID(GlobalValue::GUID G) { return G; }

static GlobalValue::GUID
getGUID(const std::pair<const GlobalValue::GUID, unsigned> &P) {
return P.first;
}

template <class T>
static unsigned numGlobalVarSummaries(const ModuleSummaryIndex &Index,
T &Cont) {
Expand Down Expand Up @@ -574,9 +594,8 @@ void llvm::ComputeCrossModuleImportForModuleFromIndex(
// e.g. record required linkage changes.
if (Summary->modulePath() == ModulePath)
continue;
// Doesn't matter what value we plug in to the map, just needs an entry
// to provoke importing by thinBackend.
ImportList[Summary->modulePath()][GUID] = 1;
// Add an entry to provoke importing by thinBackend.
ImportList[Summary->modulePath()].insert(GUID);
}
#ifndef NDEBUG
dumpImportListForModule(Index, ModulePath, ImportList);
Expand Down Expand Up @@ -698,10 +717,10 @@ void llvm::gatherImportedSummariesForModule(
const auto &DefinedGVSummaries =
ModuleToDefinedGVSummaries.lookup(ILI.first());
for (auto &GI : ILI.second) {
const auto &DS = DefinedGVSummaries.find(GI.first);
const auto &DS = DefinedGVSummaries.find(GI);
assert(DS != DefinedGVSummaries.end() &&
"Expected a defined summary for imported global value");
SummariesForIndex[GI.first] = DS->second;
SummariesForIndex[GI] = DS->second;
}
}
}
Expand Down
34 changes: 34 additions & 0 deletions llvm/test/Transforms/FunctionImport/Inputs/funcimport_resolved1.ll
@@ -0,0 +1,34 @@
target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-apple-macosx10.11.0"

define void @foo() {
call void @linkonceodrfunc()
call void @linkonceodrfunc2()
ret void
}

define linkonce_odr void @linkonceodrfunc() {
call void @f()
call void @f()
call void @f()
call void @f()
call void @f()
call void @f()
call void @f()
ret void
}

define linkonce_odr void @linkonceodrfunc2() {
call void @f()
call void @f()
call void @f()
call void @f()
call void @f()
call void @f()
call void @f()
ret void
}

define internal void @f() {
ret void
}
@@ -0,0 +1,6 @@
target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-apple-macosx10.11.0"

define linkonce_odr void @linkonceodrfunc() {
ret void
}
50 changes: 50 additions & 0 deletions llvm/test/Transforms/FunctionImport/funcimport_resolved.ll
@@ -0,0 +1,50 @@
; Test to ensure that we always select the same copy of a linkonce function
; when it is encountered with different thresholds. When we encounter the
; copy in funcimport_resolved1.ll with a higher threshold via the direct call
; from main(), it will be selected for importing. When we encounter it with a
; lower threshold by reaching it from the deeper call chain via foo(), it
; won't be selected for importing. We don't want to select both the copy from
; funcimport_resolved1.ll and the smaller one from funcimport_resolved2.ll,
; leaving it up to the backend to figure out which one to actually import.
; The linkonce_odr may have different instruction counts in practice due to
; different inlines in the compile step.

; Require asserts so we can use -debug-only
; REQUIRES: asserts

; RUN: opt -module-summary %s -o %t.bc
; RUN: opt -module-summary %p/Inputs/funcimport_resolved1.ll -o %t2.bc
; RUN: opt -module-summary %p/Inputs/funcimport_resolved2.ll -o %t3.bc

; First do a sanity check that all callees are imported with the default
; instruction limit
; RUN: llvm-lto2 run %t.bc %t2.bc %t3.bc -o %t4 -r=%t.bc,_main,pl -r=%t.bc,_linkonceodrfunc,l -r=%t.bc,_foo,l -r=%t2.bc,_foo,pl -r=%t2.bc,_linkonceodrfunc,pl -r=%t2.bc,_linkonceodrfunc2,pl -r=%t3.bc,_linkonceodrfunc,l -thinlto-threads=1 -debug-only=function-import 2>&1 | FileCheck %s --check-prefix=INSTLIMDEFAULT
; INSTLIMDEFAULT: Is importing function {{.*}} foo from {{.*}}funcimport_resolved1.ll
; INSTLIMDEFAULT: Is importing function {{.*}} linkonceodrfunc from {{.*}}funcimport_resolved1.ll
; INSTLIMDEFAULT: Is importing function {{.*}} linkonceodrfunc2 from {{.*}}funcimport_resolved1.ll
; INSTLIMDEFAULT: Is importing function {{.*}} f from {{.*}}funcimport_resolved1.ll
; INSTLIMDEFAULT-NOT: Is importing function {{.*}} linkonceodrfunc from {{.*}}funcimport_resolved2.ll

; Now run with the lower threshold that will only allow linkonceodrfunc to be
; imported from funcimport_resolved1.ll when encountered via the direct call
; from main(). Ensure we don't also select the copy in funcimport_resolved2.ll
; when it is encountered via the deeper call chain.
; RUN: llvm-lto2 run %t.bc %t2.bc %t3.bc -o %t4 -r=%t.bc,_main,pl -r=%t.bc,_linkonceodrfunc,l -r=%t.bc,_foo,l -r=%t2.bc,_foo,pl -r=%t2.bc,_linkonceodrfunc,pl -r=%t2.bc,_linkonceodrfunc2,pl -r=%t3.bc,_linkonceodrfunc,l -thinlto-threads=1 -debug-only=function-import -import-instr-limit=8 2>&1 | FileCheck %s --check-prefix=INSTLIM8
; INSTLIM8: Is importing function {{.*}} foo from {{.*}}funcimport_resolved1.ll
; INSTLIM8: Is importing function {{.*}} linkonceodrfunc from {{.*}}funcimport_resolved1.ll
; INSTLIM8: Not importing function {{.*}} linkonceodrfunc2 from {{.*}}funcimport_resolved1.ll
; INSTLIM8: Is importing function {{.*}} f from {{.*}}funcimport_resolved1.ll
; INSTLIM8-NOT: Is importing function {{.*}} linkonceodrfunc from {{.*}}funcimport_resolved2.ll

target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-apple-macosx10.11.0"

define i32 @main() #0 {
entry:
call void (...) @foo()
call void (...) @linkonceodrfunc()
ret i32 0
}

declare void @foo(...) #1
declare void @linkonceodrfunc(...) #1
2 changes: 1 addition & 1 deletion llvm/tools/llvm-link/llvm-link.cpp
Expand Up @@ -262,7 +262,7 @@ static bool importFunctions(const char *argv0, Module &DestModule) {
errs() << "Importing " << FunctionName << " from " << FileName << "\n";

auto &Entry = ImportList[FileName];
Entry.insert(std::make_pair(F->getGUID(), /* (Unused) threshold */ 1.0));
Entry.insert(F->getGUID());
}
auto CachedModuleLoader = [&](StringRef Identifier) {
return ModuleLoaderCache.takeModule(Identifier);
Expand Down

0 comments on commit d94c059

Please sign in to comment.