Skip to content

Commit

Permalink
[InstrProf] Add options to profile function groups
Browse files Browse the repository at this point in the history
Add two options, `-fprofile-function-groups=N` and `-fprofile-selected-function-group=i` used to partition functions into `N` groups and only instrument the functions in group `i`. Similar options were added to xray in https://reviews.llvm.org/D87953 and the goal is the same; to reduce instrumented size overhead by spreading the overhead across multiple builds. Raw profiles from different groups can be added like normal using the `llvm-profdata merge` command.

Reviewed By: ianlevesque

Differential Revision: https://reviews.llvm.org/D129594
  • Loading branch information
ellishg committed Jul 14, 2022
1 parent 140bfdc commit af58684
Show file tree
Hide file tree
Showing 10 changed files with 143 additions and 6 deletions.
4 changes: 4 additions & 0 deletions clang/docs/ClangCommandLineReference.rst
Expand Up @@ -2329,6 +2329,10 @@ Use instrumentation data for profile-guided optimization

Filename defining the list of functions/files to instrument

.. option:: -fprofile-function-groups=<N>, -fprofile-selected-function-group=<i>

Partition functions into <N> groups and select only functions in group <i> to be instrumented

.. option:: -fprofile-remapping-file=<file>

Use the remappings described in <file> to match the profile data against names in the program
Expand Down
26 changes: 26 additions & 0 deletions clang/docs/UsersManual.rst
Expand Up @@ -2513,6 +2513,32 @@ When the file contains only excludes, all files and functions except for the
excluded ones will be instrumented. Otherwise, only the files and functions
specified will be instrumented.

Instrument function groups
^^^^^^^^^^^^^^^^^^^^^^^^^^

Sometimes it is desirable to minimize the size overhead of instrumented
binaries. One way to do this is to partition functions into groups and only
instrument functions in a specified group. This can be done using the
`-fprofile-function-groups` and `-fprofile-selected-function-group` options.

.. option:: -fprofile-function-groups=<N>, -fprofile-selected-function-group=<i>

The following uses 3 groups

.. code-block:: console
$ clang++ -Oz -fprofile-generate=group_0/ -fprofile-function-groups=3 -fprofile-selected-function-group=0 code.cc -o code.0
$ clang++ -Oz -fprofile-generate=group_1/ -fprofile-function-groups=3 -fprofile-selected-function-group=1 code.cc -o code.1
$ clang++ -Oz -fprofile-generate=group_2/ -fprofile-function-groups=3 -fprofile-selected-function-group=2 code.cc -o code.2
After collecting raw profiles from the three binaries, they can be merged into
a single profile like normal.

.. code-block:: console
$ llvm-profdata merge -output=code.profdata group_*/*.profraw
Profile remapping
^^^^^^^^^^^^^^^^^

Expand Down
4 changes: 4 additions & 0 deletions clang/include/clang/Basic/CodeGenOptions.def
Expand Up @@ -213,6 +213,10 @@ CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic
ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone)
/// Choose profile kind for PGO use compilation.
ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone)
/// Partition functions into N groups and select only functions in group i to be
/// instrumented. Selected group numbers can be 0 to N-1 inclusive.
VALUE_CODEGENOPT(ProfileTotalFunctionGroups, 32, 1)
VALUE_CODEGENOPT(ProfileSelectedFunctionGroup, 32, 0)
CODEGENOPT(CoverageMapping , 1, 0) ///< Generate coverage mapping regions to
///< enable code coverage analysis.
CODEGENOPT(DumpCoverageMapping , 1, 0) ///< Dump the generated coverage mapping
Expand Down
9 changes: 9 additions & 0 deletions clang/include/clang/Driver/Options.td
Expand Up @@ -1333,6 +1333,15 @@ def fprofile_list_EQ : Joined<["-"], "fprofile-list=">,
Group<f_Group>, Flags<[CC1Option, CoreOption]>,
HelpText<"Filename defining the list of functions/files to instrument">,
MarshallingInfoStringVector<LangOpts<"ProfileListFiles">>;
def fprofile_function_groups : Joined<["-"], "fprofile-function-groups=">,
Group<f_Group>, Flags<[CC1Option]>, MetaVarName<"<N>">,
HelpText<"Partition functions into N groups and select only functions in group i to be instrumented using -fprofile-selected-function-group">,
MarshallingInfoInt<CodeGenOpts<"ProfileTotalFunctionGroups">, "1">;
def fprofile_selected_function_group :
Joined<["-"], "fprofile-selected-function-group=">, Group<f_Group>,
Flags<[CC1Option]>, MetaVarName<"<i>">,
HelpText<"Partition functions into N groups using -fprofile-function-groups and select only functions in group i to be instrumented. The valid range is 0 to N-1 inclusive">,
MarshallingInfoInt<CodeGenOpts<"ProfileSelectedFunctionGroup">>;
def fswift_async_fp_EQ : Joined<["-"], "fswift-async-fp=">,
Group<f_Group>, Flags<[CC1Option, CC1AsOption, CoreOption]>, MetaVarName<"<option>">,
HelpText<"Control emission of Swift async extended frame info">,
Expand Down
2 changes: 1 addition & 1 deletion clang/lib/CodeGen/CodeGenFunction.cpp
Expand Up @@ -852,7 +852,7 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy,
}

if (CGM.getCodeGenOpts().getProfileInstr() != CodeGenOptions::ProfileNone)
if (CGM.isProfileInstrExcluded(Fn, Loc))
if (CGM.isFunctionBlockedFromProfileInstr(Fn, Loc))
Fn->addFnAttr(llvm::Attribute::NoProfile);

unsigned Count, Offset;
Expand Down
19 changes: 17 additions & 2 deletions clang/lib/CodeGen/CodeGenModule.cpp
Expand Up @@ -58,6 +58,7 @@
#include "llvm/IR/Module.h"
#include "llvm/IR/ProfileSummary.h"
#include "llvm/ProfileData/InstrProfReader.h"
#include "llvm/Support/CRC.h"
#include "llvm/Support/CodeGen.h"
#include "llvm/Support/CommandLine.h"
#include "llvm/Support/ConvertUTF.h"
Expand Down Expand Up @@ -2843,8 +2844,8 @@ bool CodeGenModule::imbueXRayAttrs(llvm::Function *Fn, SourceLocation Loc,
return true;
}

bool CodeGenModule::isProfileInstrExcluded(llvm::Function *Fn,
SourceLocation Loc) const {
bool CodeGenModule::isFunctionBlockedByProfileList(llvm::Function *Fn,
SourceLocation Loc) const {
const auto &ProfileList = getContext().getProfileList();
// If the profile list is empty, then instrument everything.
if (ProfileList.isEmpty())
Expand All @@ -2871,6 +2872,20 @@ bool CodeGenModule::isProfileInstrExcluded(llvm::Function *Fn,
return ProfileList.getDefault();
}

bool CodeGenModule::isFunctionBlockedFromProfileInstr(
llvm::Function *Fn, SourceLocation Loc) const {
if (isFunctionBlockedByProfileList(Fn, Loc))
return true;

auto NumGroups = getCodeGenOpts().ProfileTotalFunctionGroups;
if (NumGroups > 1) {
auto Group = llvm::crc32(arrayRefFromStringRef(Fn->getName())) % NumGroups;
if (Group != getCodeGenOpts().ProfileSelectedFunctionGroup)
return true;
}
return false;
}

bool CodeGenModule::MustBeEmitted(const ValueDecl *Global) {
// Never defer when EmitAllDecls is specified.
if (LangOpts.EmitAllDecls)
Expand Down
12 changes: 9 additions & 3 deletions clang/lib/CodeGen/CodeGenModule.h
Expand Up @@ -1340,9 +1340,15 @@ class CodeGenModule : public CodeGenTypeCache {
bool imbueXRayAttrs(llvm::Function *Fn, SourceLocation Loc,
StringRef Category = StringRef()) const;

/// Returns true if function at the given location should be excluded from
/// profile instrumentation.
bool isProfileInstrExcluded(llvm::Function *Fn, SourceLocation Loc) const;
/// \returns true if \p Fn at \p Loc should be excluded from profile
/// instrumentation by the SCL passed by \p -fprofile-list.
bool isFunctionBlockedByProfileList(llvm::Function *Fn,
SourceLocation Loc) const;

/// \returns true if \p Fn at \p Loc should be excluded from profile
/// instrumentation.
bool isFunctionBlockedFromProfileInstr(llvm::Function *Fn,
SourceLocation Loc) const;

SanitizerMetadata *getSanitizerMetadata() {
return SanitizerMD.get();
Expand Down
21 changes: 21 additions & 0 deletions clang/lib/Driver/ToolChains/Clang.cpp
Expand Up @@ -956,6 +956,27 @@ static void addPGOAndCoverageFlags(const ToolChain &TC, Compilation &C,
CmdArgs.push_back("-fprofile-update=atomic");
}

int FunctionGroups = 1;
int SelectedFunctionGroup = 0;
if (const auto *A = Args.getLastArg(options::OPT_fprofile_function_groups)) {
StringRef Val = A->getValue();
if (Val.getAsInteger(0, FunctionGroups) || FunctionGroups < 1)
D.Diag(diag::err_drv_invalid_int_value) << A->getAsString(Args) << Val;
}
if (const auto *A =
Args.getLastArg(options::OPT_fprofile_selected_function_group)) {
StringRef Val = A->getValue();
if (Val.getAsInteger(0, SelectedFunctionGroup) ||
SelectedFunctionGroup < 0 || SelectedFunctionGroup >= FunctionGroups)
D.Diag(diag::err_drv_invalid_int_value) << A->getAsString(Args) << Val;
}
if (FunctionGroups != 1)
CmdArgs.push_back(Args.MakeArgString("-fprofile-function-groups=" +
Twine(FunctionGroups)));
if (SelectedFunctionGroup != 0)
CmdArgs.push_back(Args.MakeArgString("-fprofile-selected-function-group=" +
Twine(SelectedFunctionGroup)));

// Leave -fprofile-dir= an unused argument unless .gcda emission is
// enabled. To be polite, with '-fprofile-arcs -fno-profile-arcs' consider
// the flag used. There is no -fno-profile-dir, so the user has no
Expand Down
24 changes: 24 additions & 0 deletions clang/test/CodeGen/profile-function-groups.c
@@ -0,0 +1,24 @@
// RUN: %clang -fprofile-generate -fprofile-function-groups=3 -fprofile-selected-function-group=0 -emit-llvm -S %s -o - | FileCheck %s --check-prefixes=CHECK,SELECT0
// RUN: %clang -fprofile-generate -fprofile-function-groups=3 -fprofile-selected-function-group=1 -emit-llvm -S %s -o - | FileCheck %s --check-prefixes=CHECK,SELECT1
// RUN: %clang -fprofile-generate -fprofile-function-groups=3 -fprofile-selected-function-group=2 -emit-llvm -S %s -o - | FileCheck %s --check-prefixes=CHECK,SELECT2

// Group 0
// SELECT0-NOT: noprofile
// SELECT1: noprofile
// SELECT2: noprofile
// CHECK: define {{.*}} @hoo()
void hoo() {}

// Group 1
// SELECT0: noprofile
// SELECT1-NOT: noprofile
// SELECT2: noprofile
// CHECK: define {{.*}} @goo()
void goo() {}

// Group 2
// SELECT0: noprofile
// SELECT1: noprofile
// SELECT2-NOT: noprofile
// CHECK: define {{.*}} @boo()
void boo() {}
28 changes: 28 additions & 0 deletions compiler-rt/test/profile/instrprof-groups.c
@@ -0,0 +1,28 @@
// RUN: %clang_pgogen -fprofile-function-groups=3 -fprofile-selected-function-group=0 %s -o %t.0.out
// RUN: %clang_pgogen -fprofile-function-groups=3 -fprofile-selected-function-group=1 %s -o %t.1.out
// RUN: %clang_pgogen -fprofile-function-groups=3 -fprofile-selected-function-group=2 %s -o %t.2.out
// RUN: env LLVM_PROFILE_FILE=%t.0.profraw %run %t.0.out
// RUN: env LLVM_PROFILE_FILE=%t.1.profraw %run %t.1.out
// RUN: env LLVM_PROFILE_FILE=%t.2.profraw %run %t.2.out
// RUN: llvm-profdata merge -o %t.profdata %t.*.profraw
// RUN: llvm-profdata show %t.profdata --all-functions | FileCheck %s

int foo(int i) { return 4 * i + 1; }
int bar(int i) { return 4 * i + 2; }
int goo(int i) { return 4 * i + 3; }

int main(int argc, char *argv[]) {
foo(5);
bar(6);
goo(7);
return 0;
}

// Even though we ran this code three times, we expect all counts to be one if
// functions were partitioned into groups correctly.

// CHECK: Counters: 1
// CHECK: Counters: 1
// CHECK: Counters: 1
// CHECK: Counters: 1
// CHECK: Total functions: 4

0 comments on commit af58684

Please sign in to comment.