-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added feature in llvm-profdata merge to filter functions from the profile #78378
Conversation
profile --function=<regex> Include functions matching regex in the output --no-function=<regex> Exclude functions matching regex from the output If both are specified, --no-function has a higher precedence if a function name matches both regexes
@llvm/pr-subscribers-llvm-binary-utilities @llvm/pr-subscribers-pgo Author: William Junda Huang (huangjd) Changes
If both are specified, Full diff: https://github.com/llvm/llvm-project/pull/78378.diff 3 Files Affected:
diff --git a/llvm/include/llvm/ProfileData/SampleProf.h b/llvm/include/llvm/ProfileData/SampleProf.h
index 66aaf602d0e1d9..8ac84d4b933f20 100644
--- a/llvm/include/llvm/ProfileData/SampleProf.h
+++ b/llvm/include/llvm/ProfileData/SampleProf.h
@@ -1330,6 +1330,8 @@ class SampleProfileMap
}
size_t erase(const key_type &Key) { return base_type::erase(Key); }
+
+ iterator erase(iterator It) { return base_type::erase(It); }
};
using NameFunctionSamples = std::pair<hash_code, const FunctionSamples *>;
diff --git a/llvm/test/tools/llvm-profdata/merge-filter.test b/llvm/test/tools/llvm-profdata/merge-filter.test
new file mode 100644
index 00000000000000..8b24cbd6d54110
--- /dev/null
+++ b/llvm/test/tools/llvm-profdata/merge-filter.test
@@ -0,0 +1,62 @@
+Test llvm-profdata merge with function filters.
+
+RUN: llvm-profdata merge --sample %p/Inputs/sample-profile.proftext --text --function="_Z3.*" | FileCheck %s --check-prefix=CHECK-FILTER1
+RUN: llvm-profdata merge --sample %p/Inputs/sample-profile.proftext --text --no-function="main" | FileCheck %s --check-prefix=CHECK-FILTER1
+CHECK-FILTER1: _Z3bari:20301:1437
+CHECK-NEXT: 1: 1437
+CHECK-NEXT: _Z3fooi:7711:610
+CHECK-NEXT: 1: 610
+CHECK-NOT: main
+
+RUN: llvm-profdata merge --sample %p/Inputs/sample-profile.proftext --text --function="_Z3.*" --no-function="fooi$" | FileCheck %s --check-prefix=CHECK-FILTER2
+CHECK-FILTER2: _Z3bari:20301:1437
+CHECK-NEXT: 1: 1437
+CHECK-NOT: main
+CHECK-NOT: _Z3fooi
+
+RUN: llvm-profdata merge --instr %p/Inputs/basic.proftext --text --function="foo" | FileCheck %s --check-prefix=CHECK-FILTER3
+RUN: llvm-profdata merge --instr %p/Inputs/basic.proftext --text --no-function="main" | FileCheck %s --check-prefix=CHECK-FILTER3
+CHECK-FILTER3: foo
+CHECK-NEXT: # Func Hash:
+CHECK-NEXT: 10
+CHECK-NEXT: # Num Counters:
+CHECK-NEXT: 2
+CHECK-NEXT: # Counter Values:
+CHECK-NEXT: 499500
+CHECK-NEXT: 179900
+CHECK-NEXT:
+CHECK-NEXT: foo2
+CHECK-NEXT: # Func Hash:
+CHECK-NEXT: 10
+CHECK-NEXT: # Num Counters:
+CHECK-NEXT: 2
+CHECK-NEXT: # Counter Values:
+CHECK-NEXT: 500500
+CHECK-NEXT: 180100
+
+RUN: llvm-profdata merge --instr %p/Inputs/basic.proftext --text --function="foo" --no-function="^foo$" | FileCheck %s --check-prefix=CHECK-FILTER4
+CHECK-FILTER4: foo2
+CHECK-NEXT: # Func Hash:
+CHECK-NEXT: 10
+CHECK-NEXT: # Num Counters:
+CHECK-NEXT: 2
+CHECK-NEXT: # Counter Values:
+CHECK-NEXT: 500500
+CHECK-NEXT: 180100
+
+RUN: llvm-profdata merge --sample %p/Inputs/cs-sample.proftext --text --function="main.*@.*_Z5funcBi" | FileCheck %s --check-prefix=CHECK-FILTER5
+CHECK-FILTER5: [main:3.1 @ _Z5funcBi:1 @ _Z8funcLeafi]:500853:20
+CHECK-NEXT: 0: 15
+CHECK-NEXT: 1: 15
+CHECK-NEXT: 3: 74946
+CHECK-NEXT: 4: 74941 _Z3fibi:82359
+CHECK-NEXT: 10: 23324
+CHECK-NEXT: 11: 23327 _Z3fibi:25228
+CHECK-NEXT: 15: 11
+CHECK-NEXT: !Attributes: 1
+CHECK-NEXT: [main:3.1 @ _Z5funcBi]:120:19
+CHECK-NEXT: 0: 19
+CHECK-NEXT: 1: 19 _Z8funcLeafi:20
+CHECK-NEXT: 3: 12
+CHECK-NEXT: !Attributes: 1
+
diff --git a/llvm/tools/llvm-profdata/llvm-profdata.cpp b/llvm/tools/llvm-profdata/llvm-profdata.cpp
index 12b81d411cfa91..4c7bf90adb9226 100644
--- a/llvm/tools/llvm-profdata/llvm-profdata.cpp
+++ b/llvm/tools/llvm-profdata/llvm-profdata.cpp
@@ -35,6 +35,7 @@
#include "llvm/Support/MD5.h"
#include "llvm/Support/MemoryBuffer.h"
#include "llvm/Support/Path.h"
+#include "llvm/Support/Regex.h"
#include "llvm/Support/ThreadPool.h"
#include "llvm/Support/Threading.h"
#include "llvm/Support/VirtualFileSystem.h"
@@ -132,9 +133,11 @@ cl::opt<std::string>
cl::sub(MergeSubcommand));
cl::opt<std::string> FuncNameFilter(
"function",
- cl::desc("Details for matching functions. For overlapping CSSPGO, this "
- "takes a function name with calling context."),
- cl::sub(ShowSubcommand), cl::sub(OverlapSubcommand));
+ cl::desc("Only functions matching the filter are shown in the output. For "
+ "overlapping CSSPGO, this takes a function name with calling "
+ "context."),
+ cl::sub(ShowSubcommand), cl::sub(OverlapSubcommand),
+ cl::sub(MergeSubcommand));
// TODO: Consider creating a template class (e.g., MergeOption, ShowOption) to
// factor out the common cl::sub in cl::opt constructor for subcommand-specific
@@ -244,6 +247,10 @@ cl::opt<uint64_t> TemporalProfMaxTraceLength(
cl::sub(MergeSubcommand),
cl::desc("The maximum length of a single temporal profile trace "
"(default: 10000)"));
+cl::opt<std::string> FuncNameNegativeFilter(
+ "no-function", cl::init(""),
+ cl::sub(MergeSubcommand),
+ cl::desc("Exclude functions matching the filter from the output."));
cl::opt<FailureMode>
FailMode("failure-mode", cl::init(failIfAnyAreInvalid),
@@ -760,6 +767,45 @@ static void mergeWriterContexts(WriterContext *Dst, WriterContext *Src) {
});
}
+static StringRef
+getFuncName(const StringMap<InstrProfWriter::ProfilingData>::value_type &Val) {
+ return Val.first();
+}
+
+static std::string
+getFuncName(const SampleProfileMap::value_type &Val) {
+ return Val.second.getContext().toString();
+}
+
+template <typename T>
+static void filterFunctions(T &ProfileMap) {
+ bool hasFilter = !FuncNameFilter.empty();
+ bool hasNegativeFilter = !FuncNameNegativeFilter.empty();
+ if (hasFilter || hasNegativeFilter) {
+ size_t Count = ProfileMap.size();
+
+ llvm::Regex Pattern(FuncNameFilter);
+ llvm::Regex NegativePattern(FuncNameNegativeFilter);
+ std::string Error;
+ if (hasFilter && !Pattern.isValid(Error))
+ exitWithError(Error);
+ if (hasNegativeFilter && !NegativePattern.isValid(Error))
+ exitWithError(Error);
+
+ for (auto I = ProfileMap.begin(); I != ProfileMap.end();) {
+ auto Tmp = I++;
+ const auto &FuncName = getFuncName(*Tmp);
+ // Negative filter has higher precedence than positive filter.
+ if ((hasNegativeFilter && NegativePattern.match(FuncName)) ||
+ (hasFilter && !Pattern.match(FuncName)))
+ ProfileMap.erase(Tmp);
+ }
+
+ llvm::dbgs() << Count - ProfileMap.size() << " of " << Count << " functions"
+ << " in the original profile are filtered.\n";
+ }
+}
+
static void writeInstrProfile(StringRef OutputFilename,
ProfileFormat OutputFormat,
InstrProfWriter &Writer) {
@@ -879,6 +925,8 @@ static void mergeInstrProfile(const WeightedFileVector &Inputs,
(NumErrors > 0 && FailMode == failIfAnyAreInvalid))
exitWithError("no profile can be merged");
+ filterFunctions(Contexts[0]->Writer.getProfileData());
+
writeInstrProfile(OutputFilename, OutputFormat, Contexts[0]->Writer);
}
@@ -1459,6 +1507,8 @@ static void mergeSampleProfile(const WeightedFileVector &Inputs,
ProfileIsCS = FunctionSamples::ProfileIsCS = false;
}
+ filterFunctions(ProfileMap);
+
auto WriterOrErr =
SampleProfileWriter::create(OutputFilename, FormatMap[OutputFormat]);
if (std::error_code EC = WriterOrErr.getError())
|
You can test this locally with the following command:git-clang-format --diff a01b58aef0e42fb1b52e358adf4c56678a884d37 979e27ab6f4e23522c40a8bdbf5ccda48b2a3067 -- llvm/include/llvm/ProfileData/SampleProf.h llvm/tools/llvm-profdata/llvm-profdata.cpp View the diff from clang-format here.diff --git a/llvm/tools/llvm-profdata/llvm-profdata.cpp b/llvm/tools/llvm-profdata/llvm-profdata.cpp
index d00ce49701..4d8c8126fa 100644
--- a/llvm/tools/llvm-profdata/llvm-profdata.cpp
+++ b/llvm/tools/llvm-profdata/llvm-profdata.cpp
@@ -248,8 +248,7 @@ cl::opt<uint64_t> TemporalProfMaxTraceLength(
cl::desc("The maximum length of a single temporal profile trace "
"(default: 10000)"));
cl::opt<std::string> FuncNameNegativeFilter(
- "no-function", cl::init(""),
- cl::sub(MergeSubcommand),
+ "no-function", cl::init(""), cl::sub(MergeSubcommand),
cl::desc("Exclude functions matching the filter from the output."));
cl::opt<FailureMode>
@@ -772,13 +771,11 @@ getFuncName(const StringMap<InstrProfWriter::ProfilingData>::value_type &Val) {
return Val.first();
}
-static std::string
-getFuncName(const SampleProfileMap::value_type &Val) {
+static std::string getFuncName(const SampleProfileMap::value_type &Val) {
return Val.second.getContext().toString();
}
-template <typename T>
-static void filterFunctions(T &ProfileMap) {
+template <typename T> static void filterFunctions(T &ProfileMap) {
bool hasFilter = !FuncNameFilter.empty();
bool hasNegativeFilter = !FuncNameNegativeFilter.empty();
if (!hasFilter && !hasNegativeFilter)
|
Purpose of this feature is to manually exclude certain functions from the profile for investigating compilation bug caused by the profile, or using this as a workaround |
Thanks for the change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please also update llvm-profdata.rst file in CommandGuide.
static void filterFunctions(T &ProfileMap) { | ||
bool hasFilter = !FuncNameFilter.empty(); | ||
bool hasNegativeFilter = !FuncNameNegativeFilter.empty(); | ||
if (hasFilter || hasNegativeFilter) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use early return to avoid the nesting.
Added documentation
When specifying a positive match filter, this will basically strip all inlinees of a matched function because its inlinees are likely to have different names so they will not be matched. We will need another way to specify the filter for inlined functions, as it is probably not intuitive to make the logic of positive and negative filter unexpectedly different. |
--function=<regex>
Include functions matching regex in the output--no-function=<regex>
Exclude functions matching regex from the outputIf both are specified,
--no-function
has a higher precedence if a function name matches both filters