Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PGO] Exposing PGO's Counter Reset and File Dumping APIs #76471

Merged
merged 13 commits into from
Jan 9, 2024
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ ExpandModularHeadersPPCallbacks::ExpandModularHeadersPPCallbacks(
/*OwnsHeaderSearch=*/false);
PP->Initialize(Compiler.getTarget(), Compiler.getAuxTarget());
InitializePreprocessor(*PP, *PO, Compiler.getPCHContainerReader(),
Compiler.getFrontendOpts());
Compiler.getFrontendOpts(), Compiler.getCodeGenOpts());
ApplyHeaderSearchOptions(*HeaderInfo, *HSO, LangOpts,
Compiler.getTarget().getTriple());
}
Expand Down
6 changes: 6 additions & 0 deletions clang/include/clang/Basic/CodeGenOptions.h
Original file line number Diff line number Diff line change
Expand Up @@ -494,6 +494,12 @@ class CodeGenOptions : public CodeGenOptionsBase {
return getProfileInstr() == ProfileCSIRInstr;
}

/// Check if any form of instrumentation is on.
bool hasProfileInstr() const {
return hasProfileClangInstr() || hasProfileIRInstr() ||
hasProfileCSIRInstr();
}

/// Check if Clang profile use is on.
bool hasProfileClangUse() const {
return getProfileUse() == ProfileClangInstr;
Expand Down
4 changes: 3 additions & 1 deletion clang/include/clang/Frontend/Utils.h
Original file line number Diff line number Diff line change
Expand Up @@ -43,12 +43,14 @@ class PCHContainerReader;
class Preprocessor;
class PreprocessorOptions;
class PreprocessorOutputOptions;
class CodeGenOptions;

/// InitializePreprocessor - Initialize the preprocessor getting it and the
/// environment ready to process a single file.
void InitializePreprocessor(Preprocessor &PP, const PreprocessorOptions &PPOpts,
const PCHContainerReader &PCHContainerRdr,
const FrontendOptions &FEOpts);
const FrontendOptions &FEOpts,
const CodeGenOptions &CodeGenOpts);

/// DoPrintPreprocessedInput - Implement -E mode.
void DoPrintPreprocessedInput(Preprocessor &PP, raw_ostream *OS,
Expand Down
2 changes: 1 addition & 1 deletion clang/lib/Frontend/CompilerInstance.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -470,7 +470,7 @@ void CompilerInstance::createPreprocessor(TranslationUnitKind TUKind) {

// Predefine macros and configure the preprocessor.
InitializePreprocessor(*PP, PPOpts, getPCHContainerReader(),
getFrontendOpts());
getFrontendOpts(), getCodeGenOpts());

// Initialize the header search object. In CUDA compilations, we use the aux
// triple (the host triple) to initialize our header search, since we need to
Expand Down
23 changes: 19 additions & 4 deletions clang/lib/Frontend/InitPreprocessor.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1364,12 +1364,22 @@ static void InitializePredefinedMacros(const TargetInfo &TI,
TI.getTargetDefines(LangOpts, Builder);
}

static void InitializePGOProfileMacros(const CodeGenOptions &CodeGenOpts,
qiongsiwu marked this conversation as resolved.
Show resolved Hide resolved
MacroBuilder &Builder) {
if (CodeGenOpts.hasProfileInstr())
Builder.defineMacro("__LLVM_INSTR_PROFILE_GENERATE");
qiongsiwu marked this conversation as resolved.
Show resolved Hide resolved

if (CodeGenOpts.hasProfileIRUse() || CodeGenOpts.hasProfileClangUse())
Builder.defineMacro("__LLVM_INSTR_PROFILE_USE");
}

/// InitializePreprocessor - Initialize the preprocessor getting it and the
/// environment ready to process a single file.
void clang::InitializePreprocessor(
Preprocessor &PP, const PreprocessorOptions &InitOpts,
const PCHContainerReader &PCHContainerRdr,
const FrontendOptions &FEOpts) {
void clang::InitializePreprocessor(Preprocessor &PP,
const PreprocessorOptions &InitOpts,
const PCHContainerReader &PCHContainerRdr,
const FrontendOptions &FEOpts,
const CodeGenOptions &CodeGenOpts) {
const LangOptions &LangOpts = PP.getLangOpts();
std::string PredefineBuffer;
PredefineBuffer.reserve(4080);
Expand Down Expand Up @@ -1416,6 +1426,11 @@ void clang::InitializePreprocessor(
InitializeStandardPredefinedMacros(PP.getTargetInfo(), PP.getLangOpts(),
FEOpts, Builder);

// The PGO instrumentation profile macros are driven by options
// -fprofile[-instr]-generate/-fcs-profile-generate/-fprofile[-instr]-use,
// hence they are not guarded by InitOpts.UsePredefines.
InitializePGOProfileMacros(CodeGenOpts, Builder);

// Add on the predefines from the driver. Wrap in a #line directive to report
// that they come from the command line.
Builder.append("# 1 \"<command line>\" 1");
Expand Down
9 changes: 9 additions & 0 deletions clang/test/Profile/c-general.c
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,15 @@
// Also check compatibility with older profiles.
// RUN: %clang_cc1 -triple x86_64-apple-macosx10.9 -main-file-name c-general.c %s -o - -emit-llvm -fprofile-instrument-use-path=%S/Inputs/c-general.profdata.v1 | FileCheck -allow-deprecated-dag-overlap -check-prefix=PGOUSE %s

// RUN: %clang -fprofile-generate -E -dM %s | FileCheck -match-full-lines -check-prefix=PROFGENMACRO %s
// RUN: %clang -fprofile-instr-generate -E -dM %s | FileCheck -match-full-lines -check-prefix=PROFGENMACRO %s
// RUN: %clang -fcs-profile-generate -E -dM %s | FileCheck -match-full-lines -check-prefix=PROFGENMACRO %s
//
// RUN: %clang -fprofile-use=%t.profdata -E -dM %s | FileCheck -match-full-lines -check-prefix=PROFUSEMACRO %s

// PROFGENMACRO:#define __LLVM_INSTR_PROFILE_GENERATE 1
// PROFUSEMACRO:#define __LLVM_INSTR_PROFILE_USE 1

// PGOGEN: @[[SLC:__profc_simple_loops]] = private global [4 x i64] zeroinitializer
// PGOGEN: @[[IFC:__profc_conditionals]] = private global [13 x i64] zeroinitializer
// PGOGEN: @[[EEC:__profc_early_exits]] = private global [9 x i64] zeroinitializer
Expand Down
1 change: 1 addition & 0 deletions compiler-rt/include/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ endif(COMPILER_RT_BUILD_ORC)
if (COMPILER_RT_BUILD_PROFILE)
set(PROFILE_HEADERS
profile/InstrProfData.inc
profile/instr_prof_interface.h
snehasish marked this conversation as resolved.
Show resolved Hide resolved
)
endif(COMPILER_RT_BUILD_PROFILE)

Expand Down
66 changes: 66 additions & 0 deletions compiler-rt/include/profile/instr_prof_interface.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
/*===---- instr_prof_interface.h - Instrumentation PGO User Program API ----===
*
* Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
* See https://llvm.org/LICENSE.txt for license information.
* SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
*
*===-----------------------------------------------------------------------===
*
* This header provides a public interface for user programs to provide
* fine-grained control of counter reset and profile dumping.
*
\*===---------------------------------------------------------------------===*/

#ifndef COMPILER_RT_INSTR_PROFILING
#define COMPILER_RT_INSTR_PROFILING

#ifdef __cplusplus
extern "C" {
#endif

#ifdef __LLVM_INSTR_PROFILE_GENERATE
// Profile file reset and dump interfaces.
// Only defined when `-fprofile-generate` is in effect.

/*!
* \brief Interface to set all PGO counters to zero for the current process.
*
*/
void __llvm_profile_reset_counters(void);

/*!
* \brief this is a wrapper interface to \c __llvm_profile_write_file.
* After this interface is invoked, an already dumped flag will be set
* so that profile won't be dumped again during program exit.
* Invocation of interface __llvm_profile_reset_counters will clear
* the flag. This interface is designed to be used to collect profile
* data from user selected hot regions. The use model is
* __llvm_profile_reset_counters();
* ... hot region 1
* __llvm_profile_dump();
* .. some other code
* __llvm_profile_reset_counters();
* ... hot region 2
* __llvm_profile_dump();
*
* It is expected that on-line profile merging is on with \c %m specifier
* used in profile filename . If merging is not turned on, user is expected
* to invoke __llvm_profile_set_filename to specify different profile names
qiongsiwu marked this conversation as resolved.
Show resolved Hide resolved
* for different regions before dumping to avoid profile write clobbering.
*/
int __llvm_profile_dump(void);

// Interface to dump the current process' order file to disk.
int __llvm_orderfile_dump(void);

#else
#define __llvm_profile_reset_counters()
#define __llvm_profile_dump()
#define __llvm_orderfile_dump()
#endif

#ifdef __cplusplus
} // extern "C"
#endif

#endif
32 changes: 3 additions & 29 deletions compiler-rt/lib/profile/InstrProfiling.h
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,9 @@
#include "InstrProfilingPort.h"
#include <stdio.h>

#define __LLVM_INSTR_PROFILE_GENERATE
#include "profile/instr_prof_interface.h"

#define INSTR_PROF_VISIBILITY COMPILER_RT_VISIBILITY
#include "profile/InstrProfData.inc"

Expand Down Expand Up @@ -100,12 +103,6 @@ ValueProfNode *__llvm_profile_begin_vnodes();
ValueProfNode *__llvm_profile_end_vnodes();
uint32_t *__llvm_profile_begin_orderfile();

/*!
* \brief Clear profile counters to zero.
*
*/
void __llvm_profile_reset_counters(void);
qiongsiwu marked this conversation as resolved.
Show resolved Hide resolved

/*!
* \brief Merge profile data from buffer.
*
Expand Down Expand Up @@ -156,29 +153,6 @@ void __llvm_profile_instrument_target_value(uint64_t TargetValue, void *Data,
int __llvm_profile_write_file(void);

int __llvm_orderfile_write_file(void);
/*!
* \brief this is a wrapper interface to \c __llvm_profile_write_file.
* After this interface is invoked, an already dumped flag will be set
* so that profile won't be dumped again during program exit.
* Invocation of interface __llvm_profile_reset_counters will clear
* the flag. This interface is designed to be used to collect profile
* data from user selected hot regions. The use model is
* __llvm_profile_reset_counters();
* ... hot region 1
* __llvm_profile_dump();
* .. some other code
* __llvm_profile_reset_counters();
* ... hot region 2
* __llvm_profile_dump();
*
* It is expected that on-line profile merging is on with \c %m specifier
* used in profile filename . If merging is not turned on, user is expected
* to invoke __llvm_profile_set_filename to specify different profile names
* for different regions before dumping to avoid profile write clobbering.
*/
int __llvm_profile_dump(void);

int __llvm_orderfile_dump(void);

/*!
* \brief Set the filename for writing instrumentation data.
Expand Down
26 changes: 26 additions & 0 deletions compiler-rt/test/profile/instrprof-api.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
// RUN: %clang_profgen %s -S -emit-llvm -o - | FileCheck %s --check-prefix=PROFGEN
// RUN: %clang_profgen -o %t %s
// RUN: env LLVM_PROFILE_FILE=%t.profraw %run %t
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need a REQUIRES: ppc for this test otherwise it will fail when this line is executed on non-PPC build bots?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reading your comment, I was about to say the binary is compiled with ppc64le and requires host-byteorder-little-endian but realized the --target=ppc64le-unknown-linux-gnu is only used with -emit-llvm, and the RUN lines that build executables doesn't specify a target.

Nevertheless, I wonder test coverage for %clang_pgogen should be added here as well.

Relatedly, I do get compilation errors on ppc big-endian systems for clang_pgogen in another compiler-rt test (commit and buildbot link)

From the error message, I think lack of ABI implementation should manifest whether it's clang_pgogen or clang_profgen.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, can we drop the --target=ppc64le-unknown-linux-gnu flag? It shouldn't make a difference to what we want to test here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevertheless, I wonder test coverage for %clang_pgogen should be added here as well.

Ah good catch! Thanks! I had missed that. The %clang_pgogen tests are added.

Good point, can we drop the --target=ppc64le-unknown-linux-gnu flag? It shouldn't make a difference to what we want to test here.

Sounds good! The test is revised to avoid any target specific flags.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relatedly, I do get compilation errors on ppc big-endian systems for clang_pgogen in another compiler-rt test (commit and buildbot link)

From the error message, I think lack of ABI implementation should manifest whether it's clang_pgogen or clang_profgen.

Hmmm thanks for pointing this out! I don't have access to a BE machine at the moment to reproduce but I will go find one and take a look.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have access to a BE machine at the moment to reproduce but I will go find one and take a look.

Or maybe just add REQUIRES: host-byteorder-little-endian since this test passes on little-endian systems.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah thanks for the proposal!

We still want to keep this test working for both BE and LE because we want this test to run on AIX as well. In its current shape, the test passes on AIX. If it is fine, given this comment, I can be a bit speculative and let the buildbot alert me if this test is failing on Linux ppc BE.

// RUN: llvm-profdata merge -o %t.profdata %t.profraw
// RUN: %clang_profuse=%t.profdata %s -S -emit-llvm -o - | FileCheck %s --check-prefix=PROFUSE
#include "profile/instr_profiling.h"

__attribute__((noinline)) int bar() { return 4; }

int foo() {
__llvm_profile_reset_counters();
qiongsiwu marked this conversation as resolved.
Show resolved Hide resolved
// PROFGEN: call void @__llvm_profile_reset_counters()
// PROFUSE-NOT: call void @__llvm_profile_reset_counters()
return bar();
}

int main() {
int z = foo() + 3;
__llvm_profile_dump();
// PROFGEN: %call1 = call signext i32 @__llvm_profile_dump()
// PROFUSE-NOT: %call1 = call signext i32 @__llvm_profile_dump()
__llvm_orderfile_dump();
// PROFGEN: %call2 = call signext i32 @__llvm_orderfile_dump()
// PROFUSE-NOT: %call2 = call signext i32 @__llvm_orderfile_dump()
return z + bar() - 11;
}