Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PGO] Reland PGO's Counter Reset and File Dumping APIs #76471 #78285

Merged
merged 3 commits into from Jan 22, 2024

Conversation

qiongsiwu
Copy link
Contributor

@qiongsiwu qiongsiwu commented Jan 16, 2024

#76471 caused buildbot failures on Windows. For more details, see #77546.

This PR revises the test and relands #76471.

@llvmbot llvmbot added clang Clang issues not falling into any other category compiler-rt clang-tidy clang:frontend Language frontend issues, e.g. anything involving "Sema" PGO Profile Guided Optimizations labels Jan 16, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Jan 16, 2024

@llvm/pr-subscribers-clang-tools-extra
@llvm/pr-subscribers-clang-tidy

@llvm/pr-subscribers-clang

Author: Qiongsi Wu (qiongsiwu)

Changes

#76471 caused buildbot failure on Windows. For more details, see #77546.

This PR revises the test and relands #76471.


Patch is 21.08 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/78285.diff

13 Files Affected:

  • (modified) clang-tools-extra/clang-tidy/ExpandModularHeadersPPCallbacks.cpp (+1-1)
  • (modified) clang/docs/UsersManual.rst (+104)
  • (modified) clang/include/clang/Basic/CodeGenOptions.h (+3)
  • (modified) clang/include/clang/Frontend/Utils.h (+3-1)
  • (modified) clang/lib/Frontend/CompilerInstance.cpp (+1-1)
  • (modified) clang/lib/Frontend/InitPreprocessor.cpp (+19-4)
  • (modified) clang/test/Profile/c-general.c (+10)
  • (modified) compiler-rt/include/CMakeLists.txt (+1)
  • (added) compiler-rt/include/profile/instr_prof_interface.h (+92)
  • (modified) compiler-rt/lib/profile/InstrProfiling.h (+11-50)
  • (added) compiler-rt/test/profile/Inputs/instrprof-api.c.profdata ()
  • (added) compiler-rt/test/profile/Linux/instrprof-weak-symbol.c (+16)
  • (added) compiler-rt/test/profile/instrprof-api.c (+39)
diff --git a/clang-tools-extra/clang-tidy/ExpandModularHeadersPPCallbacks.cpp b/clang-tools-extra/clang-tidy/ExpandModularHeadersPPCallbacks.cpp
index e414ac8c770508f..5ecd4fb19131e43 100644
--- a/clang-tools-extra/clang-tidy/ExpandModularHeadersPPCallbacks.cpp
+++ b/clang-tools-extra/clang-tidy/ExpandModularHeadersPPCallbacks.cpp
@@ -100,7 +100,7 @@ ExpandModularHeadersPPCallbacks::ExpandModularHeadersPPCallbacks(
                                               /*OwnsHeaderSearch=*/false);
   PP->Initialize(Compiler.getTarget(), Compiler.getAuxTarget());
   InitializePreprocessor(*PP, *PO, Compiler.getPCHContainerReader(),
-                         Compiler.getFrontendOpts());
+                         Compiler.getFrontendOpts(), Compiler.getCodeGenOpts());
   ApplyHeaderSearchOptions(*HeaderInfo, *HSO, LangOpts,
                            Compiler.getTarget().getTriple());
 }
diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst
index 7c30570437e8b01..27c629a1ffc6576 100644
--- a/clang/docs/UsersManual.rst
+++ b/clang/docs/UsersManual.rst
@@ -2809,6 +2809,110 @@ indexed format, regardeless whether it is produced by frontend or the IR pass.
   overhead. ``prefer-atomic`` will be transformed to ``atomic`` when supported
   by the target, or ``single`` otherwise.
 
+Fine Tuning Profile Collection
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The PGO infrastructure provides user program knobs to fine tune profile
+collection. Specifically, the PGO runtime provides the following functions
+that can be used to control the regions in the program where profiles should
+be collected.
+
+ * ``void __llvm_profile_set_filename(const char *Name)``: changes the name of
+   the profile file to ``Name``.
+ * ``void __llvm_profile_reset_counters(void)``: resets all counters to zero.
+ * ``int __llvm_profile_dump(void)``: write the profile data to disk.
+ * ``int __llvm_orderfile_dump(void)``: write the order file to disk.
+
+For example, the following pattern can be used to skip profiling program
+initialization, profile two specific hot regions, and skip profiling program
+cleanup:
+
+.. code-block:: c
+
+    int main() {
+      initialize();
+
+      // Reset all profile counters to 0 to omit profile collected during
+      // initialize()'s execution.
+      __llvm_profile_reset_counters();
+      ... hot region 1
+      // Dump the profile for hot region 1.
+      __llvm_profile_set_filename("region1.profraw");
+      __llvm_profile_dump();
+
+      // Reset counters before proceeding to hot region 2.
+      __llvm_profile_reset_counters();
+      ... hot region 2
+      // Dump the profile for hot region 2.
+      __llvm_profile_set_filename("region2.profraw");
+      __llvm_profile_dump();
+
+      // Since the profile has been dumped, no further profile data
+      // will be collected beyond the above __llvm_profile_dump().
+      cleanup();
+      return 0;
+    }
+
+These APIs' names can be introduced to user programs in two ways.
+They can be declared as weak symbols on platforms which support
+treating weak symbols as ``null`` during linking. For example, the user can
+have
+
+.. code-block:: c
+
+    __attribute__((weak)) int __llvm_profile_dump(void);
+
+    // Then later in the same source file
+    if (__llvm_profile_dump)
+      if (__llvm_profile_dump() != 0) { ... }
+    // The first if condition tests if the symbol is actually defined.
+    // Profile dumping only happens if the symbol is defined. Hence,
+    // the user program works correctly during normal (not profile-generate)
+    // executions.
+
+Alternatively, the user program can include the header
+``profile/instr_prof_interface.h``, which contains the API names. For example,
+
+.. code-block:: c
+
+    #include "profile/instr_prof_interface.h"
+
+    // Then later in the same source file
+    if (__llvm_profile_dump() != 0) { ... }
+
+The user code does not need to check if the API names are defined, because
+these names are automatically replaced by ``(0)`` or the equivalence of noop
+if the ``clang`` is not compiling for profile generation.
+
+Such replacement can happen because ``clang`` adds one of two macros depending
+on the ``-fprofile-generate`` and the ``-fprofile-use`` flags.
+
+ * ``__LLVM_INSTR_PROFILE_GENERATE``: defined when one of
+   ``-fprofile[-instr]-generate``/``-fcs-profile-generate`` is in effect.
+ * ``__LLVM_INSTR_PROFILE_USE``: defined when one of
+   ``-fprofile-use``/``-fprofile-instr-use`` is in effect.
+
+The two macros can be used to provide more flexibiilty so a user program
+can execute code specifically intended for profile generate or profile use.
+For example, a user program can have special logging during profile generate:
+
+.. code-block:: c
+
+    #if __LLVM_INSTR_PROFILE_GENERATE
+    expensive_logging_of_full_program_state();
+    #endif
+
+The logging is automatically excluded during a normal build of the program,
+hence it does not impact performance during a normal execution.
+
+It is advised to use such fine tuning only in a program's cold regions. The weak
+symbols can introduce extra control flow (the ``if`` checks), while the macros
+(hence declarations they guard in ``profile/instr_prof_interface.h``)
+can change the control flow of the functions that use them between profile
+generation and profile use (which can lead to discarded counters in such
+functions). Using these APIs in the program's cold regions introduces less
+overhead and leads to more optimized code.
+
 Disabling Instrumentation
 ^^^^^^^^^^^^^^^^^^^^^^^^^
 
diff --git a/clang/include/clang/Basic/CodeGenOptions.h b/clang/include/clang/Basic/CodeGenOptions.h
index 6952b48e898a819..3f8fe385fef3dff 100644
--- a/clang/include/clang/Basic/CodeGenOptions.h
+++ b/clang/include/clang/Basic/CodeGenOptions.h
@@ -494,6 +494,9 @@ class CodeGenOptions : public CodeGenOptionsBase {
     return getProfileInstr() == ProfileCSIRInstr;
   }
 
+  /// Check if any form of instrumentation is on.
+  bool hasProfileInstr() const { return getProfileInstr() != ProfileNone; }
+
   /// Check if Clang profile use is on.
   bool hasProfileClangUse() const {
     return getProfileUse() == ProfileClangInstr;
diff --git a/clang/include/clang/Frontend/Utils.h b/clang/include/clang/Frontend/Utils.h
index 143cf4359f00b50..604e42067a3f1e8 100644
--- a/clang/include/clang/Frontend/Utils.h
+++ b/clang/include/clang/Frontend/Utils.h
@@ -43,12 +43,14 @@ class PCHContainerReader;
 class Preprocessor;
 class PreprocessorOptions;
 class PreprocessorOutputOptions;
+class CodeGenOptions;
 
 /// InitializePreprocessor - Initialize the preprocessor getting it and the
 /// environment ready to process a single file.
 void InitializePreprocessor(Preprocessor &PP, const PreprocessorOptions &PPOpts,
                             const PCHContainerReader &PCHContainerRdr,
-                            const FrontendOptions &FEOpts);
+                            const FrontendOptions &FEOpts,
+                            const CodeGenOptions &CodeGenOpts);
 
 /// DoPrintPreprocessedInput - Implement -E mode.
 void DoPrintPreprocessedInput(Preprocessor &PP, raw_ostream *OS,
diff --git a/clang/lib/Frontend/CompilerInstance.cpp b/clang/lib/Frontend/CompilerInstance.cpp
index 56bbef9697b6500..ea44a26b6db7da4 100644
--- a/clang/lib/Frontend/CompilerInstance.cpp
+++ b/clang/lib/Frontend/CompilerInstance.cpp
@@ -470,7 +470,7 @@ void CompilerInstance::createPreprocessor(TranslationUnitKind TUKind) {
 
   // Predefine macros and configure the preprocessor.
   InitializePreprocessor(*PP, PPOpts, getPCHContainerReader(),
-                         getFrontendOpts());
+                         getFrontendOpts(), getCodeGenOpts());
 
   // Initialize the header search object.  In CUDA compilations, we use the aux
   // triple (the host triple) to initialize our header search, since we need to
diff --git a/clang/lib/Frontend/InitPreprocessor.cpp b/clang/lib/Frontend/InitPreprocessor.cpp
index d83128adb511ef4..fe0fd3614113c45 100644
--- a/clang/lib/Frontend/InitPreprocessor.cpp
+++ b/clang/lib/Frontend/InitPreprocessor.cpp
@@ -1364,12 +1364,22 @@ static void InitializePredefinedMacros(const TargetInfo &TI,
   TI.getTargetDefines(LangOpts, Builder);
 }
 
+static void InitializePGOProfileMacros(const CodeGenOptions &CodeGenOpts,
+                                       MacroBuilder &Builder) {
+  if (CodeGenOpts.hasProfileInstr())
+    Builder.defineMacro("__LLVM_INSTR_PROFILE_GENERATE");
+
+  if (CodeGenOpts.hasProfileIRUse() || CodeGenOpts.hasProfileClangUse())
+    Builder.defineMacro("__LLVM_INSTR_PROFILE_USE");
+}
+
 /// InitializePreprocessor - Initialize the preprocessor getting it and the
 /// environment ready to process a single file.
-void clang::InitializePreprocessor(
-    Preprocessor &PP, const PreprocessorOptions &InitOpts,
-    const PCHContainerReader &PCHContainerRdr,
-    const FrontendOptions &FEOpts) {
+void clang::InitializePreprocessor(Preprocessor &PP,
+                                   const PreprocessorOptions &InitOpts,
+                                   const PCHContainerReader &PCHContainerRdr,
+                                   const FrontendOptions &FEOpts,
+                                   const CodeGenOptions &CodeGenOpts) {
   const LangOptions &LangOpts = PP.getLangOpts();
   std::string PredefineBuffer;
   PredefineBuffer.reserve(4080);
@@ -1416,6 +1426,11 @@ void clang::InitializePreprocessor(
   InitializeStandardPredefinedMacros(PP.getTargetInfo(), PP.getLangOpts(),
                                      FEOpts, Builder);
 
+  // The PGO instrumentation profile macros are driven by options
+  // -fprofile[-instr]-generate/-fcs-profile-generate/-fprofile[-instr]-use,
+  // hence they are not guarded by InitOpts.UsePredefines.
+  InitializePGOProfileMacros(CodeGenOpts, Builder);
+
   // Add on the predefines from the driver.  Wrap in a #line directive to report
   // that they come from the command line.
   Builder.append("# 1 \"<command line>\" 1");
diff --git a/clang/test/Profile/c-general.c b/clang/test/Profile/c-general.c
index b841f9c3d2a1d1e..2f621ec9b0bf9db 100644
--- a/clang/test/Profile/c-general.c
+++ b/clang/test/Profile/c-general.c
@@ -9,6 +9,16 @@
 // Also check compatibility with older profiles.
 // RUN: %clang_cc1 -triple x86_64-apple-macosx10.9 -main-file-name c-general.c %s -o - -emit-llvm -fprofile-instrument-use-path=%S/Inputs/c-general.profdata.v1 | FileCheck -allow-deprecated-dag-overlap  -check-prefix=PGOUSE %s
 
+// RUN: %clang -fprofile-generate -E -dM %s | FileCheck -match-full-lines -check-prefix=PROFGENMACRO %s
+// RUN: %clang -fprofile-instr-generate -E -dM %s | FileCheck -match-full-lines -check-prefix=PROFGENMACRO %s
+// RUN: %clang -fcs-profile-generate -E -dM %s | FileCheck -match-full-lines -check-prefix=PROFGENMACRO %s
+//
+// RUN: %clang -fprofile-use=%t.profdata -E -dM %s | FileCheck -match-full-lines -check-prefix=PROFUSEMACRO %s
+// RUN: %clang -fprofile-instr-use=%t.profdata -E -dM %s | FileCheck -match-full-lines -check-prefix=PROFUSEMACRO %s
+
+// PROFGENMACRO:#define __LLVM_INSTR_PROFILE_GENERATE 1
+// PROFUSEMACRO:#define __LLVM_INSTR_PROFILE_USE 1
+
 // PGOGEN: @[[SLC:__profc_simple_loops]] = private global [4 x i64] zeroinitializer
 // PGOGEN: @[[IFC:__profc_conditionals]] = private global [13 x i64] zeroinitializer
 // PGOGEN: @[[EEC:__profc_early_exits]] = private global [9 x i64] zeroinitializer
diff --git a/compiler-rt/include/CMakeLists.txt b/compiler-rt/include/CMakeLists.txt
index 78427beedb3cc4e..7a100c66bbcfda8 100644
--- a/compiler-rt/include/CMakeLists.txt
+++ b/compiler-rt/include/CMakeLists.txt
@@ -44,6 +44,7 @@ endif(COMPILER_RT_BUILD_ORC)
 if (COMPILER_RT_BUILD_PROFILE)
   set(PROFILE_HEADERS
     profile/InstrProfData.inc
+    profile/instr_prof_interface.h
     )
 endif(COMPILER_RT_BUILD_PROFILE)
 
diff --git a/compiler-rt/include/profile/instr_prof_interface.h b/compiler-rt/include/profile/instr_prof_interface.h
new file mode 100644
index 000000000000000..be40f2685934bea
--- /dev/null
+++ b/compiler-rt/include/profile/instr_prof_interface.h
@@ -0,0 +1,92 @@
+/*===---- instr_prof_interface.h - Instrumentation PGO User Program API ----===
+ *
+ * Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+ * See https://llvm.org/LICENSE.txt for license information.
+ * SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+ *
+ *===-----------------------------------------------------------------------===
+ *
+ * This header provides a public interface for fine-grained control of counter
+ * reset and profile dumping. These interface functions can be directly called
+ * in user programs.
+ *
+\*===---------------------------------------------------------------------===*/
+
+#ifndef COMPILER_RT_INSTR_PROFILING
+#define COMPILER_RT_INSTR_PROFILING
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#ifdef __LLVM_INSTR_PROFILE_GENERATE
+// Profile file reset and dump interfaces.
+// When `-fprofile[-instr]-generate`/`-fcs-profile-generate` is in effect,
+// clang defines __LLVM_INSTR_PROFILE_GENERATE to pick up the API calls.
+
+/*!
+ * \brief Set the filename for writing instrumentation data.
+ *
+ * Sets the filename to be used for subsequent calls to
+ * \a __llvm_profile_write_file().
+ *
+ * \c Name is not copied, so it must remain valid.  Passing NULL resets the
+ * filename logic to the default behaviour.
+ *
+ * Note: There may be multiple copies of the profile runtime (one for each
+ * instrumented image/DSO). This API only modifies the filename within the
+ * copy of the runtime available to the calling image.
+ *
+ * Warning: This is a no-op if continuous mode (\ref
+ * __llvm_profile_is_continuous_mode_enabled) is on. The reason for this is
+ * that in continuous mode, profile counters are mmap()'d to the profile at
+ * program initialization time. Support for transferring the mmap'd profile
+ * counts to a new file has not been implemented.
+ */
+void __llvm_profile_set_filename(const char *Name);
+
+/*!
+ * \brief Interface to set all PGO counters to zero for the current process.
+ *
+ */
+void __llvm_profile_reset_counters(void);
+
+/*!
+ * \brief this is a wrapper interface to \c __llvm_profile_write_file.
+ * After this interface is invoked, an already dumped flag will be set
+ * so that profile won't be dumped again during program exit.
+ * Invocation of interface __llvm_profile_reset_counters will clear
+ * the flag. This interface is designed to be used to collect profile
+ * data from user selected hot regions. The use model is
+ *      __llvm_profile_reset_counters();
+ *      ... hot region 1
+ *      __llvm_profile_dump();
+ *      .. some other code
+ *      __llvm_profile_reset_counters();
+ *      ... hot region 2
+ *      __llvm_profile_dump();
+ *
+ *  It is expected that on-line profile merging is on with \c %m specifier
+ *  used in profile filename . If merging is not turned on, user is expected
+ *  to invoke __llvm_profile_set_filename to specify different profile names
+ *  for different regions before dumping to avoid profile write clobbering.
+ */
+int __llvm_profile_dump(void);
+
+// Interface to dump the current process' order file to disk.
+int __llvm_orderfile_dump(void);
+
+#else
+
+#define __llvm_profile_set_filename(Name)
+#define __llvm_profile_reset_counters()
+#define __llvm_profile_dump() (0)
+#define __llvm_orderfile_dump() (0)
+
+#endif
+
+#ifdef __cplusplus
+} // extern "C"
+#endif
+
+#endif
diff --git a/compiler-rt/lib/profile/InstrProfiling.h b/compiler-rt/lib/profile/InstrProfiling.h
index 137115996748ce3..012390833691877 100644
--- a/compiler-rt/lib/profile/InstrProfiling.h
+++ b/compiler-rt/lib/profile/InstrProfiling.h
@@ -12,6 +12,17 @@
 #include "InstrProfilingPort.h"
 #include <stdio.h>
 
+// Make sure __LLVM_INSTR_PROFILE_GENERATE is always defined before
+// including instr_prof_interface.h so the interface functions are
+// declared correctly for the runtime.
+// __LLVM_INSTR_PROFILE_GENERATE is always `#undef`ed after the header,
+// because compiler-rt does not support profiling the profiling runtime itself.
+#ifndef __LLVM_INSTR_PROFILE_GENERATE
+#define __LLVM_INSTR_PROFILE_GENERATE
+#endif
+#include "profile/instr_prof_interface.h"
+#undef __LLVM_INSTR_PROFILE_GENERATE
+
 #define INSTR_PROF_VISIBILITY COMPILER_RT_VISIBILITY
 #include "profile/InstrProfData.inc"
 
@@ -100,12 +111,6 @@ ValueProfNode *__llvm_profile_begin_vnodes();
 ValueProfNode *__llvm_profile_end_vnodes();
 uint32_t *__llvm_profile_begin_orderfile();
 
-/*!
- * \brief Clear profile counters to zero.
- *
- */
-void __llvm_profile_reset_counters(void);
-
 /*!
  * \brief Merge profile data from buffer.
  *
@@ -156,50 +161,6 @@ void __llvm_profile_instrument_target_value(uint64_t TargetValue, void *Data,
 int __llvm_profile_write_file(void);
 
 int __llvm_orderfile_write_file(void);
-/*!
- * \brief this is a wrapper interface to \c __llvm_profile_write_file.
- * After this interface is invoked, an already dumped flag will be set
- * so that profile won't be dumped again during program exit.
- * Invocation of interface __llvm_profile_reset_counters will clear
- * the flag. This interface is designed to be used to collect profile
- * data from user selected hot regions. The use model is
- *      __llvm_profile_reset_counters();
- *      ... hot region 1
- *      __llvm_profile_dump();
- *      .. some other code
- *      __llvm_profile_reset_counters();
- *       ... hot region 2
- *      __llvm_profile_dump();
- *
- *  It is expected that on-line profile merging is on with \c %m specifier
- *  used in profile filename . If merging is  not turned on, user is expected
- *  to invoke __llvm_profile_set_filename  to specify different profile names
- *  for different regions before dumping to avoid profile write clobbering.
- */
-int __llvm_profile_dump(void);
-
-int __llvm_orderfile_dump(void);
-
-/*!
- * \brief Set the filename for writing instrumentation data.
- *
- * Sets the filename to be used for subsequent calls to
- * \a __llvm_profile_write_file().
- *
- * \c Name is not copied, so it must remain valid.  Passing NULL resets the
- * filename logic to the default behaviour.
- *
- * Note: There may be multiple copies of the profile runtime (one for each
- * instrumented image/DSO). This API only modifies the filename within the
- * copy of the runtime available to the calling image.
- *
- * Warning: This is a no-op if continuous mode (\ref
- * __llvm_profile_is_continuous_mode_enabled) is on. The reason for this is
- * that in continuous mode, profile counters are mmap()'d to the profile at
- * program initialization time. Support for transferring the mmap'd profile
- * counts to a new file has not been implemented.
- */
-void __llvm_profile_set_filename(const char *Name);
 
 /*!
  * \brief Set the FILE object for writing instrumentation data. Return 0 if set
diff --git a/compiler-rt/test/profile/Inputs/instrprof-api.c.profdata b/compiler-rt/test/profile/Inputs/instrprof-api.c.profdata
new file mode 100644
index 000000000000000..a0bc9f6e7aa0994
Binary files /dev/null and b/compiler-rt/test/profile/Inputs/instrprof-api.c.profdata differ
diff --git a/compiler-rt/test/profile/Linux/instrprof-weak-symbol.c b/compiler-rt/test/profile/Linux/instrprof-weak-symbol.c
new file mode 100644
index 000000000000000..eda299cb6610e41
--- /dev/null
+++ b/compiler-rt/test/profile/Linux/instrprof-weak-symbol.c
@@ -0,0 +1,16 @@
+// Test the linker feature that treats undefined weak symbols as null values.
+
+// RUN: %clang_pgogen -o %t %s
+// RUN: not %t
+// RUN: %clang -o %t %s
+// RUN: %t
+
+__attribute__((weak)) void __llvm_profile_reset_counters(void);
+
+int main() {
+  if (__llvm_profile_reset_counters) {
+    __llvm_profile_reset_counters();
+    return 1;
+  }
+  return 0;
+}
diff --git a/compiler-rt/test/profile/instrprof-api.c b/compiler-rt/test/profile/instrprof-api.c
new file mode 100644
index 000000000000000..19928bf33a3eb14
--- /dev/null
+++ b/compiler-rt/test/profile/instrprof-api.c
@@ -0,0 +1,39 @@
+// Testing profile generate.
+// RUN: %clang_profgen %s -S -emit-llvm -o - | FileCheck %s --check-prefix=PROFGEN
+// RUN: %clang_pgogen %s -S -emit-llvm -o - | FileCheck %s --check-prefix=PROFGEN
+
+// Testing profile use.
+// RUN: %clang_pgouse=%S/Inputs/instrprof-api.c.profdata %s -S -emit-llvm -o - \
+// RUN:   | FileCheck %s --check-prefix=PROFUSE
+
+#include "profile/instr_prof_interface.h"
+
+__attribute__((noinline)) int bar() { retu...
[truncated]

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 16, 2024

@llvm/pr-subscribers-pgo

Author: Qiongsi Wu (qiongsiwu)

Changes

#76471 caused buildbot failure on Windows. For more details, see #77546.

This PR revises the test and relands #76471.


Patch is 21.08 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/78285.diff

13 Files Affected:

  • (modified) clang-tools-extra/clang-tidy/ExpandModularHeadersPPCallbacks.cpp (+1-1)
  • (modified) clang/docs/UsersManual.rst (+104)
  • (modified) clang/include/clang/Basic/CodeGenOptions.h (+3)
  • (modified) clang/include/clang/Frontend/Utils.h (+3-1)
  • (modified) clang/lib/Frontend/CompilerInstance.cpp (+1-1)
  • (modified) clang/lib/Frontend/InitPreprocessor.cpp (+19-4)
  • (modified) clang/test/Profile/c-general.c (+10)
  • (modified) compiler-rt/include/CMakeLists.txt (+1)
  • (added) compiler-rt/include/profile/instr_prof_interface.h (+92)
  • (modified) compiler-rt/lib/profile/InstrProfiling.h (+11-50)
  • (added) compiler-rt/test/profile/Inputs/instrprof-api.c.profdata ()
  • (added) compiler-rt/test/profile/Linux/instrprof-weak-symbol.c (+16)
  • (added) compiler-rt/test/profile/instrprof-api.c (+39)
diff --git a/clang-tools-extra/clang-tidy/ExpandModularHeadersPPCallbacks.cpp b/clang-tools-extra/clang-tidy/ExpandModularHeadersPPCallbacks.cpp
index e414ac8c770508f..5ecd4fb19131e43 100644
--- a/clang-tools-extra/clang-tidy/ExpandModularHeadersPPCallbacks.cpp
+++ b/clang-tools-extra/clang-tidy/ExpandModularHeadersPPCallbacks.cpp
@@ -100,7 +100,7 @@ ExpandModularHeadersPPCallbacks::ExpandModularHeadersPPCallbacks(
                                               /*OwnsHeaderSearch=*/false);
   PP->Initialize(Compiler.getTarget(), Compiler.getAuxTarget());
   InitializePreprocessor(*PP, *PO, Compiler.getPCHContainerReader(),
-                         Compiler.getFrontendOpts());
+                         Compiler.getFrontendOpts(), Compiler.getCodeGenOpts());
   ApplyHeaderSearchOptions(*HeaderInfo, *HSO, LangOpts,
                            Compiler.getTarget().getTriple());
 }
diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst
index 7c30570437e8b01..27c629a1ffc6576 100644
--- a/clang/docs/UsersManual.rst
+++ b/clang/docs/UsersManual.rst
@@ -2809,6 +2809,110 @@ indexed format, regardeless whether it is produced by frontend or the IR pass.
   overhead. ``prefer-atomic`` will be transformed to ``atomic`` when supported
   by the target, or ``single`` otherwise.
 
+Fine Tuning Profile Collection
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The PGO infrastructure provides user program knobs to fine tune profile
+collection. Specifically, the PGO runtime provides the following functions
+that can be used to control the regions in the program where profiles should
+be collected.
+
+ * ``void __llvm_profile_set_filename(const char *Name)``: changes the name of
+   the profile file to ``Name``.
+ * ``void __llvm_profile_reset_counters(void)``: resets all counters to zero.
+ * ``int __llvm_profile_dump(void)``: write the profile data to disk.
+ * ``int __llvm_orderfile_dump(void)``: write the order file to disk.
+
+For example, the following pattern can be used to skip profiling program
+initialization, profile two specific hot regions, and skip profiling program
+cleanup:
+
+.. code-block:: c
+
+    int main() {
+      initialize();
+
+      // Reset all profile counters to 0 to omit profile collected during
+      // initialize()'s execution.
+      __llvm_profile_reset_counters();
+      ... hot region 1
+      // Dump the profile for hot region 1.
+      __llvm_profile_set_filename("region1.profraw");
+      __llvm_profile_dump();
+
+      // Reset counters before proceeding to hot region 2.
+      __llvm_profile_reset_counters();
+      ... hot region 2
+      // Dump the profile for hot region 2.
+      __llvm_profile_set_filename("region2.profraw");
+      __llvm_profile_dump();
+
+      // Since the profile has been dumped, no further profile data
+      // will be collected beyond the above __llvm_profile_dump().
+      cleanup();
+      return 0;
+    }
+
+These APIs' names can be introduced to user programs in two ways.
+They can be declared as weak symbols on platforms which support
+treating weak symbols as ``null`` during linking. For example, the user can
+have
+
+.. code-block:: c
+
+    __attribute__((weak)) int __llvm_profile_dump(void);
+
+    // Then later in the same source file
+    if (__llvm_profile_dump)
+      if (__llvm_profile_dump() != 0) { ... }
+    // The first if condition tests if the symbol is actually defined.
+    // Profile dumping only happens if the symbol is defined. Hence,
+    // the user program works correctly during normal (not profile-generate)
+    // executions.
+
+Alternatively, the user program can include the header
+``profile/instr_prof_interface.h``, which contains the API names. For example,
+
+.. code-block:: c
+
+    #include "profile/instr_prof_interface.h"
+
+    // Then later in the same source file
+    if (__llvm_profile_dump() != 0) { ... }
+
+The user code does not need to check if the API names are defined, because
+these names are automatically replaced by ``(0)`` or the equivalence of noop
+if the ``clang`` is not compiling for profile generation.
+
+Such replacement can happen because ``clang`` adds one of two macros depending
+on the ``-fprofile-generate`` and the ``-fprofile-use`` flags.
+
+ * ``__LLVM_INSTR_PROFILE_GENERATE``: defined when one of
+   ``-fprofile[-instr]-generate``/``-fcs-profile-generate`` is in effect.
+ * ``__LLVM_INSTR_PROFILE_USE``: defined when one of
+   ``-fprofile-use``/``-fprofile-instr-use`` is in effect.
+
+The two macros can be used to provide more flexibiilty so a user program
+can execute code specifically intended for profile generate or profile use.
+For example, a user program can have special logging during profile generate:
+
+.. code-block:: c
+
+    #if __LLVM_INSTR_PROFILE_GENERATE
+    expensive_logging_of_full_program_state();
+    #endif
+
+The logging is automatically excluded during a normal build of the program,
+hence it does not impact performance during a normal execution.
+
+It is advised to use such fine tuning only in a program's cold regions. The weak
+symbols can introduce extra control flow (the ``if`` checks), while the macros
+(hence declarations they guard in ``profile/instr_prof_interface.h``)
+can change the control flow of the functions that use them between profile
+generation and profile use (which can lead to discarded counters in such
+functions). Using these APIs in the program's cold regions introduces less
+overhead and leads to more optimized code.
+
 Disabling Instrumentation
 ^^^^^^^^^^^^^^^^^^^^^^^^^
 
diff --git a/clang/include/clang/Basic/CodeGenOptions.h b/clang/include/clang/Basic/CodeGenOptions.h
index 6952b48e898a819..3f8fe385fef3dff 100644
--- a/clang/include/clang/Basic/CodeGenOptions.h
+++ b/clang/include/clang/Basic/CodeGenOptions.h
@@ -494,6 +494,9 @@ class CodeGenOptions : public CodeGenOptionsBase {
     return getProfileInstr() == ProfileCSIRInstr;
   }
 
+  /// Check if any form of instrumentation is on.
+  bool hasProfileInstr() const { return getProfileInstr() != ProfileNone; }
+
   /// Check if Clang profile use is on.
   bool hasProfileClangUse() const {
     return getProfileUse() == ProfileClangInstr;
diff --git a/clang/include/clang/Frontend/Utils.h b/clang/include/clang/Frontend/Utils.h
index 143cf4359f00b50..604e42067a3f1e8 100644
--- a/clang/include/clang/Frontend/Utils.h
+++ b/clang/include/clang/Frontend/Utils.h
@@ -43,12 +43,14 @@ class PCHContainerReader;
 class Preprocessor;
 class PreprocessorOptions;
 class PreprocessorOutputOptions;
+class CodeGenOptions;
 
 /// InitializePreprocessor - Initialize the preprocessor getting it and the
 /// environment ready to process a single file.
 void InitializePreprocessor(Preprocessor &PP, const PreprocessorOptions &PPOpts,
                             const PCHContainerReader &PCHContainerRdr,
-                            const FrontendOptions &FEOpts);
+                            const FrontendOptions &FEOpts,
+                            const CodeGenOptions &CodeGenOpts);
 
 /// DoPrintPreprocessedInput - Implement -E mode.
 void DoPrintPreprocessedInput(Preprocessor &PP, raw_ostream *OS,
diff --git a/clang/lib/Frontend/CompilerInstance.cpp b/clang/lib/Frontend/CompilerInstance.cpp
index 56bbef9697b6500..ea44a26b6db7da4 100644
--- a/clang/lib/Frontend/CompilerInstance.cpp
+++ b/clang/lib/Frontend/CompilerInstance.cpp
@@ -470,7 +470,7 @@ void CompilerInstance::createPreprocessor(TranslationUnitKind TUKind) {
 
   // Predefine macros and configure the preprocessor.
   InitializePreprocessor(*PP, PPOpts, getPCHContainerReader(),
-                         getFrontendOpts());
+                         getFrontendOpts(), getCodeGenOpts());
 
   // Initialize the header search object.  In CUDA compilations, we use the aux
   // triple (the host triple) to initialize our header search, since we need to
diff --git a/clang/lib/Frontend/InitPreprocessor.cpp b/clang/lib/Frontend/InitPreprocessor.cpp
index d83128adb511ef4..fe0fd3614113c45 100644
--- a/clang/lib/Frontend/InitPreprocessor.cpp
+++ b/clang/lib/Frontend/InitPreprocessor.cpp
@@ -1364,12 +1364,22 @@ static void InitializePredefinedMacros(const TargetInfo &TI,
   TI.getTargetDefines(LangOpts, Builder);
 }
 
+static void InitializePGOProfileMacros(const CodeGenOptions &CodeGenOpts,
+                                       MacroBuilder &Builder) {
+  if (CodeGenOpts.hasProfileInstr())
+    Builder.defineMacro("__LLVM_INSTR_PROFILE_GENERATE");
+
+  if (CodeGenOpts.hasProfileIRUse() || CodeGenOpts.hasProfileClangUse())
+    Builder.defineMacro("__LLVM_INSTR_PROFILE_USE");
+}
+
 /// InitializePreprocessor - Initialize the preprocessor getting it and the
 /// environment ready to process a single file.
-void clang::InitializePreprocessor(
-    Preprocessor &PP, const PreprocessorOptions &InitOpts,
-    const PCHContainerReader &PCHContainerRdr,
-    const FrontendOptions &FEOpts) {
+void clang::InitializePreprocessor(Preprocessor &PP,
+                                   const PreprocessorOptions &InitOpts,
+                                   const PCHContainerReader &PCHContainerRdr,
+                                   const FrontendOptions &FEOpts,
+                                   const CodeGenOptions &CodeGenOpts) {
   const LangOptions &LangOpts = PP.getLangOpts();
   std::string PredefineBuffer;
   PredefineBuffer.reserve(4080);
@@ -1416,6 +1426,11 @@ void clang::InitializePreprocessor(
   InitializeStandardPredefinedMacros(PP.getTargetInfo(), PP.getLangOpts(),
                                      FEOpts, Builder);
 
+  // The PGO instrumentation profile macros are driven by options
+  // -fprofile[-instr]-generate/-fcs-profile-generate/-fprofile[-instr]-use,
+  // hence they are not guarded by InitOpts.UsePredefines.
+  InitializePGOProfileMacros(CodeGenOpts, Builder);
+
   // Add on the predefines from the driver.  Wrap in a #line directive to report
   // that they come from the command line.
   Builder.append("# 1 \"<command line>\" 1");
diff --git a/clang/test/Profile/c-general.c b/clang/test/Profile/c-general.c
index b841f9c3d2a1d1e..2f621ec9b0bf9db 100644
--- a/clang/test/Profile/c-general.c
+++ b/clang/test/Profile/c-general.c
@@ -9,6 +9,16 @@
 // Also check compatibility with older profiles.
 // RUN: %clang_cc1 -triple x86_64-apple-macosx10.9 -main-file-name c-general.c %s -o - -emit-llvm -fprofile-instrument-use-path=%S/Inputs/c-general.profdata.v1 | FileCheck -allow-deprecated-dag-overlap  -check-prefix=PGOUSE %s
 
+// RUN: %clang -fprofile-generate -E -dM %s | FileCheck -match-full-lines -check-prefix=PROFGENMACRO %s
+// RUN: %clang -fprofile-instr-generate -E -dM %s | FileCheck -match-full-lines -check-prefix=PROFGENMACRO %s
+// RUN: %clang -fcs-profile-generate -E -dM %s | FileCheck -match-full-lines -check-prefix=PROFGENMACRO %s
+//
+// RUN: %clang -fprofile-use=%t.profdata -E -dM %s | FileCheck -match-full-lines -check-prefix=PROFUSEMACRO %s
+// RUN: %clang -fprofile-instr-use=%t.profdata -E -dM %s | FileCheck -match-full-lines -check-prefix=PROFUSEMACRO %s
+
+// PROFGENMACRO:#define __LLVM_INSTR_PROFILE_GENERATE 1
+// PROFUSEMACRO:#define __LLVM_INSTR_PROFILE_USE 1
+
 // PGOGEN: @[[SLC:__profc_simple_loops]] = private global [4 x i64] zeroinitializer
 // PGOGEN: @[[IFC:__profc_conditionals]] = private global [13 x i64] zeroinitializer
 // PGOGEN: @[[EEC:__profc_early_exits]] = private global [9 x i64] zeroinitializer
diff --git a/compiler-rt/include/CMakeLists.txt b/compiler-rt/include/CMakeLists.txt
index 78427beedb3cc4e..7a100c66bbcfda8 100644
--- a/compiler-rt/include/CMakeLists.txt
+++ b/compiler-rt/include/CMakeLists.txt
@@ -44,6 +44,7 @@ endif(COMPILER_RT_BUILD_ORC)
 if (COMPILER_RT_BUILD_PROFILE)
   set(PROFILE_HEADERS
     profile/InstrProfData.inc
+    profile/instr_prof_interface.h
     )
 endif(COMPILER_RT_BUILD_PROFILE)
 
diff --git a/compiler-rt/include/profile/instr_prof_interface.h b/compiler-rt/include/profile/instr_prof_interface.h
new file mode 100644
index 000000000000000..be40f2685934bea
--- /dev/null
+++ b/compiler-rt/include/profile/instr_prof_interface.h
@@ -0,0 +1,92 @@
+/*===---- instr_prof_interface.h - Instrumentation PGO User Program API ----===
+ *
+ * Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+ * See https://llvm.org/LICENSE.txt for license information.
+ * SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+ *
+ *===-----------------------------------------------------------------------===
+ *
+ * This header provides a public interface for fine-grained control of counter
+ * reset and profile dumping. These interface functions can be directly called
+ * in user programs.
+ *
+\*===---------------------------------------------------------------------===*/
+
+#ifndef COMPILER_RT_INSTR_PROFILING
+#define COMPILER_RT_INSTR_PROFILING
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#ifdef __LLVM_INSTR_PROFILE_GENERATE
+// Profile file reset and dump interfaces.
+// When `-fprofile[-instr]-generate`/`-fcs-profile-generate` is in effect,
+// clang defines __LLVM_INSTR_PROFILE_GENERATE to pick up the API calls.
+
+/*!
+ * \brief Set the filename for writing instrumentation data.
+ *
+ * Sets the filename to be used for subsequent calls to
+ * \a __llvm_profile_write_file().
+ *
+ * \c Name is not copied, so it must remain valid.  Passing NULL resets the
+ * filename logic to the default behaviour.
+ *
+ * Note: There may be multiple copies of the profile runtime (one for each
+ * instrumented image/DSO). This API only modifies the filename within the
+ * copy of the runtime available to the calling image.
+ *
+ * Warning: This is a no-op if continuous mode (\ref
+ * __llvm_profile_is_continuous_mode_enabled) is on. The reason for this is
+ * that in continuous mode, profile counters are mmap()'d to the profile at
+ * program initialization time. Support for transferring the mmap'd profile
+ * counts to a new file has not been implemented.
+ */
+void __llvm_profile_set_filename(const char *Name);
+
+/*!
+ * \brief Interface to set all PGO counters to zero for the current process.
+ *
+ */
+void __llvm_profile_reset_counters(void);
+
+/*!
+ * \brief this is a wrapper interface to \c __llvm_profile_write_file.
+ * After this interface is invoked, an already dumped flag will be set
+ * so that profile won't be dumped again during program exit.
+ * Invocation of interface __llvm_profile_reset_counters will clear
+ * the flag. This interface is designed to be used to collect profile
+ * data from user selected hot regions. The use model is
+ *      __llvm_profile_reset_counters();
+ *      ... hot region 1
+ *      __llvm_profile_dump();
+ *      .. some other code
+ *      __llvm_profile_reset_counters();
+ *      ... hot region 2
+ *      __llvm_profile_dump();
+ *
+ *  It is expected that on-line profile merging is on with \c %m specifier
+ *  used in profile filename . If merging is not turned on, user is expected
+ *  to invoke __llvm_profile_set_filename to specify different profile names
+ *  for different regions before dumping to avoid profile write clobbering.
+ */
+int __llvm_profile_dump(void);
+
+// Interface to dump the current process' order file to disk.
+int __llvm_orderfile_dump(void);
+
+#else
+
+#define __llvm_profile_set_filename(Name)
+#define __llvm_profile_reset_counters()
+#define __llvm_profile_dump() (0)
+#define __llvm_orderfile_dump() (0)
+
+#endif
+
+#ifdef __cplusplus
+} // extern "C"
+#endif
+
+#endif
diff --git a/compiler-rt/lib/profile/InstrProfiling.h b/compiler-rt/lib/profile/InstrProfiling.h
index 137115996748ce3..012390833691877 100644
--- a/compiler-rt/lib/profile/InstrProfiling.h
+++ b/compiler-rt/lib/profile/InstrProfiling.h
@@ -12,6 +12,17 @@
 #include "InstrProfilingPort.h"
 #include <stdio.h>
 
+// Make sure __LLVM_INSTR_PROFILE_GENERATE is always defined before
+// including instr_prof_interface.h so the interface functions are
+// declared correctly for the runtime.
+// __LLVM_INSTR_PROFILE_GENERATE is always `#undef`ed after the header,
+// because compiler-rt does not support profiling the profiling runtime itself.
+#ifndef __LLVM_INSTR_PROFILE_GENERATE
+#define __LLVM_INSTR_PROFILE_GENERATE
+#endif
+#include "profile/instr_prof_interface.h"
+#undef __LLVM_INSTR_PROFILE_GENERATE
+
 #define INSTR_PROF_VISIBILITY COMPILER_RT_VISIBILITY
 #include "profile/InstrProfData.inc"
 
@@ -100,12 +111,6 @@ ValueProfNode *__llvm_profile_begin_vnodes();
 ValueProfNode *__llvm_profile_end_vnodes();
 uint32_t *__llvm_profile_begin_orderfile();
 
-/*!
- * \brief Clear profile counters to zero.
- *
- */
-void __llvm_profile_reset_counters(void);
-
 /*!
  * \brief Merge profile data from buffer.
  *
@@ -156,50 +161,6 @@ void __llvm_profile_instrument_target_value(uint64_t TargetValue, void *Data,
 int __llvm_profile_write_file(void);
 
 int __llvm_orderfile_write_file(void);
-/*!
- * \brief this is a wrapper interface to \c __llvm_profile_write_file.
- * After this interface is invoked, an already dumped flag will be set
- * so that profile won't be dumped again during program exit.
- * Invocation of interface __llvm_profile_reset_counters will clear
- * the flag. This interface is designed to be used to collect profile
- * data from user selected hot regions. The use model is
- *      __llvm_profile_reset_counters();
- *      ... hot region 1
- *      __llvm_profile_dump();
- *      .. some other code
- *      __llvm_profile_reset_counters();
- *       ... hot region 2
- *      __llvm_profile_dump();
- *
- *  It is expected that on-line profile merging is on with \c %m specifier
- *  used in profile filename . If merging is  not turned on, user is expected
- *  to invoke __llvm_profile_set_filename  to specify different profile names
- *  for different regions before dumping to avoid profile write clobbering.
- */
-int __llvm_profile_dump(void);
-
-int __llvm_orderfile_dump(void);
-
-/*!
- * \brief Set the filename for writing instrumentation data.
- *
- * Sets the filename to be used for subsequent calls to
- * \a __llvm_profile_write_file().
- *
- * \c Name is not copied, so it must remain valid.  Passing NULL resets the
- * filename logic to the default behaviour.
- *
- * Note: There may be multiple copies of the profile runtime (one for each
- * instrumented image/DSO). This API only modifies the filename within the
- * copy of the runtime available to the calling image.
- *
- * Warning: This is a no-op if continuous mode (\ref
- * __llvm_profile_is_continuous_mode_enabled) is on. The reason for this is
- * that in continuous mode, profile counters are mmap()'d to the profile at
- * program initialization time. Support for transferring the mmap'd profile
- * counts to a new file has not been implemented.
- */
-void __llvm_profile_set_filename(const char *Name);
 
 /*!
  * \brief Set the FILE object for writing instrumentation data. Return 0 if set
diff --git a/compiler-rt/test/profile/Inputs/instrprof-api.c.profdata b/compiler-rt/test/profile/Inputs/instrprof-api.c.profdata
new file mode 100644
index 000000000000000..a0bc9f6e7aa0994
Binary files /dev/null and b/compiler-rt/test/profile/Inputs/instrprof-api.c.profdata differ
diff --git a/compiler-rt/test/profile/Linux/instrprof-weak-symbol.c b/compiler-rt/test/profile/Linux/instrprof-weak-symbol.c
new file mode 100644
index 000000000000000..eda299cb6610e41
--- /dev/null
+++ b/compiler-rt/test/profile/Linux/instrprof-weak-symbol.c
@@ -0,0 +1,16 @@
+// Test the linker feature that treats undefined weak symbols as null values.
+
+// RUN: %clang_pgogen -o %t %s
+// RUN: not %t
+// RUN: %clang -o %t %s
+// RUN: %t
+
+__attribute__((weak)) void __llvm_profile_reset_counters(void);
+
+int main() {
+  if (__llvm_profile_reset_counters) {
+    __llvm_profile_reset_counters();
+    return 1;
+  }
+  return 0;
+}
diff --git a/compiler-rt/test/profile/instrprof-api.c b/compiler-rt/test/profile/instrprof-api.c
new file mode 100644
index 000000000000000..19928bf33a3eb14
--- /dev/null
+++ b/compiler-rt/test/profile/instrprof-api.c
@@ -0,0 +1,39 @@
+// Testing profile generate.
+// RUN: %clang_profgen %s -S -emit-llvm -o - | FileCheck %s --check-prefix=PROFGEN
+// RUN: %clang_pgogen %s -S -emit-llvm -o - | FileCheck %s --check-prefix=PROFGEN
+
+// Testing profile use.
+// RUN: %clang_pgouse=%S/Inputs/instrprof-api.c.profdata %s -S -emit-llvm -o - \
+// RUN:   | FileCheck %s --check-prefix=PROFUSE
+
+#include "profile/instr_prof_interface.h"
+
+__attribute__((noinline)) int bar() { retu...
[truncated]

@minglotus-6
Copy link
Contributor

thanks for the fix!

One test Clang :: Preprocessor/init.c failed (https://lab.llvm.org/buildbot/#/builders/74/builds/24843) and the error is due to mismatched macros (if I'm reading correctly wants __LONG_LONG_MAX__ right after __LLONG_WIDTH__ but sees __LLVM_INSTR_PROFILE_GENERATE in the middle). I'm wondering if the test (and other tests like Preprocessor/init-aarch64.c) needs update (or if they should be fixed by #77546 (comment))

@qiongsiwu
Copy link
Contributor Author

thanks for the fix!

One test Clang :: Preprocessor/init.c failed (https://lab.llvm.org/buildbot/#/builders/74/builds/24843) and the error is due to mismatched macros (if I'm reading correctly wants __LONG_LONG_MAX__ right after __LLONG_WIDTH__ but sees __LLVM_INSTR_PROFILE_GENERATE in the middle). I'm wondering if the test (and other tests like Preprocessor/init-aarch64.c) needs update (or if they should be fixed by #77546 (comment))

Thanks for the fast response @minglotus-6 ! The preprocessor init tests were failing due to a mis-optimization introduced recently. Due to the misoptimization, hasProfileInstr() (here) always returned true and hence the __LLVM_INSTR_PROFILE_GENERATE macros were always inserted when it should not show up. As noted in the comment, #77831 fixed the mis-optimization. hasProfileInstr() is slightly rewritten in this PR as well to avoid triggering the pattern that was mis-optimized.

That said this PR still has some issues. Four seemingly unrelated tests are failing the pre-commit CI on Linux and I am investigating. I will update this PR once these failures are sorted out! Thanks for your patience!

Failed Tests (4):
  cfi-devirt-lld-thinlto-x86_64 :: cross-dso/icall/dlopen.cpp
  cfi-devirt-lld-x86_64 :: cross-dso/icall/dlopen.cpp
  cfi-standalone-lld-thinlto-x86_64 :: cross-dso/icall/dlopen.cpp
  cfi-standalone-lld-x86_64 :: cross-dso/icall/dlopen.cpp

@qiongsiwu
Copy link
Contributor Author

qiongsiwu commented Jan 17, 2024

It seems that either the precommit CI has some problems, or there are existing compiler-rt breakages. #78037 is also failing the same test cases. The changes in #78037 do not look harmful either. I took a brief look at the current buildbots and I don't see any similar failures. Additionally, I am not able to reproduce these failures locally.

I will see what can be investigated next, but I suspect at this PR is not the root cause of the dlopen.cpp failures.

@qiongsiwu
Copy link
Contributor Author

Ok I think I have a good theory on why these tests are failing. Most likely the test machine does not have the latest lld during the test. If we build lld together with the project, these tests pass.

@w2yehia
Copy link
Contributor

w2yehia commented Jan 22, 2024

@qiongsiwu explained to me offline that the issue on Windows is that calls __llvm_orderfile_dump fail so he's not able to execute the instrumented program if it calls that function.
Regarding the compiler-rt/test/profile/instrprof-api.c test. The disadvantage of having a profdata file is that it's hard to debug/reproduce; the Inputs directory currently has no profdata files, it's all source/text files. So, it's better to avoid storing the profdata.
We can disable the test on Windows.
Or keep the Windows coverage and #ifdef the __llvm_orderfile_dump API call on Windows (which will require versioning the expected output using LIT substitution magic).

@qiongsiwu
Copy link
Contributor Author

Thanks for to comments/suggestions @w2yehia ! The test is modified so that

  1. The profile files are generated.
  2. The test does not support Windows to avoid known limitations on Windows.

@qiongsiwu
Copy link
Contributor Author

Hi @vitalybuka ! I am landing this PR. The precommit CI failures do not seem to related to changes in this PR. Please let me know if you see failures and I will revert. Thanks so much!

@qiongsiwu qiongsiwu merged commit 4f21fb8 into llvm:main Jan 22, 2024
5 of 6 checks passed
@rorth
Copy link
Collaborator

rorth commented Jan 23, 2024

This patch broke both the Solaris/sparcv9 and Solaris/amd64 buildbots.

If it really requires recent lld, this is guaranteed to not work: lld hasn't even been ported to Solaris. Similar issues most likely will exist on other targets where the bot either doesn't build lld or it isn't supported at all.

@qiongsiwu
Copy link
Contributor Author

qiongsiwu commented Jan 23, 2024

This patch broke both the Solaris/sparcv9 and Solaris/amd64 buildbots.

If it really requires recent lld, this is guaranteed to not work: lld hasn't even been ported to Solaris. Similar issues most likely will exist on other targets where the bot either doesn't build lld or it isn't supported at all.

Thanks for the note @rorth . I will investigate today. This PR in itself does not require lld. The test seems to be failing for a different reason. Most likely the test is failing because __llvm_orderfile_dump() is not supported on Solaris. I will revert sometime today if I cannot post a reasonable update.

@qiongsiwu
Copy link
Contributor Author

@w2yehia I think a reasonable way to go forward is to remove the testing of __llvm_orderfile_dump from the main test, and add a separate test that runs only on Darwin.

What do you think?

@qiongsiwu
Copy link
Contributor Author

@w2yehia and I discussed, and we will remove __llvm_orderfile_dump from the main test for now.

@qiongsiwu
Copy link
Contributor Author

#79150 is posted to avoid calling _llvm_orderfile_dump() in the test.

qiongsiwu added a commit that referenced this pull request Jan 23, 2024
…test (#79150)

#78285 added a test which calls
`__llvm_orderfile_dump()`, a functionality that is not supported on many
platforms. This PR removes the call to `__llvm_orderfile_dump()` to
avoid it failing on unsupported platforms, and turn on the test for
Windows, so we test the rest of the API that are supported.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category clang-tidy clang-tools-extra compiler-rt PGO Profile Guided Optimizations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants