Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[InstrProf] No linkage prefixes in IRPGO names #76994

Merged
merged 3 commits into from
Jan 5, 2024

Conversation

ellishg
Copy link
Contributor

@ellishg ellishg commented Jan 4, 2024

Change the format of IRPGO counter names to [<filepath>;]<mangled-name> which is computed by GlobalValue::getGlobalIdentifier() to fix #74565.

In fe05193 (https://reviews.llvm.org/D156569) the format of IRPGO counter names was changed to be [<filepath>;]<linkage-name> where <linkage-name> is basically F.getName() with some prefix, e.g., _ or l_ on Mach-O (yes, it is confusing that <linkage-name> is computed with Mangler().getNameWithPrefix() while <mangled-name> is just F.getName()). We discovered in #74565 that this causes some missed import issues on some targets and #74008 is a partial fix.

Since <mangled-name> may not match the <linkage-name> on some targets like Mach-O, we will need to post-process the output of llvm-profdata order before passing to the linker via -order_file.

Profiles generated after fe05193 will become stale after this diff, but I think this is acceptable since that patch landed after the LLVM 18 cut which hasn't been released yet.

Change the format of IRPGO counter names to `[<filepath>;]<mangled-name>` which is computed by `GlobalValue::getGlobalIdentifier()` to fix llvm#74565.

In fe05193 (https://reviews.llvm.org/D156569) the format of IRPGO counter names was changed to be `[<filepath>;]<linkage-name>` where `<linkage-name>` is `F.getName()` with some prefix, e.g., `_` or `l_` on Mach-O (it is confusing that `<linkage-name>` is computed with `Mangler().getNameWithPrefix()` while `<mangled-name>` is just `F.getName()`). We discovered in llvm#74565 that this causes some missed import issues on some targets and llvm#74008 is a partial fix.

Since `<mangled-name>` may not match the `<linkage-name>` on some targets like Mach-O, we will need to post-process the output of `llvm-profdata order` before passing to the linker via `-order_file`.

Profiles generated after fe05193 will become stale after this diff, but I think this is acceptable since that patch after the LLVM 18 cut which has not been released yet.
@llvmbot llvmbot added lld lld:MachO PGO Profile Guided Optimizations labels Jan 4, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Jan 4, 2024

@llvm/pr-subscribers-lld-macho

@llvm/pr-subscribers-pgo

Author: Ellis Hoag (ellishg)

Changes

Change the format of IRPGO counter names to [&lt;filepath&gt;;]&lt;mangled-name&gt; which is computed by GlobalValue::getGlobalIdentifier() to fix #74565.

In fe05193 (https://reviews.llvm.org/D156569) the format of IRPGO counter names was changed to be [&lt;filepath&gt;;]&lt;linkage-name&gt; where &lt;linkage-name&gt; is basically F.getName() with some prefix, e.g., _ or l_ on Mach-O (yes, it is confusing that &lt;linkage-name&gt; is computed with Mangler().getNameWithPrefix() while &lt;mangled-name&gt; is just F.getName()). We discovered in #74565 that this causes some missed import issues on some targets and #74008 is a partial fix.

Since &lt;mangled-name&gt; may not match the &lt;linkage-name&gt; on some targets like Mach-O, we will need to post-process the output of llvm-profdata order before passing to the linker via -order_file.

Profiles generated after fe05193 will become stale after this diff, but I think this is acceptable since that patch landed after the LLVM 18 cut which hasn't been released yet.


Full diff: https://github.com/llvm/llvm-project/pull/76994.diff

4 Files Affected:

  • (modified) lld/test/MachO/pgo-warn-mismatch.ll (+1-1)
  • (modified) llvm/lib/ProfileData/InstrProf.cpp (+18-34)
  • (modified) llvm/tools/llvm-profdata/llvm-profdata.cpp (+5-1)
  • (modified) llvm/unittests/ProfileData/InstrProfTest.cpp (+2-16)
diff --git a/lld/test/MachO/pgo-warn-mismatch.ll b/lld/test/MachO/pgo-warn-mismatch.ll
index 632adffb01fb3e..365eeaccb4b403 100644
--- a/lld/test/MachO/pgo-warn-mismatch.ll
+++ b/lld/test/MachO/pgo-warn-mismatch.ll
@@ -24,7 +24,7 @@ entry:
 
 ;--- cs.proftext
 :csir
-_foo
+foo
 # Func Hash:
 2277602155505015273
 # Num Counters:
diff --git a/llvm/lib/ProfileData/InstrProf.cpp b/llvm/lib/ProfileData/InstrProf.cpp
index 134a400e639c4b..08126cff7d06be 100644
--- a/llvm/lib/ProfileData/InstrProf.cpp
+++ b/llvm/lib/ProfileData/InstrProf.cpp
@@ -14,7 +14,6 @@
 #include "llvm/ProfileData/InstrProf.h"
 #include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/SetVector.h"
-#include "llvm/ADT/SmallString.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringExtras.h"
 #include "llvm/ADT/StringRef.h"
@@ -27,7 +26,6 @@
 #include "llvm/IR/Instruction.h"
 #include "llvm/IR/LLVMContext.h"
 #include "llvm/IR/MDBuilder.h"
-#include "llvm/IR/Mangler.h"
 #include "llvm/IR/Metadata.h"
 #include "llvm/IR/Module.h"
 #include "llvm/IR/Type.h"
@@ -297,31 +295,6 @@ static StringRef getStrippedSourceFileName(const GlobalObject &GO) {
   return FileName;
 }
 
-// The PGO name has the format [<filepath>;]<linkage-name> where <filepath>; is
-// provided if linkage is local and <linkage-name> is the mangled function
-// name. The filepath is used to discriminate possibly identical function names.
-// ; is used because it is unlikely to be found in either <filepath> or
-// <linkage-name>.
-//
-// Older compilers used getPGOFuncName() which has the format
-// [<filepath>:]<function-name>. <filepath> is used to discriminate between
-// possibly identical function names when linkage is local and <function-name>
-// simply comes from F.getName(). This caused trouble for Objective-C functions
-// which commonly have :'s in their names. Also, since <function-name> is not
-// mangled, they cannot be passed to Mach-O linkers via -order_file. We still
-// need to compute this name to lookup functions from profiles built by older
-// compilers.
-static std::string
-getIRPGONameForGlobalObject(const GlobalObject &GO,
-                            GlobalValue::LinkageTypes Linkage,
-                            StringRef FileName) {
-  SmallString<64> Name;
-  // FIXME: Mangler's handling is kept outside of `getGlobalIdentifier` for now.
-  // For more details please check issue #74565.
-  Mangler().getNameWithPrefix(Name, &GO, /*CannotUsePrivateLabel=*/true);
-  return GlobalValue::getGlobalIdentifier(Name, Linkage, FileName);
-}
-
 static std::optional<std::string> lookupPGONameFromMetadata(MDNode *MD) {
   if (MD != nullptr) {
     StringRef S = cast<MDString>(MD->getOperand(0))->getString();
@@ -330,7 +303,17 @@ static std::optional<std::string> lookupPGONameFromMetadata(MDNode *MD) {
   return {};
 }
 
-// Returns the PGO object name. This function has some special handling
+// The PGO name has the format [<filepath>;]<function-name> where <filepath>; is
+// provided if linkage is local and is used to discriminate possibly identical
+// function names. ";" is used because it is unlikely to be found in either
+// <filepath> or <linkage-name>.
+//
+// Older compilers used getPGOFuncName() which has the format
+// [<filepath>:]<function-name>. This caused trouble for Objective-C functions
+// which commonly have :'s in their names. We still need to compute this name to
+// lookup functions from profiles built by older compilers.
+//
+// This function has some special handling
 // when called in LTO optimization. The following only applies when calling in
 // LTO passes (when \c InLTO is true): LTO's internalization privatizes many
 // global linkage symbols. This happens after value profile annotation, but
@@ -347,7 +330,8 @@ static std::string getIRPGOObjectName(const GlobalObject &GO, bool InLTO,
                                       MDNode *PGONameMetadata) {
   if (!InLTO) {
     auto FileName = getStrippedSourceFileName(GO);
-    return getIRPGONameForGlobalObject(GO, GO.getLinkage(), FileName);
+    return GlobalValue::getGlobalIdentifier(GO.getName(), GO.getLinkage(),
+                                            FileName);
   }
 
   // In LTO mode (when InLTO is true), first check if there is a meta data.
@@ -357,7 +341,8 @@ static std::string getIRPGOObjectName(const GlobalObject &GO, bool InLTO,
   // If there is no meta data, the function must be a global before the value
   // profile annotation pass. Its current linkage may be internal if it is
   // internalized in LTO mode.
-  return getIRPGONameForGlobalObject(GO, GlobalValue::ExternalLinkage, "");
+  return GlobalValue::getGlobalIdentifier(GO.getName(),
+                                          GlobalValue::ExternalLinkage, "");
 }
 
 // Returns the IRPGO function name and does special handling when called
@@ -371,8 +356,8 @@ std::string getIRPGOFuncName(const Function &F, bool InLTO) {
 // The implementation is kept for profile matching from older profiles.
 // This is similar to `getIRPGOFuncName` except that this function calls
 // 'getPGOFuncName' to get a name and `getIRPGOFuncName` calls
-// 'getIRPGONameForGlobalObject'. See the difference between two callees in the
-// comments of `getIRPGONameForGlobalObject`.
+// 'getIRPGOObjectName'. See the difference between two callees in the
+// comments of `getIRPGOObjectName`.
 std::string getPGOFuncName(const Function &F, bool InLTO, uint64_t Version) {
   if (!InLTO) {
     auto FileName = getStrippedSourceFileName(F);
@@ -401,8 +386,7 @@ getParsedIRPGOFuncName(StringRef IRPGOFuncName) {
 StringRef getFuncNameWithoutPrefix(StringRef PGOFuncName, StringRef FileName) {
   if (FileName.empty())
     return PGOFuncName;
-  // Drop the file name including ':' or ';'. See getIRPGONameForGlobalObject as
-  // well.
+  // Drop the file name including ':' or ';'. See getIRPGOObjectName as well.
   if (PGOFuncName.starts_with(FileName))
     PGOFuncName = PGOFuncName.drop_front(FileName.size() + 1);
   return PGOFuncName;
diff --git a/llvm/tools/llvm-profdata/llvm-profdata.cpp b/llvm/tools/llvm-profdata/llvm-profdata.cpp
index 12b81d411cfa91..0a611d199026d8 100644
--- a/llvm/tools/llvm-profdata/llvm-profdata.cpp
+++ b/llvm/tools/llvm-profdata/llvm-profdata.cpp
@@ -3157,7 +3157,11 @@ static int order_main(int argc, const char *argv[]) {
   BalancedPartitioning BP(Config);
   BP.run(Nodes);
 
-  WithColor::note() << "# Ordered " << Nodes.size() << " functions\n";
+  OS << "# Ordered " << Nodes.size() << " functions\n";
+  OS << "# Warning: Mach-O prefixes symbols with \"_\" or \"l_\" depending on "
+        "the linkage and this output does not take this into account. Some "
+        "post processing may be required before passing to the linker via "
+        "-order_file.\n";
   for (auto &N : Nodes) {
     auto [Filename, ParsedFuncName] =
         getParsedIRPGOFuncName(Reader->getSymtab().getFuncOrVarName(N.Id));
diff --git a/llvm/unittests/ProfileData/InstrProfTest.cpp b/llvm/unittests/ProfileData/InstrProfTest.cpp
index 6a71a975fbb12d..8ffb68de7a2d20 100644
--- a/llvm/unittests/ProfileData/InstrProfTest.cpp
+++ b/llvm/unittests/ProfileData/InstrProfTest.cpp
@@ -542,22 +542,14 @@ TEST_F(InstrProfTest, test_memprof_merge) {
 TEST_F(InstrProfTest, test_irpgo_function_name) {
   LLVMContext Ctx;
   auto M = std::make_unique<Module>("MyModule.cpp", Ctx);
-  // Use Mach-O mangling so that non-private symbols get a `_` prefix.
-  M->setDataLayout(DataLayout("m:o"));
   auto *FTy = FunctionType::get(Type::getVoidTy(Ctx), /*isVarArg=*/false);
 
   std::vector<std::tuple<StringRef, Function::LinkageTypes, StringRef>> Data;
-  Data.emplace_back("ExternalFoo", Function::ExternalLinkage, "_ExternalFoo");
+  Data.emplace_back("ExternalFoo", Function::ExternalLinkage, "ExternalFoo");
   Data.emplace_back("InternalFoo", Function::InternalLinkage,
-                    "MyModule.cpp;_InternalFoo");
-  Data.emplace_back("PrivateFoo", Function::PrivateLinkage,
-                    "MyModule.cpp;l_PrivateFoo");
-  Data.emplace_back("WeakODRFoo", Function::WeakODRLinkage, "_WeakODRFoo");
-  // Test Objective-C symbols
+                    "MyModule.cpp;InternalFoo");
   Data.emplace_back("\01-[C dynamicFoo:]", Function::ExternalLinkage,
                     "-[C dynamicFoo:]");
-  Data.emplace_back("-<C directFoo:>", Function::ExternalLinkage,
-                    "_-<C directFoo:>");
   Data.emplace_back("\01-[C internalFoo:]", Function::InternalLinkage,
                     "MyModule.cpp;-[C internalFoo:]");
 
@@ -590,10 +582,6 @@ TEST_F(InstrProfTest, test_pgo_function_name) {
   Data.emplace_back("ExternalFoo", Function::ExternalLinkage, "ExternalFoo");
   Data.emplace_back("InternalFoo", Function::InternalLinkage,
                     "MyModule.cpp:InternalFoo");
-  Data.emplace_back("PrivateFoo", Function::PrivateLinkage,
-                    "MyModule.cpp:PrivateFoo");
-  Data.emplace_back("WeakODRFoo", Function::WeakODRLinkage, "WeakODRFoo");
-  // Test Objective-C symbols
   Data.emplace_back("\01-[C externalFoo:]", Function::ExternalLinkage,
                     "-[C externalFoo:]");
   Data.emplace_back("\01-[C internalFoo:]", Function::InternalLinkage,
@@ -611,8 +599,6 @@ TEST_F(InstrProfTest, test_pgo_function_name) {
 TEST_F(InstrProfTest, test_irpgo_read_deprecated_names) {
   LLVMContext Ctx;
   auto M = std::make_unique<Module>("MyModule.cpp", Ctx);
-  // Use Mach-O mangling so that non-private symbols get a `_` prefix.
-  M->setDataLayout(DataLayout("m:o"));
   auto *FTy = FunctionType::get(Type::getVoidTy(Ctx), /*isVarArg=*/false);
   auto *InternalFooF =
       Function::Create(FTy, Function::InternalLinkage, "InternalFoo", M.get());

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 4, 2024

@llvm/pr-subscribers-lld

Author: Ellis Hoag (ellishg)

Changes

Change the format of IRPGO counter names to [&lt;filepath&gt;;]&lt;mangled-name&gt; which is computed by GlobalValue::getGlobalIdentifier() to fix #74565.

In fe05193 (https://reviews.llvm.org/D156569) the format of IRPGO counter names was changed to be [&lt;filepath&gt;;]&lt;linkage-name&gt; where &lt;linkage-name&gt; is basically F.getName() with some prefix, e.g., _ or l_ on Mach-O (yes, it is confusing that &lt;linkage-name&gt; is computed with Mangler().getNameWithPrefix() while &lt;mangled-name&gt; is just F.getName()). We discovered in #74565 that this causes some missed import issues on some targets and #74008 is a partial fix.

Since &lt;mangled-name&gt; may not match the &lt;linkage-name&gt; on some targets like Mach-O, we will need to post-process the output of llvm-profdata order before passing to the linker via -order_file.

Profiles generated after fe05193 will become stale after this diff, but I think this is acceptable since that patch landed after the LLVM 18 cut which hasn't been released yet.


Full diff: https://github.com/llvm/llvm-project/pull/76994.diff

4 Files Affected:

  • (modified) lld/test/MachO/pgo-warn-mismatch.ll (+1-1)
  • (modified) llvm/lib/ProfileData/InstrProf.cpp (+18-34)
  • (modified) llvm/tools/llvm-profdata/llvm-profdata.cpp (+5-1)
  • (modified) llvm/unittests/ProfileData/InstrProfTest.cpp (+2-16)
diff --git a/lld/test/MachO/pgo-warn-mismatch.ll b/lld/test/MachO/pgo-warn-mismatch.ll
index 632adffb01fb3e..365eeaccb4b403 100644
--- a/lld/test/MachO/pgo-warn-mismatch.ll
+++ b/lld/test/MachO/pgo-warn-mismatch.ll
@@ -24,7 +24,7 @@ entry:
 
 ;--- cs.proftext
 :csir
-_foo
+foo
 # Func Hash:
 2277602155505015273
 # Num Counters:
diff --git a/llvm/lib/ProfileData/InstrProf.cpp b/llvm/lib/ProfileData/InstrProf.cpp
index 134a400e639c4b..08126cff7d06be 100644
--- a/llvm/lib/ProfileData/InstrProf.cpp
+++ b/llvm/lib/ProfileData/InstrProf.cpp
@@ -14,7 +14,6 @@
 #include "llvm/ProfileData/InstrProf.h"
 #include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/SetVector.h"
-#include "llvm/ADT/SmallString.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringExtras.h"
 #include "llvm/ADT/StringRef.h"
@@ -27,7 +26,6 @@
 #include "llvm/IR/Instruction.h"
 #include "llvm/IR/LLVMContext.h"
 #include "llvm/IR/MDBuilder.h"
-#include "llvm/IR/Mangler.h"
 #include "llvm/IR/Metadata.h"
 #include "llvm/IR/Module.h"
 #include "llvm/IR/Type.h"
@@ -297,31 +295,6 @@ static StringRef getStrippedSourceFileName(const GlobalObject &GO) {
   return FileName;
 }
 
-// The PGO name has the format [<filepath>;]<linkage-name> where <filepath>; is
-// provided if linkage is local and <linkage-name> is the mangled function
-// name. The filepath is used to discriminate possibly identical function names.
-// ; is used because it is unlikely to be found in either <filepath> or
-// <linkage-name>.
-//
-// Older compilers used getPGOFuncName() which has the format
-// [<filepath>:]<function-name>. <filepath> is used to discriminate between
-// possibly identical function names when linkage is local and <function-name>
-// simply comes from F.getName(). This caused trouble for Objective-C functions
-// which commonly have :'s in their names. Also, since <function-name> is not
-// mangled, they cannot be passed to Mach-O linkers via -order_file. We still
-// need to compute this name to lookup functions from profiles built by older
-// compilers.
-static std::string
-getIRPGONameForGlobalObject(const GlobalObject &GO,
-                            GlobalValue::LinkageTypes Linkage,
-                            StringRef FileName) {
-  SmallString<64> Name;
-  // FIXME: Mangler's handling is kept outside of `getGlobalIdentifier` for now.
-  // For more details please check issue #74565.
-  Mangler().getNameWithPrefix(Name, &GO, /*CannotUsePrivateLabel=*/true);
-  return GlobalValue::getGlobalIdentifier(Name, Linkage, FileName);
-}
-
 static std::optional<std::string> lookupPGONameFromMetadata(MDNode *MD) {
   if (MD != nullptr) {
     StringRef S = cast<MDString>(MD->getOperand(0))->getString();
@@ -330,7 +303,17 @@ static std::optional<std::string> lookupPGONameFromMetadata(MDNode *MD) {
   return {};
 }
 
-// Returns the PGO object name. This function has some special handling
+// The PGO name has the format [<filepath>;]<function-name> where <filepath>; is
+// provided if linkage is local and is used to discriminate possibly identical
+// function names. ";" is used because it is unlikely to be found in either
+// <filepath> or <linkage-name>.
+//
+// Older compilers used getPGOFuncName() which has the format
+// [<filepath>:]<function-name>. This caused trouble for Objective-C functions
+// which commonly have :'s in their names. We still need to compute this name to
+// lookup functions from profiles built by older compilers.
+//
+// This function has some special handling
 // when called in LTO optimization. The following only applies when calling in
 // LTO passes (when \c InLTO is true): LTO's internalization privatizes many
 // global linkage symbols. This happens after value profile annotation, but
@@ -347,7 +330,8 @@ static std::string getIRPGOObjectName(const GlobalObject &GO, bool InLTO,
                                       MDNode *PGONameMetadata) {
   if (!InLTO) {
     auto FileName = getStrippedSourceFileName(GO);
-    return getIRPGONameForGlobalObject(GO, GO.getLinkage(), FileName);
+    return GlobalValue::getGlobalIdentifier(GO.getName(), GO.getLinkage(),
+                                            FileName);
   }
 
   // In LTO mode (when InLTO is true), first check if there is a meta data.
@@ -357,7 +341,8 @@ static std::string getIRPGOObjectName(const GlobalObject &GO, bool InLTO,
   // If there is no meta data, the function must be a global before the value
   // profile annotation pass. Its current linkage may be internal if it is
   // internalized in LTO mode.
-  return getIRPGONameForGlobalObject(GO, GlobalValue::ExternalLinkage, "");
+  return GlobalValue::getGlobalIdentifier(GO.getName(),
+                                          GlobalValue::ExternalLinkage, "");
 }
 
 // Returns the IRPGO function name and does special handling when called
@@ -371,8 +356,8 @@ std::string getIRPGOFuncName(const Function &F, bool InLTO) {
 // The implementation is kept for profile matching from older profiles.
 // This is similar to `getIRPGOFuncName` except that this function calls
 // 'getPGOFuncName' to get a name and `getIRPGOFuncName` calls
-// 'getIRPGONameForGlobalObject'. See the difference between two callees in the
-// comments of `getIRPGONameForGlobalObject`.
+// 'getIRPGOObjectName'. See the difference between two callees in the
+// comments of `getIRPGOObjectName`.
 std::string getPGOFuncName(const Function &F, bool InLTO, uint64_t Version) {
   if (!InLTO) {
     auto FileName = getStrippedSourceFileName(F);
@@ -401,8 +386,7 @@ getParsedIRPGOFuncName(StringRef IRPGOFuncName) {
 StringRef getFuncNameWithoutPrefix(StringRef PGOFuncName, StringRef FileName) {
   if (FileName.empty())
     return PGOFuncName;
-  // Drop the file name including ':' or ';'. See getIRPGONameForGlobalObject as
-  // well.
+  // Drop the file name including ':' or ';'. See getIRPGOObjectName as well.
   if (PGOFuncName.starts_with(FileName))
     PGOFuncName = PGOFuncName.drop_front(FileName.size() + 1);
   return PGOFuncName;
diff --git a/llvm/tools/llvm-profdata/llvm-profdata.cpp b/llvm/tools/llvm-profdata/llvm-profdata.cpp
index 12b81d411cfa91..0a611d199026d8 100644
--- a/llvm/tools/llvm-profdata/llvm-profdata.cpp
+++ b/llvm/tools/llvm-profdata/llvm-profdata.cpp
@@ -3157,7 +3157,11 @@ static int order_main(int argc, const char *argv[]) {
   BalancedPartitioning BP(Config);
   BP.run(Nodes);
 
-  WithColor::note() << "# Ordered " << Nodes.size() << " functions\n";
+  OS << "# Ordered " << Nodes.size() << " functions\n";
+  OS << "# Warning: Mach-O prefixes symbols with \"_\" or \"l_\" depending on "
+        "the linkage and this output does not take this into account. Some "
+        "post processing may be required before passing to the linker via "
+        "-order_file.\n";
   for (auto &N : Nodes) {
     auto [Filename, ParsedFuncName] =
         getParsedIRPGOFuncName(Reader->getSymtab().getFuncOrVarName(N.Id));
diff --git a/llvm/unittests/ProfileData/InstrProfTest.cpp b/llvm/unittests/ProfileData/InstrProfTest.cpp
index 6a71a975fbb12d..8ffb68de7a2d20 100644
--- a/llvm/unittests/ProfileData/InstrProfTest.cpp
+++ b/llvm/unittests/ProfileData/InstrProfTest.cpp
@@ -542,22 +542,14 @@ TEST_F(InstrProfTest, test_memprof_merge) {
 TEST_F(InstrProfTest, test_irpgo_function_name) {
   LLVMContext Ctx;
   auto M = std::make_unique<Module>("MyModule.cpp", Ctx);
-  // Use Mach-O mangling so that non-private symbols get a `_` prefix.
-  M->setDataLayout(DataLayout("m:o"));
   auto *FTy = FunctionType::get(Type::getVoidTy(Ctx), /*isVarArg=*/false);
 
   std::vector<std::tuple<StringRef, Function::LinkageTypes, StringRef>> Data;
-  Data.emplace_back("ExternalFoo", Function::ExternalLinkage, "_ExternalFoo");
+  Data.emplace_back("ExternalFoo", Function::ExternalLinkage, "ExternalFoo");
   Data.emplace_back("InternalFoo", Function::InternalLinkage,
-                    "MyModule.cpp;_InternalFoo");
-  Data.emplace_back("PrivateFoo", Function::PrivateLinkage,
-                    "MyModule.cpp;l_PrivateFoo");
-  Data.emplace_back("WeakODRFoo", Function::WeakODRLinkage, "_WeakODRFoo");
-  // Test Objective-C symbols
+                    "MyModule.cpp;InternalFoo");
   Data.emplace_back("\01-[C dynamicFoo:]", Function::ExternalLinkage,
                     "-[C dynamicFoo:]");
-  Data.emplace_back("-<C directFoo:>", Function::ExternalLinkage,
-                    "_-<C directFoo:>");
   Data.emplace_back("\01-[C internalFoo:]", Function::InternalLinkage,
                     "MyModule.cpp;-[C internalFoo:]");
 
@@ -590,10 +582,6 @@ TEST_F(InstrProfTest, test_pgo_function_name) {
   Data.emplace_back("ExternalFoo", Function::ExternalLinkage, "ExternalFoo");
   Data.emplace_back("InternalFoo", Function::InternalLinkage,
                     "MyModule.cpp:InternalFoo");
-  Data.emplace_back("PrivateFoo", Function::PrivateLinkage,
-                    "MyModule.cpp:PrivateFoo");
-  Data.emplace_back("WeakODRFoo", Function::WeakODRLinkage, "WeakODRFoo");
-  // Test Objective-C symbols
   Data.emplace_back("\01-[C externalFoo:]", Function::ExternalLinkage,
                     "-[C externalFoo:]");
   Data.emplace_back("\01-[C internalFoo:]", Function::InternalLinkage,
@@ -611,8 +599,6 @@ TEST_F(InstrProfTest, test_pgo_function_name) {
 TEST_F(InstrProfTest, test_irpgo_read_deprecated_names) {
   LLVMContext Ctx;
   auto M = std::make_unique<Module>("MyModule.cpp", Ctx);
-  // Use Mach-O mangling so that non-private symbols get a `_` prefix.
-  M->setDataLayout(DataLayout("m:o"));
   auto *FTy = FunctionType::get(Type::getVoidTy(Ctx), /*isVarArg=*/false);
   auto *InternalFooF =
       Function::Create(FTy, Function::InternalLinkage, "InternalFoo", M.get());

SmallString<64> Name;
// FIXME: Mangler's handling is kept outside of `getGlobalIdentifier` for now.
// For more details please check issue #74565.
Mangler().getNameWithPrefix(Name, &GO, /*CannotUsePrivateLabel=*/true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we just keep getIRPGONameForGlobalObject method here and simply replace line 321 with another dummy helper which is a wrapper to getName() (with some comments)? This allows more platform specific handling flexibility in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

@minglotus-6 minglotus-6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix! Overall the approach looks good.

Just to confirm my understanding, will the profile data flow be something like this with this approach?

  • raw profiles -> llvm-profdata order -> <new-post-processing> which does rename by combining linkage information -> output used in -order_file

llvm/lib/ProfileData/InstrProf.cpp Outdated Show resolved Hide resolved
@minglotus-6
Copy link
Contributor

It just occurred to me, the compiler-rt test (

) needs an update with this pull request.

@ellishg
Copy link
Contributor Author

ellishg commented Jan 4, 2024

Just to confirm my understanding, will the profile data flow be something like this with this approach?

* raw profiles -> `llvm-profdata order` -> `<new-post-processing> which does rename by combining linkage information` -> output used in `-order_file`

Yes. In fact, I have a small python script that does this where all_symbols comes from nm -pU <binary> | cut -d' ' -f3- and ordered_symbols comes from llvm-profdata order <profile>.

fixed_ordered_symbols = [
    next(
        (
            prefix + symbol
            for prefix in ["", "_", "l_"]
            if prefix + symbol in all_symbols
        ),
        symbol,
    )
    for symbol in ordered_symbols
]

I'd also like to automatically order functions at build time when the profile is passed with -fprofile-generate. I think I'll need to do some processing to make sure the names are properly passed to the linker.

@ellishg ellishg merged commit 9a2df55 into llvm:main Jan 5, 2024
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler-rt lld:MachO lld PGO Profile Guided Optimizations
Projects
None yet
4 participants