Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ThinLTO][TypeProfiling]Add annotated vtable GUID as referenced variables in per function summary, and update bitcode writer to create value-ids for these referenced vtables #79234

Conversation

minglotus-6
Copy link
Contributor

@minglotus-6 minglotus-6 commented Jan 24, 2024

Test Plan: llvm regression rest. The code is also tested on WSC workloads.

Created using spr 1.3.4
@llvmbot
Copy link
Collaborator

llvmbot commented Jan 24, 2024

@llvm/pr-subscribers-lto

@llvm/pr-subscribers-llvm-analysis

Author: Mingming Liu (minglotus-6)

Changes

Test Plan: llvm regression rest. The code is also tested on WSC workloads.


Full diff: https://github.com/llvm/llvm-project/pull/79234.diff

5 Files Affected:

  • (modified) llvm/include/llvm/ProfileData/InstrProf.h (+11-1)
  • (modified) llvm/lib/Analysis/ModuleSummaryAnalysis.cpp (+16)
  • (modified) llvm/lib/Bitcode/Writer/BitcodeWriter.cpp (+11-2)
  • (modified) llvm/lib/ProfileData/InstrProf.cpp (+46-21)
  • (modified) llvm/test/Bitcode/thinlto-function-summary-vtableref-pgo.ll (+18-19)
diff --git a/llvm/include/llvm/ProfileData/InstrProf.h b/llvm/include/llvm/ProfileData/InstrProf.h
index d936f38741e1e3a..248f62c7a810592 100644
--- a/llvm/include/llvm/ProfileData/InstrProf.h
+++ b/llvm/include/llvm/ProfileData/InstrProf.h
@@ -283,7 +283,7 @@ void annotateValueSite(Module &M, Instruction &Inst,
 
 /// Extract the value profile data from \p Inst which is annotated with
 /// value profile meta data. Return false if there is no value data annotated,
-/// otherwise  return true.
+/// otherwise return true.
 bool getValueProfDataFromInst(const Instruction &Inst,
                               InstrProfValueKind ValueKind,
                               uint32_t MaxNumValueData,
@@ -291,6 +291,16 @@ bool getValueProfDataFromInst(const Instruction &Inst,
                               uint32_t &ActualNumValueData, uint64_t &TotalC,
                               bool GetNoICPValue = false);
 
+/// Extract the value profile data from \p Inst and returns them if \p Inst is
+/// annotated with value profile data. Returns nullptr otherwise. It's similar
+/// to `getValueProfDataFromInst` above except that an array is allocated only
+/// after a preliminary checking that the value profiles of kind `ValueKind`
+/// exist.
+std::unique_ptr<InstrProfValueData[]>
+getValueProfDataFromInst(const Instruction &Inst, InstrProfValueKind ValueKind,
+                         uint32_t MaxNumValueData, uint32_t &ActualNumValueData,
+                         uint64_t &TotalC, bool GetNoICPValue = false);
+
 inline StringRef getPGOFuncNameMetadataName() { return "PGOFuncName"; }
 
 /// Return the PGOFuncName meta data associated with a function.
diff --git a/llvm/lib/Analysis/ModuleSummaryAnalysis.cpp b/llvm/lib/Analysis/ModuleSummaryAnalysis.cpp
index 1f15e94783240a7..cb074edeb92488d 100644
--- a/llvm/lib/Analysis/ModuleSummaryAnalysis.cpp
+++ b/llvm/lib/Analysis/ModuleSummaryAnalysis.cpp
@@ -124,6 +124,22 @@ static bool findRefEdges(ModuleSummaryIndex &Index, const User *CurUser,
         Worklist.push_back(Operand);
     }
   }
+
+  const Instruction *I = dyn_cast<Instruction>(CurUser);
+  if (I) {
+    uint32_t ActualNumValueData = 0;
+    uint64_t TotalCount = 0;
+    auto ValueDataArray = getValueProfDataFromInst(
+        *I, IPVK_VTableTarget, 24 /* MaxNumValueData */, ActualNumValueData,
+        TotalCount);
+
+    if (ValueDataArray.get()) {
+      for (uint32_t j = 0; j < ActualNumValueData; j++) {
+        RefEdges.insert(Index.getOrInsertValueInfo(
+            ValueDataArray[j].Value /* VTableGUID */));
+      }
+    }
+  }
   return HasBlockAddress;
 }
 
diff --git a/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp b/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
index a5fc267b1883bfe..432a99d35efeab4 100644
--- a/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
+++ b/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
@@ -199,7 +199,7 @@ class ModuleBitcodeWriterBase : public BitcodeWriterBase {
     for (const auto &GUIDSummaryLists : *Index)
       // Examine all summaries for this GUID.
       for (auto &Summary : GUIDSummaryLists.second.SummaryList)
-        if (auto FS = dyn_cast<FunctionSummary>(Summary.get()))
+        if (auto FS = dyn_cast<FunctionSummary>(Summary.get())) {
           // For each call in the function summary, see if the call
           // is to a GUID (which means it is for an indirect call,
           // otherwise we would have a Value for it). If so, synthesize
@@ -207,6 +207,15 @@ class ModuleBitcodeWriterBase : public BitcodeWriterBase {
           for (auto &CallEdge : FS->calls())
             if (!CallEdge.first.haveGVs() || !CallEdge.first.getValue())
               assignValueId(CallEdge.first.getGUID());
+
+          // For each referenced variables in the function summary, see if the
+          // variable is represented by a GUID (as opposed to a symbol to
+          // declarations or definitions in the module). If so, sythesize a
+          // value id.
+          for (auto &RefEdge : FS->refs())
+            if ((!RefEdge.haveGVs() || !RefEdge.getValue()))
+              assignValueId(RefEdge.getGUID());
+        }
   }
 
 protected:
@@ -4071,7 +4080,7 @@ void ModuleBitcodeWriterBase::writePerModuleFunctionSummaryRecord(
   NameVals.push_back(SpecialRefCnts.second); // worefcnt
 
   for (auto &RI : FS->refs())
-    NameVals.push_back(VE.getValueID(RI.getValue()));
+    NameVals.push_back(getValueId(RI));
 
   const bool UseRelBFRecord =
       WriteRelBFToSummary && !F.hasProfileData() &&
diff --git a/llvm/lib/ProfileData/InstrProf.cpp b/llvm/lib/ProfileData/InstrProf.cpp
index 598f3fafb2bd90f..925e0fb67e1ee20 100644
--- a/llvm/lib/ProfileData/InstrProf.cpp
+++ b/llvm/lib/ProfileData/InstrProf.cpp
@@ -1285,46 +1285,44 @@ void annotateValueSite(Module &M, Instruction &Inst,
   Inst.setMetadata(LLVMContext::MD_prof, MDNode::get(Ctx, Vals));
 }
 
-bool getValueProfDataFromInst(const Instruction &Inst,
-                              InstrProfValueKind ValueKind,
-                              uint32_t MaxNumValueData,
-                              InstrProfValueData ValueData[],
-                              uint32_t &ActualNumValueData, uint64_t &TotalC,
-                              bool GetNoICPValue) {
+MDNode *mayHaveValueProfileOfKind(const Instruction &Inst,
+                                  InstrProfValueKind ValueKind) {
   MDNode *MD = Inst.getMetadata(LLVMContext::MD_prof);
   if (!MD)
-    return false;
+    return nullptr;
 
-  unsigned NOps = MD->getNumOperands();
+  if (MD->getNumOperands() < 5)
+    return nullptr;
 
-  if (NOps < 5)
-    return false;
-
-  // Operand 0 is a string tag "VP":
   MDString *Tag = cast<MDString>(MD->getOperand(0));
-  if (!Tag)
-    return false;
-
-  if (!Tag->getString().equals("VP"))
-    return false;
+  if (!Tag || !Tag->getString().equals("VP"))
+    return nullptr;
 
   // Now check kind:
   ConstantInt *KindInt = mdconst::dyn_extract<ConstantInt>(MD->getOperand(1));
   if (!KindInt)
-    return false;
+    return nullptr;
   if (KindInt->getZExtValue() != ValueKind)
-    return false;
+    return nullptr;
+
+  return MD;
+}
 
+static bool getValueProfDataFromInst(const MDNode *const MD,
+                                     const uint32_t MaxNumDataWant,
+                                     InstrProfValueData ValueData[],
+                                     uint32_t &ActualNumValueData,
+                                     uint64_t &TotalC, bool GetNoICPValue) {
+  const unsigned NOps = MD->getNumOperands();
   // Get total count
   ConstantInt *TotalCInt = mdconst::dyn_extract<ConstantInt>(MD->getOperand(2));
   if (!TotalCInt)
     return false;
   TotalC = TotalCInt->getZExtValue();
-
   ActualNumValueData = 0;
 
   for (unsigned I = 3; I < NOps; I += 2) {
-    if (ActualNumValueData >= MaxNumValueData)
+    if (ActualNumValueData >= MaxNumDataWant)
       break;
     ConstantInt *Value = mdconst::dyn_extract<ConstantInt>(MD->getOperand(I));
     ConstantInt *Count =
@@ -1341,6 +1339,33 @@ bool getValueProfDataFromInst(const Instruction &Inst,
   return true;
 }
 
+std::unique_ptr<InstrProfValueData[]>
+getValueProfDataFromInst(const Instruction &Inst, InstrProfValueKind ValueKind,
+                         uint32_t MaxNumValueData, uint32_t &ActualNumValueData,
+                         uint64_t &TotalC, bool GetNoICPValue) {
+  MDNode *MD = mayHaveValueProfileOfKind(Inst, ValueKind);
+  if (!MD)
+    return nullptr;
+  auto ValueDataArray = std::make_unique<InstrProfValueData[]>(MaxNumValueData);
+  if (!getValueProfDataFromInst(MD, MaxNumValueData, ValueDataArray.get(),
+                                ActualNumValueData, TotalC, GetNoICPValue))
+    return nullptr;
+  return ValueDataArray;
+}
+
+bool getValueProfDataFromInst(const Instruction &Inst,
+                              InstrProfValueKind ValueKind,
+                              uint32_t MaxNumValueData,
+                              InstrProfValueData ValueData[],
+                              uint32_t &ActualNumValueData, uint64_t &TotalC,
+                              bool GetNoICPValue) {
+  MDNode *MD = mayHaveValueProfileOfKind(Inst, ValueKind);
+  if (!MD)
+    return false;
+  return getValueProfDataFromInst(MD, MaxNumValueData, ValueData,
+                                  ActualNumValueData, TotalC, GetNoICPValue);
+}
+
 MDNode *getPGOFuncNameMetadata(const Function &F) {
   return F.getMetadata(getPGOFuncNameMetadataName());
 }
diff --git a/llvm/test/Bitcode/thinlto-function-summary-vtableref-pgo.ll b/llvm/test/Bitcode/thinlto-function-summary-vtableref-pgo.ll
index a68f10b6b61d3ec..a23bbad87e049b9 100644
--- a/llvm/test/Bitcode/thinlto-function-summary-vtableref-pgo.ll
+++ b/llvm/test/Bitcode/thinlto-function-summary-vtableref-pgo.ll
@@ -1,41 +1,40 @@
 ; RUN: opt -module-summary %s -o %t.o
 
-; RUN: llvm-bcanalyzer -dump %t.o | FileCheck %s
+; RUN llvm-bcanalyzer -dump %t.o | FileCheck %s
 
 ; RUN: llvm-dis -o - %t.o | FileCheck %s --check-prefix=DIS
-
+; Round trip it through llvm-as
+; RUN: llvm-dis -o - %t.o | llvm-as -o - | llvm-dis -o - | FileCheck %s --check-prefix=DIS
 
 ; CHECK: <GLOBALVAL_SUMMARY_BLOCK
 ; CHECK-NEXT:   <VERSION op0=9/>
 ; CHECK-NEXT:   <FLAGS op0=0/>
-; The `VALUE_GUID` below represents the "_ZN4Base4funcEv" referenced by the
-; indirect call instruction.
-; CHECK-NEXT:   <VALUE_GUID op0=8 op1=5459407273543877811/>
-; <PERMODULE_PROFILE> has the format [valueid, flags, instcount, funcflags,
-;                                     numrefs, rorefcnt, worefcnt,
-;                                     n x (valueid, hotness+tailcall)]
-; CHECK-NEXT:   <PERMODULE_PROFILE abbrevid=4 op0=0 op1=0 op2=4 op3=256 op4=0 op5=0 op6=0 op7=8 op8=2/>
+; "_ZTV4Base" referenced by the instruction that loads vtable from object.
+; CHECK-NEXT:   <VALUE_GUID op0=8 op1=1960855528937986108/>
+; "_ZN4Base4funcEv" referenced by the indirect call instruction.
+; CHECK-NEXT:   <VALUE_GUID op0=7 op1=5459407273543877811/>
+; CHECK-NEXT:   <PERMODULE abbrevid=5 op0=0 op1=0 op2=4 op3=256 op4=1 op5=1 op6=0 op7=8 op8=7/>
 ; CHECK-NEXT:  </GLOBALVAL_SUMMARY_BLOCK>
 
 target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
 target triple = "x86_64-unknown-linux-gnu"
 
-define i32 @_Z4testP4Base(ptr %0) !prof !0 {
-  %2 = load ptr, ptr %0, !prof !1
+define i32 @_Z4testP4Base(ptr %0) {
+  %2 = load ptr, ptr %0, !prof !0
   %3 = load ptr, ptr %2
-  %4 = tail call i32 %3(ptr %0), !prof !2
+  %4 = tail call i32 %3(ptr %0), !prof !1
   ret i32 %4
 }
 
-!0 = !{!"function_entry_count", i32 150}
 ; 1960855528937986108 is the MD5 hash of _ZTV4Base
-!1 = !{!"VP", i32 2, i64 1600, i64 1960855528937986108, i64 1600}
+!0 = !{!"VP", i32 2, i64 1600, i64 1960855528937986108, i64 1600}
 ; 5459407273543877811 is the MD5 hash of _ZN4Base4funcEv
-!2 = !{!"VP", i32 0, i64 1600, i64 5459407273543877811, i64 1600}
+!1 = !{!"VP", i32 0, i64 1600, i64 5459407273543877811, i64 1600}
 
 ; ModuleSummaryIndex stores <guid, global-value summary> map in std::map; so
 ; global value summares are printed out in the order that gv's guid increases.
-; DIS: ^0 = module: (path: "/usr/local/google/home/mingmingl/llvm-fork/llvm-project/build/test/Bitcode/Output/thinlto-function-summary-vtableref-pgo.ll.tmp.o", hash: (0, 0, 0, 0, 0))
-; DIS: ^1 = gv: (guid: 5459407273543877811)
-; DIS: ^2 = gv: (name: "_Z4testP4Base", summaries: (function: (module: ^0, flags: (linkage: external, visibility: default, notEligibleToImport: 0, live: 0, dsoLocal: 0, canAutoHide: 0), insts: 4, funcFlags: (readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0, noInline: 0, alwaysInline: 0, noUnwind: 0, mayThrow: 0, hasUnknownCall: 1, mustBeUnreachable: 0), calls: ((callee: ^1, hotness: none))))) ; guid = 15857150948103218965
-; DIS: ^3 = blockcount: 0
+; DIS: ^0 = module: (path: "{{.*}}", hash: (0, 0, 0, 0, 0))
+; DIS-NEXT: ^1 = gv: (guid: 1960855528937986108)
+; DIS-NEXT: ^2 = gv: (guid: 5459407273543877811)
+; DIS-NEXT: ^3 = gv: (name: "_Z4testP4Base", summaries: (function: (module: ^0, flags: (linkage: external, visibility: default, notEligibleToImport: 0, live: 0, dsoLocal: 0, canAutoHide: 0), insts: 4, funcFlags: (readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0, noInline: 0, alwaysInline: 0, noUnwind: 0, mayThrow: 0, hasUnknownCall: 1, mustBeUnreachable: 0), calls: ((callee: ^2, hotness: none)), refs: (readonly ^1)))) ; guid = 15857150948103218965
+; DIS-NEXT: ^4 = blockcount: 0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A precommit test case is in #79233, to showcase the diff clearer.

Created using spr 1.3.4
@minglotus-6 minglotus-6 added the LTO Link time optimization (regular/full LTO or ThinLTO) label Jan 24, 2024
@minglotus-6
Copy link
Contributor Author

Hi,
Sorry for the message but I'm about to re-create a new PR for review with shorter title name, so that branch name is shorter.

I get a message that the length branch name (thereby length file names) causes some build-bot failures on Windows where there is a 256-character filename length limit (they are fixing the bot to not fetch from branch users/<github-user-id>/spr meanwhile). p.s. I just filed a feature request to spr to support explicit specification of branch name.

@minglotus-6
Copy link
Contributor Author

I'll abandon this pr in favor of #79381 (which has a shorter branch name). Going to delete this lengthy branch from llvm/llvm-project shortly after.

Sorry for the noise and inconvenience!

@minglotus-6 minglotus-6 deleted the users/minglotus-6/spr/thinltotypeprofilingadd-annotated-vtable-guid-as-referenced-variables-in-per-function-summary-and-update-bitcode-writer-to-create-value-ids-for-these-referenced-vtables branch January 24, 2024 22:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:analysis LTO Link time optimization (regular/full LTO or ThinLTO)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants