Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[memprof] Reduce schema for Version2 #89876

Merged
merged 4 commits into from
Apr 24, 2024

Conversation

kazutakahirata
Copy link
Contributor

Curently, the compiler only uses several fields of MemoryInfoBlock.
Serializing all fields into the indexed MemProf file simply wastes
storage.

This patch limits the schema down to four fields for Version2 by
default. It retains the old behavior of serializing all fields via:

llvm-profdata merge --memprof-version=2 --memprof-full-schema

This patch reduces the size of the indexed MemProf profile I have by
40% (1.6GB down to 1.0GB).

Curently, the compiler only uses several fields of MemoryInfoBlock.
Serializing all fields into the indexed MemProf file simply wastes
storage.

This patch limits the schema down to four fields for Version2 by
default.  It retains the old behavior of serializing all fields via:

  llvm-profdata merge --memprof-version=2 --memprof-full-schema

This patch reduces the size of the indexed MemProf profile I have by
40% (1.6GB down to 1.0GB).
@llvmbot llvmbot added the PGO Profile Guided Optimizations label Apr 24, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Apr 24, 2024

@llvm/pr-subscribers-pgo

Author: Kazu Hirata (kazutakahirata)

Changes

Curently, the compiler only uses several fields of MemoryInfoBlock.
Serializing all fields into the indexed MemProf file simply wastes
storage.

This patch limits the schema down to four fields for Version2 by
default. It retains the old behavior of serializing all fields via:

llvm-profdata merge --memprof-version=2 --memprof-full-schema

This patch reduces the size of the indexed MemProf profile I have by
40% (1.6GB down to 1.0GB).


Full diff: https://github.com/llvm/llvm-project/pull/89876.diff

5 Files Affected:

  • (modified) llvm/include/llvm/ProfileData/InstrProfWriter.h (+5-1)
  • (modified) llvm/include/llvm/ProfileData/MemProf.h (+8-1)
  • (modified) llvm/lib/ProfileData/InstrProfWriter.cpp (+16-9)
  • (modified) llvm/test/tools/llvm-profdata/memprof-merge-v0.test (+3)
  • (modified) llvm/tools/llvm-profdata/llvm-profdata.cpp (+6-1)
diff --git a/llvm/include/llvm/ProfileData/InstrProfWriter.h b/llvm/include/llvm/ProfileData/InstrProfWriter.h
index b0ae8f364fcafb..7202ce67f0ddfd 100644
--- a/llvm/include/llvm/ProfileData/InstrProfWriter.h
+++ b/llvm/include/llvm/ProfileData/InstrProfWriter.h
@@ -85,11 +85,15 @@ class InstrProfWriter {
   // The MemProf version we should write.
   memprof::IndexedVersion MemProfVersionRequested;
 
+  // Whether to serialize the full schema.
+  bool MemProfFullSchema;
+
 public:
   InstrProfWriter(
       bool Sparse = false, uint64_t TemporalProfTraceReservoirSize = 0,
       uint64_t MaxTemporalProfTraceLength = 0, bool WritePrevVersion = false,
-      memprof::IndexedVersion MemProfVersionRequested = memprof::Version0);
+      memprof::IndexedVersion MemProfVersionRequested = memprof::Version0,
+      bool MemProfFullSchema = false);
   ~InstrProfWriter();
 
   StringMap<ProfilingData> &getProfileData() { return FunctionData; }
diff --git a/llvm/include/llvm/ProfileData/MemProf.h b/llvm/include/llvm/ProfileData/MemProf.h
index 37019bcab5448d..9557f50ea4e65b 100644
--- a/llvm/include/llvm/ProfileData/MemProf.h
+++ b/llvm/include/llvm/ProfileData/MemProf.h
@@ -117,7 +117,7 @@ struct PortableMemInfoBlock {
   void clear() { *this = PortableMemInfoBlock(); }
 
   // Returns the full schema currently in use.
-  static MemProfSchema getSchema() {
+  static MemProfSchema getFullSchema() {
     MemProfSchema List;
 #define MIBEntryDef(NameTag, Name, Type) List.push_back(Meta::Name);
 #include "llvm/ProfileData/MIBEntryDef.inc"
@@ -125,6 +125,13 @@ struct PortableMemInfoBlock {
     return List;
   }
 
+  // Returns the schema consisting of the fields currently consumed by the
+  // compiler.
+  static MemProfSchema getSchema() {
+    return {Meta::AllocCount, Meta::TotalSize, Meta::TotalLifetime,
+            Meta::TotalLifetimeAccessDensity};
+  }
+
   bool operator==(const PortableMemInfoBlock &Other) const {
 #define MIBEntryDef(NameTag, Name, Type)                                       \
   if (Other.get##Name() != get##Name())                                        \
diff --git a/llvm/lib/ProfileData/InstrProfWriter.cpp b/llvm/lib/ProfileData/InstrProfWriter.cpp
index 4a6fc9d64b6900..55023aa9b6cfd3 100644
--- a/llvm/lib/ProfileData/InstrProfWriter.cpp
+++ b/llvm/lib/ProfileData/InstrProfWriter.cpp
@@ -184,12 +184,13 @@ class InstrProfRecordWriterTrait {
 InstrProfWriter::InstrProfWriter(
     bool Sparse, uint64_t TemporalProfTraceReservoirSize,
     uint64_t MaxTemporalProfTraceLength, bool WritePrevVersion,
-    memprof::IndexedVersion MemProfVersionRequested)
+    memprof::IndexedVersion MemProfVersionRequested, bool MemProfFullSchema)
     : Sparse(Sparse), MaxTemporalProfTraceLength(MaxTemporalProfTraceLength),
       TemporalProfTraceReservoirSize(TemporalProfTraceReservoirSize),
       InfoObj(new InstrProfRecordWriterTrait()),
       WritePrevVersion(WritePrevVersion),
-      MemProfVersionRequested(MemProfVersionRequested) {}
+      MemProfVersionRequested(MemProfVersionRequested),
+      MemProfFullSchema(MemProfFullSchema) {}
 
 InstrProfWriter::~InstrProfWriter() { delete InfoObj; }
 
@@ -507,7 +508,7 @@ static Error writeMemProfV0(
   OS.write(0ULL); // Reserve space for the memprof frame payload offset.
   OS.write(0ULL); // Reserve space for the memprof frame table offset.
 
-  auto Schema = memprof::PortableMemInfoBlock::getSchema();
+  auto Schema = memprof::PortableMemInfoBlock::getFullSchema();
   writeMemProfSchema(OS, Schema);
 
   uint64_t RecordTableOffset =
@@ -533,7 +534,7 @@ static Error writeMemProfV1(
   OS.write(0ULL); // Reserve space for the memprof frame payload offset.
   OS.write(0ULL); // Reserve space for the memprof frame table offset.
 
-  auto Schema = memprof::PortableMemInfoBlock::getSchema();
+  auto Schema = memprof::PortableMemInfoBlock::getFullSchema();
   writeMemProfSchema(OS, Schema);
 
   uint64_t RecordTableOffset =
@@ -554,7 +555,8 @@ static Error writeMemProfV2(
         &MemProfRecordData,
     llvm::MapVector<memprof::FrameId, memprof::Frame> &MemProfFrameData,
     llvm::MapVector<memprof::CallStackId, llvm::SmallVector<memprof::FrameId>>
-        &MemProfCallStackData) {
+        &MemProfCallStackData,
+    bool MemProfFullSchema) {
   OS.write(memprof::Version2);
   uint64_t HeaderUpdatePos = OS.tell();
   OS.write(0ULL); // Reserve space for the memprof record table offset.
@@ -563,7 +565,11 @@ static Error writeMemProfV2(
   OS.write(0ULL); // Reserve space for the memprof call stack payload offset.
   OS.write(0ULL); // Reserve space for the memprof call stack table offset.
 
-  auto Schema = memprof::PortableMemInfoBlock::getSchema();
+  memprof::MemProfSchema Schema;
+  if (MemProfFullSchema)
+    Schema = memprof::PortableMemInfoBlock::getFullSchema();
+  else
+    Schema = memprof::PortableMemInfoBlock::getSchema();
   writeMemProfSchema(OS, Schema);
 
   uint64_t RecordTableOffset =
@@ -605,7 +611,7 @@ static Error writeMemProf(
     llvm::MapVector<memprof::FrameId, memprof::Frame> &MemProfFrameData,
     llvm::MapVector<memprof::CallStackId, llvm::SmallVector<memprof::FrameId>>
         &MemProfCallStackData,
-    memprof::IndexedVersion MemProfVersionRequested) {
+    memprof::IndexedVersion MemProfVersionRequested, bool MemProfFullSchema) {
 
   switch (MemProfVersionRequested) {
   case memprof::Version0:
@@ -614,7 +620,7 @@ static Error writeMemProf(
     return writeMemProfV1(OS, MemProfRecordData, MemProfFrameData);
   case memprof::Version2:
     return writeMemProfV2(OS, MemProfRecordData, MemProfFrameData,
-                          MemProfCallStackData);
+                          MemProfCallStackData, MemProfFullSchema);
   }
 
   return make_error<InstrProfError>(
@@ -733,7 +739,8 @@ Error InstrProfWriter::writeImpl(ProfOStream &OS) {
   if (static_cast<bool>(ProfileKind & InstrProfKind::MemProf)) {
     MemProfSectionStart = OS.tell();
     if (auto E = writeMemProf(OS, MemProfRecordData, MemProfFrameData,
-                              MemProfCallStackData, MemProfVersionRequested))
+                              MemProfCallStackData, MemProfVersionRequested,
+                              MemProfFullSchema))
       return E;
   }
 
diff --git a/llvm/test/tools/llvm-profdata/memprof-merge-v0.test b/llvm/test/tools/llvm-profdata/memprof-merge-v0.test
index 03ccbdd42efdad..28f65e0781bc63 100644
--- a/llvm/test/tools/llvm-profdata/memprof-merge-v0.test
+++ b/llvm/test/tools/llvm-profdata/memprof-merge-v0.test
@@ -16,6 +16,9 @@ RUN: llvm-profdata show %t.prof.v1 | FileCheck %s
 RUN: llvm-profdata merge %t.proftext %p/Inputs/basic.memprofraw --memprof-version=2 --profiled-binary %p/Inputs/basic.memprofexe -o %t.prof.v2
 RUN: llvm-profdata show %t.prof.v2 | FileCheck %s
 
+RUN: llvm-profdata merge %t.proftext %p/Inputs/basic.memprofraw --memprof-version=2 --memprof-full-schema --profiled-binary %p/Inputs/basic.memprofexe -o %t.prof.v2
+RUN: llvm-profdata show %t.prof.v2 | FileCheck %s
+
 For now we only check the validity of the instrumented profile since we don't
 have a way to display the contents of the memprof indexed format yet.
 
diff --git a/llvm/tools/llvm-profdata/llvm-profdata.cpp b/llvm/tools/llvm-profdata/llvm-profdata.cpp
index 78daf9f7dc109a..fa85bbc2e56a88 100644
--- a/llvm/tools/llvm-profdata/llvm-profdata.cpp
+++ b/llvm/tools/llvm-profdata/llvm-profdata.cpp
@@ -308,6 +308,11 @@ cl::opt<memprof::IndexedVersion> MemProfVersionRequested(
                clEnumValN(memprof::Version1, "1", "version 1"),
                clEnumValN(memprof::Version2, "2", "version 2")));
 
+cl::opt<bool> MemProfFullSchema("memprof-full-schema", cl::Hidden,
+                                cl::sub(MergeSubcommand),
+                                cl::desc("Serialize the full schema"),
+                                cl::init(false));
+
 // Options specific to overlap subcommand.
 cl::opt<std::string> BaseFilename(cl::Positional, cl::Required,
                                   cl::desc("<base profile file>"),
@@ -600,7 +605,7 @@ struct WriterContext {
                 SmallSet<instrprof_error, 4> &WriterErrorCodes,
                 uint64_t ReservoirSize = 0, uint64_t MaxTraceLength = 0)
       : Writer(IsSparse, ReservoirSize, MaxTraceLength, DoWritePrevVersion,
-               MemProfVersionRequested),
+               MemProfVersionRequested, MemProfFullSchema),
         ErrLock(ErrLock), WriterErrorCodes(WriterErrorCodes) {}
 };
 

@@ -16,6 +16,9 @@ RUN: llvm-profdata show %t.prof.v1 | FileCheck %s
RUN: llvm-profdata merge %t.proftext %p/Inputs/basic.memprofraw --memprof-version=2 --profiled-binary %p/Inputs/basic.memprofexe -o %t.prof.v2
RUN: llvm-profdata show %t.prof.v2 | FileCheck %s

RUN: llvm-profdata merge %t.proftext %p/Inputs/basic.memprofraw --memprof-version=2 --memprof-full-schema --profiled-binary %p/Inputs/basic.memprofexe -o %t.prof.v2
RUN: llvm-profdata show %t.prof.v2 | FileCheck %s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way to check the output to confirm that we either serialized 4 fields or all fields?

Also, just noticed that this test has "v0" in the name, should the name be changed to reflect the fact that it is testing all versions? (not in this PR but separately)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way to check the output to confirm that we either serialized 4 fields or all fields?

Yes, let me try to extend llvm/unittests/ProfileData/InstrProfTest.cpp to verify that the roundtrip works in each case -- the 4 fields or all fields.

Also, just noticed that this test has "v0" in the name, should the name be changed to reflect the fact that it is testing all versions? (not in this PR but separately)

Yes, I just borrowed a test memprof-merge.test and added -v0 to it. I'll come up with a better name in a separate patch.

Copy link
Contributor

@teresajohnson teresajohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question about testing below.

It also occurs to me that we should do something on the profile use side in the compiler to detect which fields were in the profile, and give an error if we try to use other fields. I have a bit of concern that if someone tries to experiment with compiler changes on the use side that make use of additional fields and forgets about this change that they will get 0s for other fields and it silently won't behave as expected. E.g. maybe in llvm/include/llvm/ProfileData/MemProf.h possibly save the Schema used in deserialize, and assert in the get* functions if a field is used that wasn't in that set?

@kazutakahirata
Copy link
Contributor Author

It also occurs to me that we should do something on the profile use side in the compiler to detect which fields were in the profile, and give an error if we try to use other fields. I have a bit of concern that if someone tries to experiment with compiler changes on the use side that make use of additional fields and forgets about this change that they will get 0s for other fields and it silently won't behave as expected. E.g. maybe in llvm/include/llvm/ProfileData/MemProf.h possibly save the Schema used in deserialize, and assert in the get* functions if a field is used that wasn't in that set?

I understand your concern. Yes, get* functions seem to be the way to go. Let me think about the details.

llvm/include/llvm/ProfileData/MemProf.h Outdated Show resolved Hide resolved
@@ -117,14 +117,21 @@ struct PortableMemInfoBlock {
void clear() { *this = PortableMemInfoBlock(); }

// Returns the full schema currently in use.
static MemProfSchema getSchema() {
static MemProfSchema getFullSchema() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason that these are part of the PortableMemInfoBlock? IMO they should be peers of the Schema itself (under llvm::memprof).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll post a separate patch for that.

llvm/lib/ProfileData/InstrProfWriter.cpp Outdated Show resolved Hide resolved
llvm/tools/llvm-profdata/llvm-profdata.cpp Outdated Show resolved Hide resolved
@snehasish
Copy link
Contributor

lgtm, pending the unittest for the schema based roundtrip.

- Rename getSchema to getHotColdSchema.

- Add a unit test for the full/partial schema.

- Update the description for the memprof-full-schema option.
Copy link
Contributor

@snehasish snehasish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New unit test lgtm with one minor comment. Thanks!

llvm/unittests/ProfileData/InstrProfTest.cpp Show resolved Hide resolved
@kazutakahirata kazutakahirata merged commit 4c8ec8f into llvm:main Apr 24, 2024
3 of 4 checks passed
@kazutakahirata kazutakahirata deleted the pr_memprof_schema branch April 24, 2024 23:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PGO Profile Guided Optimizations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants