Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ctx_profile] Profile reader and writer #91859

Merged
merged 5 commits into from
May 15, 2024
Merged

[ctx_profile] Profile reader and writer #91859

merged 5 commits into from
May 15, 2024

Conversation

mtrofin
Copy link
Member

@mtrofin mtrofin commented May 11, 2024

Utility converting a profile coming from compiler_rt to bitstream, and a reader.

PGOCtxProfileWriter::write would be used as the Writer parameter for __llvm_ctx_profile_fetch API. This is expected to happen in user code, for example in the RPC hanler tasked with collecting a profile, and would look like this:

// set up an output stream "Out", which could contain other stuff
{
  // constructing the Writer will start the section, in Out, containing
  // the collected contextual profiles.
  PGOCtxProfWriter Writer(Out);
  __llvm_ctx_profile_fetch(&Writer, +[](void* W, const ContextNode &N) {
    reinterpret_cast<PGOCtxProfWriter*>(W)->write(N);
  });
  // Writer going out of scope will finish up the section.
}

The reader produces a data structure suitable for maintenance during IPO transformations.

(Tracking Issue: #89287, RFC referenced there)

Utility converting a profile coming from `compiler_rt` to bitstream.

`PGOCtxProfileWriter::write` would be used as the `Writer` parameter for
`__llvm_ctx_profile_fetch` API. This is expected to happen in user code,
for example in the RPC hanler tasked with collecting a profile, and would
look like this:

```
// set up an output stream "Out", which could contain other stuff
{
  // constructing the Writer will start the section, in Out, containing
  // the collected contextual profiles.
  PGOCtxProfWriter Writer(Out);
  __llvm_ctx_profile_fetch(&Writer, +[](void* W, const ContextNode &N) {
    reinterpret_cast<PGOCtxProfWriter*>(W)->write(N);
  });
  // Writer going out of scope will finish up the section.
}
```
@mtrofin mtrofin marked this pull request as ready for review May 13, 2024 13:45
@llvmbot llvmbot added the PGO Profile Guided Optimizations label May 13, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented May 13, 2024

@llvm/pr-subscribers-pgo

Author: Mircea Trofin (mtrofin)

Changes

Utility converting a profile coming from compiler_rt to bitstream.

PGOCtxProfileWriter::write would be used as the Writer parameter for __llvm_ctx_profile_fetch API. This is expected to happen in user code, for example in the RPC hanler tasked with collecting a profile, and would look like this:

// set up an output stream "Out", which could contain other stuff
{
  // constructing the Writer will start the section, in Out, containing
  // the collected contextual profiles.
  PGOCtxProfWriter Writer(Out);
  __llvm_ctx_profile_fetch(&amp;Writer, +[](void* W, const ContextNode &amp;N) {
    reinterpret_cast&lt;PGOCtxProfWriter*&gt;(W)-&gt;write(N);
  });
  // Writer going out of scope will finish up the section.
}

Patch is 25.56 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/91859.diff

7 Files Affected:

  • (added) llvm/include/llvm/ProfileData/PGOCtxProfReader.h (+91)
  • (added) llvm/include/llvm/ProfileData/PGOCtxProfWriter.h (+91)
  • (modified) llvm/lib/ProfileData/CMakeLists.txt (+2)
  • (added) llvm/lib/ProfileData/PGOCtxProfReader.cpp (+165)
  • (added) llvm/lib/ProfileData/PGOCtxProfWriter.cpp (+49)
  • (modified) llvm/unittests/ProfileData/CMakeLists.txt (+1)
  • (added) llvm/unittests/ProfileData/PGOCtxProfReaderWriterTest.cpp (+266)
diff --git a/llvm/include/llvm/ProfileData/PGOCtxProfReader.h b/llvm/include/llvm/ProfileData/PGOCtxProfReader.h
new file mode 100644
index 0000000000000..e5219504163b6
--- /dev/null
+++ b/llvm/include/llvm/ProfileData/PGOCtxProfReader.h
@@ -0,0 +1,91 @@
+//===--- PGOCtxProfReader.h - Contextual profile reader ---------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+///
+/// \file
+///
+/// Reader for contextual iFDO profile, which comes in bitstream format.
+///
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_PROFILEDATA_CTXINSTRPROFILEREADER_H
+#define LLVM_PROFILEDATA_CTXINSTRPROFILEREADER_H
+
+#include "llvm/ADT/DenseSet.h"
+#include "llvm/Bitstream/BitstreamReader.h"
+#include "llvm/IR/GlobalValue.h"
+#include "llvm/ProfileData/PGOCtxProfWriter.h"
+#include "llvm/Support/Error.h"
+#include <map>
+#include <vector>
+
+namespace llvm {
+/// The loaded contextual profile, suitable for mutation during IPO passes. We
+/// generally expect a fraction of counters and of callsites to be populated.
+/// We continue to model counters as vectors, but callsites are modeled as a map
+/// of a map. The expectation is that, typically, there is a small number of
+/// indirect targets (usually, 1 for direct calls); but potentially a large
+/// number of callsites, and, as inlining progresses, the callsite count of a
+/// caller will grow.
+class PGOContextualProfile final {
+public:
+  using CallTargetMapTy = std::map<GlobalValue::GUID, PGOContextualProfile>;
+  using CallsiteMapTy = DenseMap<uint32_t, CallTargetMapTy>;
+
+private:
+  friend class PGOCtxProfileReader;
+  GlobalValue::GUID GUID = 0;
+  SmallVector<uint64_t, 16> Counters;
+  CallsiteMapTy Callsites;
+
+  PGOContextualProfile(GlobalValue::GUID G,
+                       SmallVectorImpl<uint64_t> &&Counters)
+      : GUID(G), Counters(std::move(Counters)) {}
+
+  Expected<PGOContextualProfile &>
+  getOrEmplace(uint32_t Index, GlobalValue::GUID G,
+               SmallVectorImpl<uint64_t> &&Counters);
+
+public:
+  PGOContextualProfile(const PGOContextualProfile &) = delete;
+  PGOContextualProfile &operator=(const PGOContextualProfile &) = delete;
+  PGOContextualProfile(PGOContextualProfile &&) = default;
+  PGOContextualProfile &operator=(PGOContextualProfile &&) = default;
+
+  GlobalValue::GUID guid() const { return GUID; }
+  const SmallVectorImpl<uint64_t> &counters() const { return Counters; }
+  const CallsiteMapTy &callsites() const { return Callsites; }
+  CallsiteMapTy &callsites() { return Callsites; }
+
+  bool hasCallsite(uint32_t I) const {
+    return Callsites.find(I) != Callsites.end();
+  }
+
+  const CallTargetMapTy &callsite(uint32_t I) const {
+    return Callsites.find(I)->second;
+  }
+  void getContainedGuids(DenseSet<GlobalValue::GUID> &Guids) const;
+};
+
+class PGOCtxProfileReader final {
+  BitstreamCursor &Cursor;
+  Expected<BitstreamEntry> advance();
+  Error readMetadata();
+  Error wrongValue(const Twine &);
+  Error unsupported(const Twine &);
+
+  Expected<std::pair<std::optional<uint32_t>, PGOContextualProfile>>
+  readContext(bool ExpectIndex);
+  bool canReadContext();
+
+public:
+  PGOCtxProfileReader(BitstreamCursor &Cursor) : Cursor(Cursor) {}
+
+  Expected<std::map<GlobalValue::GUID, PGOContextualProfile>> loadContexts();
+};
+} // namespace llvm
+#endif
\ No newline at end of file
diff --git a/llvm/include/llvm/ProfileData/PGOCtxProfWriter.h b/llvm/include/llvm/ProfileData/PGOCtxProfWriter.h
new file mode 100644
index 0000000000000..65cd4d1b0d873
--- /dev/null
+++ b/llvm/include/llvm/ProfileData/PGOCtxProfWriter.h
@@ -0,0 +1,91 @@
+//===- PGOCtxProfWriter.h - Contextual Profile Writer -----------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file declares a utility for writing a contextual profile to bitstream.
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_PROFILEDATA_PGOCTXPROFWRITER_H_
+#define LLVM_PROFILEDATA_PGOCTXPROFWRITER_H_
+
+#include "llvm/Bitstream/BitstreamWriter.h"
+#include "llvm/ProfileData/CtxInstrContextNode.h"
+
+namespace llvm {
+enum PGOCtxProfileRecords { Invalid = 0, Version, Guid, CalleeIndex, Counters };
+
+enum PGOCtxProfileBlockIDs {
+  ProfileMetadataBlockID = 100,
+  ContextNodeBlockID = ProfileMetadataBlockID + 1
+};
+
+/// Write one or more ContextNodes to the provided raw_fd_stream.
+/// The caller must destroy the PGOCtxProfileWriter object before closing the
+/// stream.
+/// The design allows serializing a bunch of contexts embedded in some other
+/// file. The overall format is:
+///
+///  [... other data written to the stream...]
+///  SubBlock(ProfileMetadataBlockID)
+///   Version
+///   SubBlock(ContextNodeBlockID)
+///     [RECORDS]
+///     SubBlock(ContextNodeBlockID)
+///       [RECORDS]
+///       [... more SubBlocks]
+///     EndBlock
+///   EndBlock
+///
+/// The "RECORDS" are bitsream records. The IDs are in CtxProfileCodes (except)
+/// for Version, which is just for metadata). All contexts will have Guid and
+/// Counters, and all but the roots have CalleeIndex. The order in which the
+/// records appear does not matter, but they must precede any subcontexts,
+/// because that helps keep the reader code simpler.
+///
+/// Subblock containment captures the context->subcontext relationship. The
+/// "next()" relationship in the raw profile, between call targets of indirect
+/// calls, are just modeled as peer subblocks where the callee index is the
+/// same.
+///
+/// Versioning: the writer may produce additional records not known by the
+/// reader. The version number indicates a more structural change.
+/// The current version, in particular, is set up to expect optional extensions
+/// like value profiling - which would appear as additional records. For
+/// example, value profiling would produce a new record with a new record ID,
+/// containing the profiled values (much like the counters)
+class PGOCtxProfileWriter final {
+  SmallVector<char, 1 << 20> Buff;
+  BitstreamWriter Writer;
+
+  void writeCounters(const ctx_profile::ContextNode &Node);
+  void writeImpl(std::optional<uint32_t> CallerIndex,
+                 const ctx_profile::ContextNode &Node);
+
+public:
+  PGOCtxProfileWriter(raw_fd_stream &Out,
+                      std::optional<unsigned> VersionOverride = std::nullopt)
+      : Writer(Buff, &Out, 0) {
+    Writer.EnterSubblock(PGOCtxProfileBlockIDs::ProfileMetadataBlockID,
+                         CodeLen);
+    const auto Version = VersionOverride ? *VersionOverride : CurrentVersion;
+    Writer.EmitRecord(PGOCtxProfileRecords::Version,
+                      SmallVector<unsigned, 1>({Version}));
+  }
+
+  ~PGOCtxProfileWriter() { Writer.ExitBlock(); }
+
+  void write(const ctx_profile::ContextNode &);
+
+  // constants used in writing which a reader may find useful.
+  static constexpr unsigned CodeLen = 2;
+  static constexpr uint32_t CurrentVersion = 1;
+  static constexpr unsigned VBREncodingBits = 6;
+};
+
+} // namespace llvm
+#endif
\ No newline at end of file
diff --git a/llvm/lib/ProfileData/CMakeLists.txt b/llvm/lib/ProfileData/CMakeLists.txt
index 408f9ff01ec87..2397eebaf7b19 100644
--- a/llvm/lib/ProfileData/CMakeLists.txt
+++ b/llvm/lib/ProfileData/CMakeLists.txt
@@ -7,6 +7,8 @@ add_llvm_component_library(LLVMProfileData
   ItaniumManglingCanonicalizer.cpp
   MemProf.cpp
   MemProfReader.cpp
+  PGOCtxProfReader.cpp
+  PGOCtxProfWriter.cpp
   ProfileSummaryBuilder.cpp
   SampleProf.cpp
   SampleProfReader.cpp
diff --git a/llvm/lib/ProfileData/PGOCtxProfReader.cpp b/llvm/lib/ProfileData/PGOCtxProfReader.cpp
new file mode 100644
index 0000000000000..fa9f57f90faa5
--- /dev/null
+++ b/llvm/lib/ProfileData/PGOCtxProfReader.cpp
@@ -0,0 +1,165 @@
+//===- PGOCtxProfReader.cpp - Contextual Instrumentation profile reader ---===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// Read a contextual profile into a datastructure suitable for maintenance
+// throughout IPO
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/ProfileData/PGOCtxProfReader.h"
+#include "llvm/Bitstream/BitCodeEnums.h"
+#include "llvm/Bitstream/BitstreamReader.h"
+#include "llvm/ProfileData/PGOCtxProfWriter.h"
+#include "llvm/Support/Errc.h"
+#include "llvm/Support/Error.h"
+
+using namespace llvm;
+
+#define EXPECT_OR_RET(LHS, RHS)                                                \
+  auto LHS = RHS;                                                              \
+  if (!LHS)                                                                    \
+    return LHS.takeError();
+
+#define RET_ON_ERR(EXPR)                                                       \
+  if (auto Err = (EXPR))                                                       \
+    return Err;
+
+Expected<PGOContextualProfile &>
+PGOContextualProfile::getOrEmplace(uint32_t Index, GlobalValue::GUID G,
+                                   SmallVectorImpl<uint64_t> &&Counters) {
+  auto I = Callsites[Index].insert(
+      {G, PGOContextualProfile(G, std::move(Counters))});
+  if (!I.second)
+    return make_error<StringError>(llvm::errc::invalid_argument,
+                                   "Duplicate GUID for same callsite.");
+  return I.first->second;
+}
+
+void PGOContextualProfile::getContainedGuids(
+    DenseSet<GlobalValue::GUID> &Guids) const {
+  Guids.insert(GUID);
+  for (const auto &[_, Callsite] : Callsites)
+    for (const auto &[_, Callee] : Callsite)
+      Callee.getContainedGuids(Guids);
+}
+
+Expected<BitstreamEntry> PGOCtxProfileReader::advance() {
+  return Cursor.advance(BitstreamCursor::AF_DontAutoprocessAbbrevs);
+}
+
+Error PGOCtxProfileReader::wrongValue(const Twine &Msg) {
+  return make_error<StringError>(llvm::errc::invalid_argument, Msg);
+}
+
+Error PGOCtxProfileReader::unsupported(const Twine &Msg) {
+  return make_error<StringError>(llvm::errc::not_supported, Msg);
+}
+
+bool PGOCtxProfileReader::canReadContext() {
+  auto Blk = advance();
+  if (!Blk) {
+    consumeError(Blk.takeError());
+    return false;
+  }
+  return Blk->Kind == BitstreamEntry::SubBlock &&
+         Blk->ID == PGOCtxProfileBlockIDs::ContextNodeBlockID;
+}
+
+Expected<std::pair<std::optional<uint32_t>, PGOContextualProfile>>
+PGOCtxProfileReader::readContext(bool ExpectIndex) {
+  RET_ON_ERR(Cursor.EnterSubBlock(PGOCtxProfileBlockIDs::ContextNodeBlockID));
+
+  std::optional<ctx_profile::GUID> Guid;
+  std::optional<SmallVector<uint64_t, 16>> Counters;
+  std::optional<uint32_t> CallsiteIndex;
+
+  SmallVector<uint64_t, 1> RecordValues;
+
+  // We don't prescribe the order in which the records come in, and we are ok
+  // if other unsupported records appear. We seek in the current subblock until
+  // we get all we know.
+  while (!Guid || !Counters || (ExpectIndex && !CallsiteIndex)) {
+    RecordValues.clear();
+    EXPECT_OR_RET(Entry, advance());
+    if (Entry->Kind != BitstreamEntry::Record)
+      return unsupported(
+          "Expected records before encountering more subcontexts");
+    EXPECT_OR_RET(ReadRecord,
+                  Cursor.readRecord(bitc::UNABBREV_RECORD, RecordValues));
+    switch (*ReadRecord) {
+    case PGOCtxProfileRecords::Guid: {
+      if (RecordValues.size() != 1)
+        return wrongValue("The GUID record should have exactly one value");
+      Guid = RecordValues[0];
+      break;
+    }
+    case PGOCtxProfileRecords::Counters:
+      Counters = std::move(RecordValues);
+      if (Counters->empty())
+        return wrongValue("Empty counters. At least the entry counter (one "
+                          "value) was expected");
+      break;
+    case PGOCtxProfileRecords::CalleeIndex: {
+      if (!ExpectIndex)
+        return wrongValue("The root context should not have a callee index");
+      if (RecordValues.size() != 1)
+        return wrongValue("The callee index should have exactly one value");
+      CallsiteIndex = RecordValues[0];
+      break;
+    }
+    default:
+      // OK if we see records we do not understand.
+      break;
+    }
+  }
+
+  PGOContextualProfile Ret(*Guid, std::move(*Counters));
+
+  while (canReadContext()) {
+    EXPECT_OR_RET(SC, readContext(true));
+    if (!Ret.callsites()[*SC->first]
+             .insert({SC->second.guid(), std::move(SC->second)})
+             .second)
+      return wrongValue("Duplicate");
+  }
+  return std::make_pair(CallsiteIndex, std::move(Ret));
+}
+
+Error PGOCtxProfileReader::readMetadata() {
+  EXPECT_OR_RET(Blk, advance());
+  if (Blk->Kind != BitstreamEntry::SubBlock)
+    return unsupported("Expected Version record");
+  RET_ON_ERR(
+      Cursor.EnterSubBlock(PGOCtxProfileBlockIDs::ProfileMetadataBlockID));
+  EXPECT_OR_RET(MData, advance());
+  if (MData->Kind != BitstreamEntry::Record)
+    return unsupported("Expected Version record");
+
+  SmallVector<uint64_t, 1> Ver;
+  EXPECT_OR_RET(Code, Cursor.readRecord(bitc::UNABBREV_RECORD, Ver));
+  if (*Code != PGOCtxProfileRecords::Version)
+    return unsupported("Expected Version record");
+  if (Ver.size() != 1 || Ver[0] > PGOCtxProfileWriter::CurrentVersion)
+    return unsupported("Version " + std::to_string(*Code) +
+                       " is higher than supported version " +
+                       std::to_string(PGOCtxProfileWriter::CurrentVersion));
+  return Error::success();
+}
+
+Expected<std::map<GlobalValue::GUID, PGOContextualProfile>>
+PGOCtxProfileReader::loadContexts() {
+  std::map<GlobalValue::GUID, PGOContextualProfile> Ret;
+  RET_ON_ERR(readMetadata());
+  while (canReadContext()) {
+    EXPECT_OR_RET(E, readContext(false));
+    auto Key = E->second.guid();
+    if (!Ret.insert({Key, std::move(E->second)}).second)
+      return wrongValue("Duplicate roots");
+  }
+  return Ret;
+}
\ No newline at end of file
diff --git a/llvm/lib/ProfileData/PGOCtxProfWriter.cpp b/llvm/lib/ProfileData/PGOCtxProfWriter.cpp
new file mode 100644
index 0000000000000..e1cae92a78b8d
--- /dev/null
+++ b/llvm/lib/ProfileData/PGOCtxProfWriter.cpp
@@ -0,0 +1,49 @@
+//===- PGOCtxProfWriter.cpp - Contextual Instrumentation profile writer ---===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// Write a contextual profile to bitstream.
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/ProfileData/PGOCtxProfWriter.h"
+#include "llvm/Bitstream/BitCodeEnums.h"
+
+using namespace llvm;
+using namespace llvm::ctx_profile;
+
+void PGOCtxProfileWriter::writeCounters(const ContextNode &Node) {
+  Writer.EmitCode(bitc::UNABBREV_RECORD);
+  Writer.EmitVBR(PGOCtxProfileRecords::Counters, VBREncodingBits);
+  Writer.EmitVBR(Node.counters_size(), VBREncodingBits);
+  for (auto I = 0U; I < Node.counters_size(); ++I)
+    Writer.EmitVBR64(Node.counters()[I], VBREncodingBits);
+}
+
+// recursively write all the subcontexts. We do need to traverse depth first to
+// model the context->subcontext implicitly, and since this captures call
+// stacks, we don't really need to be worried about stack overflow and we can
+// keep the implementation simple.
+void PGOCtxProfileWriter::writeImpl(std::optional<uint32_t> CallerIndex,
+                                    const ContextNode &Node) {
+  Writer.EnterSubblock(PGOCtxProfileBlockIDs::ContextNodeBlockID, CodeLen);
+  Writer.EmitRecord(PGOCtxProfileRecords::Guid,
+                    SmallVector<uint64_t, 1>{Node.guid()});
+  if (CallerIndex)
+    Writer.EmitRecord(PGOCtxProfileRecords::CalleeIndex,
+                      SmallVector<uint64_t, 1>{*CallerIndex});
+  writeCounters(Node);
+  for (auto I = 0U; I < Node.callsites_size(); ++I)
+    for (const auto *Subcontext = Node.subContexts()[I]; Subcontext;
+         Subcontext = Subcontext->next())
+      writeImpl(I, *Subcontext);
+  Writer.ExitBlock();
+}
+
+void PGOCtxProfileWriter::write(const ContextNode &RootNode) {
+  writeImpl(std::nullopt, RootNode);
+}
\ No newline at end of file
diff --git a/llvm/unittests/ProfileData/CMakeLists.txt b/llvm/unittests/ProfileData/CMakeLists.txt
index ce3a0a45ccf18..c92642ded8282 100644
--- a/llvm/unittests/ProfileData/CMakeLists.txt
+++ b/llvm/unittests/ProfileData/CMakeLists.txt
@@ -13,6 +13,7 @@ add_llvm_unittest(ProfileDataTests
   InstrProfTest.cpp
   ItaniumManglingCanonicalizerTest.cpp
   MemProfTest.cpp
+  PGOCtxProfReaderWriterTest.cpp
   SampleProfTest.cpp
   SymbolRemappingReaderTest.cpp
   )
diff --git a/llvm/unittests/ProfileData/PGOCtxProfReaderWriterTest.cpp b/llvm/unittests/ProfileData/PGOCtxProfReaderWriterTest.cpp
new file mode 100644
index 0000000000000..727fb900b942b
--- /dev/null
+++ b/llvm/unittests/ProfileData/PGOCtxProfReaderWriterTest.cpp
@@ -0,0 +1,266 @@
+//===-------------- PGOCtxProfReadWriteTest.cpp ---------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Bitstream/BitstreamReader.h"
+#include "llvm/ProfileData/CtxInstrContextNode.h"
+#include "llvm/ProfileData/PGOCtxProfReader.h"
+#include "llvm/ProfileData/PGOCtxProfWriter.h"
+#include "llvm/Support/Error.h"
+#include "llvm/Support/MemoryBuffer.h"
+#include "llvm/Support/raw_ostream.h"
+#include "llvm/Testing/Support/SupportHelpers.h"
+#include "gtest/gtest.h"
+
+using namespace llvm;
+using namespace llvm::ctx_profile;
+
+class PGOCtxProfRWTest : public ::testing::Test {
+  std::vector<std::unique_ptr<char[]>> Nodes;
+  std::map<GUID, const ContextNode *> Roots;
+
+public:
+  ContextNode *createNode(GUID Guid, uint32_t NrCounters, uint32_t NrCallsites,
+                          ContextNode *Next = nullptr) {
+    auto AllocSize = ContextNode::getAllocSize(NrCounters, NrCallsites);
+    auto *Mem = Nodes.emplace_back(std::make_unique<char[]>(AllocSize)).get();
+    std::memset(Mem, 0, AllocSize);
+    auto *Ret = new (Mem) ContextNode(Guid, NrCounters, NrCallsites, Next);
+    return Ret;
+  }
+
+  void SetUp() override {
+    // Root (guid 1) has 2 callsites, one used for an indirect call to either
+    // guid 2 or 4.
+    // guid 2 calls guid 5
+    // guid 5 calls guid 2
+    // there's also a second root, guid3.
+    auto *Root1 = createNode(1, 2, 2);
+    Root1->counters()[0] = 10;
+    Root1->counters()[1] = 11;
+    Roots.insert({1, Root1});
+    auto *L1 = createNode(2, 1, 1);
+    L1->counters()[0] = 12;
+    Root1->subContexts()[1] = createNode(4, 3, 1, L1);
+    Root1->subContexts()[1]->counters()[0] = 13;
+    Root1->subContexts()[1]->counters()[1] = 14;
+    Root1->subContexts()[1]->counters()[2] = 15;
+
+    auto *L3 = createNode(5, 6, 3);
+    for (auto I = 0; I < 6; ++I)
+      L3->counters()[I] = 16 + I;
+    L1->subContexts()[0] = L3;
+    L3->subContexts()[2] = createNode(2, 1, 1);
+    L3->subContexts()[2]->counters()[0] = 30;
+    auto *Root2 = createNode(3, 1, 0);
+    Root2->counters()[0] = 40;
+    Roots.insert({3, Root2});
+  }
+
+  const std::map<GUID, const ContextNode *> &roots() const { return Roots; }
+};
+
+void checkSame(const ContextNode &Raw, const PGOContextualProfile &Profile) {
+  EXPECT_EQ(Raw.guid(), Profile.guid());
+  ASSERT_EQ(Raw.counters_size(), Profile.counters().size());
+  for (auto I = 0U; I < Raw.counters_size(); ++I)
+    EXPECT_EQ(Raw.counters()[I], Profile.counters...
[truncated]

Copy link
Contributor

@snehasish snehasish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments, still reading the rest of the patch.

Writer.write(*R);
}
Out.flush();
Out.close();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so does the dtor, actually.

llvm/lib/ProfileData/PGOCtxProfReader.cpp Show resolved Hide resolved

Expected<std::map<GlobalValue::GUID, PGOContextualProfile>>
PGOCtxProfileReader::loadContexts() {
std::map<GlobalValue::GUID, PGOContextualProfile> Ret;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be an DenseMap? If we iterate on the contents, I'm thinking that non-determinism may be masked by integer guids being ordered by std::map. Also it has log(n) lookup which will usually be slower than DenseMap.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There shouldn't be too many roots, and the use of map is because of pointer / iterator stability. Now, the set of roots shouldn't change, but this matches the other place std::map is used (for call targets).

/// example, value profiling would produce a new record with a new record ID,
/// containing the profiled values (much like the counters)
class PGOCtxProfileWriter final {
SmallVector<char, 1 << 20> Buff;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use a llvm::MemoryBuffer?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how - it's for the ctor of BitstreamWriter. Not seeing it take a MemoryBuffer, what am I missing

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My mistake, I didn't look hard enough and assumed that the usage was concerned with just a buffer of char with a particular size.

}

Error PGOCtxProfileReader::wrongValue(const Twine &Msg) {
return make_error<StringError>(llvm::errc::invalid_argument, Msg);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use InstrProfError instead of StringError here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. Also changed the API below accordingly.

Copy link
Contributor

@snehasish snehasish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm with some minor comments.

/// example, value profiling would produce a new record with a new record ID,
/// containing the profiled values (much like the counters)
class PGOCtxProfileWriter final {
SmallVector<char, 1 << 20> Buff;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My mistake, I didn't look hard enough and assumed that the usage was concerned with just a buffer of char with a particular size.

llvm/lib/ProfileData/PGOCtxProfReader.cpp Outdated Show resolved Hide resolved
auto I = Callsites[Index].insert(
{G, PGOContextualProfile(G, std::move(Counters))});
if (!I.second)
return make_error<StringError>(llvm::errc::invalid_argument,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be instrprof_error::invalid_prof?

CallsiteIndex = RecordValues[0];
break;
default:
// OK if we see records we do not understand, like records (profiles)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean s/profiles/value profiles/? (since I recall that mentioned elsewhere).

if (!Ret.callsites()[*SC->first]
.insert({SC->second.guid(), std::move(SC->second)})
.second)
return wrongValue("Duplicate");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you expand on what duplicate is being reported as an error?

Comment on lines 127 to 129
if (!Ret.callsites()[*SC->first]
.insert({SC->second.guid(), std::move(SC->second)})
.second)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you spell this out with some intermediate variables to make it easier to read?

// We don't prescribe the order in which the records come in, and we are ok
// if other unsupported records appear. We seek in the current subblock until
// we get all we know.
while (!Guid || !Counters || (ExpectIndex && !CallsiteIndex)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: The negative booleans are a little hard to read. Consider this alternative loop structure:

while(true) {

... existing logic ...

// Exit the loop once we have gathered all the values we expect.
if (Guid.has_value() && Counters.has_value && (ExpectIndex && CallsiteIndex.has_value) 
  break;
}

llvm/lib/ProfileData/PGOCtxProfWriter.cpp Outdated Show resolved Hide resolved
llvm/unittests/ProfileData/PGOCtxProfReaderWriterTest.cpp Outdated Show resolved Hide resolved
Expected<PGOContextualProfile &>
PGOContextualProfile::getOrEmplace(uint32_t Index, GlobalValue::GUID G,
SmallVectorImpl<uint64_t> &&Counters) {
auto I = Callsites[Index].insert(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a structured binding?

@mtrofin mtrofin merged commit dfdc3dc into llvm:main May 15, 2024
3 of 4 checks passed
@mtrofin mtrofin deleted the bitstream2 branch May 15, 2024 01:01
mtrofin added a commit that referenced this pull request May 15, 2024
mtrofin added a commit that referenced this pull request May 15, 2024
@mtrofin
Copy link
Member Author

mtrofin commented May 15, 2024

Part of the fix to enable reapplying is #92208.

The other is adding the bitcode lib dependency for profile data, so the dynamic linking build would pass.

Waiting to land #92208 first.

mtrofin added a commit that referenced this pull request May 16, 2024
This was an oversight in #91859. Using the subblock ID mechanism
other places that use the bitstream APIs (e.g. `BitstreamRemarkSerializer`) use.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PGO Profile Guided Optimizations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants