Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[memprof] Omit the key length for the record table #89527

Merged
merged 1 commit into from
Apr 23, 2024

Conversation

kazutakahirata
Copy link
Contributor

The record table has a constant key length, so we don't need to
serialize or deserialize it for every key-data pair. Omitting the key
length saves 0.06% of the indexed MemProf file size.

Note that it's OK to change the format because Version2 is still under
development.

The record table has a constant key length, so we don't need to
serialize or deserialize it for every key-data pair.  Omitting the key
length saves 0.06% of the indexed MemProf file size.

Note that it's OK to change the format because Version2 is still under
development.
@llvmbot
Copy link
Collaborator

llvmbot commented Apr 21, 2024

@llvm/pr-subscribers-llvm-support

Author: Kazu Hirata (kazutakahirata)

Changes

The record table has a constant key length, so we don't need to
serialize or deserialize it for every key-data pair. Omitting the key
length saves 0.06% of the indexed MemProf file size.

Note that it's OK to change the format because Version2 is still under
development.


Full diff: https://github.com/llvm/llvm-project/pull/89527.diff

2 Files Affected:

  • (modified) llvm/include/llvm/ProfileData/MemProf.h (+9-3)
  • (modified) llvm/include/llvm/Support/OnDiskHashTable.h (+1-1)
diff --git a/llvm/include/llvm/ProfileData/MemProf.h b/llvm/include/llvm/ProfileData/MemProf.h
index aa6cdf198485b0..f356e3a54a3645 100644
--- a/llvm/include/llvm/ProfileData/MemProf.h
+++ b/llvm/include/llvm/ProfileData/MemProf.h
@@ -471,12 +471,16 @@ class RecordLookupTrait {
 
   hash_value_type ComputeHash(uint64_t K) { return K; }
 
-  static std::pair<offset_type, offset_type>
+  std::pair<offset_type, offset_type>
   ReadKeyDataLength(const unsigned char *&D) {
     using namespace support;
 
+    // Starting with Version2, we don't read the key length because it is a
+    // constant.
     offset_type KeyLen =
-        endian::readNext<offset_type, llvm::endianness::little>(D);
+        Version < Version2
+            ? endian::readNext<offset_type, llvm::endianness::little>(D)
+            : sizeof(uint64_t);
     offset_type DataLen =
         endian::readNext<offset_type, llvm::endianness::little>(D);
     return std::make_pair(KeyLen, DataLen);
@@ -534,7 +538,9 @@ class RecordWriterTrait {
 
     endian::Writer LE(Out, llvm::endianness::little);
     offset_type N = sizeof(K);
-    LE.write<offset_type>(N);
+    // Starting with Version2, we omit the key length because it is a constant.
+    if (Version < Version2)
+      LE.write<offset_type>(N);
     offset_type M = V.serializedSize(Version);
     LE.write<offset_type>(M);
     return std::make_pair(N, M);
diff --git a/llvm/include/llvm/Support/OnDiskHashTable.h b/llvm/include/llvm/Support/OnDiskHashTable.h
index f6b4055e74de7e..b6dbea53f3da6d 100644
--- a/llvm/include/llvm/Support/OnDiskHashTable.h
+++ b/llvm/include/llvm/Support/OnDiskHashTable.h
@@ -377,7 +377,7 @@ template <typename Info> class OnDiskChainedHashTable {
 
       // Determine the length of the key and the data.
       const std::pair<offset_type, offset_type> &L =
-          Info::ReadKeyDataLength(Items);
+          InfoPtr->ReadKeyDataLength(Items);
       offset_type ItemLen = L.first + L.second;
 
       // Compare the hashes.  If they are not the same, skip the entry entirely.

Copy link
Contributor

@snehasish snehasish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, the function GUID based key for records has worked well so far. I don't think we will change that.

@kazutakahirata kazutakahirata merged commit b28f4d4 into llvm:main Apr 23, 2024
6 checks passed
@kazutakahirata kazutakahirata deleted the pr_memprof_omit_key_record branch April 23, 2024 05:59
@RKSimon
Copy link
Collaborator

RKSimon commented Apr 23, 2024

@kazutakahirata This is causing problems on EXPENSIVE_CHECKS builds:
https://lab.llvm.org/buildbot/#/builders/16/builds/64480

RKSimon added a commit that referenced this pull request Apr 23, 2024
RKSimon added a commit that referenced this pull request Apr 23, 2024
…89527)"

Breaks on EXPENSIVE_CHECKS builds which still use the static ReadKeyDataLength implementation in several locations
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants