-
Notifications
You must be signed in to change notification settings - Fork 11k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[memprof] Omit the key length for the record table #89527
[memprof] Omit the key length for the record table #89527
Conversation
The record table has a constant key length, so we don't need to serialize or deserialize it for every key-data pair. Omitting the key length saves 0.06% of the indexed MemProf file size. Note that it's OK to change the format because Version2 is still under development.
@llvm/pr-subscribers-llvm-support Author: Kazu Hirata (kazutakahirata) ChangesThe record table has a constant key length, so we don't need to Note that it's OK to change the format because Version2 is still under Full diff: https://github.com/llvm/llvm-project/pull/89527.diff 2 Files Affected:
diff --git a/llvm/include/llvm/ProfileData/MemProf.h b/llvm/include/llvm/ProfileData/MemProf.h
index aa6cdf198485b0..f356e3a54a3645 100644
--- a/llvm/include/llvm/ProfileData/MemProf.h
+++ b/llvm/include/llvm/ProfileData/MemProf.h
@@ -471,12 +471,16 @@ class RecordLookupTrait {
hash_value_type ComputeHash(uint64_t K) { return K; }
- static std::pair<offset_type, offset_type>
+ std::pair<offset_type, offset_type>
ReadKeyDataLength(const unsigned char *&D) {
using namespace support;
+ // Starting with Version2, we don't read the key length because it is a
+ // constant.
offset_type KeyLen =
- endian::readNext<offset_type, llvm::endianness::little>(D);
+ Version < Version2
+ ? endian::readNext<offset_type, llvm::endianness::little>(D)
+ : sizeof(uint64_t);
offset_type DataLen =
endian::readNext<offset_type, llvm::endianness::little>(D);
return std::make_pair(KeyLen, DataLen);
@@ -534,7 +538,9 @@ class RecordWriterTrait {
endian::Writer LE(Out, llvm::endianness::little);
offset_type N = sizeof(K);
- LE.write<offset_type>(N);
+ // Starting with Version2, we omit the key length because it is a constant.
+ if (Version < Version2)
+ LE.write<offset_type>(N);
offset_type M = V.serializedSize(Version);
LE.write<offset_type>(M);
return std::make_pair(N, M);
diff --git a/llvm/include/llvm/Support/OnDiskHashTable.h b/llvm/include/llvm/Support/OnDiskHashTable.h
index f6b4055e74de7e..b6dbea53f3da6d 100644
--- a/llvm/include/llvm/Support/OnDiskHashTable.h
+++ b/llvm/include/llvm/Support/OnDiskHashTable.h
@@ -377,7 +377,7 @@ template <typename Info> class OnDiskChainedHashTable {
// Determine the length of the key and the data.
const std::pair<offset_type, offset_type> &L =
- Info::ReadKeyDataLength(Items);
+ InfoPtr->ReadKeyDataLength(Items);
offset_type ItemLen = L.first + L.second;
// Compare the hashes. If they are not the same, skip the entry entirely.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, the function GUID based key for records has worked well so far. I don't think we will change that.
@kazutakahirata This is causing problems on EXPENSIVE_CHECKS builds: |
The record table has a constant key length, so we don't need to
serialize or deserialize it for every key-data pair. Omitting the key
length saves 0.06% of the indexed MemProf file size.
Note that it's OK to change the format because Version2 is still under
development.