perf: packed hash encoding to reduce memory ~451 → ~240 B/key#276
Merged
Conversation
replace the Compact(Vec<(CompactString, Bytes)>) hash representation with
Packed(Vec<u8>) using a listpack-style wire format:
[num_fields: u16][{name_len: u16, name: bytes, val_len: u32, val: bytes}...]
per-field overhead drops from 48 bytes (CompactString + Bytes tuple) to
6 bytes (2B name_len + 4B value_len). for the standard benchmark case
(5 fields, "f1":"value001"), this cuts total per-key memory from ~451
to ~240 bytes.
key design decisions:
- all buffer reads use checked accessors (buf.get(), checked_add) so a
truncated or corrupt buffer stops iteration rather than panicking
- insert of a new field is append-only (no buffer rebuild)
- insert of an existing field or remove rebuilds the buffer via splice/drain
which is fine since the buffer is small (~80 bytes for 5 fields)
- get() now returns Option<&[u8]> instead of Option<&Bytes>; callers use
Bytes::copy_from_slice at the keyspace boundary (same cost as .cloned())
- HashMap promotion at >32 fields is unchanged
- no changes to command dispatch hot path, persistence, or replication
This was referenced Feb 24, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
summary
replaces
HashValue::Compact(Vec<(CompactString, Bytes)>)withHashValue::Packed(Vec<u8>)using a listpack-style wire format. per-field overhead drops from 48 bytes to 6 bytes, cutting total per-key memory for the standard benchmark case (5 fields) from ~451 to ~240 bytes.packed format:
[num_fields: u16][{name_len: u16, name: bytes, val_len: u32, val: bytes}...]get()returnsOption<&[u8]>(zero-copy into the packed buffer); callers useBytes::copy_from_sliceat the keyspace boundarywhat was tested
cargo test -p emberkv-core— all 507 tests pass (including new packed-specific edge cases: truncated buffer, middle field removal, in-place update, roundtrip via to_hash_map/from)cargo clippy -p emberkv-core— clean, no warningscargo build --workspace— full workspace compilesdesign considerations
the packed buffer trades O(1) field update for O(n) scan + rebuild, which is acceptable because n ≤ 32 and the buffer is contiguous in L1 cache (~80 bytes for 5 fields). the read path (HGET) benefits from better cache locality — scanning 6 bytes of framing per field instead of 48-byte tuples.
write_u16uses direct indexing which is safe because it's only called on offsets that were just read successfully from the same buffer.