Skip to content

perf: packed hash encoding to reduce memory ~451 → ~240 B/key#276

Merged
kacy merged 1 commit intomainfrom
perf/packed-hash-encoding
Feb 24, 2026
Merged

perf: packed hash encoding to reduce memory ~451 → ~240 B/key#276
kacy merged 1 commit intomainfrom
perf/packed-hash-encoding

Conversation

@kacy
Copy link
Copy Markdown
Owner

@kacy kacy commented Feb 24, 2026

summary

replaces HashValue::Compact(Vec<(CompactString, Bytes)>) with HashValue::Packed(Vec<u8>) using a listpack-style wire format. per-field overhead drops from 48 bytes to 6 bytes, cutting total per-key memory for the standard benchmark case (5 fields) from ~451 to ~240 bytes.

packed format: [num_fields: u16][{name_len: u16, name: bytes, val_len: u32, val: bytes}...]

  • all buffer reads use checked accessors — truncated/corrupt buffers stop iteration gracefully
  • get() returns Option<&[u8]> (zero-copy into the packed buffer); callers use Bytes::copy_from_slice at the keyspace boundary
  • new field insert is append-only (no rebuild); existing field update or remove uses splice/drain on a small contiguous buffer
  • HashMap promotion at >32 fields is unchanged
  • no changes to command dispatch hot path, persistence wire format, or replication

what was tested

  • cargo test -p emberkv-core — all 507 tests pass (including new packed-specific edge cases: truncated buffer, middle field removal, in-place update, roundtrip via to_hash_map/from)
  • cargo clippy -p emberkv-core — clean, no warnings
  • cargo build --workspace — full workspace compiles

design considerations

the packed buffer trades O(1) field update for O(n) scan + rebuild, which is acceptable because n ≤ 32 and the buffer is contiguous in L1 cache (~80 bytes for 5 fields). the read path (HGET) benefits from better cache locality — scanning 6 bytes of framing per field instead of 48-byte tuples. write_u16 uses direct indexing which is safe because it's only called on offsets that were just read successfully from the same buffer.

replace the Compact(Vec<(CompactString, Bytes)>) hash representation with
Packed(Vec<u8>) using a listpack-style wire format:

  [num_fields: u16][{name_len: u16, name: bytes, val_len: u32, val: bytes}...]

per-field overhead drops from 48 bytes (CompactString + Bytes tuple) to
6 bytes (2B name_len + 4B value_len). for the standard benchmark case
(5 fields, "f1":"value001"), this cuts total per-key memory from ~451
to ~240 bytes.

key design decisions:
- all buffer reads use checked accessors (buf.get(), checked_add) so a
  truncated or corrupt buffer stops iteration rather than panicking
- insert of a new field is append-only (no buffer rebuild)
- insert of an existing field or remove rebuilds the buffer via splice/drain
  which is fine since the buffer is small (~80 bytes for 5 fields)
- get() now returns Option<&[u8]> instead of Option<&Bytes>; callers use
  Bytes::copy_from_slice at the keyspace boundary (same cost as .cloned())
- HashMap promotion at >32 fields is unchanged
- no changes to command dispatch hot path, persistence, or replication
@kacy kacy merged commit 39d3046 into main Feb 24, 2026
6 of 7 checks passed
@kacy kacy deleted the perf/packed-hash-encoding branch February 24, 2026 19:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant