Merged
Conversation
the version field (u64, 8 bytes) existed on every entry but was only read during WATCH/EXEC, which is <1% of production workloads. every mutation unconditionally bumped it. moved version tracking to a lazily-populated AHashMap on keyspace: - key_version() inserts into the side table on first call (like WATCH) - bump_version() only bumps if the key is tracked (almost never) - version entries are cleaned up on delete/expire/evict/flush this shrinks Entry from 52 to 44 bytes and reduces ENTRY_OVERHEAD from 128 to 120, saving ~8 bytes per key in sharded mode.
This was referenced Feb 25, 2026
kacy
added a commit
that referenced
this pull request
Feb 25, 2026
sharded string overhead is 180 B/key (was 208), hash is 215 B/key (was 243) after the entry struct optimizations in PRs #284-287.
kacy
added a commit
that referenced
this pull request
Feb 25, 2026
sharded string overhead is 180 B/key (was 208), hash is 215 B/key (was 243) after the entry struct optimizations in PRs #284-287.
kacy
added a commit
that referenced
this pull request
Feb 25, 2026
refresh all throughput, latency, and encryption numbers after entry struct optimization PRs (#284-287). remove concurrent mode references since it's being deprecated. add data type throughput section from bench-datatypes.sh. fix memtier header (50k → 10k req/client).
kacy
added a commit
that referenced
this pull request
Feb 25, 2026
refresh all throughput, latency, and encryption numbers after entry struct optimization PRs (#284-287). remove concurrent mode references since it's being deprecated. add data type throughput section from bench-datatypes.sh. fix memtier header (50k → 10k req/client).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
summary
moves the
version: u64field from everyEntryin the keyspace to a lazily-populated side table (AHashMap<CompactString, u64>) onKeyspace. this field was only meaningful for WATCH/EXEC optimistic locking (<1% of workloads) but consumed 8 bytes on every single entry.key_version()now inserts into the side table on first call (simulating WATCH registration)bump_version()only bumps the version if the key is already tracked — on the hot path this is a fast hash-miss on an empty mapENTRY_OVERHEADreduced from 128 to 120 bytesestimated savings: ~8 bytes per key in sharded mode (208 → ~200 B/key).
what was tested
cargo test -p emberkv-core)cargo test --all)cargo clippy --allclean (no new warnings)entry_overhead_not_too_smalltest validates the new constant against actual struct sizesdesign considerations
the alternative was keeping version on Entry but using a smaller type (u32). that only saves 4 bytes with alignment and still writes 4 bytes on every mutation. the side table approach saves 8 bytes per entry and eliminates the unconditional write on the hot path entirely — mutations now only do a hash lookup against
self.versionswhich is almost always empty (fast miss).the tradeoff is that WATCH/EXEC now has slightly different bookkeeping (lazy population on first
key_version()call), but the semantics are identical: WATCH captures a snapshot version, EXEC detects changes.