litt sharding simple sharding#3395
Merged
Merged
Conversation
|
The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).
|
Kbhat1
approved these changes
May 11, 2026
blindchaser
reviewed
May 13, 2026
| @@ -233,18 +233,13 @@ the [value](#value) associated with a [key](#key) can be retrieved from disk. | |||
| An address is encoded in a 64-bit integer. It contains two pieces of information: | |||
Contributor
There was a problem hiding this comment.
from ai: The implementation now serializes a 13-byte address, not a 64-bit integer, and the offset points at the value length prefix. Update this before merging.
Contributor
Author
There was a problem hiding this comment.
Fixed wording:
## Address
An address partially describes the location on disk where a [value](#value) is stored. Together with a [key](#key),
the [value](#value) associated with a [key](#key) can be retrieved from disk.
An address contains the following information:
- the [segment](#segment) [index](#segment-index) where the [value](#value) is stored
- the [shard](#shard) within that segment that holds the [value](#value)
- the offset within the [value file](#segment-value-files) where the first byte of
the [value](#value) is stored
- the length of the [value](#value) in bytes
Retrieving a [value](#value) starting from an Address is a self-contained
operation that does not need to consult any segment-level metadata or recompute anything from the [key](#key).
blindchaser
reviewed
May 13, 2026
| } | ||
|
|
||
| // DeserializeAddress converts a byte slice to an Address. The slice must be exactly AddressSerializedSize bytes. | ||
| func DeserializeAddress(bytes []byte) (Address, error) { |
Contributor
There was a problem hiding this comment.
Issue: DeserializeAddress accepts any byte as ShardID, but these paths index s.shards without validating it is < len(s.shards). A bad keymap entry or damaged key file can turn a read/restart into a runtime panic.
Contributor
Author
There was a problem hiding this comment.
added defensive checks
blindchaser
approved these changes
May 13, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Describe your changes and provide context
This PR simplifies how littDB assignes values to shards. The net result of this change is that we don't need the siphash library any more. Previously, LittDB needed this library for shard assignment.
Testing performed to validate your change
Benchmarked this particular schema when evaluating LittDB performance for block storage.
Note
High Risk
High risk because it changes LittDB’s on-disk formats (segment metadata, key files, and keymap Address serialization) and the shard selection logic, which can affect data compatibility and read/write correctness across restarts.
Overview
This PR replaces key-hash-based sharding (with per-segment salt) with round-robin shard assignment at write time, and makes reads fully address-driven by embedding
shardID(andvalueSize) into the on-disktypes.Address.It updates segment/key-file/metadata serialization to a new
LatestSegmentVersion(dropping legacy versions and salt fields), tightens sharding-factor validation (now capped atMaxShardingFactor = 256), removes the siphash dependency, and adjusts/extends tests and docs to match the new wire formats and shard-selection behavior (including out-of-range shardID handling).Reviewed by Cursor Bugbot for commit 9d2dd25. Bugbot is set up for automated code reviews on this repo. Configure here.