refactor(hot): delaminate sharding from HistoryRead/HistoryWrite#68
Open
prestwich wants to merge 21 commits into
Open
refactor(hot): delaminate sharding from HistoryRead/HistoryWrite#68prestwich wants to merge 21 commits into
prestwich wants to merge 21 commits into
Conversation
Used by the upcoming merge_and_split helper to un-push the overflow block when splitting a shard. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pure function: takes ownership of (existing, additions), merges them, and splits off a tail iff the merged list exceeds max_bytes after roaring encoding. Single linear pass; allocates the tail shard only when a split is required. This is the backend-agnostic primitive that signet-hot-mdbx's HistoryWrite impl will use to maintain its size-bounded DUPSORT shards. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ition The spec mandates a debug-only check that additions fit within max_bytes. Test 4 (merge_and_split_split_preserves_strict_ordering) was updated to use sparse input data (one element per 16-bit roaring container) so encoded sizes grow proportionally to count, keeping additions within budget while still provoking a split. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…X budget Two property-style tests assert roaring's encoded size stays under 1500 B for both the dense-pack (650 contiguous) and sparse-distinct-container (100 blocks across 100 16-bit chunks) worst cases, plus a round-trip test through merge_and_split at the realistic budget. These pin roaring's encoding behavior so a future crate update can't silently break MDBX's DUPSORT value cap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nation HistoryRead -> LegacyHistoryRead UnsafeHistoryWrite -> LegacyUnsafeHistoryWrite HistoryWrite -> LegacyConsistentHistoryWrite Pure mechanical rename. The unprefixed names are reserved for the new logical traits introduced in the next commit, which will live in crate::db::history and expose only logical (no shard-leaking) operations. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New module crate::db::history defines: - HistoryRead: blanket-impled on HotKvRead, default-impl-only. - HistoryWrite: required per-backend, with default impls for bulk ops. No callers yet; this is the trait scaffolding. Per-backend HistoryWrite impls land in subsequent commits. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace .ok() with .expect("history blocks strictly increasing") on the
two BlockNumberList::append calls inside blocks_changed_account and
blocks_changed_storage. The error case is only reachable on DB
corruption; silently suppressing it would convert corruption into
data loss in the read path. Loud panic is correct.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Single shard per (addr) at subkey u64::MAX, or per (addr, slot) at ShardedKey(slot, u64::MAX). MemKv has no per-value size budget, so no splitting is needed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
MDBX backend's HistoryWrite uses signet_storage_types::merge_and_split with MAX_SHARD_BYTES = 1500, derived from MDBX's DUPSORT value cap minus key2 and per-node overhead. Load-merge-rewrite for appends; backwards-walk delete-and-re-key-tail for truncates. The u64::MAX-tail subkey convention is preserved structurally: appends always write the tail at u64::MAX; truncates re-key the surviving prefix to u64::MAX if the original tail was deleted. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
setup_history_kv now uses append_account_history / append_storage_history instead of the shard-aware write_*_history methods. The at-height tests exercise the new HistoryRead default-impl path through get_account_at_height / get_storage_at_height. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
unwind_above's two shard-surgery blocks become single calls to truncate_account_history_above / truncate_storage_history_above. load_genesis uses append_account_history / append_storage_history. validate_chain_extension and append_blocks move to HistoryWrite as default impls. The LegacyConsistentHistoryWrite trait and crates/hot/src/db/consistent.rs are deleted; ADDRESS_MAX moves alongside the new default impls. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous commit accidentally moved hot(), cold(), reader(), revm_reader(), etc. into the impl block bounded by H::RwTx: HistoryWrite. These methods don't open a write transaction; the bound is unnecessary and tightens the public API for no reason. Split into two impl blocks matching the pre-refactor structure. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rite Bulk helpers (write_plain_revert_sorted, write_state_changes, append_block_inconsistent, etc.) move as default impls onto crate::db::history::HistoryWrite. The legacy shard-aware methods (write_*_history, last_*_history, append_*_history_index, update_history_indices_inconsistent) stay on LegacyUnsafeHistoryWrite for now; they are deleted in a subsequent commit. Removes the temporary LegacyUnsafeHistoryWrite supertrait on HistoryWrite introduced in the previous commit — the moved helpers now provide everything the consistent-state default impls need. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
All shard-shaped assertions (last_account_history returning specific shard keys, write_account_history with explicit subkeys) replaced with logical equivalents using HistoryRead::blocks_changed_account / HistoryWrite::append_account_history / truncate_account_history_above. Shard-structure assertions migrate to an MDBX-only structural test in a follow-up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Multi-shard scenarios that previously paired write_account_history calls with explicit subkeys collapse into single append_account_history calls with unioned block lists. The multi-shard layout is now an MDBX-internal detail; conformance tests assert the logical union, not the layout. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
LegacyHistoryRead, LegacyUnsafeHistoryWrite, and the append_to_sharded_history free fn are all replaced by the logical HistoryRead / HistoryWrite traits in crate::db::history. Non-shard-aware header/range/check methods migrate as default impls onto the new HistoryRead. No more ShardedKey<_> in public return types. No more shard subkeys in method signatures. The splitter logic now lives in signet_storage_types::merge_and_split and is called only from the MDBX backend. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Count-based shard splitting was replaced by merge_and_split's size-based logic. The constant has no remaining users. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
200 sparse blocks (200 distinct roaring containers, written as two batches of 100 to satisfy the per-additions shard precondition) feed the MDBX HistoryWrite impl. Tests assert that at least 2 dup entries exist for the addr in AccountsHistory (and analogously for StorageHistory), and that the tail shard's subkey is u64::MAX. This is the only place ShardedKey<_> appears outside the abstraction crate by design. Also adds [[test]] required-features = ["test-utils"] to Cargo.toml so the integration test is skipped under --no-default-features. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Wire the six new history conformance entrypoints into conformance() so they run against both MemKv and MDBX. Closes coverage gap on multi-batch appends + standalone truncates. The two update_history_indices tests use non-overlapping block windows (1-5 vs 1001-1005) to avoid re-indexing shared change-set entries on the shared conformance store. - Add empirical-size comment to worst_case_dense_pack_fits_in_dupsort_budget explaining the 650 vs 750 adjustment. - Drop a "shard key" terminology leak from a comment in db/history.rs. - Update setup_history_kv docstring in revm.rs to describe logical state rather than shard layout. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5b24c64 to
d6887c0
Compare
This was referenced May 22, 2026
prestwich
commented
May 22, 2026
| @@ -0,0 +1,1002 @@ | |||
| //! Logical history reads and writes. | |||
| //! | |||
| //! These traits replace the shard-leaking surface in `db::read` and | |||
Member
Author
There was a problem hiding this comment.
rustdoc should reference present behavior only, not past behavior
| /// `IntegerList::push` already does for roaring container growth. One | ||
| /// `IntegerList` allocation when a split occurs. | ||
| pub fn merge_and_split( | ||
| mut existing: IntegerList, |
Member
Author
There was a problem hiding this comment.
this should be a function on IntegerList.
and the additions should not be materialized
| let cold = ColdStorage::new(cold_backend, cancel_token); | ||
| Self::new(hot, cold) | ||
| } | ||
| } |
Member
Author
There was a problem hiding this comment.
why was this file changed?
|
|
||
| /// Logical reads against history + changeset tables. | ||
| /// | ||
| /// Default-impl-only. Backends cannot override — the blanket impl below |
Member
Author
There was a problem hiding this comment.
Remove this section of rustdoc. item-level rustdoc describes ONLY the behavior and how the downstream user interacts with it. archietecture goes in module or lib level
- merge_and_split moves from a free fn to IntegerList::merge_and_split, taking an iterator for additions (no materialization). Drops the .clone() at the MDBX call site. - HotKv::RwTx now bounds HistoryWrite, so UnifiedStorage and the conformance generics no longer need explicit `where H::RwTx: HistoryWrite` clauses. unified.rs reverts to a single impl block. - Module-level rustdoc in db/history.rs drops the historical "replaces" framing; trait-level rustdoc on HistoryRead drops the architecture paragraph (override prevention is a property of the blanket impl, not user-visible behavior). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes ENG-2287. Subsumes #66 (size-based shard splitter) — the splitter is included here as a backend-private detail of the MDBX impl.
Summary
Reshapes
HistoryRead/HistoryWriteinsignet-hotso that sharding is a backend-internal concern. The trait surface now exposes only logical history operations;ShardedKey<U256>no longer appears in any public method signature or return type.Trait layout:
crate::db::history::HistoryRead— blanket-impled onHotKvRead, default-impl-only. Backends cannot override (orphan rules + blanket impl structurally enforce this).crate::db::history::HistoryWrite— required per-backend, four methods:append_account_history,append_storage_history,truncate_account_history_above,truncate_storage_history_above. Plus default impls for bulk operations (update_history_indices,append_blocks,unwind_above,load_genesis, etc.).Backend impls:
MemKvwrites a single dup entry per(addr)atu64::MAX(or per(addr, slot)atShardedKey(slot, u64::MAX)). No splitter.signet-hot-mdbxuses the newsignet_storage_types::merge_and_splitwithMAX_SHARD_BYTES = 1500(constant private to the crate). Load-merge-rewrite for appends; backwards-walk delete-and-re-key-tail for truncates. Theu64::MAX-tail subkey convention is preserved structurally.Splitter (
signet_storage_types::merge_and_split):(existing, additions, max_bytes), returns(first, second)withsecond = Some(tail)iff a split occurred.IntegerListallocation on overflow.signet-storage-types(dense pack of 650 contiguous values, sparse 100 distinct 16-bit containers, and round-trip throughmerge_and_splitat the 1500 B budget).Deleted:
LegacyHistoryRead,LegacyUnsafeHistoryWrite,LegacyConsistentHistoryWrite.append_to_sharded_historyfree function incrates/hot/src/db/inconsistent.rs.ShardedKey::SHARD_COUNTconstant (count-based splitting is obsolete).crates/hot/src/db/consistent.rs(contents moved as default impls ontoHistoryWrite).Design + plan docs:
docs/superpowers/specs/2026-05-22-history-trait-delamination-design.mdanddocs/superpowers/plans/2026-05-22-history-trait-delamination.md(both gitignored; available locally on the branch).Why now
PR #66 surfaced two layering concerns that this refactor resolves:
append_to_sharded_historyreferencedMAX_SHARD_BYTES(an MDBX-specific cap) from the abstract crate. The constant now lives insignet-hot-mdbxand the splitter is a pure helper insignet-storage-types.ShardedKey<U256>appeared in public return types (last_storage_history), forcing every backend — even ones with no DUPSORT value cap — to mimic MDBX's sharded layout. The new logical interface lets each backend choose its strategy.Test plan
cargo +nightly fmt -- --checkcleancargo clippy --workspace --all-targets --all-features -- -D warningscleancargo clippy --workspace --all-targets --no-default-features -- -D warningscleanRUSTDOCFLAGS=\"-D warnings\" cargo doc --workspace --no-deps --all-featurescleancargo t --workspace— 196 tests pass across all cratessignet-hotconformance suite runs against both MemKv and MDBX backends (6 history conformance entrypoints now wired intoconformance())signet-hot-mdbxstructural test (history_sharding.rs) confirms MDBX splits oversized input into ≥2 dups and preserves theu64::MAX-tail subkeysignet-hotintegration tests (load_genesis for local, mainnet, parmigiana, pecorino genesis artifacts) passsignet-storage::unifiedend-to-end tests (append + drain_above + unwind) passsignet-hot-mdbxunwind conformance: byte-level comparison after unwind against a fresh-build replicasignet-storage-typespin encoded bytes vs the 1500 B budgetCommit list (21 commits)
Note: commits are unsigned in this draft; they will be re-signed via
git rebase --exec 'git commit --amend --no-edit -S'before merge.🤖 Generated with Claude Code