prune, state, snapshotsync: bound commitment history retention by block count#21021
prune, state, snapshotsync: bound commitment history retention by block count#21021MysticRyuujin wants to merge 10 commits into
Conversation
…ck count Adds --prune.commitment-history.distance.blocks=N for nodes running with --prune.include-commitment-history that want bounded snapshot disk usage. Composes with the existing bool: must be ≤ --prune.distance because eth_getProof needs the underlying state history kept by the regular prune mode. Implementation mirrors how transactions snapshots are bounded: a download- time filter that skips preverified segments older than the retention boundary, plus a local prune pass that removes files outside the window after each retire cycle. Runtime tightening is allowed (extra files are removed on the next prune cycle); widening requires resync because the filtered files no longer exist on disk. The retention value is persisted in kv.DatabaseInfo alongside the existing commitment-history bool. Mismatched config on restart errors out the same way the bool does.
There was a problem hiding this comment.
Pull request overview
Adds a new commitment-history retention option (bounded by block count) for nodes running with commitment-history enabled, aiming to cap snapshot disk usage while preserving the eth_getProof dependency on state-history retention.
Changes:
- Introduces
--prune.commitment-history.distance.blocksand persists its value in the datadir for restart consistency checks. - Filters commitment-history snapshot downloads and adds a local prune path that deletes expired commitment-history snapshot files.
- Adds unit tests for filename expiration logic, DB persistence helpers, and backend startup validation.
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| node/ethconfig/features/sync_features.go | Reads persisted commitment-history block-retention value into sync config. |
| node/ethconfig/config.go | Adds block-retention field and helper for comparing effective retention windows. |
| node/eth/backend.go | Persists retention blocks in DB and enforces “shrink-only” on restart (widen requires resync). |
| node/eth/backend_test.go | Adds test coverage for restart mismatch behavior (fresh/equal/shrink/expand). |
| node/cli/flags.go | Validates commitment-history retention does not exceed regular prune distance. |
| node/cli/default_flags.go | Registers the new CLI flag in default flag set. |
| cmd/utils/flags.go | Defines the new flag and enforces it requires commitment-history to be enabled. |
| db/kv/tables.go | Adds a new DatabaseInfo key for persisting commitment-history block-retention. |
| db/rawdb/accessors_chain.go | Adds rawdb helpers to read/write the persisted block-retention value. |
| db/rawdb/accessors_chain_test.go | Adds tests for the new rawdb read/write helpers (round-trip + malformed). |
| db/snapshotsync/snapshotsync.go | Adds download-side filtering to skip expired commitment-history segments. |
| db/snapshotsync/snapshotsync_test.go | Adds unit tests for commitment-history segment expiration logic. |
| execution/stagedsync/stage_snapshots.go | Adds a prune step that removes commitment-history snapshot files below the retention boundary. |
| db/state/aggregator.go | Implements file selection + deletion for expired commitment-history history/idx snapshot files. |
| docs/gitbook/src/interacting-with-erigon/eth.md | Documents bounded commitment-history retention and the invariant with --prune.distance. |
| docs/gitbook/src/fundamentals/configuring-erigon/README.md | Updates flag documentation and CLI help output with the new option. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- aggregator: notify the downloader (a.onFilesDelete) with relative paths before unlinking, matching the cleanAfterMerge ordering so the downloader cannot recreate files after local removal. Treat ENOENT from RemoveFile as success (deleteMergeFile may have already removed non-frozen files) and count only successful removals. - snapshotsync: keep commitment-history segments that overlap the retention boundary instead of skipping them. Aligns the download filter with the local prune (endTxNum <= threshold) and ensures retention covers the full requested window. - node/cli/flags: only enforce commit <= prune.distance when the bounded flag is actually set; otherwise existing --prune.include-commitment-history setups regress at startup. Extract validateCommitmentHistoryRetention for direct testing. Adds focused tests for the downloader callback, the boundary-overlap download behavior, and the CLI validation matrix.
|
I think I addressed the copilot review findings. I did a PR review with opus 4.7 and GPT 5.5, they both said it looked good. |
Mark commitment-history files for deletion through the existing reader-owned cleanup path so retention cannot unlink mapped files while readers are alive.
|
Updated in
I did not reuse block/receipt pruning directly because those use the block snapshot subsystem, while commitment history is managed by Validation: |
…stance-blocks # Conflicts: # docs/gitbook/src/fundamentals/configuring-erigon/README.md
…/site location The gitbook docs where this flag was originally documented were removed on main. Port the same one-liner to docs/site/.
…stance-blocks # Conflicts: # db/state/dirty_files.go # db/state/history.go # db/state/inverted_index.go
Summary
Adds
--prune.commitment-history.distance.blocks=Nfor nodes running with--prune.include-commitment-historythat want bounded snapshot disk usage. Composes with the existing bool:Nmust be≤ --prune.distance(validated at startup) becauseeth_getProofneeds the underlying state history kept by the regular prune mode.Behavior
--prune.include-commitment-historyonly+ --prune.commitment-history.distance.blocks=N--prune.commitment-history.distance.blockswithout the boolN > --prune.distanceRuntime tightening between runs is allowed (extra files cleaned up next prune cycle); widening requires resync because filtered files no longer exist on disk.
Implementation
db/snapshotsync/snapshotsync.go: pre-computescommitmentMinSteponce before the loop and skips commitment-history segments that end before the retention boundary, while keeping boundary-overlapping segments.execution/stagedsync/stage_snapshots.go+db/state/aggregator.go:pruneCommitmentHistorySnapshotsruns alongside block snapshot pruning, but uses the state aggregator's file lifecycle rather than block/receipt pruning. Commitment-history files are removed fromdirtyFiles, markedcanDelete, and physically removed only by the reader-owned cleanup path so live mmap readers are not invalidated. Frozen-file deletion is opted in only for commitment history viadeleteFrozen.db/rawdb/accessors_chain.go+db/kv/tables.go: newCommitmentLayoutBlocksKeyinkv.DatabaseInfo. Legacy datadirs without the key are tolerated.node/cli/flags.go+node/eth/backend.go: startup check enforcescommitment ≤ prune.distance; restart-mismatch check allows shrinks, rejects expands with an actionable error.Tests
db/snapshotsync/snapshotsync_test.go: boundary coverage for commitment-history download filtering.db/state/aggregator_test.go: downloader delete notification, idempotent no-op on repeat prune, immediate cleanup for unpinned dirty files, and deferred cleanup while frozen files have active readers.db/rawdb/accessors_chain_test.go: round-trip + malformed-length for the new persistence helpers.node/eth/backend_test.go: restart mismatch behavior for fresh / equal / shrink / expand cases.Validation
End-to-end on Hoodi (~2.7M blocks):
--prune.distance=2.5M --prune.commitment-history.distance.blocks=2M): commitment.0-128 segment filtered at download (~32 GB saved) while accounts/code/storage/receipt 0-128 all downloaded — flag is the binding constraint for commitment only.=500K):[snapshots] commitment history retention tightenedwarning fires at startup; on the next prune cycle,[snapshots] pruned commitment history files removed=4deletes commitment.128-192 (32 GB) plus its inverted-index pair. Chain continues without disruption.eth_getProofsucceeds at 9-36 ms for blocks inside the retention window and returnsold data not available due to pruning: commitment start: <txnum>for blocks outside it. The reportedcommitment startvalue moves up after the shrink (50M → 75M tx-num), confirming the cleanup actually removed data.Docs
Updated:
docs/gitbook/src/fundamentals/configuring-erigon/README.md— flag table and verbatim help-text dump.docs/gitbook/src/interacting-with-erigon/eth.md— paragraph in theeth_getProofsection explaining the bound and the≤ --prune.distanceinvariant.Test plan
make lintgo test ./db/state ./db/snapshotsync -count=1