[r3.4] cl/caplin: fix blob and data column pruning (never ran, wrong range, configurable keep window)#20380
Merged
AskAlexSharov merged 4 commits intoApr 8, 2026
Conversation
The Prune() functions for both BlobStore and dataColumnStorageImpl had a bug where the loop started from (cutoff - minSlotsForBlobSidecarRequest) instead of 0. This meant only a narrow ~131K-slot window just before the pruning cutoff was ever iterated, leaving all data older than that window on disk indefinitely. On a long-running Hoodi node this caused 444GB of blob accumulation and 1.1TB of PeerDAS data column accumulation (1.6TB total in caplin/). Fix: start the delete loop from slot 0. Also add an underflow guard for the case where currentSlot < keepDistance (e.g. very early in sync). Co-Authored-By: Claude
Replace hardcoded 1_000_000 slot column keep distance with a configurable flag --caplin.columns-keep-slots (default: 131072 = MIN_EPOCHS_FOR_DATA_COLUMN_SIDECARS_REQUESTS * SLOTS_PER_EPOCH = 4096 * 32, ~18 days). The previous default of 1M slots (~138 days) caused unbounded disk growth on nodes where PeerDAS activated recently: all column data fell within the keep window so nothing was ever pruned. Operators running DA oracle or rollup nodes that need longer column history can increase the value via the flag. Co-Authored-By: Claude
ForkChoice was transitioning directly to SleepForSlot, leaving CleanupAndPruning as a dead stage that was never executed. Blob and data column pruning therefore never ran on any node. Fix ForkChoice to transition to CleanupAndPruning, which then transitions to SleepForSlot, matching the intended graph: ForkChoice -> CleanupAndPruning -> SleepForSlot Co-Authored-By: Claude
…value - Derive the column keep distance from beaconCfg when ColumnKeepSlots is 0 (i.e. not set via CLI), using MinEpochsForDataColumnSidecarsRequests * SlotsPerEpoch. This gives the correct spec minimum per chain: mainnet/Hoodi/Sepolia = 131072 slots (~18 days), Gnosis/Chiado = 65536 slots (~3.8 days). - Guards against ColumnKeepSlots being zero (struct zero-value when config is constructed without CLI), which would otherwise delete all column data. Co-Authored-By: Claude
AskAlexSharov
approved these changes
Apr 8, 2026
2 tasks
AskAlexSharov
pushed a commit
that referenced
this pull request
Apr 23, 2026
…tion) (#20729) ## Summary Documents the `--caplin.columns-keep-slots` flag introduced in #20380. - Adds a new **PeerDAS Data Column Retention** subsection to `caplin.md` - Flag: `--caplin.columns-keep-slots` (default: 131072, ~18 days) - Explains use case for DA oracle / rollup nodes needing longer column history ## Test plan - [ ] Verify flag appears in `erigon --help` on `release/3.4` - [ ] Verify default value matches source: `MIN_EPOCHS_FOR_DATA_COLUMN_SIDECARS_REQUESTS * SLOTS_PER_EPOCH = 131072` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: bloxster <gianni.morselli@erigon.tech>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Cherry-pick of #20379 to
release/3.4.Summary
Three bugs caused caplin blob and PeerDAS data column storage to grow unboundedly, filling disks on long-running nodes (observed: 1.6TB in
caplin/on a Hoodi node).Bug 1:
CleanupAndPruningstage was never reachedForkChoicetransitioned directly toSleepForSlot, makingCleanupAndPruninga dead stage. Pruning never ran on any node, ever.Bug 2: Prune loop only covered a narrow window
Even if pruning had run, both
BlobStore.Prune()anddataColumnStorageImpl.Prune()started the delete loop fromcurrentSlot - minSlotsForBlobSidecarRequestinstead of0. Data older than that window was never deleted.Bug 3: Column keep window was 1M slots (~138 days) — too large
The hardcoded
pruneDistance = 1_000_000for data columns meant that on networks where PeerDAS activated recently (e.g. Hoodi ~50 days ago), all column data fell within the keep window and nothing was pruned even after fixing bugs 1 and 2.Test plan
du -sh /erigon-data/caplin/*decreases after reaching headls /erigon-data/caplin/blobs/ | sort -n | head -5shows only recent subdirs after prune--caplin.columns-keep-slotsflag is accepted and overrides the defaultgo test ./cl/persistence/blob_storage/...passesCo-Authored-By: Claude