[codex] Fix reth partial archive pruning and sizing#561
Merged
Conversation
1cc7073 to
ed2e170
Compare
OisinKyne
approved these changes
May 28, 2026
|
|
||
| if !partialArchiveClients[executionClient] || scope.Kind == "" || scope.Kind == "all" { | ||
| return ethereumStorageProfile{ | ||
| ExecutionSize: "4500Gi", |
Contributor
There was a problem hiding this comment.
Just by chance if you spot this @apham0001 or @anadi2311 , do we have archive ELs for our data pipline? (I guess geth), do you know what disk space they take up?
Merged
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This is a stacked follow-up to #559. It keeps the original archive-mode design, but tightens the two places where the rendered behavior did not match the user-facing partial archive contract:
--since=<duration>now prunes all Reth history segments that Reth can prune by distance: account history, storage history, receipts, and bodies.--since=<fork|duration|block>now drives a single Ethereum storage profile that is reused by the disk preflight, generatedvalues.yaml, PVC sizing, and the upstreamethereum-nodepersistence request.The goal is to make these four surfaces agree:
Findings that motivated this
While reviewing #559, I found that
ArchiveScopeonly affected the Reth args. Storage still treated every mainnet archive as a genesis archive:--since=365dgeneratedpruneKind: distance, but PVCs still rendered as4500Giexecution and500Giconsensus.5000 GB, even though the docs and picker describe365das a bounded partial archive.distancebranch emitted only account/storage history prune flags, leaving receipts and bodies unbounded.That made the install internally inconsistent: the CLI claimed bounded archive history, but Kubernetes storage and parts of Reth history retention still behaved like a much larger archive.
Reth verification
I double-checked this against upstream Reth before changing the chart template.
v2.2.0, which is the version pinned by feat(ethereum): support archive mainnet more gracefully #559.mainchecked at7c5168c17a966081fb2b4523dadabb76f5a3e6bb.Both
v2.2.0and currentmainsupport:--prune.account-history.before--prune.storage-history.before--prune.receipts.before--prune.bodies.before--prune.account-history.distance--prune.storage-history.distance--prune.receipts.distance--prune.bodies.distanceRelevant upstream references:
Because Reth supports exact
.beforecutoffs for receipts and bodies, this PR removes the coarserpre-mergespecial case. A raw block like--since=1000000should mean “keep from block 1000000 forward,” not “keep only from the Merge forward” for receipts and bodies.What changed
Storage profile
Added
resolveEthereumStorageProfile, which centralizes the sizing decision for Ethereum installs.Important detail: unsupported clients such as geth still get genesis archive sizing even if
--sinceis present. That is deliberate. #559 warns that--sinceis wired only for Reth, so the storage profile must not underprovision clients that will ignore the partial-prune args.Reth prune args
The Reth branch now renders all relevant prune segments:
Helm compatibility
The Helmfile template uses
index .Values "field" | default ""for optional storage/prune fields so older generated values that lack the new keys can still render. I verified an older-style values file withoutexecutionStorageSize,consensusStorageSize, orpruneKindstill renders the previous full archive sizes.Validation
Automated checks:
go test ./internal/network ./internal/embed ./cmd/obol git diff --check jq empty renovate.jsonRendered smoke checks:
--since=365dnow renders:800Giexecution,500Giconsensus.distance=2628000--since=mergenow renders:1900Giexecution,500Giconsensus.before=155373944500Giexecution,500Giconsensus--execution-client=geth --since=365dstill writes full archive sizing and emits the existing warning that--sinceis only wired for Reth.Expected impact
This makes partial archive installs materially less surprising: