Skip to content

[codex] Fix reth partial archive pruning and sizing#561

Merged
OisinKyne merged 1 commit into
oisin/networksfrom
codex/reth-archive-scope-fixes
May 28, 2026
Merged

[codex] Fix reth partial archive pruning and sizing#561
OisinKyne merged 1 commit into
oisin/networksfrom
codex/reth-archive-scope-fixes

Conversation

@bussyjd
Copy link
Copy Markdown
Collaborator

@bussyjd bussyjd commented May 27, 2026

Summary

This is a stacked follow-up to #559. It keeps the original archive-mode design, but tightens the two places where the rendered behavior did not match the user-facing partial archive contract:

  • --since=<duration> now prunes all Reth history segments that Reth can prune by distance: account history, storage history, receipts, and bodies.
  • --since=<fork|duration|block> now drives a single Ethereum storage profile that is reused by the disk preflight, generated values.yaml, PVC sizing, and the upstream ethereum-node persistence request.

The goal is to make these four surfaces agree:

flowchart TD
  A[CLI flags: --mode and --since] --> B[ArchiveScope]
  B --> C[Ethereum storage profile]
  B --> D[Reth prune args]
  C --> E[Disk preflight]
  C --> F[Generated values.yaml]
  F --> G[PVC requests]
  F --> H[ethereum-node persistence.size]
  D --> I[Rendered reth command]
Loading

Findings that motivated this

While reviewing #559, I found that ArchiveScope only affected the Reth args. Storage still treated every mainnet archive as a genesis archive:

  • --since=365d generated pruneKind: distance, but PVCs still rendered as 4500Gi execution and 500Gi consensus.
  • The disk preflight still warned for roughly 5000 GB, even though the docs and picker describe 365d as a bounded partial archive.
  • The distance branch emitted only account/storage history prune flags, leaving receipts and bodies unbounded.

That made the install internally inconsistent: the CLI claimed bounded archive history, but Kubernetes storage and parts of Reth history retention still behaved like a much larger archive.

flowchart LR
  A[--since=365d] --> B[account-history.distance]
  A --> C[storage-history.distance]
  A -. missing before this PR .-> D[receipts.distance]
  A -. missing before this PR .-> E[bodies.distance]
  A -. previously still .-> F[4500Gi PVC]
Loading

Reth verification

I double-checked this against upstream Reth before changing the chart template.

Both v2.2.0 and current main support:

  • --prune.account-history.before
  • --prune.storage-history.before
  • --prune.receipts.before
  • --prune.bodies.before
  • --prune.account-history.distance
  • --prune.storage-history.distance
  • --prune.receipts.distance
  • --prune.bodies.distance

Relevant upstream references:

Because Reth supports exact .before cutoffs for receipts and bodies, this PR removes the coarser pre-merge special case. A raw block like --since=1000000 should mean “keep from block 1000000 forward,” not “keep only from the Merge forward” for receipts and bodies.

What changed

Storage profile

Added resolveEthereumStorageProfile, which centralizes the sizing decision for Ethereum installs.

flowchart TD
  A[mode != archive] --> A1[full profile]
  B[archive + unsupported client] --> B1[genesis archive profile]
  C[archive + reth + since=genesis] --> C1[genesis archive profile]
  D[archive + reth + since=fork] --> D1[partial hardfork profile]
  E[archive + reth + since=duration] --> E1[partial distance profile]
  F[archive + reth + raw block after known fork] --> F1[nearest conservative hardfork profile]
  G[archive + reth + raw block before merge] --> G1[genesis archive profile]
Loading

Important detail: unsupported clients such as geth still get genesis archive sizing even if --since is present. That is deliberate. #559 warns that --since is wired only for Reth, so the storage profile must not underprovision clients that will ignore the partial-prune args.

Reth prune args

The Reth branch now renders all relevant prune segments:

# before
--prune.account-history.before=N
--prune.storage-history.before=N
--prune.receipts.before=N
--prune.bodies.before=N

# distance
--prune.account-history.distance=N
--prune.storage-history.distance=N
--prune.receipts.distance=N
--prune.bodies.distance=N

Helm compatibility

The Helmfile template uses index .Values "field" | default "" for optional storage/prune fields so older generated values that lack the new keys can still render. I verified an older-style values file without executionStorageSize, consensusStorageSize, or pruneKind still renders the previous full archive sizes.

Validation

Automated checks:

go test ./internal/network ./internal/embed ./cmd/obol
git diff --check
jq empty renovate.json

Rendered smoke checks:

  • --since=365d now renders:
    • PVCs: 800Gi execution, 500Gi consensus
    • Reth args: account/storage/receipts/bodies .distance=2628000
  • --since=merge now renders:
    • PVCs: 1900Gi execution, 500Gi consensus
    • Reth args: account/storage/receipts/bodies .before=15537394
  • Older values without new storage keys still render:
    • PVCs: 4500Gi execution, 500Gi consensus
  • --execution-client=geth --since=365d still writes full archive sizing and emits the existing warning that --since is only wired for Reth.

Expected impact

This makes partial archive installs materially less surprising:

  • Disk warnings match the selected archive scope.
  • PVC and upstream chart persistence requests match the selected archive scope.
  • Reth receives complete partial-prune flags for the selected scope.
  • Non-Reth clients remain conservatively sized because they do not receive partial-prune flags.

@bussyjd bussyjd force-pushed the codex/reth-archive-scope-fixes branch from 1cc7073 to ed2e170 Compare May 27, 2026 15:57
@bussyjd bussyjd marked this pull request as ready for review May 27, 2026 15:57

if !partialArchiveClients[executionClient] || scope.Kind == "" || scope.Kind == "all" {
return ethereumStorageProfile{
ExecutionSize: "4500Gi",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just by chance if you spot this @apham0001 or @anadi2311 , do we have archive ELs for our data pipline? (I guess geth), do you know what disk space they take up?

@OisinKyne OisinKyne merged commit 9fabd56 into oisin/networks May 28, 2026
@OisinKyne OisinKyne deleted the codex/reth-archive-scope-fixes branch May 28, 2026 12:12
@bussyjd bussyjd mentioned this pull request May 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants