Skip to content

refactor(archiver)!: simplify L2BlockSource block lookups#22809

Merged
PhilWindle merged 3 commits intomerge-train/spartanfrom
spl/internal-archiver-api-review
May 5, 2026
Merged

refactor(archiver)!: simplify L2BlockSource block lookups#22809
PhilWindle merged 3 commits intomerge-train/spartanfrom
spl/internal-archiver-api-review

Conversation

@spalladino
Copy link
Copy Markdown
Contributor

@spalladino spalladino commented Apr 28, 2026

⚠️ This PR includes #22870. Reviewers should review only the first commit. ⚠️

Motivation

Consolidates the block-related lookup surface on L2BlockSource from ~17 narrow methods returning ~9 different shapes down to 4 methods returning 2 shapes (L2Block and BlockData). Replaces the per-shape getters with discriminated query objects that carry both the lookup discriminant and a single onlyCheckpointed filter, removing the parallel Checkpointed* API and the throwaway wrapper types.

Additionally, this refactor centralizes block-zero handling in the archiver and threads the dynamic initial header through every component that previously hard-coded the constant, eliminating the divergence and removing the special-case branches in callers.

Approach

L2BlockSource exposes 4 methods that take query objects:

getBlock(query: BlockQuery): Promise<L2Block | undefined>
getBlocks(query: BlocksQuery): Promise<L2Block[]>
getBlockData(query: BlockQuery): Promise<BlockData | undefined>
getBlocksData(query: BlocksQuery): Promise<BlockData[]>

type BlockQuery  = ({number} | {hash} | {archive}) & { onlyCheckpointed?: boolean }
type BlocksQuery = ({from, limit} | {epoch})       & { onlyCheckpointed?: boolean }

On-disk format is unchanged — the archiver already stored block metadata, tx bodies, and per-checkpoint L1/attestation data in separate LMDB maps; CheckpointedL2Block was only an in-memory join produced at read time.

Includes changes from #22870

API surface change

Methods removed from L2BlockSource

getL2Block, getL2BlockByHash, getL2BlockByArchive, getCheckpointedBlock, getCheckpointedBlockByHash, getCheckpointedBlockByArchive, getCheckpointedBlocks, getCheckpointedBlocksForEpoch, getCheckpointedBlockHeadersForEpoch, getBlock(number), getBlocks(from, limit), getBlockData(number), getBlockDataByArchive, getBlockDataWithCheckpointContext, getBlockHeader, getBlockHeaderByHash, getBlockHeaderByArchive.

Types deleted

CheckpointedL2Block, BlockDataWithCheckpointContext — both removed entirely (file + schema + re-exports). Callers that previously read .l1 / .attestations off these now do getBlockData(...) followed by getCheckpointData(blockData.checkpointNumber) and read those fields off CheckpointData.

Types added

BlockQuery, BlocksQuery (and matching Zod schemas) on L2BlockSource. No new domain types — L2Block, BlockData, BlockHeader are unchanged.

AztecNode public RPC

Method names preserved (getBlock, getBlockHeader, getCheckpointedBlocks, etc. — bodies delegate internally to the new L2BlockSource methods). One wire-level change: AztecNode.getCheckpointedBlocks element type goes CheckpointedL2Block[]BlockResponse[], forced by the type deletion. Older RPC clients that parse the old shape will need to update.

Changes

  • stdlib: BlockQuery / BlocksQuery types + Zod schemas next to L2BlockSource. CheckpointedL2Block file deleted; BlockDataWithCheckpointContext removed from block_data.ts. ArchiverApiSchema and MockArchiver shrunk; new it() blocks cover each query discriminant. L2BlockStream migrated.
  • archiver: BlockStore consolidates to four query-object reads plus iterators. data_source_base.ts adds resolveBlocksQuery that translates { epoch }{ from, limit } (returns null for empty epochs so callers short-circuit to []). Mocks honor onlyCheckpointed.
  • aztec-node: server.ts keeps the public RPC method names but delegates to the new query methods. getCheckpointedBlocks adds a per-call Map<CheckpointNumber, CheckpointData> cache to avoid an N+1.
  • consumer migrations: world-state, txe, p2p block-txs handler, validator-client (validator.ts, proposal_handler.ts), pxe block-stream source (honors onlyCheckpointed via node.getL2Tips), prover-node, sequencer-client, telemetry-client, aztec/testing, L2BlockStream in stdlib.
  • tests: per-package mocks updated for the new shapes; new test covers getBlocks({ epoch }) empty-epoch returning [].

@spalladino spalladino requested a review from a team as a code owner April 28, 2026 00:57
@spalladino spalladino added the ci-no-fail-fast Sets NO_FAIL_FAST in the CI so the run is not aborted on the first failure label Apr 28, 2026
@spalladino spalladino force-pushed the spl/internal-archiver-api-review branch from 45b231c to 6f8d6e8 Compare April 28, 2026 01:12
Base automatically changed from spl/new-rpc-api to merge-train/spartan April 28, 2026 11:18
@spalladino spalladino force-pushed the spl/internal-archiver-api-review branch 4 times, most recently from 40b605e to 1ddff08 Compare April 28, 2026 14:58
@spalladino spalladino changed the base branch from merge-train/spartan to spl/remove-kv-store April 28, 2026 14:58
@spalladino spalladino changed the title refactor(node-rpc)!: unify L2BlockSource block lookups via query objects refactor(node-rpc)!: simplify L2BlockSource block lookups Apr 28, 2026
@spalladino spalladino added S-do-not-merge Status: Do not merge this PR and removed S-do-not-merge Status: Do not merge this PR labels Apr 28, 2026
@spalladino spalladino changed the title refactor(node-rpc)!: simplify L2BlockSource block lookups refactor(archiver)!: simplify L2BlockSource block lookups Apr 30, 2026
PhilWindle pushed a commit that referenced this pull request May 1, 2026
> **Note:** This PR is stacked together with #22818, #22809, and #22870
into combined PR #22891 (targeting `merge-train/spartan`) for easier
merging. The combined PR has each of these as a separate commit, so
reviewers can either review here or on the combined PR.

## Motivation

`KVArchiverDataStore` was a thick pass-through wrapper that re-exported
the same methods as the substores it owned (`BlockStore`, `LogStore`,
`MessageStore`, `ContractClassStore`, `ContractInstanceStore`), forcing
every API change to be plumbed through three layers. Removing it brings
the archiver one step closer to a clean, query-object-based data source
API and makes substore boundaries explicit at every call site.

## Approach

Replace the wrapper with a plain `ArchiverDataStores` bundle that
exposes the substores directly, and move cross-store helpers to free
functions on the bundle. Substores absorb the array/iterator helpers
that previously lived on the wrapper. Function-name caching becomes its
own small class, and the `ContractDataSource` adapter is now a named
class instead of an inline object literal so that re-prover tools and
tests can reach for it without depending on `data_stores.ts` internals.

## Changes

- **archiver**: Delete `KVArchiverDataStore`; add `ArchiverDataStores`
bundle (`BlockStore`, `LogStore`, `MessageStore`, `ContractClassStore`,
`ContractInstanceStore`, db, function-name cache) and
`createArchiverDataStores`. Cross-store helpers
(`getArchiverSynchPoint`, `backupArchiverDataStores`) move to free
functions.
- **archiver (renames)**: `ArchiverDataStores` fields use plural form:
`blocks`, `logs`, `messages`, `contractClasses`, `contractInstances`.
Substore class names and individual store classes are unchanged.
- **archiver (function names)**: Extract `FunctionNamesCache` class
(replaces `Map<string, string>` plus the
`registerContractFunctionSignatures`/`getDebugFunctionName` free
functions).
- **archiver (contract data source)**: Extract
`ArchiverContractDataSourceAdapter` class implementing
`ContractDataSource`; `createContractDataSource` is now a thin factory.
- **archiver (substores)**:
`BlockStore.getBlocks`/`getCheckpointedBlocks`/`getBlockHeaders` return
arrays (iterator variants renamed `iterate*`); contract substores expose
batch `add*`/`delete*` helpers.
- **archiver (tests)**: Split the 4286-line `kv_archiver_store.test.ts`
into per-substore test files (`block_store.test.ts`,
`log_store.test.ts`, `message_store.test.ts`,
`contract_class_store.test.ts`, `contract_instance_store.test.ts`).
- **node-lib, prover-node, txe, validator-client, end-to-end**: Update
call sites to reach for the relevant substore on `ArchiverDataStores`
directly.
Base automatically changed from spl/remove-kv-store to merge-train/spartan May 1, 2026 09:21
Consolidate the ~17 block-related lookup methods on `L2BlockSource` into 4
methods that take discriminated query objects: `getBlock`, `getBlocks`,
`getBlockData`, `getBlocksData`. Lookups accept `{ number } | { hash } |
{ archive }` (single) or `{ from, limit } | { epoch }` (range), with an
`onlyCheckpointed` filter that subsumes the per-checkpoint variants.

Removes `CheckpointedL2Block` and `BlockDataWithCheckpointContext` types
entirely; callers that need L1 publish info or attestations now read
them via `getCheckpointData(blockData.checkpointNumber)`. Internal
`BlockStore` storage is unchanged — `CheckpointedL2Block` was only an
in-memory join produced at read time.

Public `AztecNode.getCheckpointedBlocks` element type changes from
`CheckpointedL2Block[]` to `BlockResponse[]` (forced by the type
deletion); method names preserved.
@spalladino spalladino force-pushed the spl/internal-archiver-api-review branch from ef5c3b5 to f930fd3 Compare May 4, 2026 12:27
The Aztec rollup has an implicit "block zero" whose state is captured in
an initial block header computed by `NativeWorldStateService`. Today the
archiver returns `undefined` for any query that resolves to genesis,
forcing every consumer (aztec-node, prover-node, p2p, sequencer, PXE,
sentinel) to reach into
`worldStateSynchronizer.getCommitted().getInitialHeader()` and
synthesize a fake block 0 themselves. Worse, components disagreed on the
genesis hash whenever `genesisTimestamp` or prefilled state diverged
from the default, because some used
`worldState.getInitialHeader().hash()` (dynamic) and others used the
protocol constant `GENESIS_BLOCK_HEADER_HASH` (static).

This refactor centralizes block-zero handling in the archiver and
threads the dynamic initial header through every component that
previously hard-coded the constant, eliminating the divergence and
removing the special-case branches in callers.

Construct world-state first, capture `nativeWs.getInitialHeader()`, and
pass it into the archiver at construction. The archiver returns a
synthetic `L2Block.empty(initialHeader)` for single-block queries that
resolve to genesis — by number, hash, archive, or tag. Range queries
explicitly do **not** prepend. `L2TipsCache`, world-state synchronizer's
`getL2Tips`, stdlib's `L2TipsStoreBase`, P2PClient/L2TipsKVStore, PXE,
and sentinel all switched to the dynamic `initialHeader.hash()`. Genesis
special-casing was deleted from aztec-node, prover-node, p2p_client,
sequencer (the all-zeros escape hatch), and stdlib's
`areBlockHashesEqualAt` — reorg detection at genesis is now real instead
of an `if blockNumber === 0 return true` short-circuit.

- **archiver**: synthetic genesis block in `data_source_base` (with
`archive` set to `new AppendOnlyTreeSnapshot(genesisArchiveRoot, 1)`),
`initialHeader` plumbing through `factory` / constructor, `L2TipsCache`
uses dynamic genesis hash, `MockL2BlockSource` synthesizes block 0 +
exposes `getInitialHeader`/`setInitialHeader`/`setGenesisArchiveRoot`,
`NoopL1Archiver` accepts `initialHeader`.
- **world-state**: `createWorldState` exported as a public factory, new
`createWorldStateSynchronizerOverNative` that wraps a pre-built native
instance, `getL2Tips` reports `initialHeader.hash()` and
`BlockNumber.ZERO` for genesis tips.
- **aztec-node**: `server.ts` reorders wiring so world-state is built
first; `getBlock`, `getBlockHeader`, `resolveBlockNumber`,
`getPrivateLogsByTags`, `getPublicLogsByTagsFromContract` no longer
special-case genesis. `buildGenesisBlockResponse` deleted. Sentinel
re-creates its `L2TipsMemoryStore` in `init()` with the archiver's
block-0 hash.
- **stdlib**: `L2TipsStoreBase` accepts `initialBlockHash` (default
`GENESIS_BLOCK_HEADER_HASH` for back-compat). `areBlockHashesEqualAt` no
longer short-circuits at block 0. `L2BlockStream`'s reorg-search loop
refuses to walk past block 0 — emits a clear "genesis hash mismatch"
error instead of cascading into "block hash not found for -1".
`L2TipsKVStore`/`L2TipsMemoryStore` thread the param through.
- **p2p**: `P2PClient` / `createP2PClient` accept and forward
`initialBlockHash`; aztec-node passes the archiver's genesis hash. Test
helpers and benches updated.
- **pxe**: fetches `node.getBlock(0)` at startup and seeds
`L2TipsKVStore` with that hash.
- **sequencer-client**: deleted the all-zeros escape hatch in
`getStatus`.
- **prover-node**: `gatherPreviousBlockHeader` calls
`l2BlockSource.getBlockData({number:0})` uniformly.
- **tests**: new `archiver/src/modules/data_source_base.test.ts`
covering genesis-query semantics; world-state and validator-client
integration tests thread `db.getInitialHeader()` and
`genesisArchiveRoot` to the archiver/mock; new `l2_block_stream.test.ts`
case asserting the genesis-hash-mismatch error path.
@spalladino spalladino force-pushed the spl/internal-archiver-api-review branch from 4b17843 to 119e271 Compare May 4, 2026 12:46
… identity

Construct the synthetic genesis L2Block and BlockData once in the
ArchiverDataSourceBase constructor instead of building a fresh instance
per call. Consumers that cache by reference (e.g. L2TipsCache) require
`getBlock({number:0})` to return the same instance across calls.
spalladino added a commit that referenced this pull request May 4, 2026
… objects

Apply the same simplification PR #22809 made to the block-side API to the
checkpoint-side API: collapse 9 narrow methods over 4 return shapes into 4
query-shaped methods over 2 return shapes, plus a polymorphic proposed-checkpoint
lookup. Tightens the public RPC: removes the 'latest' / 'proposed' alias foot-gun,
adds a confirmed->proposed fallback for by-number/by-slot lookups, and rejects
incompatible include flags up-front.

BREAKING CHANGE: `getCheckpointsDataForEpoch` removed; `'latest'` removed from
`CheckpointParameter`; `'proposed'` semantics tightened (was alias for latest
confirmed, now strictly the proposed map / proposed-tip with confirmed fallback);
`getCheckpoint('proposed', { includeAttestations | includeL1PublishInfo })` and
proposed-fallback equivalents now throw `BadRequestError`.
@PhilWindle PhilWindle merged commit f655893 into merge-train/spartan May 5, 2026
12 checks passed
@PhilWindle PhilWindle deleted the spl/internal-archiver-api-review branch May 5, 2026 09:04
spalladino added a commit that referenced this pull request May 5, 2026
… objects

Apply the same simplification PR #22809 made to the block-side API to the
checkpoint-side API: collapse 9 narrow methods over 4 return shapes into 4
query-shaped methods over 2 return shapes, plus a polymorphic proposed-checkpoint
lookup. Tightens the public RPC: removes the 'latest' / 'proposed' alias foot-gun,
adds a confirmed->proposed fallback for by-number/by-slot lookups, and rejects
incompatible include flags up-front.

BREAKING CHANGE: `getCheckpointsDataForEpoch` removed; `'latest'` removed from
`CheckpointParameter`; `'proposed'` semantics tightened (was alias for latest
confirmed, now strictly the proposed map / proposed-tip with confirmed fallback);
`getCheckpoint('proposed', { includeAttestations | includeL1PublishInfo })` and
proposed-fallback equivalents now throw `BadRequestError`.
PhilWindle pushed a commit that referenced this pull request May 6, 2026
… objects (#22933)

## Motivation

Clean up the checkpoint side of `L2BlockSource`. PR #22809 already
collapsed the block-side API into 4 query-shaped methods over 2 return
types; the checkpoint surface was left with the pre-refactor sprawl (9
narrow methods over 4 return shapes, parallel by-number / by-range /
by-epoch entrypoints, and a wire-level alias that conflated proposed and
confirmed checkpoints). This change applies the same simplification.

Fixes A-979

## Approach

`L2BlockSource` checkpoint methods reduce to 4 query-shaped readers
(`getCheckpoint`, `getCheckpoints`, `getCheckpointData`,
`getCheckpointsData`) over 2 return shapes (`PublishedCheckpoint`,
`CheckpointData`), plus a polymorphic
`getProposedCheckpointData(query?)` for the proposed-only path. Three
new query types live next to `BlockQuery`/`BlocksQuery`. On-disk format
and `BlockStore` primitives are unchanged — the simplification is at the
API boundary. The public RPC's `getCheckpoint` keeps the same wire
signature but gains a confirmed→proposed fallback (for
`{number}`/`{slot}`/`'proposed'` lookups) and `BadRequestError` guards
for incompatible `include*` flags.

## API surface change

### Methods removed from `L2BlockSource`

`getCheckpoints(from, limit)`, `getCheckpointData(n)`,
`getCheckpointDataRange(from, limit)`, `getCheckpointsForEpoch(epoch)`,
`getCheckpointsDataForEpoch(epoch)`, `getCheckpointNumberBySlot(slot)`,
`getLastCheckpoint()`, `getLastProposedCheckpoint()`. Dead methods on
`data_source_base` also removed: `getCheckpointHeader`,
`getLastBlockNumberInCheckpoint`, `getSynchedCheckpointNumber`.

### Methods added to `L2BlockSource`

```ts
getCheckpoint(query: CheckpointQuery): Promise<PublishedCheckpoint | undefined>
getCheckpoints(query: CheckpointsQuery): Promise<PublishedCheckpoint[]>
getCheckpointData(query: CheckpointQuery): Promise<CheckpointData | undefined>
getCheckpointsData(query: CheckpointsQuery): Promise<CheckpointData[]>
getProposedCheckpointData(query?: ProposedCheckpointQuery): Promise<ProposedCheckpointData | undefined>

type CheckpointQuery         = { number } | { slot } | { tag: 'checkpointed' | 'proven' | 'finalized' }
type CheckpointsQuery        = { from, limit } | { epoch }
type ProposedCheckpointQuery = { number } | { slot } | { tag: 'proposed' }
```

### Public RPC (`AztecNode`) wire-level changes

- `getCheckpointsDataForEpoch(epoch)` removed;
`getCheckpointsData(query: CheckpointsQuery)` added (range or epoch).
- `'latest'` removed from `CheckpointParameter`.
- `'proposed'` semantics changed: previously aliased to "latest
L1-confirmed checkpoint" (a documented foot-gun); now
`getCheckpoint('proposed')` strictly targets the proposed-checkpoint
store, and `getCheckpointNumber('proposed')` returns the proposed-tip
number with confirmed fallback.
- `getCheckpoint({ number }) / ({ slot })` now check confirmed first
then fall back to proposed; tag-based lookups (`'checkpointed'` /
`'proven'` / `'finalized'`) do not fall back.
- `getCheckpoint('proposed', { includeL1PublishInfo: true |
includeAttestations: true })` and the same flags on a by-number/by-slot
lookup that resolves to a proposed entry now throw `BadRequestError`
(proposed checkpoints have no L1 publish info or attestations).

### Types kept

`CheckpointData`, `CommonCheckpointData` (structural base of
`CheckpointData` / `ProposedCheckpointInput`), `ProposedCheckpointData`,
`ProposedCheckpointInput`, `PublishedCheckpoint`, `Checkpoint`. No
structural-type deletions.

Migration guidance for wallet/SDK consumers is in
`docs/docs-developers/docs/resources/migration_notes.md`.

## Changes

- **stdlib**: New query types (`CheckpointQuery`, `CheckpointsQuery`,
`ProposedCheckpointQuery`) + Zod schemas in `block/l2_block_source.ts`.
`'latest'` literal removed from `interfaces/checkpoint_parameter.ts`.
`NormalizedCheckpointDispatch` type for the server's parameter
normalizer. `ArchiverApiSchema` and `AztecNode` schema updated.
`computeL2ToL1MembershipWitness` switched to the new query shape.
- **archiver**: `data_source_base` adds `resolveCheckpointQuery` /
`resolveCheckpointsQuery` mirroring the block-side helpers, implements
the 4 confirmed methods plus the polymorphic proposed lookup.
`BlockStore` adds `getProposedCheckpointBySlot(slot)`. `MockArchiver`
and `mock_l2_block_source` updated to match the new interface.
- **aztec-node**: `server.ts` adds the confirmed→proposed fallback flow
with the two `BadRequestError` guards in `getCheckpoint`, sources all
tips from a single `getL2Tips()` call in `getCheckpointNumber`, and
routes the public RPC through the new internal methods. New
pure-projection helper `projectProposedToCheckpointResponse` in
`block_response_helpers.ts`.
- **consumer migrations**: prover-node (collapses two checkpoint fetches
into one `getCheckpoints({ epoch })`), world-state, slasher, sequencer
(`checkpoint_proposal_job`, `sequencer`), validator
(`proposal_handler`), `L2BlockStream`, pxe `block_stream_source`,
telemetry wrapper, and 10 e2e files updated to the new query shapes.
- **tests**: 48 new `it()` blocks covering each query discriminant, the
throw guards, the confirmed→proposed fallback, the polymorphic
`getProposedCheckpointData` dispatch, and
`BlockStore.getProposedCheckpointBySlot`.
- **docs**: `migration_notes.md` updated with the breaking changes for
downstream wallet/SDK consumers.
danielntmd pushed a commit to danielntmd/aztec-packages that referenced this pull request May 6, 2026
BEGIN_COMMIT_OVERRIDE
refactor(archiver)!: simplify L2BlockSource block lookups (AztecProtocol#22809)
chore(lint): allow branded primitive types as keys in collections
(AztecProtocol#22935)
test(e2e): test missed l1 publishing under pipelining (AztecProtocol#22926)
fix: dedup attestation pool by payload hash (AztecProtocol#22871)
chore: notify slack users directly (AztecProtocol#22944)
END_COMMIT_OVERRIDE
PhilWindle added a commit that referenced this pull request May 6, 2026
PR #22933 (and earlier #22809) reshaped L2BlockSource: getBlockHeader,
getCheckpointsForEpoch, and the positional getCheckpoints(from, limit)
were removed. L2TipsMemoryStore also gained a required initialBlockHash
constructor argument.

- getBlockHeader(n) -> (await getBlockData({ number: n }))?.header
- getCheckpointsForEpoch(epoch) -> getCheckpoints({ epoch }), with
  field access moving from .number/.blocks to .checkpoint.number/.blocks
- startProof folds the two-call pattern (checkpoints + separate
  attestations fetch) into one getCheckpoints({ epoch }) call since
  PublishedCheckpoint already carries attestations per entry
- L2TipsMemoryStore initialised in the constructor body with
  l2BlockSource.getGenesisBlockHash()

Test updates mirror the production migration; also restores the
beforeEach getBlockData mock to return a header for any block number
(the merge resolution had narrowed it to a single block, breaking the
checkpoint-driven flow tests).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-no-fail-fast Sets NO_FAIL_FAST in the CI so the run is not aborted on the first failure

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants