chore: backport merge-train/spartan PRs to v5-next#23846
Merged
Conversation
Use `createHash` which is used in other parts of the code. No awaits. This also fixes a potential race condition if you `getOrValidate` concurrently. Seems to even be slightly faster due to no microtask scheduling. | Metric | Before (subtle) | After (createHash) | Δ (avg) | |--------|-----------------|---------------------|---------| | **PRV-SHA256** | 3.39 ms | 3.13 ms | ~8% faster | | **PUB-SHA256** | 5.27 ms | 5.00 ms | ~5% faster | | PRV getTxHash | 9.39 ms | 8.24 ms | (unchanged path; run noise) | | PUB getTxHash | 18.53 ms | 17.55 ms | (unchanged path; run noise) |
Attempt at deflaking slashing tests by keeping some warmup slots. Codex analysis below. ## Cause The duplicate proposal slash test was timing-fragile after proposer pipelining. In the failed CI run, the malicious shared-key validators were selected for the first slot of the target epoch immediately after the test started sequencers and warped time forward. Because pipelining builds a proposal one slot ahead, selecting the first slot meant the malicious nodes had to start building at the exact warp boundary. They started late in the build sub-slot, serialized the duplicate proposals after the receiver nodes had already advanced to the next slot, and the receivers rejected the gossip as late (`invalid slot number`) before duplicate-proposal detection could run. Failed run: [`36837b7edc543e70`](https://ci.aztec-labs.com/36837b7edc543e70) ## Analysis The failed run never produced a `DUPLICATE_PROPOSAL` offense. It only produced a `DUPLICATE_ATTESTATION` offense, then timed out waiting for the expected proposal offense. Timeline from the failed run: - malicious validators were selected for slot 10 - both started building late in sub-slot 3 - both broadcast checkpoint proposals for slot 10 - receivers had already advanced to slot 11 and rejected slot 10 proposals as stale - the generic offense wait was satisfied by a duplicate attestation - the final `DUPLICATE_PROPOSAL` assertion timed out A passing run selected a later proposer slot instead. That gave the sequencers a full warmup slot after the warp, both malicious nodes built early enough, standalone duplicate proposals were broadcast in time, and `DUPLICATE_PROPOSAL` was detected. ## Fix Rationale Update `advanceToEpochBeforeProposer` to skip the first `warmupSlots` slots of the target epoch and return the concrete `targetSlot`. The duplicate proposal and duplicate attestation tests still warp to one slot before the target epoch, but now the selected malicious proposer slot is at least one slot into that epoch. That gives freshly-started sequencers one full slot of wall-clock warmup before the pipelined build for the malicious slot begins, avoiding the startup/warp boundary race where duplicate proposals are emitted too late and rejected before slash detection.
## Motivation
The node exposed two overlapping transaction-lookup methods —
`getTxReceipt` (lifecycle/status) and `getTxEffect` (mined side effects)
— forcing callers to stitch results together and duplicating fields
(txHash, fee, block hash/number, execution result) for mined txs. This
resolves the standing `REFACTOR` note on `TxReceipt` by making
`getTxReceipt` the single lookup API.
**Breaking change** to the public node RPC, with no wire back-compat.
## Approach
`TxReceipt` becomes a discriminated union over the lifecycle —
`PendingTxReceipt | DroppedTxReceipt | MinedTxReceipt` over an abstract
`TxReceiptBase` — with `isMined()`/`isPending()`/`isDropped()` guards
for narrowing while bare field reads keep compiling. `getTxReceipt`
gains `GetTxReceiptOptions` to opt into attaching the full `TxEffect`,
the pending `Tx`, and its proof. Receipt assembly moves from the block
store to the node (deriving mined status from the cached `getL2Tips`),
`getSettledTxReceipt` is removed from the archiver/`L2BlockSource`, and
the public `getTxEffect` is deprecated.
**Breaking change** adds the `slotNumber` to the indexed tx effect,
changing the db format.
## Changes
- **stdlib**: `TxReceipt` variant classes + union + `TxReceiptSchema`,
`GetTxReceiptOptions`, `Tx.withoutProof`; new `getTxReceipt`
interface/RPC signature; removed `getSettledTxReceipt` from
`L2BlockSource`/archiver; `l2_to_l1_membership` collapsed to a single
call.
- **aztec-node**: node assembles `MinedTxReceipt` from `getTxEffect` +
`getL2Tips` (+ epoch); deprecated the public `getTxEffect`.
- **archiver**: removed `getSettledTxReceipt` impl/wrapper/mock; the
block store returns raw `IndexedTxEffect`s.
- **pxe / cli / cli-wallet / wallet-sdk / bot**: migrated callers off
the deprecated `getTxEffect` to `getTxReceipt(h, { includeTxEffect: true
})`; the PXE oracle and message-context paths reconstruct an
`IndexedTxEffect` so downstream note/event services are unchanged.
- **p2p**: `tx_archive` reuses `Tx.withoutProof`.
- **tests**: per-variant construction + union round-trip coverage,
node-level status-derivation tests across the tip boundaries, and
updated RPC round-trip tests.
- **docs**: migration-notes entry for the breaking change +
how-to-send-transaction guide update.
Note: the auto-generated `typescript-api` and `node-api-reference` docs
are intentionally not regenerated here — the former is bulk-regenerated
by release tooling (regenerating now would sweep in unrelated drift
since the v4.3.0 snapshot) and the latter's generator is currently
incompatible with the repo's Zod 4 schemas (a pre-existing issue
affecting all methods).
## Motivation Proposer pipelining (the proposer builds for `slot + 1` while in `slot`, defers checkpoint finalization to the next slot, and builds on the locally-gossiped proposed parent checkpoint) was gated behind `SEQ_ENABLE_PROPOSER_PIPELINING`. Production already runs with it on and every non-trivial e2e suite opts in, so the dual on/off code path was dead weight — a duplicated timing model and `if (isPipelining)` branches scattered across the sequencer, validator, and p2p stack. This makes pipelining the only behavior of the production sequencer and removes the toggle. ## Approach Removed the `enableProposerPipelining` config / `SEQ_ENABLE_PROPOSER_PIPELINING` env var everywhere and dropped every `isProposerPipeliningEnabled()` / `pipeliningOffset()` check, collapsing each to the pipelining branch (the `EpochCache` now always applies `PROPOSER_PIPELINING_SLOT_OFFSET`). The checkpoint timing model was consolidated to a single pipelined model. The test-only `AutomineSequencer` is preserved — it is selected by the separate `useAutomineSequencer` flag, publishes synchronously in-slot, and is the one remaining caller of the non-pipelined `getPreviousCheckpointOutHashes` branch. Checks for whether a proposed checkpoint exists (`hasProposedCheckpoint` / `proposedCheckpointData`) are kept. ## Changes - **stdlib**: deleted `pipelining-config.ts`; consolidated `timetable` to a single `CheckpointTimingModel` (removed `StandardCheckpointTimingModel` and the `pipelining` option from `createCheckpointTimingModel` / `calculateMaxBlocksPerSlot`); kept the `pipeliningEnabled` param on `getPreviousCheckpointOutHashes` for automine, with new dedicated unit coverage. - **foundation / archiver / sequencer-client / epoch-cache (config)**: removed the env-var registry entry and the `PipelineConfig` merges. - **epoch-cache**: removed `isProposerPipeliningEnabled()` / `pipeliningOffset()`; `getTargetSlot` / `getTargetEpochAndSlotInNextL1Slot` / `getTargetAndNextSlot` always apply the offset. - **sequencer-client**: collapsed the sequencer / checkpoint-proposal-job / publisher branches to the pipelining path; `SequencerTimetable` no longer takes a `pipelining` flag; `AutomineSequencer` untouched (boundary comment added). - **validator-client / p2p**: collapsed proposal-handler, p2p-client, and clock-tolerance branches; deleted the orphaned `waitForBlockSourceSync` and the now-dead `block_source_not_synced` reason. - **p2p gossipsub**: `maxBlocksPerSlot` now uses the pipelined timing model (a higher value), consistent with the always-pipelining sequencer. - **end-to-end (tests)**: dropped the flag from `PIPELINING_SETUP_OPTS` / `AUTOMINE_E2E_OPTS` and ~40 test sites; `setup.ts` allows empty checkpoints unconditionally. - **spartan / docs / aztec-up / docker-compose**: removed the env var from infra, and added a migration note. - Also includes the benchmark pipelining-setup migration cherry-picked from #23647. Note: `SEQ_ENABLE_PROPOSER_PIPELINING` is a breaking change for node operators — see the migration note. Labeled `ci-no-fail-fast` to survey the full suite.
## Motivation
Building an L2-to-L1 message membership witness is currently a
client-side responsibility: callers hand
`computeL2ToL1MembershipWitness` an `OutboxRootsReader` plus a node, and
the helper reads the L1 Outbox and rebuilds the full four-level message
tree on every call, which involves sending all messages for an entire
epoch to the client. This adds a node JSON-RPC method that does the work
centrally, behind a cache.
As of this PR, the cache only caches reads to the Outbox and for a
single L1 slot, but the intent is to eventually cache the individual
L2-to-L1 trees so we don't have to recompute them on every request, as
well as permanently cache data for an epoch once it's finalized.
Fixes A-653
## Approach
A new `OutboxTreesResolver` in the archiver resolves witness requests.
Roots are fetched lazily on request and re-fetched only when the node's
synced L1 block has advanced. The four tree levels are rebuilt per
request by delegating to the existing, unchanged
`computeL2ToL1MembershipWitness` helper with the cached roots array, so
there is no second cache to keep consistent. The resolver is exposed
through the archiver (the node's block source) and surfaced as
`AztecNode.getL2ToL1MembershipWitness`.
## Changes
- **archiver**: New `OutboxTreesResolver` (lazy roots cache,
single-flight de-duplication, witness assembly).
- **stdlib**: Add `getL2ToL1MembershipWitness` to the `AztecNode` /
`L2BlockSource` / archiver interfaces and schemas; export
`L2ToL1MembershipWitnessSchema`. The `computeL2ToL1MembershipWitness`
helper is unchanged.
- **aztec-node**: `AztecNodeService.getL2ToL1MembershipWitness`
passthrough to the block source.
- **ethereum**: `OutboxContract.getRoots` gains an optional `{
blockNumber }` read option so reads can be pinned to the node's synced
L1 block.
- **end-to-end (tests)**: Migrate `computeL2ToL1MembershipWitness` call
sites to the new RPC, wrapped in `retryUntil` for the cache's eventual
consistency. The synthetic-roots case in
`epochs_partial_proof_multi_root` keeps using the helper directly.
- **archiver (tests)**: New `outbox_trees_resolver.test.ts` covering the
lazy cache (refresh-on-advance, seal/finalize permanence, not-synced
handling), single-flight, witness correctness across partial-proof
depths and block-level compression, and transient-vs-genuine root
mismatch handling.
## Summary `BLOCK_TXS` request/response validation had a bug that caused us to **discard perfectly good transactions**. When a peer doesn't have the block (proposal pruned, or never received) but the request carried the full tx hashes, the responder (`reqRespBlockTxsHandler`) still matches those hashes against its own tx pool and ships whatever it finds — it just can't produce an availability bitvector for a block it doesn't know about. This is a legitimate "I don't have the block, but here are the txs you asked for by hash" response, not misbehaviour. Previously this case was signalled by setting `archiveRoot = Fr.ZERO` on the response, and `validateRequestedBlockTxsConsistency` treated any response that didn't echo the requested archive root (including the zero case) as a hard failure: it returned `false`, which routed the response through the `INTERNAL_ERROR` path and discarded the returned txs entirely. The intended behaviour is the opposite — we want to **use** the txs the peer returned and merely mark the peer as "dumb" (it can't serve index-based smart requests), without penalising it. ## Changes **Drop `archiveRoot` from `BlockTxsResponse`.** - The archive root on the response only ever served as an out-of-band "I have / don't have the block" flag (and a redundant echo of the request). - Checking if the response matches the request doesn't make sense. A cheating peer can always return the same archive root as the request, but otherwise malform the rest of the response. - It is replaced by a `peerHasBlock()` helper that derives the same signal from the availability bitvector: an empty bitvector (length 0) means the peer doesn't have the block. The responder no longer special-cases the archive root. **Rework `validateRequestedBlockTxsConsistency`.** Validation now: - rejects + penalises (mid) duplicate txs in the response; - resolves the block tx hashes from the proposal or the archiver — if neither is available we can't verify membership, so we reject without penalising (local-state gap, not a peer fault); - rejects + penalises (low) any returned tx that is neither part of the block nor one we explicitly requested by hash — i.e. the returned set must be a subset of `block tx hashes ∪ request.txHashes` (a tx requested by hash may legitimately not belong to the block being validated); - accepts (returns `true`) when the peer signals it lacks the block (`!peerHasBlock()`) — the returned txs are still valid and usable, which is the core of the fix; - rejects + penalises (mid) a bitvector whose length disagrees with the block size; - rejects + penalises (low) a peer that advertises a requested tx via its bitvector but withholds it from the response. The previous order / strictly-increasing and `maxReturnable` checks are removed; membership plus the advertise-vs-deliver check cover the cases that matter. **Move dumb-marking into the smart/dumb decision.** `BatchTxRequester` no longer inspects archive roots. `decideIfPeerIsSmart` marks a peer dumb (and clears its per-peer data, without penalty) whenever the response signals it lacks the block (`!peerHasBlock()`); penalisation for genuinely inconsistent responses is left to the validator. The old `handleArchiveRootMismatch` helper is removed. ## Tests - Updated the serialization, handler, validation, requester and integration tests to the `peerHasBlock()` model. - Added a regression test at the validator level — a peer that signals it lacks the block via an empty bitvector but returns valid txs is now accepted instead of discarded. - Added a regression test at the requester level — those txs are delivered (used) and the peer is marked dumb without penalty. - Added coverage for the partial-availability case (peer returns fewer txs than bits set: we request a,b,c, peer has c,d,e, so only c comes back with three bits set) and for the by-hash case (a tx requested via `request.txHashes` that is not part of the block is accepted). The regression tests fail against the former behaviour.
## Summary Fixes **[A-1070](https://linear.app/aztec-labs/issue/A-1070/malicious-proposer-can-make-honest-nodes-to-fail-tx-validation)**: a malicious proposer who sends two different proposals with the **same archive root but different tx sets** could make two honest nodes fail the `BLOCK_TXS` exchange and penalize each other. In the `BLOCK_TXS` protocol the requester asks for txs by their **index** within a block (proposal), identified only by its archive root. If an equivocating proposer gives node A and node B two proposals that share an archive root but differ in their tx list, then: - Node A (requester) asks node B for txs at indices `[i, j, …]` of "the block with this archive root". - Node B (responder) resolves those indices against *its* version of the proposal and returns txs that, from A's perspective, are not part of the block. - A's `validateRequestedBlockTxsConsistency` rejects the response and penalizes B — an honest node punished for honest behavior. ## Fix The request now carries a **commitment to the full set of block tx hashes** (`blockTxHashesCommitment`, a SHA-256 over the serialized tx hashes) alongside the archive root. The responder only serves txs *by index* (and advertises availability via the bitvector) when its own block's tx-hash commitment matches the request's. Otherwise it treats the request as "I don't have that block" — returning an empty bitvector and only servicing any explicitly-requested tx hashes — so neither side is penalized for an equivocation it didn't cause. This closes the gap that the archive root alone could not: identical archive roots no longer imply identical tx sets. ## Why not use proposal hash? That would work when the BLOCK_TXS request is from a proposal, but it cannot be used when it's done from a block (e.g., in the prover node). ## Changes - `BlockTxsRequest` gains a `blockTxHashesCommitment` field and a `computeBlockTxHashesCommitment` helper; serialization and `fromTxsSourceAndMissingTxs` updated accordingly. - `reqRespBlockTxsHandler` verifies the commitment before serving txs by index; on mismatch it falls back to the "block not available" path instead of returning indexed txs. - This builds on the preceding `BLOCK_TXS` validation revamp commit (consistency checks on the requester side, response no longer echoes the archive root). - Tests adapted across `block_txs`, `block_txs_handler`, and `libp2p_service`, plus a new handler test covering the equivocation case (different proposal under the same archive root → responder refuses to serve by index). Closes https://linear.app/aztec-labs/issue/A-1070/malicious-proposer-can-make-honest-nodes-to-fail-tx-validation .
This allows us to then make local changes to vs code settings in yarn-project, and then ignore them via `skip-worktree` so they persist across `git clean` operations triggered by bootstrap.
…due (#23807) ## Motivation The orphan-block guard in `checkSync` (added in #23606) was logging at `warn` on every non-proposer validator, ~once per second for a full slot, every slot. Under pipelining a node receives and re-executes a block proposal for the next checkpoint up to one slot before the matching checkpoint proposal arrives, so the world-state tip legitimately sits in an as-yet-unproposed checkpoint for that whole window. That is the happy path, not the abnormal "proposer published blocks but never the checkpoint" case the guard is meant to flag. Observed on `next-net`: 118 warnings in ~59s on a healthy validator for a single slot. ## Approach The condition that distinguishes "checkpoint hasn't arrived yet" from "checkpoint will never arrive" is purely temporal — which is exactly what the archiver already computes in `pruneOrphanProposedBlocks` to decide when to prune an orphan block. The guard now reuses that same deadline: it still refuses to build (`return undefined`) whenever the orphan-shaped state holds, but only escalates to `warn` once the enclosing checkpoint is overdue by that deadline; within the normal pipelining window it logs at `debug`. The warn therefore fires at the same instant the archiver would prune the orphan. ## Changes - **sequencer-client**: Add `isProposedCheckpointOverdue`, mirroring the archiver's orphan-prune deadline (`start of slot after the block's build slot + grace`, grace derived from `blockDurationMs` as the node wiring does). Gate the existing guard's log level on it — `warn` when overdue, `debug` otherwise. Control flow is unchanged. - **sequencer-client (tests)**: Thread a real `blockSlot` through the orphan-guard test setup and split the warning test into an overdue case (expects `warn`) and a within-window case (expects no `warn`).
…ublish windows (#23776) ## Summary Fixes timing bugs in block building and validation, now that proposer pipelining is the only production mode. Found via an audit of the sequencer timetable, checkpoint proposal job, validator client, proposal handler, and p2p proposal/attestation validators. ### The frame bug (main fix) Under pipelining the proposer job runs with `slotNow = N-1` (build slot) and `targetSlot = N`. The job passed `targetSlot` to `setState` for build-frame states, so `Sequencer.setState` measured the `assertTimeLeft` deadlines against `getSlotStartBuildTimestamp(targetSlot)` — one full Aztec slot (72s) later than the build frame. The build-frame deadlines (`INITIALIZING_CHECKPOINT`, `CREATING_BLOCK`, `ASSEMBLING_CHECKPOINT`, `COLLECTING_ATTESTATIONS`, `PUBLISHING_CHECKPOINT`, …) were therefore checked ~72s too late and never fired. Now these states are measured against `slotNow`. `targetSlot` is still used for headers, signing, and `sendRequestsAt`. ### Aligning the attestation / publish windows around L1 geometry - The checkpoint attestation/publish deadline and the p2p attestation acceptance window are now derived from `ethereumSlotDuration` — **one Ethereum slot (12s) before the last L1 block of the target slot**, the latest a checkpoint can be submitted and still land on L1 in its slot. Previously the deadline used the configurable `l1PublishingTime` and the p2p window was only `2 * p2pPropagationTime` (~4.5s into the target slot). This also unifies the deadline with the publisher's send lead (`sendRequestsAt` already targets one Ethereum slot before the target slot start). - Validators (in `validateCheckpointProposal`) keep validating/attesting checkpoint proposals until that L1 publish deadline instead of the target-slot start, so attestations stay useful right up to the proposer's real publish cutoff. Block-proposal re-execution deadlines are intentionally left at the target-slot start. ### Why no test caught the frame bug The job timing test built the job with `slotNow === targetSlot` (so the two frames coincided) and stubbed `setStateFn` with a no-op, mocking away the very `assertTimeLeft` enforcement where the frame matters. This PR adds: - A contract test asserting every build-frame state is set against the build slot (`slotNow`), not the target slot. - A behavioral test with a real enforcing `setStateFn`: a checkpoint whose assembly crosses the build-frame deadline is now correctly abandoned. Both fail on the pre-fix code and pass after the fix. - Updated stdlib/timetable, clock-tolerance, attestation-validator, proposal-handler, and validator tests for the realigned windows (including an `l1PublishingTime != ethereumSlotDuration` case proving the deadline is now Ethereum-slot-based). No constants were removed and no broader cleanup was done; that is deferred. ## Test plan - `yarn build` green; touched packages lint/format clean. - `@aztec/stdlib` timetable, `@aztec/sequencer-client` (incl. `timetable`, `checkpoint_proposal_job.timing`, `sequencer-publisher`), `@aztec/validator-client` (incl. `proposal_handler`, `validator`), and `@aztec/p2p` `msg_validators` suites pass.
fcarreiro
approved these changes
Jun 4, 2026
spalladino
pushed a commit
that referenced
this pull request
Jun 4, 2026
Backports the `merge-train/spartan-v5` wiring from #23831 (merged to `next`) onto `v5-next`. ## Why this is needed The base→train sync (`merge-train-next-to-branches.yml`) is push-triggered, and for a `push` event GitHub Actions runs the workflow file **as it exists on the pushed branch**. `v5-next` still has the old workflow (triggers only on `next`, no `spartan-v5` routing), so commits landing on `v5-next` never fire the sync into `merge-train/spartan-v5`. Putting the wiring on `v5-next` makes that sync fire for every subsequent push to `v5-next`. This is a cherry-pick of #23831's squashed commit onto `v5-next` — same 12-file changeset, no v5-specific divergence (the merge-train infra files merged cleanly; v5-next's newer `actions/checkout` pin is preserved). ## After merge Future pushes to `v5-next` will sync into `merge-train/spartan-v5`. The commit that already landed on `v5-next` (`chore: backport merge-train/spartan PRs to v5-next` #23846) won't retroactively trigger — it needs either a fresh push to `v5-next` or a one-time manual `scripts/merge-train/merge-next.sh merge-train/spartan-v5 v5-next` to catch the train up. Labeled `ci-skip` — workflow/script/docs config only. --- *Created by [claudebox](https://claudebox.work/v2/sessions/a24733a6b8930662) · group: `slackbot`*
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
v5-nextwas cut fromnextat cbc99df (Jun 1), so PRs merged tomerge-train/spartanafter the cut never flowed into it. This backports all of them (authored by @spalladino and @fcarreiro) to keep v5-next current with the spartan train.Approach
Each PR is cherry-picked from its squashed merge commit on
merge-train/spartan, in merge order, preserving the original commit message and PR number — one commit per backported PR. All 11 applied cleanly with no conflicts; patches are identical to the originals (verified viagit patch-id), andbootstrap.sh build yarn-projectpasses on the result. Labeledci-no-squashto preserve the per-PR commits.Backported PRs
Note #23660, #23778, and #23786 are breaking changes (node RPC + tx-effect db format, and p2p wire format respectively), as they were on
next.