fix(sandbox): tolerate unsealed inbox in simulation, drop pipelining from docs/playground by spalladino · Pull Request #23315 · AztecProtocol/aztec-packages

spalladino · 2026-05-15T14:10:19Z

Motivation

Two failure modes surfaced on the spartan merge train after PR #23277 enabled SEQ_ENABLE_PROPOSER_PIPELINING=true in the sandbox-based test composes. Both showed up in docs/examples/bootstrap.sh execute (e.g. http://ci.aztec-labs.com/7f325afea4f00b31): (1) aztecjs_advanced deterministically failed in AztecNodeService.simulatePublicCalls with L1ToL2MessagesNotReadyError — the simulator+inboxLag mismatch that's TODO'd in e2e_bot.test.ts:39, e2e_fees/*.test.ts, and e2e_avm_simulator.test.ts; (2) example_swap SIGTERMd at the docs-compose 600s mark while polling getBlockNumber('proven') because the local sandbox's proven tip only advances via the slow-path wall-clock warp once the chain goes idle.

Approach

Two commits. The first is the real bug-fix: AztecNodeService.simulatePublicCalls catches L1ToL2MessagesNotReadyError thrown when querying the not-yet-sealed next-checkpoint's L1→L2 messages, and simulates without those messages. Simulation becomes best-effort across checkpoint boundaries under pipelining; block production continues to use sealed messages as before. The second commit narrows the blast radius for the demo sandboxes: removes SEQ_ENABLE_PROPOSER_PIPELINING=true from docs/examples/ts/docker-compose.yml and playground/docker-compose.yml, drops example_swap from the default docs runner (matching the existing aave_bridge precedent), and bumps docs/examples/bootstrap.sh test_cmds TIMEOUT to 20m to match the bumps from #23275.

Pipelining coverage is retained where it actually exercises sequencer/watcher behaviour: yarn-project/end-to-end/scripts/docker-compose.yml (compose-routed e2e + cli-wallet flows) and aztec-up/test/{amm_flow,basic_install,bridge_and_claim}.sh. The proven-tip stall and re-enabling of example_swap are deferred to a follow-up that gives the sandbox a way to advance the proven tip without a continuous tx stream.

Changes

yarn-project/aztec-node (AztecNodeService.simulatePublicCalls): narrow try/catch on L1ToL2MessagesNotReadyError (matched by err.name); rethrow anything else.
docs/examples (compose, runner, test_cmds): drop pipelining env, drop example_swap from defaults, bump compose TIMEOUT to 20m.
playground (compose): drop pipelining env.

Codex reviewed both rounds of the design; the unsuccessful buildCheckpointIfEmpty + watcher-gate variant was abandoned after a confirmed cascade race / deadlock and reverted before commit.

`AztecNodeService.simulatePublicCalls` opens a fork of world state at the latest proposed block and, when the next block would start a new checkpoint, appends that checkpoint's L1->L2 messages to the fork's message tree so the simulated tx sees them. Under proposer pipelining with non-trivial `inboxLag`, the next-checkpoint's messages are not yet sealed on L1 — the archiver's message store throws `L1ToL2MessagesNotReadyError` when queried for an in-progress checkpoint (see `message_store.ts:233`). This makes every public-call simulation at a checkpoint boundary deterministically fail, which is the issue tracked by the existing `TODO(palla/pipelining): re-opt-in once public-call simulation handles inboxLag` comments in `e2e_bot.test.ts`, `e2e_fees/*.test.ts`, and `e2e_avm_simulator.test.ts`, and which surfaced as the `aztecjs_advanced` failures on PR #23253's merge-queue run. Catch the error by name (`L1ToL2MessagesNotReadyError`) and proceed with no next-checkpoint messages. Simulation becomes best-effort across checkpoint boundaries under pipelining: a tx that depends on a not-yet-sealed message may simulate incorrectly, but block production will use the real (sealed) messages when they are available. All other errors continue to throw.

Two unrelated failure modes surfaced when PR #23277 enabled `SEQ_ENABLE_PROPOSER_PIPELINING=true` on the docs-examples and playground compose sandboxes: 1. `example_swap` polls `getBlockNumber('proven')` after the swap's final tx lands and the sandbox goes idle. Under pipelining the proven tip only catches up via the watcher's slow-path wall-clock warp (~72s/slot), which can SIGTERM the example under merge-queue load. See http://ci.aztec-labs.com/b08ac48286302949 (block 86). 2. `aztecjs_advanced` deterministically failed in `AztecNodeService.simulatePublicCalls` with `L1ToL2MessagesNotReadyError` — that's the simulator+inboxLag mismatch fixed in the preceding commit. The simulator commit lands the actual bug-fix. This commit ships the narrower workarounds for the docs/playground demo sandboxes: - Remove `SEQ_ENABLE_PROPOSER_PIPELINING=true` from `docs/examples/ts/docker-compose.yml` and `playground/docker-compose.yml`. These are developer-facing demos, not pipelining test coverage; the real coverage lives in `yarn-project/end-to-end/scripts/docker-compose.yml` and the `aztec-up/test/*.sh` shell scripts, both untouched. - Drop `example_swap` from the default docs runner list, matching the existing `aave_bridge` precedent, since the proven-tip stall is a sandbox-side limitation that needs a separate sequencer-team fix. - Bump `docs/examples/bootstrap.sh` `test_cmds` TIMEOUT to 20m to match the compose/web3signer/ha bumps in #23275 — defense-in-depth against cumulative runtime growth, no longer the primary fix. Re-enable in a follow-up once the sandbox advances the proven tip without a continuous tx stream.

spalladino added 2 commits May 15, 2026 11:09

spalladino requested a review from a team as a code owner May 15, 2026 14:10

spalladino added the ci-no-fail-fast Sets NO_FAIL_FAST in the CI so the run is not aborted on the first failure label May 15, 2026

spalladino closed this May 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(sandbox): tolerate unsealed inbox in simulation, drop pipelining from docs/playground#23315

fix(sandbox): tolerate unsealed inbox in simulation, drop pipelining from docs/playground#23315
spalladino wants to merge 2 commits into
merge-train/spartanfrom
spl/sandbox-build-empty

spalladino commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

spalladino commented May 15, 2026

Motivation

Approach

Changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant