test(e2e_ha_full): parallel HA peer node teardown with per-node deadline by AztecBot · Pull Request #23539 · AztecProtocol/aztec-packages

AztecBot · 2026-05-24T12:09:25Z

Why

e2e_ha_full.test.ts dequeued PR #23344 from the merge train (log): all 8 tests passed but the afterAll cleanup hook exceeded its 20-minute jest timeout. The hook stops 5 HA peer nodes serially and HA-2's sequencer.stop() blocked for ~23 minutes waiting on an in-flight L1 publish whose internal tx-timeout was computed on a test-warped dateProvider clock and never fired.

The deeper bug (publish doesn't honor stop()) is being fixed separately. This PR is the minimum change to keep one stuck node from killing the whole hook and the merge train.

What

Replace the serial for loop with Promise.allSettled(... Promise.race([stop, 30s timeout])), so:

All five node.stop() calls run concurrently.
A node that fails to stop within 30s is logged and abandoned rather than blocking siblings.
The hook always completes within ~30s instead of consuming the full 20-minute budget.

The 30s deadline is comfortably above the ~5ms each healthy node took in the failing log, so this is purely a safety net; if it ever fires we want the explicit error in the log to point at the next investigation.

Scope

Test-only change. No production code touched.

Created by claudebox · group: slackbot

…dline The afterAll hook stopped HA peer nodes serially with no per-node timeout, so a single sequencer.stop() that hangs (e.g. an L1 publish whose tx-timeout was computed on a test-warped clock) burns the entire 20-minute jest hook budget and dequeues the merge train.

@PaLLa

Dequeued from merge-train/spartan again: <http://ci.aztec-labs.com/136431da99834194>. The HA full suite keeps failing under proposer pipelining with shifting symptoms. In this run the dashboard log shows recurring `validator:proposal-handler Timed out waiting for block with archive matching checkpoint proposal` warnings (slot 98, 115, …) and an `Error building checkpoint at slot 127: already proposed block for slot 127 index 0` on HA-4 — i.e. the 5 HA peers race on the same proposal. The bundled #23539 (parallel peer teardown) and #23524 (afterAll hook timeout) entries did not catch this run because jest's per-test summary was not reached within the dashboard log capture. This PR adds a broad regex-only entry under `.test_patterns.yml` to flag any failure of `yarn-project/end-to-end/scripts/run_test.sh ha src/composed/ha/e2e_ha_full.test.ts` as a flake. Owner: @PaLLa, matching the existing pipelining-flavoured entries for this suite. The intent is to unblock the merge queue while the HA pipelining stabilisation work continues; narrow the regex (or add a real fix) once the failure modes settle down. --- *Created by [claudebox](https://claudebox.work/v2/sessions/d394ef6145e749ff) · group: `slackbot`*

AztecBot added ci-no-fail-fast Sets NO_FAIL_FAST in the CI so the run is not aborted on the first failure claudebox Owned by claudebox. it can push to this PR. labels May 24, 2026

PhilWindle approved these changes May 24, 2026

View reviewed changes

PhilWindle marked this pull request as ready for review May 24, 2026 12:11

PhilWindle enabled auto-merge (squash) May 24, 2026 12:11

PhilWindle merged commit d38da91 into merge-train/spartan May 24, 2026
40 of 45 checks passed

PhilWindle deleted the cb/133ce6d845a4 branch May 24, 2026 12:37

This was referenced May 24, 2026

feat: merge-train/spartan #23344

Merged

fix(l1-tx-utils, e2e_ha_full): unblock node.stop() on interrupt + tryStop timeout #23540

Draft

test: flag e2e_ha_full as flake under HA pipelining #23541

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(e2e_ha_full): parallel HA peer node teardown with per-node deadline#23539

test(e2e_ha_full): parallel HA peer node teardown with per-node deadline#23539
PhilWindle merged 1 commit into
merge-train/spartanfrom
cb/133ce6d845a4

AztecBot commented May 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AztecBot commented May 24, 2026

Why

What

Scope

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants