fix(p2p): reduce flakiness in proposal tx collector benchmark by AztecBot · Pull Request #22240 · AztecProtocol/aztec-packages

AztecBot · 2026-04-01T17:27:10Z

Summary

Fixes flakiness in p2p_client.proposal_tx_collector.bench.test.ts caused by three compounding issues:

chunkTxHashesRequest defaulted to chunkSize=1, creating 500 individual libp2p streams for the 500-tx send-batch-request case. The rapid stream churn overwhelms the connection, causing EPIPE cascades that kill the muxer. Bumped to chunkSize=8 as the existing TODO indicated.
Peer scores persisted between benchmark cases, so hundreds of HighToleranceError penalties from EPIPE failures in one case degraded peer selection in subsequent cases. Added PeerScoring.resetAllScores() and called it in the worker before each benchmark run.
No connectivity check between cases, so degraded connections from a previous case could silently affect the next. Added waitForConnectivity() to verify the aggregator has 80% of expected peers before each case starts.

Full analysis with CI log evidence: https://gist.github.com/AztecBot/e5af3238fbfefc29c51de2ee5deaa8ea

Changes

protocols/tx.ts: Change chunkTxHashesRequest default chunkSize from 1 to 8
peer_scoring.ts: Add resetAllScores() method
p2p_client_testbench_worker.ts: Reset peer scores before each bench case, add GET_PEER_COUNT IPC command
worker_client_manager.ts: Add waitForConnectivity() and getPeerCount() methods
p2p_client.proposal_tx_collector.bench.test.ts: Check connectivity in beforeEach

ClaudeBox log: https://claudebox.work/s/38590d3cfe6a7000?run=2

- Bump chunkTxHashesRequest default chunkSize from 1 to 8 (was a known TODO) - Add PeerScoring.resetAllScores() and call it between benchmark cases - Add connectivity check before each benchmark case to detect degraded state

AztecBot · 2026-04-07T06:36:20Z

Automatically closing this stale claudebox draft PR (no updates for 5+ days). Re-open if still needed.

BEGIN_COMMIT_OVERRIDE chore: fix mempool limit test (#22332) fix(bot): bot fee juice funding (#21949) fix(foundation): flush current batch on BatchQueue.stop() (#22341) chore: (A-750) read JSON body then parse to avoid double stream consumption on error message (#22247) chore: bump log level in stg-public (#22354) chore: fix main.tf syntax (#22356) chore: wire up spartan checks to make (#22358) fix(p2p): reduce flakiness in proposal tx collector benchmark (#22240) fix: disable sponsored fpc and test accounts for devnet (#22331) chore: add v4-devnet-3 to tf network ingress (#22327) chore: remove unused env var (#22365) chore: add pdb (#22364) chore: dispatch CB on failed deployments (#22367) chore: (A-749) single character url join (#22269) feat: support different docker image for HA validator nodes (#22371) chore: fix the daily healthchecks (#22373) chore: remove v4-devnet-2 references (#22372) fix: rename #team-alpha → #e-team-alpha slack channel (#22374) chore(pipeline): timetable adjustments under pipelining (#21076) feat(pipeline): handle pipeline prunes (#21250) fix: handle error types serialization errors (#22379) feat(spartan): configurable HA validator replica count (#22384) fix(e2e): increase prune timeout in epochs_mbps_pipeline test (#22392) fix(epoch-cache): use TTL-based caching with finalization tracking and correct lag (#22204) chore: deflake e2e ha sync test (#22403) chore(ci): skip prunes-uncheckpointed test in epochs_mbps_pipeline (#22401) refactor(slasher): remove empire slasher model (#21830) fix: use strict equality in world-state ops queue (#22398) fix: remove unused BLOCK reqresp sub-protocol (#22407) refactor(sequencer): sign last block before archiver sync (#22117) feat(world-state): add genesis timestamp support and GenesisData type (#22359) fix: use Int64Value instead of Uint32Value for 64-bit map sizes (#22400) chore: Reduce logging verbosity (#22423) fix(p2p): include values in tx validation error messages (#22422) END_COMMIT_OVERRIDE

fix(p2p): reduce flakiness in proposal tx collector benchmark

1728140

- Bump chunkTxHashesRequest default chunkSize from 1 to 8 (was a known TODO) - Add PeerScoring.resetAllScores() and call it between benchmark cases - Add connectivity check before each benchmark case to detect degraded state

AztecBot added ci-draft Run CI on draft PRs. claudebox Owned by claudebox. it can push to this PR. labels Apr 1, 2026

AztecBot closed this Apr 7, 2026

PhilWindle reopened this Apr 7, 2026

PhilWindle marked this pull request as ready for review April 7, 2026 12:14

Merge branch 'merge-train/spartan' into claudebox/38590d3cfe6a7000-2

40055ce

PhilWindle approved these changes Apr 7, 2026

View reviewed changes

PhilWindle enabled auto-merge (squash) April 7, 2026 12:16

PhilWindle merged commit d26761f into merge-train/spartan Apr 7, 2026
12 checks passed

PhilWindle deleted the claudebox/38590d3cfe6a7000-2 branch April 7, 2026 12:34

AztecBot mentioned this pull request Apr 7, 2026

feat: merge-train/spartan #22352

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(p2p): reduce flakiness in proposal tx collector benchmark#22240

fix(p2p): reduce flakiness in proposal tx collector benchmark#22240
PhilWindle merged 2 commits intomerge-train/spartanfrom
claudebox/38590d3cfe6a7000-2

AztecBot commented Apr 1, 2026

Uh oh!

AztecBot commented Apr 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AztecBot commented Apr 1, 2026

Summary

Changes

Uh oh!

AztecBot commented Apr 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants