fix(tests): bump chunk-op timeout to 90s for macOS CI runners by grumbach · Pull Request #75 · WithAutonomi/ant-node

grumbach · 2026-04-21T06:18:25Z

Context

data_types::chunk::tests::test_chunk_store_on_remote_node has been flaking on Test (macos-latest) with:

thread 'data_types::chunk::tests::test_chunk_store_on_remote_node' panicked at tests/e2e/data_types/chunk.rs:260:14:
Failed to store max-size chunk on remote node: Storage("Timeout waiting for remote store response after 30s")

The test transfers a 4 MiB chunk via QUIC on loopback inside a 5-node testnet. The 30 s budget covers QUIC+PQC handshake, payload transfer, and storage confirmation — enough on Linux CI, not enough on macOS runners (nested-virt, roughly half the CPU throughput of the Linux pool). Under the concurrent handshake burst the PQC (ML-KEM-768 + ML-DSA-65) exchange + a 4 MiB transfer + disk write can spill past 30 s on a bad day.

This is the same root cause that ant-client#50 fixed for the client test suite — CPU-constrained macOS runners + a timing budget sized for Linux.

Fix

Bump DEFAULT_CHUNK_OPERATION_TIMEOUT_SECS from 30 s to 90 s. Test-only: no production code path reads this constant. The constant carries a comment explaining why it's larger than the happy-path needs, so future readers don't shrink it back.

Test plan

cargo fmt --all --check: clean
cargo clippy --all-targets --all-features -- -D warnings -D clippy::unwrap_used -D clippy::expect_used: clean
cargo test --features test-utils --test e2e data_types::chunk::tests::test_chunk_store_on_remote_node: passes locally in 1.86 s (happy path unchanged, new budget only matters on the slow runner)
Full CI will run on this PR; expect macOS Test matrix to go green.

`data_types::chunk::tests::test_chunk_store_on_remote_node` has been flaking on `Test (macos-latest)` with: Storage("Timeout waiting for remote store response after 30s") The test transfers a 4 MiB chunk over QUIC on loopback inside a 5-node testnet, with the 30 s budget covering QUIC+PQC handshake, payload transfer, and storage confirmation. Linux runners fit comfortably; macOS runners (nested-virt, roughly half the CPU throughput of the Linux pool) saturate under the concurrent handshake burst and blow through 30 s on bad days. Mirrors the ant-client#50 root cause. 90 s is conservative - happy-path loopback transfers complete in under a second, so the larger budget only shows up on flakes. Test-only; no production code path reads DEFAULT_CHUNK_OPERATION_TIMEOUT_SECS. Verified locally: test completes in 1.86 s with the new constant.

Copilot

Pull request overview

This PR addresses a flaky E2E test on macOS CI runners by increasing the default timeout budget used for chunk store/get operations in the E2E testnet harness.

Changes:

Increased DEFAULT_CHUNK_OPERATION_TIMEOUT_SECS from 30s to 90s.
Added an explanatory comment documenting why the larger timeout is needed on macOS CI.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-21T06:20:13Z

+/// Covers the full round-trip: QUIC handshake, up to a 4 MiB payload
+/// transfer, and storage confirmation. 30 s was enough on Linux CI but
+/// flaked on `macos-latest` runners (nested-virt, roughly half the CPU
+/// throughput of the Linux pool) when the 5-node testnet's concurrent
+/// QUIC+PQC handshake burst collided with the 4 MiB
+/// `test_chunk_store_on_remote_node` fixture. 90 s is deliberately
+/// conservative; the happy path completes in well under a second on
+/// loopback, so the larger budget only shows up on flakes. Test-only —


The doc comment mentions covering a full QUIC handshake + payload transfer, but this timeout is also used for purely local operations (store_chunk/get_chunk call protocol.try_handle_request without any network/handshake). Consider rewording the comment to reflect that it’s a shared default for both local protocol handling and remote P2P chunk ops, with the 90s budget primarily justified by the remote macOS CI case.

Suggested change

/// Covers the full round-trip: QUIC handshake, up to a 4 MiB payload

/// transfer, and storage confirmation. 30 s was enough on Linux CI but

/// flaked on `macos-latest` runners (nested-virt, roughly half the CPU

/// throughput of the Linux pool) when the 5-node testnet's concurrent

/// QUIC+PQC handshake burst collided with the 4 MiB

/// `test_chunk_store_on_remote_node` fixture. 90 s is deliberately

/// conservative; the happy path completes in well under a second on

/// loopback, so the larger budget only shows up on flakes. Test-only —

/// Shared default for both local protocol handling and remote P2P chunk

/// operations in the E2E harness. Some call sites only execute

/// `protocol.try_handle_request` locally, so they do not involve any network

/// transfer or QUIC handshake; others cover the full remote round-trip,

/// including handshake, up to a 4 MiB payload transfer, and storage

/// confirmation. 30 s was enough on Linux CI but flaked on `macos-latest`

/// runners (nested-virt, roughly half the CPU throughput of the Linux pool)

/// when the 5-node testnet's concurrent QUIC+PQC handshake burst collided

/// with the 4 MiB `test_chunk_store_on_remote_node` fixture. 90 s is

/// deliberately conservative; the happy path completes in well under a second

/// on loopback, so the larger budget only shows up on flakes. Test-only —

Copilot AI review requested due to automatic review settings April 21, 2026 06:18

Copilot started reviewing on behalf of grumbach April 21, 2026 06:18 View session

Copilot AI reviewed Apr 21, 2026

View reviewed changes

mickvandijke approved these changes Apr 21, 2026

View reviewed changes

mickvandijke merged commit 5a5d7d4 into main Apr 21, 2026
15 checks passed

mickvandijke deleted the fix/e2e-macos-chunk-timeout branch April 21, 2026 07:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(tests): bump chunk-op timeout to 90s for macOS CI runners#75

fix(tests): bump chunk-op timeout to 90s for macOS CI runners#75
mickvandijke merged 1 commit intomainfrom
fix/e2e-macos-chunk-timeout

grumbach commented Apr 21, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

-/// Covers the full round-trip: QUIC handshake, up to a 4 MiB payload
-/// transfer, and storage confirmation. 30 s was enough on Linux CI but
-/// flaked on `macos-latest` runners (nested-virt, roughly half the CPU
-/// throughput of the Linux pool) when the 5-node testnet's concurrent
-/// QUIC+PQC handshake burst collided with the 4 MiB
-/// `test_chunk_store_on_remote_node` fixture. 90 s is deliberately
-/// conservative; the happy path completes in well under a second on
-/// loopback, so the larger budget only shows up on flakes. Test-only —
+/// Shared default for both local protocol handling and remote P2P chunk
+/// operations in the E2E harness. Some call sites only execute
+/// `protocol.try_handle_request` locally, so they do not involve any network
+/// transfer or QUIC handshake; others cover the full remote round-trip,
+/// including handshake, up to a 4 MiB payload transfer, and storage
+/// confirmation. 30 s was enough on Linux CI but flaked on `macos-latest`
+/// runners (nested-virt, roughly half the CPU throughput of the Linux pool)
+/// when the 5-node testnet's concurrent QUIC+PQC handshake burst collided
+/// with the 4 MiB `test_chunk_store_on_remote_node` fixture. 90 s is
+/// deliberately conservative; the happy path completes in well under a second
+/// on loopback, so the larger budget only shows up on flakes. Test-only —

Conversation

grumbach commented Apr 21, 2026

Context

Fix

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants