Skip to content

feat(mcp,agent): add retry jitter and cancellation smoke test#4609

Merged
bug-ops merged 2 commits into
mainfrom
feat/4574-4587-resilience
May 29, 2026
Merged

feat(mcp,agent): add retry jitter and cancellation smoke test#4609
bug-ops merged 2 commits into
mainfrom
feat/4574-4587-resilience

Conversation

@bug-ops
Copy link
Copy Markdown
Owner

@bug-ops bug-ops commented May 29, 2026

Summary

Changes

  • crates/zeph-mcp/Cargo.toml: add rand.workspace = true
  • crates/zeph-mcp/src/manager.rs: apply jitter in connect_retry_backoff(), update doc comment, update two tests to range assertions
  • src/agent_setup.rs: add cancellation smoke test with NOTE comment explaining coverage limitation

Test coverage limitation (#4587)

apply_response_cache is a sync function that detaches an inner cleanup task via tokio::spawn. Without exposing the inner JoinHandle, a test cannot directly await the inner task's exit on cancellation. The added test verifies the function does not panic and returns promptly; deeper cancel-branch coverage requires a future API change to expose the handle.

Test plan

  • cargo nextest run -p zeph-mcp -E 'test(connect_retry_backoff)' — 2/2 PASS
  • cargo nextest run -p zeph --features testing -E 'test(apply_response_cache_cleanup_spawns_without_panic)' — 1/1 PASS
  • Full workspace: 10214/10214 PASS
  • cargo +nightly fmt --check — clean
  • cargo clippy --workspace -- -D warnings — clean
  • RUSTFLAGS="-D warnings" cargo check --workspace --all-targets --features desktop,ide,server,chat,pdf,scheduler — clean

Closes #4574
Closes #4587

@github-actions github-actions Bot added enhancement New feature or request rust Rust code changes dependencies Dependency updates size/M Medium PR (51-200 lines) labels May 29, 2026
@bug-ops bug-ops enabled auto-merge (squash) May 29, 2026 00:34
@bug-ops bug-ops force-pushed the feat/4574-4587-resilience branch from cc84ffa to 0a80e52 Compare May 29, 2026 00:34
Add ±25% downward jitter to connect_retry_backoff() in zeph-mcp so
concurrent MCP server restarts do not fire reconnect attempts in
lock-step (thundering herd). Jitter range is [nominal*3/4, nominal],
AWS full-jitter style. Requires rand workspace dependency.

Add apply_response_cache_cleanup_spawns_without_panic test in the
binary crate to verify the function does not panic when a
CancellationToken is cancelled during setup. A NOTE comment documents
that the inner detached cleanup loop cannot be directly awaited without
exposing its JoinHandle; deeper coverage is deferred.

Closes #4574
Closes #4587
@bug-ops bug-ops force-pushed the feat/4574-4587-resilience branch from 0a80e52 to 53f92dc Compare May 29, 2026 00:35
@bug-ops bug-ops merged commit 647f50a into main May 29, 2026
32 checks passed
@bug-ops bug-ops deleted the feat/4574-4587-resilience branch May 29, 2026 00:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Dependency updates enhancement New feature or request rust Rust code changes size/M Medium PR (51-200 lines)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

test(agent): add cancellation coverage for apply_response_cache cleanup loop feat(mcp): add jitter to startup retry backoff to avoid thundering herd

1 participant