Skip to content

ci(mutants): expand mutation testing scope to dispatch + classify, enable on PRs#320

Merged
Destynova2 merged 3 commits intomainfrom
ci/mutation-scope-expansion
Apr 28, 2026
Merged

ci(mutants): expand mutation testing scope to dispatch + classify, enable on PRs#320
Destynova2 merged 3 commits intomainfrom
ci/mutation-scope-expansion

Conversation

@Destynova2
Copy link
Copy Markdown
Contributor

Summary

  • Adds mutation testing for src/server/dispatch/ (4 shards: mod.rs split 0/2 + 1/2, retry.rs, provider_loop.rs) and src/routing/classify/ (2 shards: mod.rs, classify.rs) to the existing main-only matrix.
  • Introduces a PR-triggered mutants-pr job that runs only on the Rust files the PR touches within the curated scope (router, dispatch, classify, dlp). Wall-clocks at 25 min, writes a job summary + sticky PR comment, and stays informational — the full matrix on main remains the source of truth.
  • Adds scripts/mutation-pr.sh (diff-based filtering, scope filter, jq summary parsing, GITHUB_OUTPUT wiring).

Why

  • The audit flagged that mutation testing only ran on main and only covered router/ + features/dlp/ (5 files). Critical untested modules: server/dispatch/, routing/classify/.
  • Running on PRs surfaces regressions before they land on main, where the previous setup just produced post-merge noise.

Scope decisions

Module Shards Rationale
src/server/dispatch/mod.rs (575 LoC) 5a, 5b (--shard 0/2 and 1/2) Largest dispatch file; same split style as pii.rs
src/server/dispatch/retry.rs (478 LoC) 5c Single shard fits in 60 min budget
src/server/dispatch/provider_loop.rs (275 LoC) 5d Independent surface, kept separate from retry.rs
src/routing/classify/mod.rs (476 LoC) 6a Routing entry point, well-tested already
src/routing/classify/classify.rs (512 LoC) 6b Stateless complexity classifier

resolver.rs, telemetry.rs, inference.rs, tier_match.rs, etc. were not added to the matrix yet to keep total per-PR cost bounded; they can land in a follow-up.

PR sampling strategy (mutants-pr)

  1. Diffs BASE...HEAD to find changed src/**/*.rs files.
  2. Drops anything outside the curated scope (router / dispatch / classify / dlp). Empty intersection → job skipped (status=skipped-out-of-scope).
  3. Runs cargo mutants --file ... -- --lib with timeout --foreground --preserve-status 1500 (25 min cap).
  4. Parses mutants.out/outcomes.json via jq for caught/missed/timeout/unviable counts.
  5. Writes GITHUB_STEP_SUMMARY + upserts a single PR comment (no spam on re-runs).

Status legend exposed to the PR comment: clean, missed, timed-out, skipped-no-rust, skipped-out-of-scope.

.cargo/mutants.toml

Left unchanged. The existing five exclusions already document each unreachable mutant with a file/line rationale; tightening the regex list is a separate, code-touching effort and was deferred to keep this PR diff scoped to CI infrastructure.

Caveat

If the new mutants-pr job (or the next main run with the expanded matrix) surfaces a wave of surviving mutations on dispatch/ — likely, since the audit found 0 unit tests there — those are tracked as test gaps for the parallel test/dispatch-unit-tests PR. This PR explicitly does not aim to fix them.

This PR could not run cargo-mutants locally to produce concrete survivor counts (the Claude Code sandbox blocks cargo execution). The first run of the workflow on this branch will produce the baseline; the comment it posts becomes the acceptance criteria for the unit-test PR.

Test plan

  • CI: mutants-pr job runs (this PR touches .github/workflows/ci.yml + scripts/, both outside the scope filter, so the script should exit skipped-out-of-scope → no mutation work performed).
  • CI: existing mutants shards 1-4 still green on the next main push.
  • CI: new shards 5a-d, 6a-b run on next main push (informational, continue-on-error: true).
  • CI: validate-yaml (actionlint) passes on the new YAML.
  • CI: required gate is unchanged — neither mutants nor mutants-pr blocks merge.

🤖 Generated with Claude Code

…able on PRs

Adds mutation testing for `src/server/dispatch/` (4 shards) and
`src/routing/classify/` (2 shards) to the existing main-only matrix and
introduces a PR-triggered `mutants-pr` job that runs only on the Rust
files the PR touches within the curated scope (router, dispatch, classify,
dlp). The PR job wall-clocks at 25 min via a wrapper script and writes a
sticky comment + job summary with caught/missed counts; it stays
informational and never blocks merge — the full matrix on `main` remains
the source of truth.

Files:
- `.github/workflows/ci.yml`: 6 new shards on `mutants` (5a/b/c/d for
  dispatch, 6a/b for classify), new `mutants-pr` job, no change to the
  `required` gate (mutation testing stays informational).
- `scripts/mutation-pr.sh`: diff-based filtering against the PR base ref,
  scope filter (router/dispatch/classify/dlp), 25 min wall-clock cap,
  jq summary parser, GITHUB_OUTPUT integration.

`.cargo/mutants.toml` is intentionally left unchanged: the existing five
exclusions already document each unreachable mutant with a file/line
rationale, and tightening the list is a separate, code-touching effort.
@Destynova2 Destynova2 enabled auto-merge April 28, 2026 20:40
Clément LIARD added 2 commits April 28, 2026 22:44
…C2016

actionlint's shellcheck integration flagged the prior `printf '%s\n...'`
single-quoted format string with SC2016 (backtick characters in markdown
backticks misread as command substitution candidates). Replace the
printf with `{ echo ... } >file` and feed `gh pr comment --body-file`
plus `gh api --method PATCH` via `jq -Rs '{body: .}'`. Same external
behaviour, no more SC2016 noise.
Backticks in echos that wrap shell variables can be parsed by shellcheck
as command substitution candidates — even when escaped — and the
actionlint+shellcheck combo running in CI flagged them. Replace with
plain ASCII delimiters in the job-summary output and the comment-body
echos.
@Destynova2 Destynova2 merged commit c6c5729 into main Apr 28, 2026
42 checks passed
@Destynova2 Destynova2 deleted the ci/mutation-scope-expansion branch April 28, 2026 21:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant