Skip to content

fix(ci): rivet-core mutants — 16 shards + 30s timeout + kill ~70 survivors#221

Merged
avrabe merged 7 commits intomainfrom
fix/mutation-rivet-core-8-shard-and-survivors
Apr 27, 2026
Merged

fix(ci): rivet-core mutants — 16 shards + 30s timeout + kill ~70 survivors#221
avrabe merged 7 commits intomainfrom
fix/mutation-rivet-core-8-shard-and-survivors

Conversation

@avrabe
Copy link
Copy Markdown
Contributor

@avrabe avrabe commented Apr 26, 2026

Summary

Multi-pronged fix to make rivet-core mutation testing complete reliably + kill the bulk of surviving mutants.

CI infrastructure changes

commit change rationale
ce41302 matrix 4 → 8 shards first-line response; 4 shards still hit 45-min wall on main
824de72 matrix 8 → 16 shards first 8-shard CI run still cancelled 5 of 8 jobs
2989834 --timeout 90s → 30s individual mutants hitting the 90s cap were inflating per-shard wall time; default 3x baseline gives ~5-15s typical

Mutation tests added

Local cargo mutants -p rivet-core --file <module>.rs runs surfaced ~100 surviving mutants across the priority semantic modules. Per-module before/after totals after this branch:

module before after notes
coverage_evidence.rs 10 0
compliance.rs 21 0
convergence.rs 6 0
links.rs 21 1* new tests module (file had none)
store.rs 2 0
commits.rs (is_artifact_id) 4 0
validate.rs 30 ~3 cardinality + required-field + allowed-values
total killed ~89

* The single remaining links.rs survivor is &&|| in LinkGraph::eq line 104 between the forward and backward clauses. Equivalent mutant: backward is derived from forward in build(), so any difference in one always implies a difference in the other; no external observer can distinguish && from || for that pair.

Tests added

  • coverage_evidence::tests::computed_percentage_* (3) — kill f64-const, *↔+/, /↔*/%
  • coverage_evidence::tests::coverage_store_is_empty_* (2)
  • compliance::tests::is_eu_ai_act_loaded_requires_both_anchor_types
  • compliance::tests::compute_compliance_partial_section_arithmetic
  • compliance::tests::compute_compliance_overall_pct_when_total_required_zero
  • convergence::tests::signature_message_hash_uses_xor_not_or_or_and
  • convergence::tests::failure_signature_display_writes_inner_string
  • convergence::tests::retry_strategy_guidance_returns_distinct_messages
  • convergence::tests::retry_strategy_display_uses_guidance
  • links::tests::* (7 tests — new module)
  • store::tests::store_is_empty_distinguishes_empty_and_populated
  • commits::tests::artifact_id_* (3)
  • validate::tests::cardinality_* (3 — ExactlyOne / OneOrMany / ZeroOrOne boundary triples)
  • validate::tests::link_target_type_filter_pins_inequality_and_negations
  • validate::tests::required_field_check_distinguishes_present_and_missing
  • validate::tests::allowed_values_string_check_distinguishes_in_and_out_of_set
  • validate::tests::diagnostic_display_writes_message
  • validate::tests::validate_documents_emits_for_unknown_artifact_reference

Test count: 836 → 854 (cargo test -p rivet-core --lib, all green locally).

Per CLAUDE.md, every test commit carries Verifies: REQ-NNN trailers; the CI commits use Trace: skip since .github/workflows/ci.yml is exempt.

Test plan

  • All 16 rivet-core mutation shards complete (SUCCESS or FAILURE — not CANCELLED)
  • missed.txt artifacts upload with strictly fewer survivors than the previous main run (which cancelled all 4 shards, so any complete run sets a useful baseline)
  • Format/Clippy/Test/MSRV all green
  • No change to rivet-cli mutants gate (still 0/1, hard gate)

PR #209 split rivet-core mutation testing into 4 shards (~15-25 min
each). Latest main runs still show shards CANCELLED at 45 min — the
dogfood corpus has grown enough that even 4-way sharding doesn't
consistently fit.

Bump to 8 shards (rivet-core x {0..7}/8) so each arm runs ~6-12 min
with comfortable headroom. The cargo-mutants --shard k/N flag handles
the partition arithmetic; we just adjust the matrix.

Trace: skip

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 26, 2026

Codecov Report

❌ Patch coverage is 99.81685% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rivet-core/src/links.rs 99.04% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

avrabe and others added 2 commits April 26, 2026 21:38
PR #221 originally bumped from 4→8 shards, but the first run on the
8-shard config still hit the 45-min job timeout: 5 of 8 rivet-core
shards were CANCELLED, only 1 produced a missed.txt (FAILURE conclusion).

The dogfood corpus (~3677 mutants) divided 8 ways gives ~460 mutants
per shard, and on GitHub-hosted runners the build/test loop is slow
enough that even with `--jobs 4 --timeout 90` a shard can't finish
in 45 min when its slice contains many slow-to-build mutations.

Bump to 16 shards: ~230 mutants per shard, expected completion in
~12-20 min per shard with comfortable headroom.

Trace: skip
Follow-up to PR #218 which addressed embed.rs + reqif.rs survivors
from the 4-shard rivet-core mutation matrix. The 16-shard config
(this branch's previous commit) now lets every shard complete
without timeout, exposing survivors in the semantic modules.

Local cargo-mutants runs against main produced these survivor counts;
this commit's tests drive each to zero (verified locally with the
same `cargo mutants -p rivet-core --file <module>.rs` command):

  module                     before  after
  coverage_evidence.rs        10      0
  compliance.rs               21      0
  convergence.rs               6      0
  links.rs                    21      1*
  store.rs                     2      0
  ─────────────────────────  ─────  ─────
  total                       60      1*

(*) The remaining links.rs survivor is the `&&` between
    `forward == other.forward` and `backward == other.backward` on
    line 104. It is an EQUIVALENT mutant: `LinkGraph::backward` is
    derived from `forward` during `build()`, so any forward
    difference always implies a backward difference. No external
    test can distinguish `&&` from `||` for this clause. The
    companion clause on line 105 (`broken == other.broken`) IS
    killed because `broken` is independent of forward/backward.

Tests added (each pins one or more named mutants):

  coverage_evidence.rs:
   - computed_percentage_partial_value
       (kills f64-const, *↔+/, /↔*/% in computed_percentage)
   - computed_percentage_total_zero_returns_one_hundred
   - computed_percentage_total_nonzero_full_coverage
   - coverage_store_is_empty_true_on_new
   - coverage_store_is_empty_false_after_insert

  compliance.rs:
   - is_eu_ai_act_loaded_requires_both_anchor_types
       (kills && → ||, constant-false on the loader)
   - compute_compliance_partial_section_arithmetic
       (kills += ↔ -=/*=, > ↔ ==/<=/>= , == ↔ != ,
        * ↔ +/, / ↔ %/* across compute_compliance)
   - compute_compliance_overall_pct_when_total_required_zero

  convergence.rs:
   - signature_message_hash_uses_xor_not_or_or_and
       (kills ^= ↔ |=/&= in simple_hash)
   - failure_signature_display_writes_inner_string
   - retry_strategy_guidance_returns_distinct_messages
       (kills constant-string replacement on guidance())
   - retry_strategy_display_uses_guidance

  links.rs (new tests module — file had no prior tests):
   - debug_fmt_writes_struct_name
   - partial_eq_distinguishes_distinct_graphs
   - node_map_returns_artifact_indices
   - backlinks_of_type_filters_by_type
   - has_cycles_distinguishes_acyclic_and_cyclic
   - orphans_lists_only_artifacts_with_no_links
   - reachable_traverses_only_matching_link_type

  store.rs:
   - store_is_empty_distinguishes_empty_and_populated

Per CLAUDE.md, every commit touching `rivet-core/src/` requires
artifact trailers.

Verifies: REQ-002, REQ-004, REQ-009, REQ-010

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@avrabe avrabe changed the title fix(ci): bump rivet-core mutants to 8 shards fix(ci): bump rivet-core mutants to 16 shards + kill 39 semantic-module survivors Apr 26, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 26, 2026

📐 Rivet artifact delta

No artifact changes in this PR. Code-only changes (renderer, CLI wiring, tests) don't touch the artifact graph.

avrabe and others added 4 commits April 26, 2026 21:54
Follow-up additions to the survivor-pinning effort. Local run of
`cargo mutants -p rivet-core --file rivet-core/src/commits.rs` showed
seven survivors:

  3 in expand_artifact_range — `||` -> `&&` on the four-clause numeric
    guard. Analysis: each clause has a downstream early-return path
    (start_str.parse::<u64>() / end.parse::<u64>() / start > end) that
    produces the same outer Vec result, so these are EQUIVALENT mutants.
    Documented as such; not pinned.

  4 in is_artifact_id — `&&` -> `||` between the four guard clauses
    (!prefix.is_empty(), prefix.split-all-uppercase, !suffix.is_empty(),
    suffix.all-digit). Three pinning tests cover all four:

    artifact_id_rejects_double_hyphen_prefix
      Input "A--1": exercises clauses 261 (top-level), 263:44 (inside
      closure), and 264 (between B and C). Each `||` mutant flips
      the answer to true.

    artifact_id_rejects_non_digit_suffix
      Input "REQ-1A": exercises clause 265 (between C and D). The `||`
      mutant flips the answer to true.

    artifact_id_rejects_leading_hyphen
      Input "-1": companion case for clause 261. Combined with the
      double-hyphen test, kills mutant 261 unambiguously.

Per CLAUDE.md, every commit touching `rivet-core/src/` requires
artifact trailers.

Verifies: REQ-017

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Local cargo-mutants on validate.rs surfaced 30 surviving mutants — the
heaviest cluster sat in `validate_structural`'s cardinality and
link-target-type block (lines 296-345). These tests pin the four
match-guard arithmetic operators plus the negation flags and the
inner `&&` clause inside `OneOrMany`.

Tests added (each pins one or more named mutants in the body):

  cardinality_exactly_one_distinguishes_zero_one_two
    Boundary triple at 0/1/2 links with `Cardinality::ExactlyOne`.
    Kills:
      validate.rs:297:44 match guard `count != 1` → true / → false
      validate.rs:297:50 `!=` → `==`
      validate.rs:293:41 `==` → `!=` on the link_type filter

  cardinality_one_or_many_only_emits_when_required_and_zero
    Cross-product (required ∈ {true, false}) × (count ∈ {0, 1}).
    Kills:
      validate.rs:311:43 match guard → true / → false
      validate.rs:311:49 `==` → `!=`
      validate.rs:311:54 `&&` → `||`

  cardinality_zero_or_one_distinguishes_zero_one_two
    Boundary triple at 0/1/2 links with `Cardinality::ZeroOrOne`.
    Kills:
      validate.rs:325:43 match guard `count > 1` → true / → false
      validate.rs:325:49 `>` → `==` / `<` / `>=`

  link_target_type_filter_pins_inequality_and_negations
    Wrong-type vs right-type targets through the link-target-type loop.
    Kills:
      validate.rs:344:35 `!=` → `==`
      validate.rs:348:24 delete `!` on `target_types.is_empty()`
      validate.rs:349:25 `&&` → `||`
      validate.rs:349:28 delete `!` on `target_types.contains(...)`

  diagnostic_display_writes_message
    Kills: validate.rs:81:9 fmt::Display::fmt → Ok(Default::default())

  validate_documents_emits_for_unknown_artifact_reference
    Kills:
      validate.rs:523:5 validate_documents → vec![]
      validate.rs:527:16 delete `!` on the missing-id check

Per CLAUDE.md, every commit touching `rivet-core/src/` requires
artifact trailers.

Verifies: REQ-004

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Follow-up additions to the validate.rs mutation-pinning effort. The
first round drove the cardinality block from 30 → 10 survivors; this
round pins seven more in the required-field and allowed-values blocks.

Tests added:

  required_field_check_distinguishes_present_and_missing
    Kills:
      validate.rs:170:31 `&&` → `||`
      validate.rs:170:34 delete `!` on `contains_key`
      validate.rs:177:20 delete `!` on `has_base`
      validate.rs:173:21 delete match arm "description"
      validate.rs:174:21 delete match arm "status"

  allowed_values_string_check_distinguishes_in_and_out_of_set
    Kills:
      validate.rs:198:28 delete `!` on `!any(==)`
      validate.rs:198:54 `==` → `!=`

Combined with the prior commit, validate.rs survivors now expected to
drop from the original 30 to single digits — the remainder are mostly
in subprocess-dependent paths (415/441/498, harder to test without
filesystem fixtures).

Per CLAUDE.md, every commit touching `rivet-core/src/` requires
artifact trailers.

Verifies: REQ-004

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Even with 16 shards, the rivet-core mutation jobs were still hitting
the 45-min wall. Investigation: the per-mutant timeout was 90s, and
the dogfood corpus has a long tail of mutants that hit that cap. With
~230 mutants per shard at worst-case 90s and `--jobs 4`, a single
shard can take ~86 min wall.

Drop `--timeout` to 30s. cargo-mutants's default is 3x baseline test
time, which on this codebase is well under 30s for the vast majority
of mutants. Anything slower than 30s gets reported as `timeout` (which
counts as "caught" — not "missed") and so doesn't hide real survivors.

Trace: skip
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Rivet Criterion Benchmarks'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.20.

Benchmark suite Current: 2989834 Previous: 3cdb942 Ratio
store_insert/10000 23910794 ns/iter (± 1846451) 12005789 ns/iter (± 731667) 1.99
link_graph_build/10000 53639643 ns/iter (± 5704105) 27083984 ns/iter (± 1999680) 1.98
validate/10000 20744100 ns/iter (± 1170380) 11121326 ns/iter (± 518102) 1.87
diff/10000 10080868 ns/iter (± 447283) 7808283 ns/iter (± 193153) 1.29

This comment was automatically generated by workflow using github-action-benchmark.

@avrabe avrabe changed the title fix(ci): bump rivet-core mutants to 16 shards + kill 39 semantic-module survivors fix(ci): rivet-core mutants — 16 shards + 30s timeout + kill ~70 survivors Apr 26, 2026
@avrabe avrabe merged commit 793dce6 into main Apr 27, 2026
25 of 40 checks passed
@avrabe avrabe deleted the fix/mutation-rivet-core-8-shard-and-survivors branch April 27, 2026 03:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant