Problem
All C-LEARN-scale integration tests are marked #[ignore] so they don't blow the 3-minute cargo test --workspace debug-mode wall-clock cap (see docs/dev/rust.md test-time budgets and the CLAUDE.md "Hard Rules" section). Affected tests include (in src/simlin-engine/tests/):
simulates_clearn
clearn_with_ltm_simulates_model_vars_identically
clearn_ltm_discovery_compiles
clearn_pinned_climate_loop_is_scored
simulates_clearn_wasm
Because they're #[ignore]d, they only run when someone remembers to invoke them manually (cargo test -- --ignored). Regressions in C-LEARN-scale behavior are therefore not caught automatically by CI.
Why it's now worth fixing
Measured 2026-06-01: a single C-LEARN test takes 39-60s in debug mode (parse + LTM compile alone is ~39s), i.e. 22-33% of the entire workspace budget -- which is exactly why they're ignored. But in release mode the same test is only 6-15s, thanks to the recent LTM performance work (the LTM compile dropped from 47.6s to 4.0s in release; see commits 477af2a / 94f9c07 and the recent perf session).
The perf wins make these tests cheap enough in release mode that a dedicated CI lane running them automatically is now practical, where it would not have been before.
Why it matters
- Correctness/regression coverage: C-LEARN is the largest real model in the corpus and exercises code paths (large-model LTM compile, fragment assembly, pinned-loop scoring, WASM sim) that smaller fixtures don't. Right now nothing guards those paths in CI.
- Developer experience: relying on humans to remember
--ignored is fragile; the recent perf regressions/fixes in this area show the paths are actively churning.
Proposed approach
Add a separate CI lane in .github/workflows/ci.yaml (or a scheduled/nightly job) that runs the ignored C-LEARN-scale tests in release mode, e.g.:
cargo test --release -p simlin-engine -- --ignored <clearn test filters>
This stays off the 3-minute debug cargo test --workspace path (which remains the canonical PR gate) while still exercising the C-LEARN tests automatically. Options to weigh:
- A dedicated release-mode job in the existing PR CI matrix (runs on every PR but only the C-LEARN-scale ignored tests, in release).
- A
schedule:-triggered nightly job (lower PR cost, slightly delayed signal).
Either way, this is a CI-config change; it does not require un-marking the #[ignore] attributes for the default debug flow.
Components affected
- CI configuration:
.github/workflows/ci.yaml
- The
#[ignore]d tests in src/simlin-engine/tests/ (no source change needed beyond possibly a shared filter/feature to select them)
Related
Discovery context
Identified during the GH #653 pinned-loop work (design plan docs/design-plans/2026-06-01-ltm-653-pinned-loop-dimensions.md, "Additional Considerations" section records the timing measurements). The question was whether the new C-LEARN pin test (clearn_pinned_climate_loop_is_scored) could be un-ignored given the recent perf improvements; the answer was no for debug mode, yes if a release-mode lane existed.
Problem
All C-LEARN-scale integration tests are marked
#[ignore]so they don't blow the 3-minutecargo test --workspacedebug-mode wall-clock cap (see docs/dev/rust.md test-time budgets and theCLAUDE.md"Hard Rules" section). Affected tests include (insrc/simlin-engine/tests/):simulates_clearnclearn_with_ltm_simulates_model_vars_identicallyclearn_ltm_discovery_compilesclearn_pinned_climate_loop_is_scoredsimulates_clearn_wasmBecause they're
#[ignore]d, they only run when someone remembers to invoke them manually (cargo test -- --ignored). Regressions in C-LEARN-scale behavior are therefore not caught automatically by CI.Why it's now worth fixing
Measured 2026-06-01: a single C-LEARN test takes 39-60s in debug mode (parse + LTM compile alone is ~39s), i.e. 22-33% of the entire workspace budget -- which is exactly why they're ignored. But in release mode the same test is only 6-15s, thanks to the recent LTM performance work (the LTM compile dropped from 47.6s to 4.0s in release; see commits 477af2a / 94f9c07 and the recent perf session).
The perf wins make these tests cheap enough in release mode that a dedicated CI lane running them automatically is now practical, where it would not have been before.
Why it matters
--ignoredis fragile; the recent perf regressions/fixes in this area show the paths are actively churning.Proposed approach
Add a separate CI lane in
.github/workflows/ci.yaml(or a scheduled/nightly job) that runs the ignored C-LEARN-scale tests in release mode, e.g.:This stays off the 3-minute debug
cargo test --workspacepath (which remains the canonical PR gate) while still exercising the C-LEARN tests automatically. Options to weigh:schedule:-triggered nightly job (lower PR cost, slightly delayed signal).Either way, this is a CI-config change; it does not require un-marking the
#[ignore]attributes for the default debug flow.Components affected
.github/workflows/ci.yaml#[ignore]d tests insrc/simlin-engine/tests/(no source change needed beyond possibly a shared filter/feature to select them)Related
wrld3_element_level_enumeration_is_uncappedbehind release-only for the same debug-vs-release cost asymmetry -- precedent for release-gated heavy tests.Discovery context
Identified during the GH #653 pinned-loop work (design plan
docs/design-plans/2026-06-01-ltm-653-pinned-loop-dimensions.md, "Additional Considerations" section records the timing measurements). The question was whether the new C-LEARN pin test (clearn_pinned_climate_loop_is_scored) could be un-ignored given the recent perf improvements; the answer was no for debug mode, yes if a release-mode lane existed.