Skip to content

Commit 9515add

Browse files
committed
Codify the testing playbook + tools inventory
Centralizes the decisions we made during the Step 1-6 speedup and the Step 7 + mutation/state-machine/property/IPC investigations into two canonical docs: - docs/testing.md: the playbook. Test pyramid, decision table (when to use which layer), anti-patterns with rationale (no `await sleep`, no synthetic F-key for dialog tests, no direct atomic mutation in state-machine tests, no `retries: 1` masking), and a per-feature checklist ("when you add X, also add Y"). - docs/tooling/testing.md: the tools inventory. One paragraph per tool — proptest, cargo-mutants, stryker, vitest mockIPC, pollUntil, dispatchMenuCommand, virtual-mtp, the Docker SMB containers, the checker, and the custom ESLint rules. Plus per-module CLAUDE.md updates calling out the testing bar where it matters most: - write_operations/: state machine has 30+ tests, drive transitions via public interface, run cargo-mutants after substantial changes - indexing/: IndexPhase lifecycle tests need a dedicated mutex + tempdir IndexStore; `Initializing` carries non-Clone owned state - src/lib/ipc/: when to write an IPC contract test (destructive, cross-window, many-arg surfaces) and when to skip (thin getters) And process integration: - AGENTS.md: link to docs/testing.md from the Testing section - docs/maintenance.md: quarterly mutation sweep, per-release E2E health check, per-release scan for new fixed sleeps in E2E specs, periodic triage of the legacy `eslint-disable cmdr/no-arbitrary-sleep-in-e2e` annotations
1 parent 43e78a5 commit 9515add

8 files changed

Lines changed: 451 additions & 32 deletions

File tree

AGENTS.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -92,6 +92,10 @@ Core structure:
9292

9393
## Testing and checking
9494

95+
**Before adding or modifying tests**, read [docs/testing.md](docs/testing.md) — the testing playbook (decision table,
96+
anti-patterns, per-feature checklist). The companion file [docs/tooling/testing.md](docs/tooling/testing.md) is the
97+
tools inventory.
98+
9599
Always use the checker script for compilation, linting, formatting, and tests. Its output is concise and focused — no
96100
`2>&1`, `head`, or `tail` needed. Don't run raw `cargo check`, `cargo clippy`, `cargo fmt`, `cargo nextest run`, etc.
97101

apps/desktop/src-tauri/src/file_system/write_operations/CLAUDE.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -255,3 +255,20 @@ and `statfs.f_fstypename` for APFS. See `copy_strategy.rs` for the implementatio
255255
- `crate::file_system::volume``Volume` trait, `SpaceInfo`, `ScanConflict` (used by `volume_copy`)
256256
- `crate::ignore_poison``IgnorePoison` extension for `RwLock`/`Mutex` to not panic on poisoned locks
257257
- External: `tauri` (emit, AppHandle), `uuid` (operation IDs, temp names), `libc` (access, statvfs, sync), `xattr`, `exacl`, `filetime` (metadata preservation in `chunked_copy`)
258+
259+
## Testing bar
260+
261+
This module's state machine (`state.rs`) is the spine of the cancel UX. Past investigations found one real production
262+
bug here ([commit `1de4255d`](../../../../../../docs/notes/speed-up-e2e-tests.md) — lost-rollback on `Ok(())` arm) plus
263+
30+ mutation-testing gaps that have since been pinned. New transitions or new cancel paths must:
264+
265+
1. **Drive the state machine through the public interface in tests.** Direct `state.intent.store(...)` mutation bypasses
266+
the validation guard and effectively dead-tests it. Pattern to copy: `state.rs::tests::test_cancel_via_public_path`.
267+
2. **Cover both the happy path and the cancel-during-X race** for any new write-side operation. The Cancel-copy bug
268+
was specifically the `Ok(())` arm of the loop not re-checking intent.
269+
3. **Add at least one E2E test** for user-visible flows (transfer dialogs, conflict policies) — use
270+
`dispatchMenuCommand` for keyboard-shortcut triggers, see `docs/testing.md` § "❌ Synthesized F-key dispatches".
271+
4. **Run `cargo mutants --file src/file_system/write_operations/<file>.rs`** after substantial changes — this module has
272+
~85-90% mutation score per file and shouldn't regress. See `docs/testing.md` § "Process".
273+
274+
See also: [docs/testing.md](../../../../../../docs/testing.md) for the project-wide testing playbook.

apps/desktop/src-tauri/src/indexing/CLAUDE.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -124,6 +124,25 @@ Key test files are alongside each module (test functions within `#[cfg(test)]` b
124124
- stress_tests_lifecycle.rs: lifecycle stress tests — start/stop/restart under load, clean lifecycle, double-start guard, early shutdown, rapid cycles, mixed queued work shutdown
125125
- stress_test_helpers.rs: shared helpers for stress tests — `setup_writer`, `build_synthetic_tree`, `check_db_consistency`, `make_file_entry`
126126

127+
### Testing bar — state machine + IndexPhase lifecycle
128+
129+
The `IndexPhase` lifecycle (`Disabled → Initializing → Running`, plus the `Initializing → Disabled` race during cancel)
130+
is the trickiest backend state machine to test cleanly. Three rules for tests in this area:
131+
132+
1. **Tests must serialize on a dedicated mutex.** `INDEXING` is a global; concurrent tests will corrupt each other.
133+
Pattern in `mod.rs::tests`: dedicated mutex + `IndexStore` fixtured via `tempdir`.
134+
2. **`Initializing { store: IndexStore }` carries non-`Clone` owned state.** Building fixtures is verbose. Where you'd
135+
like to test a pure transition without the owned data, extract a pure classifier (e.g. `is_initializing_phase`) and
136+
test that in isolation. Don't pretend to mock `IndexStore`.
137+
3. **`start_indexing`'s `Disabled → Initializing → Running` happy path needs `tauri::AppHandle`** — currently not
138+
feasible in unit tests without enabling the `tauri/test` feature (meaningful compile cost). The classifier-extraction
139+
approach covers the race-decision logic; the rest stays under integration / E2E coverage.
140+
141+
For `platform_case_compare` in `store.rs`: proptests cover the comparator algebra (reflexive / antisymmetric /
142+
transitive) and NFC≡NFD equivalence on macOS. Don't regress those — see `store.rs::tests` for the property statements.
143+
144+
See also: [docs/testing.md](../../../../../../docs/testing.md) for the project-wide testing playbook.
145+
127146
## Key decisions
128147

129148
**Defer indexer auto-start until the user decides about Full Disk Access**: At first launch on macOS, recursively

apps/desktop/src/lib/ipc/CLAUDE.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -167,3 +167,38 @@ Why two parallel command lists? `specta::function::collect_functions![]` doesn't
167167
(it only takes path expressions). The `tauri::generate_handler![]` macro does. So we use `generate_handler![]` for
168168
runtime dispatch (handles platform gating cleanly) and `collect_functions![]` for type collection (split by platform
169169
into separate fns). They're kept in sync by convention.
170+
171+
## IPC contract testing
172+
173+
For destructive / cross-window / multi-positional-arg commands, write a vitest IPC contract test that pins the wire
174+
shape. The harness is `installIpcMock()` in `test-helpers.ts`:
175+
176+
```ts
177+
import { installIpcMock } from './test-helpers'
178+
import { commands } from './bindings'
179+
180+
const ipc = installIpcMock()
181+
ipc.mock('copy_files', () => null)
182+
await commands.copyFiles({ sources, destination, volumeId, itemSizes, config })
183+
const call = ipc.lastCall('copy_files')
184+
expect(call?.payload).toMatchObject({ sources, destination /* ... */ })
185+
```
186+
187+
What IPC tests catch:
188+
189+
- Serde-shape drift between Rust and the typed bindings (renamed field in the Rust struct)
190+
- Multi-positional-arg ordering bugs (`mount_network_share` has 6 positional args — easy to swap two by accident)
191+
- Typed-error discriminator shape (`{ type: 'auth_required' }` vs `{ code: 'auth_required' }` etc.)
192+
- Whether the FE actually calls the binding (TS compiles ≠ runtime invokes)
193+
194+
What IPC tests do **not** catch:
195+
196+
- The Tauri permission gate (mockIPC patches `__TAURI_INTERNALS__.invoke` upstream of the gate)
197+
- Business-logic bugs in `*_core` / `ops_*` helpers (those should have Rust unit tests)
198+
- UI integration (Playwright E2E covers those)
199+
200+
When to skip an IPC test: thin getters (`get_default_volume_id`, `has_font_metrics`) — their shape is trivial and
201+
already enforced by the TS bindings. **Don't write IPC tests for every command** — focus on destructive, cross-window,
202+
and many-arg surfaces.
203+
204+
See `docs/testing.md` § "Decision table" for the broader picture.

docs/maintenance.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,23 @@ automate it, and survives `oxfmt` (which collapses whitespace in regular markdow
6666
- **Pin GitHub Actions to commit SHAs**: Re-pin Actions to fresh commit SHAs when their tagged versions move.
6767
Supply-chain safety. Frequency: every 6 months alongside the Actions version bump.
6868

69+
### Test-suite health
70+
71+
- **Quarterly: mutation-testing sweep on hot-spot modules**. Run `cargo mutants` on each module listed in
72+
`docs/testing.md` § "Hot spots" (write_operations, indexing, file_viewer, file_system/index/store). Triage survivors.
73+
If mutation score drops below 80% on any of them, add tests to bring it back. The pass is slow (~10–15 min per file on
74+
this hardware) — budget half a day. Document the run + score in the log below.
75+
- **Per release: E2E suite health check**. Three back-to-back `./scripts/check.sh --check desktop-e2e-playwright` runs.
76+
All three must be green (no flakes). Look at the slowest 5 tests — if any have crept back to `sleep()`-based waits
77+
(the lint catches new ones, but existing `eslint-disable` annotations may need re-triaging), replace with `pollUntil`.
78+
Document the run + flake rate + slowest tests in the log.
79+
- **Per release: scan for new fixed sleeps in E2E specs**. Run
80+
`grep -rE "await sleep\(" apps/desktop/test/e2e-playwright/*.spec.ts | grep -v "eslint-disable"` — should return
81+
empty. If not, the new sleeps got past the lint somehow and need attention.
82+
- **As needed: surviving `eslint-disable cmdr/no-arbitrary-sleep-in-e2e` annotations**. ~66 of these were added during
83+
the Step 6 speedup as a legacy migration tool. Each is a candidate for replacement with `pollUntil`. Picking off a few
84+
per quarter shrinks the technical-debt surface without forcing a single huge cleanup.
85+
6986
## Log
7087

7188
Newest-first. Tab-separated columns: `date`, `hash(es)`, `task`, `comment`.

docs/notes/extend-e2e-tests.md

Lines changed: 33 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -603,14 +603,14 @@ Branch `e2e-speedup`, 47 commits, ready for fast-forward to `main`.
603603

604604
### Tests added across the push
605605

606-
| Category | Tests | Bugs surfaced |
607-
| ------------------------------ | ----- | ------------------------------------------------------------------------ |
608-
| E2E coverage extension | 9 | 1 — Cancel-copy rollback (Rust `Ok(())` arm + Svelte settle-window race) |
609-
| Mutation testing (Rust+Svelte) | 55 | 0 |
610-
| State-machine transitions | 17 | 1 — `file_viewer` `SearchStatus::Cancelled` unobservable to FE |
611-
| Property-based (proptest) | 12 | 0 |
612-
| IPC contract (mockIPC) | 23 | 0 |
613-
| **Total new unit / IPC tests** | **107** | |
606+
| Category | Tests | Bugs surfaced |
607+
| ------------------------------ | ------- | ------------------------------------------------------------------------ |
608+
| E2E coverage extension | 9 | 1 — Cancel-copy rollback (Rust `Ok(())` arm + Svelte settle-window race) |
609+
| Mutation testing (Rust+Svelte) | 55 | 0 |
610+
| State-machine transitions | 17 | 1 — `file_viewer` `SearchStatus::Cancelled` unobservable to FE |
611+
| Property-based (proptest) | 12 | 0 |
612+
| IPC contract (mockIPC) | 23 | 0 |
613+
| **Total new unit / IPC tests** | **107** | |
614614

615615
Plus dead code removal: `SmbVolume::ConnectionState::OsMount` variant dropped (state machine collapsed to its real
616616
binary `Direct ⇄ Disconnected` shape).
@@ -625,52 +625,53 @@ binary `Direct ⇄ Disconnected` shape).
625625

626626
- **E2E checker total**: 13m 12s baseline → **4m 18s** in the final slow pass (−67%).
627627
- **E2E Playwright wall-clock**: 10m 12s baseline → ~1m 48s longest shard (−82%).
628-
- **Fast checker total**: ~2m 30s baseline → **3m 13s** (+43s) — the regression is the new IPC contract tests
629-
(+30s of Svelte vitest) and the +79 Rust tests (+1s). Net cost is well below what an equivalent E2E spec would add.
628+
- **Fast checker total**: ~2m 30s baseline → **3m 13s** (+43s) — the regression is the new IPC contract tests (+30s of
629+
Svelte vitest) and the +79 Rust tests (+1s). Net cost is well below what an equivalent E2E spec would add.
630630

631631
### Real bugs fixed
632632

633-
1. **file_viewer search-cancel was unobservable to the FE.** `search_cancel` nulled `session.search` immediately,
634-
so the spawned thread's `SearchStatus::Cancelled` write was clobbered before the FE could see it. The FE
635-
couldn't distinguish "search completed naturally with zero matches" from "search was cancelled mid-flight."
636-
Fix: stop nulling on cancel; let the thread write `Cancelled` and the next `search_start` replaces it.
633+
1. **file_viewer search-cancel was unobservable to the FE.** `search_cancel` nulled `session.search` immediately, so the
634+
spawned thread's `SearchStatus::Cancelled` write was clobbered before the FE could see it. The FE couldn't
635+
distinguish "search completed naturally with zero matches" from "search was cancelled mid-flight." Fix: stop nulling
636+
on cancel; let the thread write `Cancelled` and the next `search_start` replaces it.
637637
2. **Cancel-copy mid-operation rollback was lost on fast filesystems.** The Rust `Ok(())` arm in
638-
`copy_files_with_progress` didn't check `OperationIntent` before committing the transaction, so a click during
639-
the < 1 µs window between the last `is_cancelled` poll and loop exit landed as a no-op. Plus the Svelte
640-
`TransferProgressDialog` left the Rollback button enabled during the `MIN_DISPLAY_MS = 400 ms` settle window
641-
after `write-complete`, so clicks during settle were silent no-ops. Both fixed in Step 6d.
638+
`copy_files_with_progress` didn't check `OperationIntent` before committing the transaction, so a click during the <
639+
1 µs window between the last `is_cancelled` poll and loop exit landed as a no-op. Plus the Svelte
640+
`TransferProgressDialog` left the Rollback button enabled during the `MIN_DISPLAY_MS = 400 ms` settle window after
641+
`write-complete`, so clicks during settle were silent no-ops. Both fixed in Step 6d.
642642

643643
### Dead code removed
644644

645-
- `SmbVolume::ConnectionState::OsMount` variant — never written to the atomic, two unreachable `match` arms gone.
646-
The OS-mount fallback the UI renders lives at the outer `SmbConnectionState` enriched by `enrich_smb_connection_state`
647-
in `commands/volumes.rs`, not on this internal atomic.
645+
- `SmbVolume::ConnectionState::OsMount` variant — never written to the atomic, two unreachable `match` arms gone. The
646+
OS-mount fallback the UI renders lives at the outer `SmbConnectionState` enriched by `enrich_smb_connection_state` in
647+
`commands/volumes.rs`, not on this internal atomic.
648648

649649
### Honest verdict per technique
650650

651-
- **Mutation testing (cargo-mutants + stryker)**: zero bugs, 55 tests added. Tools work but are too slow / noisy for
652-
CI gating. Worth ad-hoc runs on numeric / state-machine modules. Don't wire into `check.sh`.
653-
- **State-machine coverage**: 17 tests, 1 real bug (file_viewer search-cancel). Highest signal-to-noise of the test-quality push — the bug was a real silent UX failure, surfaced by writing the transition test.
651+
- **Mutation testing (cargo-mutants + stryker)**: zero bugs, 55 tests added. Tools work but are too slow / noisy for CI
652+
gating. Worth ad-hoc runs on numeric / state-machine modules. Don't wire into `check.sh`.
653+
- **State-machine coverage**: 17 tests, 1 real bug (file_viewer search-cancel). Highest signal-to-noise of the
654+
test-quality push — the bug was a real silent UX failure, surfaced by writing the transition test.
654655
- **Property-based (proptest)**: 12 tests, 0 bugs. `proptest` added as a dev-dep, scoped to four targets
655-
(`topological_sort_bottom_up`, `glob_to_regex`, `split_scope_segments`, `platform_case_compare`). Worth keeping
656-
for those specific algorithmic spots; not worth a project-wide convention.
656+
(`topological_sort_bottom_up`, `glob_to_regex`, `split_scope_segments`, `platform_case_compare`). Worth keeping for
657+
those specific algorithmic spots; not worth a project-wide convention.
657658
- **IPC contract tests (vitest mockIPC)**: 23 tests, 0 bugs. Modest value — `bindings-fresh` + `no-raw-tauri-invoke`
658659
already cover most drift. Worth doing for destructive / cross-window surfaces (write ops, viewer, SMB); diminishing
659660
returns past that.
660-
- **E2E coverage extension**: 9 tests, 1 real bug (cancel-copy rollback). Surfaced by walking the slowest test
661-
(32.7 s) in the post-Step-1 report.
661+
- **E2E coverage extension**: 9 tests, 1 real bug (cancel-copy rollback). Surfaced by walking the slowest test (32.7 s)
662+
in the post-Step-1 report.
662663

663664
### Outstanding items
664665

665666
- **Linux SMB flakes**: `50-share host shows correct share count` and `unicode shares render correctly` — flake under
666667
GVFS race in Docker. Pre-existing, David is aware. Not addressed in this branch.
667668
- **Step 6a parallel-load keystroke-dispatch flakes**: rarely surface on warm runs; Step 6e converted the worst
668669
offenders to `dispatchMenuCommand`. Three remaining keyboard-pathway tests can flake under heavy parallel load
669-
(~1-in-N runs). The Step 6a fix-suggestions list (data-app-ready route-change reset, focus re-issue after click)
670-
is a candidate for a follow-up but wasn't load-bearing here.
670+
(~1-in-N runs). The Step 6a fix-suggestions list (data-app-ready route-change reset, focus re-issue after click) is a
671+
candidate for a follow-up but wasn't load-bearing here.
671672

672673
### Final validation
673674

674675
- `./scripts/check.sh` (fast pass): **green in 3m 13s**, 1728 Rust + 1812 Svelte tests pass.
675-
- `./scripts/check.sh --only-slow`: **green except the two pre-existing Linux SMB flakes**. E2E Playwright
676-
131/131 in 4m 18s across 3 shards; rust-tests-linux 1699/1699; eslint-typecheck 453/453.
676+
- `./scripts/check.sh --only-slow`: **green except the two pre-existing Linux SMB flakes**. E2E Playwright 131/131 in 4m
677+
18s across 3 shards; rust-tests-linux 1699/1699; eslint-typecheck 453/453.

0 commit comments

Comments
 (0)