Extract shared test infrastructure: deduplicate mock providers and setup boilerplate

## Summary

Three patterns of test duplication across the codebase: identical `test_artifact_dir()` helpers in 2 files plus 10 inline copies, near-identical mock `AgentSession` implementations in 2 files, and 30+ executor tests repeating the same 3-line setup block.

**Severity: S2 (Medium)** | **Confidence: 0.95** | **Blast radius: Low** (test-only)

## Technical Details

### 1. `test_artifact_dir()` — 12 copies across 3 files

**Identical private helpers:**
- `src/executor.rs:1036` — `fn test_artifact_dir() -> (TempDir, ArtifactDir)`
- `src/report.rs:315` — `fn test_artifact_dir() -> (TempDir, ArtifactDir)`

**Inline copies (no helper, pattern repeated directly):**
- `src/command.rs` — **10 instances** at lines 654-657, 676-679, 697-700, 720-723, 738-741, 756-759, 784-787, 817-820, 838-841, 858-861:
  ```rust
  let tmp = tempfile::tempdir().unwrap();
  let run_id = RunId("test-run".to_string());
  let artifact_dir = ArtifactDir::from_run_id(tmp.path(), &run_id);
  artifact_dir.create_all().unwrap();
  ```

### 2. Mock AgentSession — 2 near-identical implementations

- `src/executor.rs:947-1001` — `MockSession` with `responses: Vec<...>`, `call_count: usize`
- `tests/pipeline_integration.rs:21-97` — `MockProvider` with identical struct layout and logic

Both implement `AgentSession` with the same stub methods for `initialize`, `start`, `send_bootstrap`, and `close`. Neither references the other.

### 3. Per-test setup boilerplate — 30+ copies

Every executor unit test repeats:
```rust
let steps = test_steps();
let (run_id, session_id) = test_run_ids();
let (_tmp, artifact_dir) = test_artifact_dir();
// Then: execute_steps(..., None, None, None, Path::new("."), &AtomicBool::new(false))
```

The 5 trailing `None, None, None, Path::new("."), &AtomicBool::new(false)` arguments are identical across all 30+ call sites.

## Proposed Fix

1. Create `src/test_helpers.rs` with `#[cfg(test)]`:
   ```rust
   #[cfg(test)]
   pub mod test_helpers {
       pub fn test_artifact_dir() -> (TempDir, ArtifactDir) { ... }
       pub struct MockSession { ... }
       impl AgentSession for MockSession { ... }
   }
   ```
2. Replace all 10 inline copies in `command.rs` and 2 private helpers with `use crate::test_helpers::*`
3. Replace `MockProvider` in `pipeline_integration.rs` with the shared mock
4. Introduce a `TestCtx` builder for executor tests:
   ```rust
   struct TestCtx { run_id, session_id, artifact_dir, ... }
   impl TestCtx {
       fn run_steps(&self, session: &mut MockSession, steps: Vec<Step>) -> Result<...> { ... }
   }
   ```

**Estimated effort:** ~6 hours

## Related

- Related to F5: test code makes up 1,060 of executor.rs's 1,918 lines
- Part of refactor Bundle 3: Shared Test Infrastructure

---

> 🔍 *Found by [vibe-code-audit](https://github.com/codesoda/vibe-code-audit) — automated codebase audit skill for Claude Code.*

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extract shared test infrastructure: deduplicate mock providers and setup boilerplate #29

Summary

Technical Details

1. `test_artifact_dir()` — 12 copies across 3 files

2. Mock AgentSession — 2 near-identical implementations

3. Per-test setup boilerplate — 30+ copies

Proposed Fix

Related

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Extract shared test infrastructure: deduplicate mock providers and setup boilerplate #29

Description

Summary

Technical Details

1. test_artifact_dir() — 12 copies across 3 files

2. Mock AgentSession — 2 near-identical implementations

3. Per-test setup boilerplate — 30+ copies

Proposed Fix

Related

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

1. `test_artifact_dir()` — 12 copies across 3 files