Problem
Even between full E2E evaluations, there is no quick sanity check that the nightshift CLI actually works. Unit tests pass but the CLI could be broken (import errors, missing config, bad shell commands). We need a fast smoke test after every merge.
What needs to happen
-
Add a post-merge smoke test to evolve.md Step 9 (post-merge health check). After make check passes:
- Run
python3 -m nightshift run --dry-run --agent claude (already in Step 5 but often skipped)
- Run
python3 -m nightshift run --dry-run --agent codex
- Both must exit 0
-
Update evolve-auto.md — add SMOKE TEST RULE: "Dry-run is mandatory post-merge, not optional."
-
Update scripts/smoke-test.sh if needed — ensure it can run headless without interactive prompts.
-
Update docs as needed.
Acceptance Criteria
Problem
Even between full E2E evaluations, there is no quick sanity check that the nightshift CLI actually works. Unit tests pass but the CLI could be broken (import errors, missing config, bad shell commands). We need a fast smoke test after every merge.
What needs to happen
Add a post-merge smoke test to evolve.md Step 9 (post-merge health check). After
make checkpasses:python3 -m nightshift run --dry-run --agent claude(already in Step 5 but often skipped)python3 -m nightshift run --dry-run --agent codexUpdate evolve-auto.md — add SMOKE TEST RULE: "Dry-run is mandatory post-merge, not optional."
Update scripts/smoke-test.sh if needed — ensure it can run headless without interactive prompts.
Update docs as needed.
Acceptance Criteria