feat(studio): Stop run button + graceful CLI interrupt#1228
feat(studio): Stop run button + graceful CLI interrupt#1228
Conversation
Track long-lived provider subprocesses (claude, codex, pi, copilot,
vscode) in a per-process registry and walk it from a top-level signal
handler in cli.ts. Without this, Studio's child.kill('SIGTERM') against
the CLI orphans grandchildren — the Node parent exits but the OS does
not propagate the signal.
Plan in docs/plans/1222-stop-run.md.
- Pre-existing format errors in two studio routes block any push; fixed with biome --write so CI passes. - child-tracker uses a structural Killable type to avoid a tsup dts resolution failure on `node:child_process` re-export.
Deploying agentv with
|
| Latest commit: |
e1cfcc0
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://2bc6ebb2.agentv.pages.dev |
| Branch Preview URL: | https://feat-1222-stop-run.agentv.pages.dev |
- DELETE /api/eval/run/:id (and benchmark-scoped variant) SIGTERMs the
spawned CLI. The CLI's own signal handler walks the child registry
added in the previous commit and kills grandchildren before exiting.
Existing child.on('close') flips status — no new 'stopping' state.
- StopRunButton on /jobs/:runId, hidden when terminal or read-only.
Optimistic 'Stopping…' label until the next status poll observes a
terminal state.
- planned_test_count persisted in benchmark.json.metadata via a stub
written at run start. Resume-action visibility now triggers when
results.length < planned_test_count even with no execution_error
rows — covers Stop-button / Ctrl+C cases.
- Narrow tests: shouldShowStopButton matrix, partial-run resume helper,
DELETE 404/403 routes for both base + benchmark-scoped paths. Happy
path of DELETE→SIGTERM is covered by manual UAT.
Stop is part of the stop → resume → complete workflow, not a destructive cancel — DELETE semantics were wrong. Switched to POST /api/eval/run/:id/stop (and benchmark-scoped variant), kept the idempotent-on-terminal behavior so clients can fire-and-forget. UI: removed red destructive styling on the Stop button. Now neutral gray with a pause glyph to signal "this is a pause, not a kill."
Manual red/green UAT — evidenceRed baseline (origin/main @ 0bab7a3, before these changes)No
export function shouldShowResumeActions(results, isReadOnly): boolean {
if (isReadOnly) return false;
return results.some((r) => r.executionStatus === 'execution_error');
}So a run interrupted after a few clean passes is invisible to Resume. This is the gap the issue called out. Green (this branch)G1 — CLI Ctrl+C kills providers, partial JSONL preserved, stub benchmark.json carries G2 — POST /api/eval/run/:id/stop terminates the spawned CLI: Idempotency / 404 / 403 on POST /stop: Run-detail API surfaces
Pre-push hooks
|
Closes #1222.
Simplified scope
Per the user's revision: make Ctrl+C work, expose the same kill via HTTP, add a button that calls it. No AbortSignal threading, no "let in-flight tests finish", no staged shutdown, no new status words.
process.on('SIGINT'|'SIGTERM')walks achild-trackerregistry that providers populate, sends SIGTERM to each, and exits. Partialindex.jsonlalready row-by-row durable.POST /api/eval/run/:id/stopcallschild.kill('SIGTERM'). Existingchild.on('close')flips status. Idempotent — terminal runs return{stopped: false, reason: 'already_terminal'}.⏸ Stop) on/jobs/:runIdwith optimistic "Stopping…" label. Part of the stop→resume workflow, not a destructive cancel.planned_test_countinbenchmark.json.metadataat run start; client-side comparisonresults.length < planned_test_count. Nois_resumable/resume_reason.Plan:
docs/plans/1222-stop-run.md.Drive-by: fixed two pre-existing biome format errors that were blocking pre-push hooks.
Test plan
index.jsonlpreserved.POST /api/eval/run/:id/stopreturns 200 / 403 / 404 paths; idempotent on terminal runs.okrows shows Resume; complete run does not.🤖 Generated with Claude Code