Summary
Long-running agents sometimes need to be interrupted. The earlier "Cancel" button only aborted the client-side fetch; the server-side task kept running until completion. The only way to actually stop a run today is to restart the api container, which kills every other in-flight run too.
Details
Problem:
- No surgical kill exists for a specific run.
- Server-side asyncio.Task continues after browser disconnects.
task.cancel() is too aggressive — raises CancelledError mid-await inside LLM HTTP calls, leaving httpx connections in unclean states.
Proposed solution
Cooperative cancellation, layered:
- CancelToken via contextvar (same pattern as Trace). The agent runtime checks
should_stop() between operations.
- Enforcement at structural boundaries. Wrap
tools.call_llm, tools.data_store.*, and gofannon_client.call with an entry check — if stopping, raise AgentStopped immediately without executing. In-flight LLM calls finish naturally; only the next attempt to do anything observable raises.
UI:
- Stop button next to Run; disabled when no run in flight.
- While stopping (after click, before halt): button shows "Stopping… (after current LLM call completes)" disabled.
- Run's outcome becomes a third status
stopped — neutral chip color in the Progress Log (gray with a stop icon, not red).
Stop semantics for chained agents:
When agent X is stopping and X has called Y, Y stops too. Stop means the whole tree. Contextvar makes this trivial.
Acceptance Criteria
References
- File:
webapp/packages/api/user-service/services/agent_trace.py (Trace contextvar pattern)
- File:
webapp/packages/api/user-service/dependencies.py:_execute_agent_code
- File:
webapp/packages/webui/src/pages/AgentCreationFlow/RunsScreen.jsx
- Tracker: FIXES.md item Q2 roadmap #6
Priority
Medium - Depends on ISSUE-003 (run registry) for the cancel token to live somewhere addressable by run_id.
Summary
Long-running agents sometimes need to be interrupted. The earlier "Cancel" button only aborted the client-side fetch; the server-side task kept running until completion. The only way to actually stop a run today is to restart the api container, which kills every other in-flight run too.
Details
Problem:
task.cancel()is too aggressive — raisesCancelledErrormid-await inside LLM HTTP calls, leaving httpx connections in unclean states.Proposed solution
Cooperative cancellation, layered:
should_stop()between operations.tools.call_llm,tools.data_store.*, andgofannon_client.callwith an entry check — if stopping, raiseAgentStoppedimmediately without executing. In-flight LLM calls finish naturally; only the next attempt to do anything observable raises.UI:
stopped— neutral chip color in the Progress Log (gray with a stop icon, not red).Stop semantics for chained agents:
When agent X is stopping and X has called Y, Y stops too. Stop means the whole tree. Contextvar makes this trivial.
Acceptance Criteria
CancelTokencontextvar threaded through agent executiontools.call_llm,tools.data_store.*,gofannon_client.callcheckshould_stop()on entryPOST /runs/{run_id}/stopsets cancel token, responds 202RunRecord.status = "stopped"distinguishable from errorReferences
webapp/packages/api/user-service/services/agent_trace.py(Trace contextvar pattern)webapp/packages/api/user-service/dependencies.py:_execute_agent_codewebapp/packages/webui/src/pages/AgentCreationFlow/RunsScreen.jsxPriority
Medium - Depends on ISSUE-003 (run registry) for the cancel token to live somewhere addressable by run_id.