🐛 Add cancelling state to mission cancellation flow#4143
Conversation
) The mission cancel flow previously jumped straight from 'running' to 'failed' without an intermediate state or backend confirmation. This introduces a 'cancelling' state that provides visual feedback while waiting for backend acknowledgment before transitioning to the final 'failed' state. - Add 'cancelling' to the MissionStatus type union - Set status to 'cancelling' with a system message on cancel request - Handle cancel_ack/cancel_confirmed messages from backend - Handle terminal messages (result/error/stream-done) during cancelling - Add 10s safety-net timeout if backend never acknowledges - HTTP fallback uses response status to finalize cancellation - Finalize stuck 'cancelling' missions on page reload - Show orange spinner and "Cancelling..." label in chat header, list items - Disable cancel button during cancelling state (already in progress) - Update tests to verify cancelling -> failed transition via ack/timeout Signed-off-by: Andrew Anderson <andy@clubanderson.com>
✅ Deploy Preview for kubestellarconsole ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Pull request overview
Introduces an intermediate cancelling mission status so the UI can reflect a pending cancellation while the frontend waits for backend acknowledgment (or a safety-net timeout) before finalizing to failed.
Changes:
- Added
cancellingtoMissionStatusand UI status config (orange spinner + “Cancelling…” label). - Updated cancellation flow to set
cancellingimmediately, wait forcancel_ack/cancel_confirmed(or terminal messages), and enforce a 10s timeout. - Added/updated tests to cover ack-based finalization, timeout finalization, and HTTP fallback behavior.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| web/src/hooks/useMissions.tsx | Adds cancelling state, cancel-ack handling, timeout safety net, and unmount cleanup. |
| web/src/hooks/useMissions.test.tsx | Updates existing expectations and adds tests for ack + timeout cancellation finalization. |
| web/src/components/layout/mission-sidebar/types.ts | Adds cancelling to STATUS_CONFIG for consistent UI labels/colors. |
| web/src/components/layout/mission-sidebar/MissionListItem.tsx | Shows an orange spinner icon on list items while cancelling. |
| web/src/components/layout/mission-sidebar/MissionChat.tsx | Displays cancelling state in chat header/footer and animates status icon. |
| if (isTerminalMessage) { | ||
| pendingRequests.current.delete(message.id) | ||
| finalizeCancellation(missionId, 'Mission cancelled by user.') | ||
| return m // finalizeCancellation handles the state update via setMissions | ||
| } | ||
| // Ignore non-terminal messages (progress, partial stream) while cancelling | ||
| return m |
There was a problem hiding this comment.
This calls setMissions (via finalizeCancellation) from inside another setMissions functional updater. That introduces nested state updates and makes update ordering harder to reason about (and can lead to extra renders). Prefer handling this path outside the setMissions(prev => ...) call (e.g., check the current mission status via missionsRef.current and early-return after calling finalizeCancellation), or update the mission to failed directly within this updater and keep finalizeCancellation as a pure helper that doesn’t call setMissions.
| if (isTerminalMessage) { | |
| pendingRequests.current.delete(message.id) | |
| finalizeCancellation(missionId, 'Mission cancelled by user.') | |
| return m // finalizeCancellation handles the state update via setMissions | |
| } | |
| // Ignore non-terminal messages (progress, partial stream) while cancelling | |
| return m | |
| if (!isTerminalMessage) { | |
| // Ignore non-terminal messages (progress, partial stream) while cancelling | |
| return m | |
| } | |
| // Terminal message while cancelling: treat as cancellation confirmation and | |
| // finalize the mission state here to avoid nested setMissions calls. | |
| pendingRequests.current.delete(message.id) | |
| return { | |
| ...m, | |
| status: 'failed', | |
| updatedAt: new Date(), | |
| } |
| // Finalize a cancelling mission — transitions from 'cancelling' to 'failed' | ||
| // and clears any pending cancel timeout. | ||
| const finalizeCancellation = useCallback((missionId: string, message: string) => { | ||
| // Clear the timeout if one is pending | ||
| const timeout = cancelTimeouts.current.get(missionId) | ||
| if (timeout) { | ||
| clearTimeout(timeout) | ||
| cancelTimeouts.current.delete(missionId) | ||
| } | ||
|
|
||
| setMissions(prev => prev.map(m => | ||
| m.id === missionId && m.status === 'cancelling' ? { | ||
| ...m, | ||
| status: 'failed', | ||
| currentStep: undefined, | ||
| updatedAt: new Date(), | ||
| messages: [ | ||
| ...m.messages, | ||
| { | ||
| id: `msg-${Date.now()}`, | ||
| role: 'system', | ||
| content: message, | ||
| timestamp: new Date(), | ||
| } | ||
| ] | ||
| } : m | ||
| )) | ||
| }, []) |
There was a problem hiding this comment.
finalizeCancellation finalizes the mission state but does not clear pendingRequests entries for that mission. If the backend keeps sending messages for an in-flight request (or sends a late terminal message after the timeout), those messages can still be routed to this mission via pendingRequests even after it’s marked failed, and the map entry may linger indefinitely. Consider purging all pendingRequests.current entries whose value matches missionId as part of finalization (or when entering cancelling), so late/stray messages don’t update a cancelled mission and the map doesn’t leak entries.
| // Safety-net timeout: if the backend never acknowledges, finalize after CANCEL_ACK_TIMEOUT_MS | ||
| const timeoutHandle = setTimeout(() => { | ||
| cancelTimeouts.current.delete(missionId) | ||
| finalizeCancellation(missionId, 'Mission cancelled by user (backend did not confirm cancellation in time).') | ||
| }, CANCEL_ACK_TIMEOUT_MS) | ||
| cancelTimeouts.current.set(missionId, timeoutHandle) | ||
| }, [finalizeCancellation]) |
There was a problem hiding this comment.
If cancelMission(missionId) is triggered multiple times (double-click, repeated UI events, etc.), this will schedule multiple timeouts for the same mission without clearing the previous handle, which can lead to duplicate finalization messages and unnecessary work. Before creating a new timeout, clear any existing handle already stored in cancelTimeouts.current for the same missionId.
| const { result } = renderHook(() => useMissions(), { wrapper }) | ||
| const { missionId } = await startMissionWithConnection(result) | ||
|
|
||
| act(() => { | ||
| result.current.cancelMission(missionId) | ||
| }) | ||
| expect(result.current.missions.find(m => m.id === missionId)?.status).toBe('cancelling') | ||
|
|
||
| // Advance past the cancel ack timeout (10s) | ||
| act(() => { | ||
| vi.advanceTimersByTime(10_000) | ||
| }) | ||
|
|
||
| const mission = result.current.missions.find(m => m.id === missionId) | ||
| expect(mission?.status).toBe('failed') | ||
| const systemMessages = mission?.messages.filter(m => m.role === 'system') ?? [] | ||
| expect(systemMessages.some(m => m.content.includes('backend did not confirm'))).toBe(true) | ||
|
|
||
| vi.useRealTimers() |
There was a problem hiding this comment.
Timer mode cleanup is currently at the end of the test body; if an assertion throws before line 509, fake timers may leak into subsequent tests. Use a try/finally pattern so vi.useRealTimers() always runs even when the test fails.
| const { result } = renderHook(() => useMissions(), { wrapper }) | |
| const { missionId } = await startMissionWithConnection(result) | |
| act(() => { | |
| result.current.cancelMission(missionId) | |
| }) | |
| expect(result.current.missions.find(m => m.id === missionId)?.status).toBe('cancelling') | |
| // Advance past the cancel ack timeout (10s) | |
| act(() => { | |
| vi.advanceTimersByTime(10_000) | |
| }) | |
| const mission = result.current.missions.find(m => m.id === missionId) | |
| expect(mission?.status).toBe('failed') | |
| const systemMessages = mission?.messages.filter(m => m.role === 'system') ?? [] | |
| expect(systemMessages.some(m => m.content.includes('backend did not confirm'))).toBe(true) | |
| vi.useRealTimers() | |
| try { | |
| const { result } = renderHook(() => useMissions(), { wrapper }) | |
| const { missionId } = await startMissionWithConnection(result) | |
| act(() => { | |
| result.current.cancelMission(missionId) | |
| }) | |
| expect(result.current.missions.find(m => m.id === missionId)?.status).toBe('cancelling') | |
| // Advance past the cancel ack timeout (10s) | |
| act(() => { | |
| vi.advanceTimersByTime(10_000) | |
| }) | |
| const mission = result.current.missions.find(m => m.id === missionId) | |
| expect(mission?.status).toBe('failed') | |
| const systemMessages = mission?.messages.filter(m => m.role === 'system') ?? [] | |
| expect(systemMessages.some(m => m.content.includes('backend did not confirm'))).toBe(true) | |
| } finally { | |
| vi.useRealTimers() | |
| } |
|
👋 Hey @clubanderson — thanks for opening this PR!
This is an automated message. |
|
Thank you for your contribution! Your PR has been merged. Check out what's new:
Stay connected: Slack #kubestellar-dev | Multi-Cluster Survey |
🔄 Auto-Applying Copilot Code ReviewCopilot code review found 2 code suggestion(s) and 2 general comment(s). @copilot Please apply all of the following code review suggestions:
Also address these general comments:
Push all fixes in a single commit. Run Auto-generated by copilot-review-apply workflow. |
- server.go: Restrict query param token to WebSocket upgrades only, update stale comments for validateToken and matchOrigin (#4099) - useMissions.tsx: Guard against double-cancel with timeout map check (#4143) - mcp.go: Nil guard on ListWorkloads result before accessing Items (#4145) - workload.go: Remove redundant len(nodes)>0 guard after early continue (#4146) - workload_scaling_test.go: Rename test to ZeroNodeCluster (not UnreachableCluster) (#4146) Signed-off-by: Andrew Anderson <andy@clubanderson.com>
…4158) - server.go: Restrict query param token to WebSocket upgrades only, update stale comments for validateToken and matchOrigin (#4099) - useMissions.tsx: Guard against double-cancel with timeout map check (#4143) - mcp.go: Nil guard on ListWorkloads result before accessing Items (#4145) - workload.go: Remove redundant len(nodes)>0 guard after early continue (#4146) - workload_scaling_test.go: Rename test to ZeroNodeCluster (not UnreachableCluster) (#4146) Signed-off-by: Andrew Anderson <andy@clubanderson.com>
Summary
Fixes #4123, #4125, #4126.
The mission cancel flow previously jumped straight from
runningtofailedwithout an intermediate state or backend confirmation. This introduces acancellingstate in the mission status state machine that provides immediate visual feedback while waiting for backend acknowledgment before transitioning to the finalfailedstate.Changes:
cancellingto theMissionStatustype union andSTATUS_CONFIG(orange spinner, "Cancelling..." label)cancelMission()now sets status tocancellingimmediately, sends the cancel signal (WS or HTTP), and starts a 10s safety-net timeoutcancel_ack/cancel_confirmedmessage types, or by detecting terminal messages (result/error/stream.done) while incancellingstatecancellingafter a page reload are finalized tofailedduring localStorage hydrationcancelling(replaced with orange spinner), chat footer shows cancelling state, list items show animated spinnerTest plan
npx tsc --noEmit)vitest run useMissions.test.tsx)npm run build)Closes #4123
Closes #4125
Closes #4126