Skip to content

🐛 Add cancelling state to mission cancellation flow#4143

Merged
clubanderson merged 1 commit intomainfrom
fix/mission-cancellation
Apr 1, 2026
Merged

🐛 Add cancelling state to mission cancellation flow#4143
clubanderson merged 1 commit intomainfrom
fix/mission-cancellation

Conversation

@clubanderson
Copy link
Copy Markdown
Collaborator

Summary

Fixes #4123, #4125, #4126.

The mission cancel flow previously jumped straight from running to failed without an intermediate state or backend confirmation. This introduces a cancelling state in the mission status state machine that provides immediate visual feedback while waiting for backend acknowledgment before transitioning to the final failed state.

Changes:

  • Add cancelling to the MissionStatus type union and STATUS_CONFIG (orange spinner, "Cancelling..." label)
  • cancelMission() now sets status to cancelling immediately, sends the cancel signal (WS or HTTP), and starts a 10s safety-net timeout
  • Backend acknowledgment handled via cancel_ack/cancel_confirmed message types, or by detecting terminal messages (result/error/stream.done) while in cancelling state
  • HTTP fallback uses the response status to finalize cancellation (success vs error vs unreachable)
  • Missions stuck in cancelling after a page reload are finalized to failed during localStorage hydration
  • UI: cancel button hidden during cancelling (replaced with orange spinner), chat footer shows cancelling state, list items show animated spinner
  • Non-terminal messages (progress, partial stream chunks) are ignored while cancelling
  • All cancel timeout handles are cleaned up on provider unmount

Test plan

  • TypeScript compiles cleanly (npx tsc --noEmit)
  • All 66 existing + new tests pass (vitest run useMissions.test.tsx)
  • Build succeeds (npm run build)
  • No new lint errors introduced
  • Manual: Start a mission, click Cancel, verify "Cancelling..." spinner appears
  • Manual: Confirm status transitions to "Failed" after backend ack or timeout
  • Manual: Reload page during cancelling state, verify mission shows as failed

Closes #4123
Closes #4125
Closes #4126

)

The mission cancel flow previously jumped straight from 'running' to
'failed' without an intermediate state or backend confirmation. This
introduces a 'cancelling' state that provides visual feedback while
waiting for backend acknowledgment before transitioning to the final
'failed' state.

- Add 'cancelling' to the MissionStatus type union
- Set status to 'cancelling' with a system message on cancel request
- Handle cancel_ack/cancel_confirmed messages from backend
- Handle terminal messages (result/error/stream-done) during cancelling
- Add 10s safety-net timeout if backend never acknowledges
- HTTP fallback uses response status to finalize cancellation
- Finalize stuck 'cancelling' missions on page reload
- Show orange spinner and "Cancelling..." label in chat header, list items
- Disable cancel button during cancelling state (already in progress)
- Update tests to verify cancelling -> failed transition via ack/timeout

Signed-off-by: Andrew Anderson <andy@clubanderson.com>
Copilot AI review requested due to automatic review settings April 1, 2026 13:13
@kubestellar-prow kubestellar-prow bot added the dco-signoff: yes Indicates the PR's author has signed the DCO. label Apr 1, 2026
@netlify
Copy link
Copy Markdown

netlify bot commented Apr 1, 2026

Deploy Preview for kubestellarconsole ready!

Name Link
🔨 Latest commit 1224e89
🔍 Latest deploy log https://app.netlify.com/projects/kubestellarconsole/deploys/69cd19f32b90e70008af22bd
😎 Deploy Preview https://deploy-preview-4143.console-deploy-preview.kubestellar.io
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@kubestellar-prow
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign mikespreitzer for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@clubanderson clubanderson merged commit 32032dc into main Apr 1, 2026
19 of 20 checks passed
@kubestellar-prow kubestellar-prow bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Apr 1, 2026
@kubestellar-prow kubestellar-prow bot deleted the fix/mission-cancellation branch April 1, 2026 13:13
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Introduces an intermediate cancelling mission status so the UI can reflect a pending cancellation while the frontend waits for backend acknowledgment (or a safety-net timeout) before finalizing to failed.

Changes:

  • Added cancelling to MissionStatus and UI status config (orange spinner + “Cancelling…” label).
  • Updated cancellation flow to set cancelling immediately, wait for cancel_ack/cancel_confirmed (or terminal messages), and enforce a 10s timeout.
  • Added/updated tests to cover ack-based finalization, timeout finalization, and HTTP fallback behavior.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
web/src/hooks/useMissions.tsx Adds cancelling state, cancel-ack handling, timeout safety net, and unmount cleanup.
web/src/hooks/useMissions.test.tsx Updates existing expectations and adds tests for ack + timeout cancellation finalization.
web/src/components/layout/mission-sidebar/types.ts Adds cancelling to STATUS_CONFIG for consistent UI labels/colors.
web/src/components/layout/mission-sidebar/MissionListItem.tsx Shows an orange spinner icon on list items while cancelling.
web/src/components/layout/mission-sidebar/MissionChat.tsx Displays cancelling state in chat header/footer and animates status icon.

Comment on lines +758 to +764
if (isTerminalMessage) {
pendingRequests.current.delete(message.id)
finalizeCancellation(missionId, 'Mission cancelled by user.')
return m // finalizeCancellation handles the state update via setMissions
}
// Ignore non-terminal messages (progress, partial stream) while cancelling
return m
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This calls setMissions (via finalizeCancellation) from inside another setMissions functional updater. That introduces nested state updates and makes update ordering harder to reason about (and can lead to extra renders). Prefer handling this path outside the setMissions(prev => ...) call (e.g., check the current mission status via missionsRef.current and early-return after calling finalizeCancellation), or update the mission to failed directly within this updater and keep finalizeCancellation as a pure helper that doesn’t call setMissions.

Suggested change
if (isTerminalMessage) {
pendingRequests.current.delete(message.id)
finalizeCancellation(missionId, 'Mission cancelled by user.')
return m // finalizeCancellation handles the state update via setMissions
}
// Ignore non-terminal messages (progress, partial stream) while cancelling
return m
if (!isTerminalMessage) {
// Ignore non-terminal messages (progress, partial stream) while cancelling
return m
}
// Terminal message while cancelling: treat as cancellation confirmation and
// finalize the mission state here to avoid nested setMissions calls.
pendingRequests.current.delete(message.id)
return {
...m,
status: 'failed',
updatedAt: new Date(),
}

Copilot uses AI. Check for mistakes.
Comment on lines +654 to +681
// Finalize a cancelling mission — transitions from 'cancelling' to 'failed'
// and clears any pending cancel timeout.
const finalizeCancellation = useCallback((missionId: string, message: string) => {
// Clear the timeout if one is pending
const timeout = cancelTimeouts.current.get(missionId)
if (timeout) {
clearTimeout(timeout)
cancelTimeouts.current.delete(missionId)
}

setMissions(prev => prev.map(m =>
m.id === missionId && m.status === 'cancelling' ? {
...m,
status: 'failed',
currentStep: undefined,
updatedAt: new Date(),
messages: [
...m.messages,
{
id: `msg-${Date.now()}`,
role: 'system',
content: message,
timestamp: new Date(),
}
]
} : m
))
}, [])
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

finalizeCancellation finalizes the mission state but does not clear pendingRequests entries for that mission. If the backend keeps sending messages for an in-flight request (or sends a late terminal message after the timeout), those messages can still be routed to this mission via pendingRequests even after it’s marked failed, and the map entry may linger indefinitely. Consider purging all pendingRequests.current entries whose value matches missionId as part of finalization (or when entering cancelling), so late/stray messages don’t update a cancelled mission and the map doesn’t leak entries.

Copilot uses AI. Check for mistakes.
Comment on lines +1515 to +1521
// Safety-net timeout: if the backend never acknowledges, finalize after CANCEL_ACK_TIMEOUT_MS
const timeoutHandle = setTimeout(() => {
cancelTimeouts.current.delete(missionId)
finalizeCancellation(missionId, 'Mission cancelled by user (backend did not confirm cancellation in time).')
}, CANCEL_ACK_TIMEOUT_MS)
cancelTimeouts.current.set(missionId, timeoutHandle)
}, [finalizeCancellation])
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If cancelMission(missionId) is triggered multiple times (double-click, repeated UI events, etc.), this will schedule multiple timeouts for the same mission without clearing the previous handle, which can lead to duplicate finalization messages and unnecessary work. Before creating a new timeout, clear any existing handle already stored in cancelTimeouts.current for the same missionId.

Copilot uses AI. Check for mistakes.
Comment on lines +491 to +509
const { result } = renderHook(() => useMissions(), { wrapper })
const { missionId } = await startMissionWithConnection(result)

act(() => {
result.current.cancelMission(missionId)
})
expect(result.current.missions.find(m => m.id === missionId)?.status).toBe('cancelling')

// Advance past the cancel ack timeout (10s)
act(() => {
vi.advanceTimersByTime(10_000)
})

const mission = result.current.missions.find(m => m.id === missionId)
expect(mission?.status).toBe('failed')
const systemMessages = mission?.messages.filter(m => m.role === 'system') ?? []
expect(systemMessages.some(m => m.content.includes('backend did not confirm'))).toBe(true)

vi.useRealTimers()
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timer mode cleanup is currently at the end of the test body; if an assertion throws before line 509, fake timers may leak into subsequent tests. Use a try/finally pattern so vi.useRealTimers() always runs even when the test fails.

Suggested change
const { result } = renderHook(() => useMissions(), { wrapper })
const { missionId } = await startMissionWithConnection(result)
act(() => {
result.current.cancelMission(missionId)
})
expect(result.current.missions.find(m => m.id === missionId)?.status).toBe('cancelling')
// Advance past the cancel ack timeout (10s)
act(() => {
vi.advanceTimersByTime(10_000)
})
const mission = result.current.missions.find(m => m.id === missionId)
expect(mission?.status).toBe('failed')
const systemMessages = mission?.messages.filter(m => m.role === 'system') ?? []
expect(systemMessages.some(m => m.content.includes('backend did not confirm'))).toBe(true)
vi.useRealTimers()
try {
const { result } = renderHook(() => useMissions(), { wrapper })
const { missionId } = await startMissionWithConnection(result)
act(() => {
result.current.cancelMission(missionId)
})
expect(result.current.missions.find(m => m.id === missionId)?.status).toBe('cancelling')
// Advance past the cancel ack timeout (10s)
act(() => {
vi.advanceTimersByTime(10_000)
})
const mission = result.current.missions.find(m => m.id === missionId)
expect(mission?.status).toBe('failed')
const systemMessages = mission?.messages.filter(m => m.role === 'system') ?? []
expect(systemMessages.some(m => m.content.includes('backend did not confirm'))).toBe(true)
} finally {
vi.useRealTimers()
}

Copilot uses AI. Check for mistakes.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 1, 2026

👋 Hey @clubanderson — thanks for opening this PR!

🤖 This project is developed exclusively using AI coding assistants.

Please do not attempt to code anything for this project manually.
All contributions should be authored using an AI coding tool such as:

This ensures consistency in code style, architecture patterns, test coverage,
and commit quality across the entire codebase.


This is an automated message.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 1, 2026

Thank you for your contribution! Your PR has been merged.

Check out what's new:

Stay connected: Slack #kubestellar-dev | Multi-Cluster Survey

@clubanderson
Copy link
Copy Markdown
Collaborator Author

🔄 Auto-Applying Copilot Code Review

Copilot code review found 2 code suggestion(s) and 2 general comment(s).

@copilot Please apply all of the following code review suggestions:

  • web/src/hooks/useMissions.tsx (line 764): if (!isTerminalMessage) { // Ignore non-terminal messages (progress, p...
  • web/src/hooks/useMissions.test.tsx (line 509): try { const { result } = renderHook(() => useMissions(), { wrapper }) ...

Also address these general comments:

  • web/src/hooks/useMissions.tsx (line 681): finalizeCancellation finalizes the mission state but does not clear pendingRequests entries for that mission. If the
  • web/src/hooks/useMissions.tsx (line 1521): If cancelMission(missionId) is triggered multiple times (double-click, repeated UI events, etc.), this will schedule m

Push all fixes in a single commit. Run cd web && npm run build && npm run lint before committing.


Auto-generated by copilot-review-apply workflow.

clubanderson added a commit that referenced this pull request Apr 1, 2026
- server.go: Restrict query param token to WebSocket upgrades only,
  update stale comments for validateToken and matchOrigin (#4099)
- useMissions.tsx: Guard against double-cancel with timeout map
  check (#4143)
- mcp.go: Nil guard on ListWorkloads result before accessing Items
  (#4145)
- workload.go: Remove redundant len(nodes)>0 guard after early
  continue (#4146)
- workload_scaling_test.go: Rename test to ZeroNodeCluster (not
  UnreachableCluster) (#4146)

Signed-off-by: Andrew Anderson <andy@clubanderson.com>
clubanderson added a commit that referenced this pull request Apr 1, 2026
…4158)

- server.go: Restrict query param token to WebSocket upgrades only,
  update stale comments for validateToken and matchOrigin (#4099)
- useMissions.tsx: Guard against double-cancel with timeout map
  check (#4143)
- mcp.go: Nil guard on ListWorkloads result before accessing Items
  (#4145)
- workload.go: Remove redundant len(nodes)>0 guard after early
  continue (#4146)
- workload_scaling_test.go: Rename test to ZeroNodeCluster (not
  UnreachableCluster) (#4146)

Signed-off-by: Andrew Anderson <andy@clubanderson.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dco-signoff: yes Indicates the PR's author has signed the DCO. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

3 participants