[codex] Fix desktop terminal stop cancellation#464
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughThis PR implements end-to-end cancellation for streaming terminal commands: adds ChangesStreaming Command Cancellation
Sequence Diagram(s)sequenceDiagram
participant Caller
participant CentrifugoSandbox as CentrifugoSandbox.commands.run
participant AbortSignal
participant Centrifugo as Centrifugo.publish
Caller->>CentrifugoSandbox: run(command, {signal})
CentrifugoSandbox->>AbortSignal: register abort listener
CentrifugoSandbox->>Centrifugo: publish {type: command}
CentrifugoSandbox->>CentrifugoSandbox: set publishedCommand=true
AbortSignal->>CentrifugoSandbox: abort event
CentrifugoSandbox->>Centrifugo: publish {type: command_cancel}
CentrifugoSandbox->>Caller: resolve {exitCode: 130}
flowchart LR
ExecuteCmd["execute_stream_command"] --> SpawnChild["Spawn child & capture PID"]
SpawnChild --> StateMap["Insert PID into StreamCommandState"]
CancelCmd["cancel_stream_command"] --> LookupPID["Lock state & lookup PID"]
LookupPID --> TerminateTree["platform::cancel_process_tree(pid)"]
TerminateTree --> SigTerm["SIGTERM process group"]
SigTerm --> Wait["wait 500ms / check"]
Wait --> SigKill["SIGKILL on timeout"]
ExecuteCmd --> Cleanup["Remove PID from StreamCommandState"]
sequenceDiagram
participant Centrifugo
participant LocalClient as LocalSandboxClient
participant ProcessTree
Centrifugo->>LocalClient: publish {type: command_cancel, commandId}
LocalClient->>LocalClient: handleCommandCancel -> find ChildProcess
LocalClient->>ProcessTree: terminateProcessTree(pid)
alt Windows
ProcessTree->>ProcessTree: taskkill /T /F
else Unix
ProcessTree->>ProcessTree: kill(-pid, SIGTERM)
ProcessTree->>ProcessTree: wait 500ms
ProcessTree->>ProcessTree: kill(-pid, SIGKILL)
end
LocalClient->>LocalClient: remove from activeStreamCommands
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
app/services/desktop-sandbox-bridge.ts (1)
333-349:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftDon't drop a cancel that lands before Rust has registered the PID.
activeCommandsis marked before the Tauri side stores the spawned PID. Ifcommand_cancelarrives during that window,cancel_stream_commandreturnsfalse; this handler ignores that result, so the command still runs even though Stop was pressed. Please persist a pending-cancel state in Rust or retry when the invoke reportsfalse.Also applies to: 362-369
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/services/desktop-sandbox-bridge.ts` around lines 333 - 349, The activeCommands flag is set before the Tauri/Rust side has registered the spawned PID, so a cancel arriving in that race can be lost; update the logic around invoke("execute_stream_command") and invoke("cancel_stream_command") to persist and honor a pending-cancel state: when adding commandId to activeCommands (and in the same area that imports invoke/Channel and calls invoke("execute_stream_command") in execute_stream_command path), record commandId in a transient pendingCancel map (or mark it) and, if invoke("execute_stream_command") returns false (or the first cancel attempt returns false), retry or enqueue a cancel to invoke("cancel_stream_command") until it succeeds (or notify Rust to persist pending cancellation), and ensure the cancel handler consults and clears that pendingCancel entry so a Stop press before Rust registers PID reliably cancels the process; apply the same change to the other invoke/cancel pair mentioned (lines 362–369) so both streaming and non-streaming cancel flows are covered.packages/desktop/src-tauri/src/lib.rs (1)
1027-1052:⚠️ Potential issue | 🟠 Major | ⚡ Quick winCancel tracked stream commands during desktop shutdown.
StreamCommandStateis only used for explicit stop requests right now. The existingRunEvent::Exitpath still tears down PTYs only, so with the new process-group isolation a stream command that is running when the app closes can outlive the desktop process entirely. Please drain this state and kill each tracked command on shutdown as well.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@lib/ai/tools/utils/centrifugo-sandbox.ts`:
- Around line 244-249: The abort path in publishCancel can race with the publish
attempt because it checks publishedCommand too early and may resolveCanceled
without sending command_cancel; update publish logic so publishCancel knows when
a publish is in-flight and, if publish is pending, wait for the publish promise
to settle (or register a continuation) and then send command_cancel if
publishedCommand becomes true and subscription exists; specifically, introduce
or use an in-flight publish promise/flag (related to publishedCommand in the
publish path around where publishedCommand = true is set) and change
publishCancel to await/attach to that promise before resolving canceled so
remote cancellation is always attempted when publish was started but not yet
flipped to true.
In `@packages/desktop/src-tauri/src/platform.rs`:
- Around line 151-159: build_command() currently calls libc::setsid() which
creates a new session for the child; this leaks detached process groups when
non-streaming callers (execute_command and the /execute path) return on timeout
without killing children. Either remove the unsafe setsid() block from
build_command() and instead call setsid() only in the streaming callers that
invoke platform::graceful_kill(), or keep setsid() but add explicit group-kill
logic to the non-streaming timeout branches (the timeout handling in
execute_command and the /execute handler) so they call platform::graceful_kill()
or send SIGTERM to the child process group; pick one approach and implement it
consistently so every caller that can timeout either does not create a new
session or reliably kills the created session on timeout.
In `@packages/local/src/index.ts`:
- Around line 804-807: activeStreamCommands entries are left running because
cleanup() never kills the detached procs; update cleanup() to iterate
this.activeStreamCommands, call terminateProcessTree(proc) for each tracked
ChildProcess, await/handle results as needed, then clear the map before
proceeding to disconnect Centrifugo or other shutdown steps so spawned pipelines
are terminated during client shutdown.
---
Outside diff comments:
In `@app/services/desktop-sandbox-bridge.ts`:
- Around line 333-349: The activeCommands flag is set before the Tauri/Rust side
has registered the spawned PID, so a cancel arriving in that race can be lost;
update the logic around invoke("execute_stream_command") and
invoke("cancel_stream_command") to persist and honor a pending-cancel state:
when adding commandId to activeCommands (and in the same area that imports
invoke/Channel and calls invoke("execute_stream_command") in
execute_stream_command path), record commandId in a transient pendingCancel map
(or mark it) and, if invoke("execute_stream_command") returns false (or the
first cancel attempt returns false), retry or enqueue a cancel to
invoke("cancel_stream_command") until it succeeds (or notify Rust to persist
pending cancellation), and ensure the cancel handler consults and clears that
pendingCancel entry so a Stop press before Rust registers PID reliably cancels
the process; apply the same change to the other invoke/cancel pair mentioned
(lines 362–369) so both streaming and non-streaming cancel flows are covered.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6e4849cc-49f3-4dd5-a97e-7dfec4b62093
📒 Files selected for processing (9)
app/services/desktop-sandbox-bridge.tslib/ai/tools/run-terminal-cmd.tslib/ai/tools/utils/__tests__/centrifugo-sandbox.test.tslib/ai/tools/utils/centrifugo-sandbox.tslib/ai/tools/utils/sandbox-types.tslib/centrifugo/types.tspackages/desktop/src-tauri/src/lib.rspackages/desktop/src-tauri/src/platform.rspackages/local/src/index.ts
|
Addressed the CodeRabbit cancellation lifecycle findings in follow-up commit |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/desktop/src-tauri/src/lib.rs`:
- Around line 68-120: The read tasks spawned in
wait_with_output_or_kill_on_timeout (stdout_task and stderr_task) can block
indefinitely after child.wait() returns; wrap the collection of those tasks in a
secondary tokio::time::timeout (e.g., reuse the provided timeout or a short
fixed duration) so that awaiting stdout_task.await and stderr_task.await is
bounded; on timeout abort the remaining tasks (stdout_task.abort(),
stderr_task.abort()), call platform::graceful_kill(&mut child).await if needed,
and return an Err describing the read timeout instead of blocking forever.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 098e69fe-a9a2-446d-91d1-7fcd8a9266e6
📒 Files selected for processing (4)
lib/ai/tools/utils/__tests__/centrifugo-sandbox.test.tslib/ai/tools/utils/centrifugo-sandbox.tspackages/desktop/src-tauri/src/lib.rspackages/local/src/index.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- packages/local/src/index.ts
- lib/ai/tools/utils/centrifugo-sandbox.ts
|
Addressed the pipe-drain hang review note in follow-up commit |
Summary
Fixes the Stop path for desktop/free-plan local terminal runs by relaying command cancellation through the local command bridge and killing the active streaming process group.
What changed
command_cancelrelay message for local Centrifugo commands.run_terminal_cmdinto the local sandbox command path.command_canceland call a new Tauricancel_stream_commandcommand.commandIdin Tauri and killed the process group so shell pipelines stop as a unit.command_cancelon Centrifugo abort.Root cause
Desktop/free-plan local runs were relying on fragile process discovery for cancellation. That could leave parts of a shell pipeline running after Stop was pressed.
Validation
pnpm test -- lib/ai/tools/utils/__tests__/centrifugo-sandbox.test.ts --runInBandpnpm test -- lib/ai/tools/__tests__/run-terminal-cmd.test.ts --runInBandpnpm exec tsc --noEmit --pretty falsecargo checkinpackages/desktop/src-tauriSummary by CodeRabbit
New Features
Tests