Fix race condition causing requested commands to be auto-cancelled.#10241
Conversation
86b1197 to
46a26eb
Compare
46a26eb to
e64a356
Compare
|
@zachbai this is a short-term fix. Real fix is consolidating these terminal view <> convo ID maps and their access patterns. It's a big mess right now and will require a good amount of work to avoid regressing stuff. So this is an in-between to get the halting problem fixed. |
We should be able to get rid of the |
There was a problem hiding this comment.
Overview
This PR separates explicit conversation transfer from automatic active-conversation marking and adds a fallback lookup for requested command actions.
Concerns
- The global fallback in
conversation_id_for_actiondrops the terminal-view ownership constraint and can associate a requested command with a conversation from another pane.
Verdict
Found: 0 critical, 1 important, 0 suggestions
Request changes
Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).
Powered by Oz
| // Fallback: the conversation may have been transferred to another view | ||
| // or removed from the live list while the action was executing. Search | ||
| // all in-memory conversations so the command is not silently dropped. | ||
| let fallback = self |
There was a problem hiding this comment.
TerminalView then executes the requested command in the current pane's active session. Keep the fallback scoped to conversations owned by terminal_view_id (or use an action→conversation mapping) rather than searching every conversation.
…arpdotdev#10241) ## Description When two agent conversations run in parallel in different terminal views, `set_active_conversation_id` was called during automatic follow-ups (tool call → result cycles). This function has **transfer semantics**: it removes the conversation from every *other* terminal view's live list. When two views both trigger follow-ups concurrently, each view's follow-up could steal the other view's conversation, causing requested commands to be silently dropped with: ``` No conversation ID found for requested command ``` ### Root cause `set_active_conversation_id` was designed for **explicit user navigation** (opening a conversation in a different pane from the command palette), but was also being called in two automatic code paths: 1. `send_follow_up_for_conversation` (controller.rs) — after tool call results are sent back 2. `send_request_input` (controller.rs) — when sending any non-passive request These paths only need to update the "most recently streamed" pointer within their own view. They should never touch other views. ### Fix **`history_model.rs`**: Added `mark_active_conversation_id` — a non-transferring variant that updates the active conversation pointer for a single terminal view without removing the conversation from other views' live lists. The existing `set_active_conversation_id` (with transfer semantics) is preserved for explicit navigation call sites. **`controller.rs`**: Switched both follow-up call sites to use `mark_active_conversation_id` instead of `set_active_conversation_id`. **`history_model.rs`** (`conversation_id_for_action`): Added a global fallback search across all in-memory conversations when the per-view lookup fails. This is a safety net — if a conversation is ever removed from a view's live list while actions are in flight, the command still executes instead of being silently dropped. ## Linked Issue N/A — discovered via internal debugging of intermittent command cancellation during parallel agent conversations. ## Testing Manual testing with parallel agent conversations in split panes. The race condition is timing-dependent and not consistently reproducible, but the fix is structurally sound: the follow-up paths no longer call the transferring variant, so the race is eliminated by design. ## Agent Mode - [x] Warp Agent Mode - This PR was created via Warp's AI Agent Mode <!-- CHANGELOG-BUG-FIX: Fixed a race condition where requested commands could be silently cancelled when running multiple agent conversations in parallel. --> Co-Authored-By: Oz <oz-agent@warp.dev> [Conversation](https://staging.warp.dev/conversation/ff1292cc-8e8e-4c53-9f5d-6fcfa23089aa) (cherry picked from commit 7467260)
…arpdotdev#10241) ## Description When two agent conversations run in parallel in different terminal views, `set_active_conversation_id` was called during automatic follow-ups (tool call → result cycles). This function has **transfer semantics**: it removes the conversation from every *other* terminal view's live list. When two views both trigger follow-ups concurrently, each view's follow-up could steal the other view's conversation, causing requested commands to be silently dropped with: ``` No conversation ID found for requested command ``` ### Root cause `set_active_conversation_id` was designed for **explicit user navigation** (opening a conversation in a different pane from the command palette), but was also being called in two automatic code paths: 1. `send_follow_up_for_conversation` (controller.rs) — after tool call results are sent back 2. `send_request_input` (controller.rs) — when sending any non-passive request These paths only need to update the "most recently streamed" pointer within their own view. They should never touch other views. ### Fix **`history_model.rs`**: Added `mark_active_conversation_id` — a non-transferring variant that updates the active conversation pointer for a single terminal view without removing the conversation from other views' live lists. The existing `set_active_conversation_id` (with transfer semantics) is preserved for explicit navigation call sites. **`controller.rs`**: Switched both follow-up call sites to use `mark_active_conversation_id` instead of `set_active_conversation_id`. **`history_model.rs`** (`conversation_id_for_action`): Added a global fallback search across all in-memory conversations when the per-view lookup fails. This is a safety net — if a conversation is ever removed from a view's live list while actions are in flight, the command still executes instead of being silently dropped. ## Linked Issue N/A — discovered via internal debugging of intermittent command cancellation during parallel agent conversations. ## Testing Manual testing with parallel agent conversations in split panes. The race condition is timing-dependent and not consistently reproducible, but the fix is structurally sound: the follow-up paths no longer call the transferring variant, so the race is eliminated by design. ## Agent Mode - [x] Warp Agent Mode - This PR was created via Warp's AI Agent Mode <!-- CHANGELOG-BUG-FIX: Fixed a race condition where requested commands could be silently cancelled when running multiple agent conversations in parallel. --> Co-Authored-By: Oz <oz-agent@warp.dev> [Conversation](https://staging.warp.dev/conversation/ff1292cc-8e8e-4c53-9f5d-6fcfa23089aa)
…arpdotdev#10241) When two agent conversations run in parallel in different terminal views, `set_active_conversation_id` was called during automatic follow-ups (tool call → result cycles). This function has **transfer semantics**: it removes the conversation from every *other* terminal view's live list. When two views both trigger follow-ups concurrently, each view's follow-up could steal the other view's conversation, causing requested commands to be silently dropped with: ``` No conversation ID found for requested command ``` `set_active_conversation_id` was designed for **explicit user navigation** (opening a conversation in a different pane from the command palette), but was also being called in two automatic code paths: 1. `send_follow_up_for_conversation` (controller.rs) — after tool call results are sent back 2. `send_request_input` (controller.rs) — when sending any non-passive request These paths only need to update the "most recently streamed" pointer within their own view. They should never touch other views. **`history_model.rs`**: Added `mark_active_conversation_id` — a non-transferring variant that updates the active conversation pointer for a single terminal view without removing the conversation from other views' live lists. The existing `set_active_conversation_id` (with transfer semantics) is preserved for explicit navigation call sites. **`controller.rs`**: Switched both follow-up call sites to use `mark_active_conversation_id` instead of `set_active_conversation_id`. **`history_model.rs`** (`conversation_id_for_action`): Added a global fallback search across all in-memory conversations when the per-view lookup fails. This is a safety net — if a conversation is ever removed from a view's live list while actions are in flight, the command still executes instead of being silently dropped. N/A — discovered via internal debugging of intermittent command cancellation during parallel agent conversations. Manual testing with parallel agent conversations in split panes. The race condition is timing-dependent and not consistently reproducible, but the fix is structurally sound: the follow-up paths no longer call the transferring variant, so the race is eliminated by design. - [x] Warp Agent Mode - This PR was created via Warp's AI Agent Mode <!-- CHANGELOG-BUG-FIX: Fixed a race condition where requested commands could be silently cancelled when running multiple agent conversations in parallel. --> Co-Authored-By: Oz <oz-agent@warp.dev> [Conversation](https://staging.warp.dev/conversation/ff1292cc-8e8e-4c53-9f5d-6fcfa23089aa) (cherry picked from commit 7467260)
Description
When two agent conversations run in parallel in different terminal views,
set_active_conversation_idwas called during automatic follow-ups (tool call → result cycles). This function has transfer semantics: it removes the conversation from every other terminal view's live list. When two views both trigger follow-ups concurrently, each view's follow-up could steal the other view's conversation, causing requested commands to be silently dropped with:Root cause
set_active_conversation_idwas designed for explicit user navigation (opening a conversation in a different pane from the command palette), but was also being called in two automatic code paths:send_follow_up_for_conversation(controller.rs) — after tool call results are sent backsend_request_input(controller.rs) — when sending any non-passive requestThese paths only need to update the "most recently streamed" pointer within their own view. They should never touch other views.
Fix
history_model.rs: Addedmark_active_conversation_id— a non-transferring variant that updates the active conversation pointer for a single terminal view without removing the conversation from other views' live lists. The existingset_active_conversation_id(with transfer semantics) is preserved for explicit navigation call sites.controller.rs: Switched both follow-up call sites to usemark_active_conversation_idinstead ofset_active_conversation_id.history_model.rs(conversation_id_for_action): Added a global fallback search across all in-memory conversations when the per-view lookup fails. This is a safety net — if a conversation is ever removed from a view's live list while actions are in flight, the command still executes instead of being silently dropped.Linked Issue
N/A — discovered via internal debugging of intermittent command cancellation during parallel agent conversations.
Testing
Manual testing with parallel agent conversations in split panes. The race condition is timing-dependent and not consistently reproducible, but the fix is structurally sound: the follow-up paths no longer call the transferring variant, so the race is eliminated by design.
Agent Mode
Co-Authored-By: Oz oz-agent@warp.dev
Conversation