Fix optimistic-root persistence breaking conversation restore (QUALITY-774)#11814
Conversation
…Y-774) PR #10794 introduced eager `persist_conversation_state` calls at several sites (start_new_child_conversation, initialize_output_for_response_stream, assign_run_id_for_conversation, mark_conversation_as_remote_child) that fire while the root task is still `Optimistic(Root)`. Combined with `Task::source_for_persistence` synthesizing a stub `api::Task` for optimistic roots, this wrote orphan rows to `agent_tasks` keyed by the client-generated UUID. After the server upgraded the root, a second row was written keyed by the server task id; on restore, HashMap iteration order non-deterministically picked between the two parentless tasks, sometimes leaving the conversation stuck with an empty stub root. Two-pronged fix: 1. Writer side: `Task::source_for_persistence` returns `None` for `Optimistic(Root)` (matching `Optimistic(CLIAgent(_))`). No new stub rows are written. `AgentConversationData.root_task_is_optimistic` is retained for backward-compatible deserialization but is no longer written and is ignored on restore. 2. Restore side: - Two-constructor pattern. `AIConversation::new_restored` is strict (`Err(NoRootTask)` on empty input) — used for cloud-restore and fork-insert. `AIConversation::new_restored_synthesizing_on_empty` synthesizes a fresh in-progress optimistic root when `tasks` is empty — used for local-DB restore, where the empty shape is now normal for child conversations persisted before their first server response. - Dedupe heuristic: when multiple parentless tasks exist (legacy DB case), prefer the candidate with non-empty `messages`. Heals pre-fix rows without a migration. Removes `Task::new_restored_optimistic_root`, `Task::is_optimistic_root_task`, and the prior `PersistedConversationFields` helper; adds a `#[cfg(test)]` helper for driving root upgrades in tests. Co-Authored-By: Oz <oz-agent@warp.dev>
6a540dc to
fc1135e
Compare
|
I'm starting a first review of this pull request. You can view the conversation on Warp. I completed the review and no human review was requested for this pull request. Comment Powered by Oz |
There was a problem hiding this comment.
Overview
This PR stops persisting optimistic root task stubs, adds a lenient local-DB restore path for empty task lists, and tries to prefer the real root when legacy rows contain both an empty optimistic stub and a server root.
Concerns
- Returning no persisted task rows for an optimistic root does not clear any previously persisted task rows because the SQLite writer only upserts tasks. Truncating/resetting a previously server-backed conversation can still restore stale rows after restart.
- The new legacy parentless-task dedupe is bypassed by the existing historical-conversation restorable prefilter, so affected legacy rows with two parentless tasks can still be dropped before
new_restoredgets a chance to heal them.
Verdict
Found: 0 critical, 2 important, 0 suggestions
Request changes
Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).
Powered by Oz
| // real server root) over an empty stub. If multiple have messages | ||
| // or none have messages, fall back to the first-encountered | ||
| // candidate. | ||
| let root_task_pick = parentless_candidates |
There was a problem hiding this comment.
initialize_historical_conversations still calls AgentConversation::is_restorable() first, and that rejects legacy rows with the empty stub plus real root as two parentless tasks. Apply the legacy-root filtering before that prefilter or relax is_restorable for this shape so affected conversations are not dropped from history/metadata before restore.
There was a problem hiding this comment.
This one needs a careful relaxation of is_restorable to allow the stub/real conversation case to restore.
| // produces an orphan row in `agent_tasks` that survives the later | ||
| // server-side upgrade and breaks restore by competing with the | ||
| // real server root for parentless-task selection. See QUALITY-774. | ||
| TaskImpl::Optimistic(optimistic::Task::Root) => None, |
There was a problem hiding this comment.
upsert_agent_conversation only upserts provided tasks and never deletes omitted rows, so returning None here leaves previously persisted task rows behind when an existing conversation is reset back to an optimistic root (for example truncate-to-empty). After restart the local DB loader can restore the stale server row instead of the intended empty in-progress conversation; the persist path needs replace/delete semantics for rows not present in updated_tasks.
There was a problem hiding this comment.
This seems like something we can defer. Basically I think this only applies to conversations that get rewound to zero and then have no new exchanges before restart.
…Y-774) Pre-QUALITY-774, the optimistic-root writer bug could leave a conversation with two parentless rows in agent_tasks: an empty stub keyed by the optimistic UUID alongside the real server-rooted task. The fix for the writer (this PR) plus the restore-side dedupe in AIConversation::new_restored deterministically pick the real root in that shape. However, the history-list restore path goes through BlocklistAIHistoryModel::initialize_historical_conversations, which prefilters via AgentConversation::is_restorable before invoking the constructor. The old predicate rejected any conversation with more than one parentless task, so the dedupe never had a chance to run for legacy conversations on the history list — they were silently dropped. Relax the predicate to permit multi-root conversations iff exactly one parentless task has non-empty messages (the [stub + real] shape). Multi-root conversations with multiple real roots or no real roots remain rejected (genuinely ambiguous or malformed). Co-Authored-By: Oz <oz-agent@warp.dev>
seemeroland
left a comment
There was a problem hiding this comment.
Looks good to me! Thanks for the fix
Upstream PR #11814 ("Fix optimistic-root persistence breaking conversation restore", QUALITY-774) covers the optimistic-stub case via two coordinated changes: Task::source_for_persistence returns None for Optimistic(Root) so no new stub rows are written, and AIConversation::new_restored deduplicates multi-root payloads by preferring the parentless task with non-empty messages. AgentConversation::is_restorable was relaxed to accept the multi-root [stub + real] shape so legacy DB rows still load. This commit updates specs/QUALITY-768/TECH.md to mark change 3 (the read-time optimistic-stub filter) as superseded by upstream and to drop the corresponding test coverage from the doc. The code change itself happened during the rebase by dropping commit d5ca6ab and accepting upstream's version of model.rs and conversation_loader.rs for the three subsequent commits that touched them. Co-Authored-By: Oz <oz-agent@warp.dev>
Upstream PR #11814 ("Fix optimistic-root persistence breaking conversation restore", QUALITY-774) covers the optimistic-stub case via two coordinated changes: Task::source_for_persistence returns None for Optimistic(Root) so no new stub rows are written, and AIConversation::new_restored deduplicates multi-root payloads by preferring the parentless task with non-empty messages. AgentConversation::is_restorable was relaxed to accept the multi-root [stub + real] shape so legacy DB rows still load. This commit updates specs/QUALITY-768/TECH.md to mark change 3 (the read-time optimistic-stub filter) as superseded by upstream and to drop the corresponding test coverage from the doc. The code change itself happened during the rebase by dropping commit d5ca6ab and accepting upstream's version of model.rs and conversation_loader.rs for the three subsequent commits that touched them. Co-Authored-By: Oz <oz-agent@warp.dev>
Upstream PR #11814 ("Fix optimistic-root persistence breaking conversation restore", QUALITY-774) covers the optimistic-stub case via two coordinated changes: Task::source_for_persistence returns None for Optimistic(Root) so no new stub rows are written, and AIConversation::new_restored deduplicates multi-root payloads by preferring the parentless task with non-empty messages. AgentConversation::is_restorable was relaxed to accept the multi-root [stub + real] shape so legacy DB rows still load. This commit updates specs/QUALITY-768/TECH.md to mark change 3 (the read-time optimistic-stub filter) as superseded by upstream and to drop the corresponding test coverage from the doc. The code change itself happened during the rebase by dropping commit d5ca6ab and accepting upstream's version of model.rs and conversation_loader.rs for the three subsequent commits that touched them. Co-Authored-By: Oz <oz-agent@warp.dev>
#11722) ## Description Demo: https://www.loom.com/share/46634b5c2513469aa4e69cb0f09cd03a Restarting Warp could leave orchestration sessions in four broken states: child agents rendered as "Unknown agent" in the pill bar, restored remote-child panes showed a blank "New conversation" placeholder instead of the cloud transcript, a cloud-parent orchestration restored correctly on the first restart but lost its cloud-mode shape on the second, and the on-disk prune split orchestration trees across restarts. This PR addresses all four. **Eager orchestration-child hydration at startup.** `initialize_historical_conversations` now inserts orchestration children into `BlocklistAIHistoryModel::conversations_by_id` at load time, so the pill bar and transcript name resolver find them before any pane materializes. **Direct `AmbientAgentTask` inspection for remote-child hydration.** Restored remote-child placeholders went through `AgentConversationsModel::resolve_open_action`, which once the local placeholder existed collapsed "navigate to local" and "hydrate cloud transcript" into the same outcome — so the transcript was never loaded. Hidden-pane hydration now inspects the `AmbientAgentTask` directly via a pure `decide_remote_child_hydration_action` and routes all four arms through a single tombstone gate keyed on whether the task is in a terminal state. **Cloud-mode shared-session viewer snapshot/restore loop.** A cloud-parent orchestration's pane is a shared-session viewer locally. The restore path used the non-deferred `viewer::TerminalManager::new(..)` which hardcoded `is_cloud_mode = false`, so the restored `TerminalView` had no `ambient_agent_view_model`. Restart 1 still rendered correctly because the network's `JoinedSuccessfully` handler spins up an `OrchestrationViewerModel` regardless, but the snapshot path keys off `ambient_agent_view_model` to choose between `LeafContents::AmbientAgent { task_id }` and the fallback empty `LeafContents::Terminal` — so the next shutdown silently dropped the `task_id` and restart 2 re-materialized the parent as a plain local terminal with no orchestration UI. The fix threads `is_cloud_mode: bool` through `create_shared_session_viewer` and passes `true` from the two ambient-agent restoration call sites; the restored pane's shape now matches a freshly created cloud-mode pane. **Tree-aware persisted-conversation prune.** The on-disk prune treated rows individually, so parents and children of the same orchestration session could be evicted independently and the next boot would re-encounter a split tree. The prune is now tree-aware: orchestration trees are sorted freshest-first by `max(member.last_modified_at)` and kept atomically, with the freshest tree always retained intact. The retention cap moved from 100 to 200 to fit typical orchestration session sizes. A fifth issue originally in scope — local-no-harness Oz children persisting an optimistic stub task alongside the real upgraded root — was independently addressed by PR #11814 (QUALITY-774); this PR was rebased onto that fix and drops its earlier read-time stub filter. The post-facto technical spec at `specs/QUALITY-768/TECH.md` documents the implementation as it ships. ## Testing - [x] I have manually tested my changes locally with `./script/run` Manual validation: local-local restart shows correct agent names; local-remote restart shows the cloud transcript instead of the blank placeholder; cloud-parent restart-restart preserves the orchestration UI on both restarts; tree-aware prune keeps orchestration trees together on disk; baseline single-conversation restore unchanged. Automated coverage added across the dispatch decision, the cloud-transcript merge seam, the cloud-mode viewer snapshot contract, and the tree-aware eviction. `cargo fmt`, `cargo clippy --all-targets --features local_fs -- -D warnings`, and `cargo nextest run` across the affected crates all pass. ## Agent Mode - [x] Warp Agent Mode - This PR was created via Warp's AI Agent Mode CHANGELOG-BUG-FIX: Fixed orchestration session restoration so the pill bar, child transcript name resolution, restored remote-child cloud transcripts, and cloud-parent panes all work correctly after a Warp restart. --------- Co-authored-by: Oz <oz-agent@warp.dev>
Description
Fix-forward for the bug introduced by #10794 where Agent Mode conversations can fail to restore after a quit/restart, sometimes producing the server error
Response stream finished unexpectedly with internal error: Not found: Tried to add messages to non-existent task (id=).Root cause. PR #10794 added eager
persist_conversation_statecalls at several sites (start_new_child_conversation,initialize_output_for_response_stream,assign_run_id_for_conversation,mark_conversation_as_remote_child) that fire while the root task is stillOptimistic(Root).Task::source_for_persistencewas synthesizing a stubapi::Taskfor the optimistic root, which was written toagent_taskskeyed by the client-generated UUID. After the server upgraded the root, a second row was written keyed by the server task id; the stub row was never deleted. On restore,AIConversation::new_restoredsaw two parentless tasks and HashMap iteration order non-deterministically picked between them — when the stub won, restore left the conversation stuck with an empty root and subtaskparent_task_idreferences pointing at a server root id that no longer existed in the task store.Fix. Two coordinated changes:
Writer side (
Task::source_for_persistence): returnsNoneforOptimistic(Root), matching the existingOptimistic(CLIAgent(_))behavior. No new stub rows are written.AgentConversationData.root_task_is_optimisticis retained on the persistence model for backward-compatible deserialization of existing local DB rows, but is no longer written and is ignored on restore.Restore side (
AIConversation::new_restored):new_restoredis strict (returnsErr(NoRootTask)for an emptytaskslist) and is used by cloud-restore and fork-insert, where an empty payload is malformed input.new_restored_synthesizing_on_emptyis a small wrapper that synthesizes a fresh in-progressOptimistic(Root)conversation whentasksis empty (mirroringAIConversation::new()) and is used by the local-DB restore path, where an emptyagent_tasksrow is the normal shape for child conversations persisted before their first server response.messages. Silently heals pre-fix rows without a SQLite migration.Removes
Task::new_restored_optimistic_rootandTask::is_optimistic_root_task; replaces the previousPersistedConversationFieldshelper by inlining the overlay logic into the single constructor. Adds a#[cfg(test)] pub(crate)helper for driving root upgrades in tests.Linked Issue
QUALITY-774 — fix-forward of #10794.
ready-to-specorready-to-implement.Testing
test_persist_with_optimistic_root_emits_event_with_no_task_rows— proves the writer bug exists onmasterand disappears with the fix.app/src/ai/blocklist/history_model_tests.rs:test_optimistic_root_upgrade_then_persist_emits_event_with_single_server_task_rowtest_optimistic_root_restore_round_trip_yields_in_progress_optimistic_roottest_truncate_from_exchange_to_empty_persist_event_has_empty_updated_taskstest_two_restart_cycles_keep_exactly_one_server_root_task_rowapp/src/ai/agent/conversation_tests.rs:restored_conversation_with_empty_task_list_creates_in_progress_optimistic_rootrestored_conversation_ignores_legacy_root_task_is_optimistic_flag_with_non_empty_tasksrestored_conversation_ignores_legacy_root_task_is_optimistic_flag_with_empty_tasksnew_restored_with_empty_task_list_returns_no_root_task_errortest_new_restored_prefers_parentless_task_with_messages_over_empty_stub(50-iteration loop per Vec ordering to catch HashMap-order regressions)agent_conversation_data_roundtrips_optimistic_root_markerincrates/persistence/src/model.rsstill passes, pinning back-compat deserialization of the legacy field../script/presubmitpasses end-to-end.Note on legacy data: This PR does not physically remove pre-existing orphan stub rows in dev/preview local SQLite, but the restore-side dedupe silently ignores them. Affected internal users should not need to clear local AI history.
./script/runAgent Mode
Conversation: https://staging.warp.dev/conversation/3fc61b76-c75b-4e86-bc4a-6f4534e44abb
Plan: https://staging.warp.dev/drive/notebook/EAwKh7EVES70otTsIdkZ9u