fix(acp): replace timing-based replay guard with DB-matching state machine by matt2e · Pull Request #431 · block/builderbot

matt2e · 2026-03-20T06:38:14Z

Summary

Replaces the AtomicBool replay guard with a phase-based state machine (Replaying → WaitingForPrompt → Live) that matches replayed notifications against persisted DB messages, preventing out-of-order or duplicate messages in session chat on resumption.
Adds Store::get_session_messages() trait method so the ACP driver can read previously persisted messages for replay matching.
Applies the same phase-based pattern to the simple driver for consistency, and moves strip_code_fences to acp-client crate for reuse.

Test plan

Resume an existing agent session and verify chat messages appear in correct order without duplicates
Start a fresh session and verify normal message flow is unaffected
Verify tool call results display correctly after session resumption

🤖 Generated with Claude Code

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ca9c7b3404

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-20T06:42:11Z

crates/acp-client/src/driver.rs

+    async fn is_replay_idle(&self, timeout: Duration) -> bool {
+        let phase = self.phase.lock().await;
+        if let HandlerPhase::Replaying(buf) = &*phase {
+            buf.received_any && buf.last_notification_at.elapsed() >= timeout


Treat zero-notification resumes as completed replay

If the previous run never persisted any non-user output—for example, the user cancelled before the agent answered—ReplayBuffer::new considers replay complete immediately (target_index = None). run_acp_protocol still waits for replay_done/is_replay_idle, though, and this helper never returns true until at least one notification arrives. When load_session emits nothing, the loop at driver.rs:957-971 never exits, so resume_session leaves the session stuck in running and never sends the new prompt.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-20T06:42:11Z

crates/acp-client/src/driver.rs

+    handler.transition_to_live().await;
+
    connection
        .prompt(prompt_request)
        .await


Keep replay suppression active until the prompt is actually sent

On a resumed session, late replay AgentMessageChunks can still arrive after the role/id matcher decides replay is done. This code switches the handler to Live before awaiting connection.prompt(...), and Live forwards assistant chunks without any replay filter. In that window, a slow load_session replay is appended to the DB as if it belonged to the new turn, duplicating old assistant text in the transcript.

Useful? React with 👍 / 👎.

Add [msg-order] and [replay-guard] tagged debug logs across the full message lifecycle to help diagnose the race condition where old messages appear after the user's reply when resuming a session: - ACP driver & simple driver: log replay guard transitions (set_live) and whether notifications are dropped or processed - Session runner: log the persisted user message ID on resume - Message writer: log DB row IDs for inserted assistant/tool messages - Store: log get_session_messages_since results when >1 message returned - Frontend: log poll results with message IDs and roles Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Log when tool_call and tool_result dedup fires (in-place update instead of insert) and when finalize() resets writer state. These silent paths were previously unlogged, making it hard to diagnose whether tool calls break the message ordering/deduplication logic. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

When resuming a session, load_session() returns before all replay notifications have been delivered through the async transport. This caused old tool calls and assistant text to slip past the replaying guard, get written to the DB with new IDs, and appear after the user's message in the chat. Track tool_call_ids seen during the replay phase. After set_live(), definitively drop any ToolCall/ToolCallUpdate referencing a replayed ID, and use temporal proximity (500ms window) to also drop interleaved old assistant text and AvailableCommandsUpdates that arrive in the same burst as stale tool-call notifications. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

… gate The previous 500ms temporal window for filtering late-arriving replay notifications was unreliable — especially over high-latency connections to remote workstations where buffered notifications could arrive well after the window expired. Replace the timing-based approach with a deterministic `prompt_sent` gate. Between `set_live()` and `mark_prompt_sent()`, ALL notifications are definitively stale because the server cannot produce new-turn output before receiving the prompt. This eliminates the STALE_PROXIMITY constant and last_stale_drop tracking entirely. Keep `replayed_tool_call_ids` as a safety net: tool-call IDs observed during the replay and pre-prompt phases are used to catch any stale ToolCall/ToolCallUpdate that arrives after the prompt is sent. Applied the same pattern to the simple driver. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…chine The previous approaches (temporal window, prompt_sent gate) relied on timing assumptions that break on high-latency connections to remote workstations. Replace with a phase-based state machine that accumulates replay notifications into complete messages, matches them against existing DB messages, and signals completion deterministically: - Replaying: accumulate text chunks, match tool calls/results against DB by role and content prefix. Signal done when the last non-user DB message is matched, or after 1s of idle (fallback). - WaitingForPrompt: replay done, drop stragglers, record tool-call IDs. - Live: forward to writer with replayed_tool_call_ids safety net. The prompt is only sent after replay completion is detected, eliminating the race between stale replay notifications and new-turn output. Also adds Store::get_session_messages() to the acp-client Store trait (with a default empty-vec impl for NoOpStore) and implements it for the Tauri Store. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Fix finalize_current() return value being discarded in the Replaying phase, so replay completion is signalled via content matching instead of always falling through to the 1s idle timeout - Remove all [msg-order] and [replay-guard] diagnostic logs added during debugging across writer, session runner, store, driver, simple driver, and frontend - Extract duplicate strip_code_fences to acp-client and import in writer - Guard transition_to_live() with is_resuming check in both drivers to avoid misleading transitions on non-resuming sessions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Add 10s absolute timeout to replay-wait loop to prevent indefinite hang when server sends zero replay notifications (e.g. remote session was garbage-collected but local DB still has messages) - Remove unused _content parameter from ReplayBuffer::try_match — matching is role-based only, so callers no longer compute content for matching (strip_code_fences, sanitize_title) - Log warning on get_session_messages failure instead of silently returning empty vec via unwrap_or_default - Drop phase mutex before writer calls in Live phase to avoid holding it across await points, matching the pattern used by the simple driver Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

matt2e requested review from baxen and wesbillman as code owners March 20, 2026 06:38

chatgpt-codex-connector bot reviewed Mar 20, 2026

View reviewed changes

matt2e and others added 8 commits March 20, 2026 19:17

chore: promote debug logs to info level for easier diagnosis

10b21c7

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

matt2e force-pushed the session-chat-out-of-order branch from 72a0511 to b371963 Compare March 20, 2026 08:18

matt2e merged commit f1369af into main Mar 20, 2026
6 checks passed

matt2e deleted the session-chat-out-of-order branch March 20, 2026 08:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(acp): replace timing-based replay guard with DB-matching state machine#431

fix(acp): replace timing-based replay guard with DB-matching state machine#431
matt2e merged 8 commits intomainfrom
session-chat-out-of-order

matt2e commented Mar 20, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Mar 20, 2026

Uh oh!

chatgpt-codex-connector bot Mar 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

matt2e commented Mar 20, 2026

Summary

Test plan

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant