fix: drain pending replay notifications before emitting replay_complete by matt2e · Pull Request #163 · block/goose2

matt2e · 2026-04-13T07:19:14Z

Summary

Add a wait_for_replay_drain helper that yields to the async runtime and waits for the replay event counter to stabilise before finalizing replay, fixing a race where replay_complete fired before all replay notifications were dispatched
Track replay event counts on SessionRoute via a new replay_events field, with a get_replay_event_count accessor on the dispatcher
Remove debug/perf logging from AppShell.tsx and useAcpStream.ts that was added during investigation
Add unit tests for the drain stabilisation logic including edge cases (late arrivals, runaway counter cap)
Update file size limits in check-file-sizes.mjs to account for the new code

Test plan

Unit tests for wait_for_replay_drain cover: immediate zero, stable count, spawned notifications, late arrival reset, iteration cap
Manual verification that loading a session with history correctly shows all replay messages before the "replay complete" indicator
Verify no regression in new-session flow (no replay events)

🤖 Generated with Claude Code

… creation triggers Adds [session] prefixed console.log statements at key points: - createDraftSession: when a frontend draft is created - promoteDraft: when a draft is promoted for first message send - acpPrepareSession: before/after backend session creation with trigger reason - acp:session_bound: when backend confirms session binding - setSessionAcpId: when frontend links to backend session ID - loadSessions: summary of backend sessions including empty session count - sendMessage error/abort: when message send fails after session creation These logs help diagnose empty chat sessions by tracing what triggered backend session creation and whether the subsequent message send succeeded. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Replaces [session] lifecycle logs with [replay] logs that trace the full session history loading pipeline: Frontend (useAcpStream/AppShell): - Log when loadSessionMessages starts with goose session ID and messageCount - Count replay events (user messages, assistant messages, total) as they arrive - Log replay_complete with event counts and buffer size - Log buffer flush with message role breakdown - Log timeout with accumulated event counts to detect stalled replays Backend (Rust dispatcher/session_ops): - Log load_session start with session ID mapping and working dir - Count replayed user/assistant messages in SessionRoute - Log load_session completion with event counts and elapsed time These logs help diagnose sessions that load with missing messages by showing exactly how many events the backend emitted vs how many the frontend received and buffered. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

After load_session().await resolves, the ACP RPC layer may still have queued notification tasks (session/update with message chunks) that haven't been processed yet on the single-threaded runtime. This caused non-deterministic message loss when loading older sessions. Yield repeatedly after load_session returns, waiting for the replay event count to stabilize, before finalizing and emitting replay_complete. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Tests all 6 exported functions from replayBuffer.ts: - ensureReplayBuffer: creation, identity, session isolation - getBufferedMessage: lookup by id, missing cases - getReplayBufferSize: unknown session, accurate count - getAndDeleteReplayBuffer: returns and removes buffer - clearReplayBuffer: removes without returning - findLatestUnpairedToolRequest: edge cases for paired/unpaired tool requests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

This reverts commit dd51a72.

Extract the yield-and-wait-for-stability loop from load_session_inner into a generic wait_for_replay_drain() function that accepts an async counter callback, making it testable in isolation. Add 4 tests: - replay_drain_returns_immediately_when_count_is_zero - replay_drain_returns_stable_count (already-stable counter) - replay_drain_waits_for_spawned_notifications (tokio::spawn simulating async ACP notifications arriving over multiple yields) - replay_drain_resets_stability_on_late_arrival (counter bumps just before stability threshold, verifying the reset logic) The last two tests fail when the drain logic is replaced with a single poll (the pre-fix behavior), confirming the fix is necessary. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Remove [session] and [replay] info-level console.log/log::info statements added during empty-chats investigation. Revert the dotenv-load justfile addition. Simplify dispatcher replay counting to a single total (only field the drain fix actually reads). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add a MAX_DRAIN_ITERATIONS (100) safety cap to wait_for_replay_drain so a runaway counter cannot spin-yield forever and freeze the UI. Log a warning when the cap is hit, returning the partial count. Strengthen the late-arrival test to assert poll_count >= 7, confirming the stability window truly resets rather than coincidentally seeing the bumped value. Add a new test for the iteration cap behavior. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…path Merge the two consecutive routes lock acquisitions in the UserMessageChunk replay handler into a single lock scope. The first lock finalized any in-progress assistant message, and the second incremented replay_events — both on the same route entry. Combining them reduces contention and simplifies the code. Tighten the file size limit for dispatcher.rs accordingly (545 → 540).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ed559e808f

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-04-13T07:22:20Z

                        let mut routes = self.routes.lock().await;
                        if let Some(route) = routes.get_mut(&goose_session_id) {
                            route.replay_message_id = Some(new_id.clone());
+                            route.replay_events += 1;


Count every replay event before finalizing replay

wait_for_replay_drain waits for replay_events to stabilize, but this counter is only incremented when a new replay assistant message is created (plus user chunks), not for subsequent AgentMessageChunk/tool replay updates on the same message. That lets load_session_inner call finalize_replay and emit acp:replay_complete while replay chunks are still queued, and those late chunks can be dropped once the frontend marks the message completed (shouldTrackStreamingEvent in useAcpStream). Please increment the drain counter for every replay notification that can still mutate buffered replay state, or use a completion signal tied to the notification queue instead of message creation.

Useful? React with 👍 / 👎.

The debug logging cleanup commit (5fdab4e) accidentally removed [perf:load] and [perf:stream] console.log instrumentation by Bradley Axen and a "Failed to prepare ACP session" console.error by Taylor Ho. These were pre-existing logs, not debug logs added during investigation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

matt2e and others added 11 commits April 13, 2026 13:17

chore: enable dotenv-load in justfile

1de0162

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Revert "test: add unit tests for replay buffer module"

047ab38

This reverts commit dd51a72.

style: apply cargo fmt to dispatcher and session_ops

ed559e8

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

matt2e requested a review from wesbillman as a code owner April 13, 2026 07:19

chatgpt-codex-connector Bot reviewed Apr 13, 2026

View reviewed changes

matt2e merged commit 268eca8 into main Apr 13, 2026
8 checks passed

matt2e deleted the empty-chats-saved branch April 13, 2026 07:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: drain pending replay notifications before emitting replay_complete#163

fix: drain pending replay notifications before emitting replay_complete#163
matt2e merged 12 commits intomainfrom
empty-chats-saved

matt2e commented Apr 13, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Apr 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

matt2e commented Apr 13, 2026

Summary

Test plan

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant