Skip to content

fix: queue messages during session creation to prevent silent loss#287

Merged
PureWeen merged 2 commits intomainfrom
fix/session-mobile-new-sessions-opens-empty-20260305-1436
Mar 5, 2026
Merged

fix: queue messages during session creation to prevent silent loss#287
PureWeen merged 2 commits intomainfrom
fix/session-mobile-new-sessions-opens-empty-20260305-1436

Conversation

@StephaneDelcroix
Copy link
Copy Markdown
Collaborator

Bug

New sessions open empty with no dialog — user's first message is silently lost.

Root Cause

Race condition between optimistic session creation and user interaction:

  1. CreateSessionAsync adds session to UI immediately with IsCreating = true
  2. SDK session creation is awaited (takes seconds)
  3. During this window, user can type and send messages
  4. SendPromptAsync throws InvalidOperationException ("Session is still being created")
  5. Error is silently swallowed in fire-and-forget ContinueWith
  6. Input text is already cleared before send — message is lost
  7. User sees empty session with no dialog

Secondary: ChatDatabase.AddMessageAsync crash in crash.log due to missing fields in ChatMessageEntity.

Fix

  1. Queue messages when IsCreating is true (Dashboard.razor) — same pattern as existing IsProcessing queue logic
  2. Drain queue after session creation completes (CopilotService.cs) — sends first queued message once SDK session is ready
  3. Show error feedback in chat (Dashboard.razor) — display errors as error messages in chat instead of only logging to console
  4. Add missing DB fields (ChatDatabase.cs) — ToolInput, ImagePath, ImageDataUri, Caption added to ChatMessageEntity with round-trip support

Tests

  • 13 new tests in NewSessionEmptyDialogTests.cs: IsCreating state, queue behavior, round-trip entity conversion, error messages
  • 2 new integration tests in ChatDatabaseResilienceTests.cs: DB round-trip for ToolInput and Image fields
  • All 1980 existing tests continue to pass

When a new session is created, the SDK session creation is async. During
this window, IsCreating=true and SendPromptAsync rejects sends. The UI
(Dashboard.SendFromCard) was not checking IsCreating before sending,
causing the user's first message to be silently discarded — the input
was cleared but no message appeared in chat ('opens empty, no dialog').

Fixes:
1. Queue messages when IsCreating is true (same pattern as IsProcessing)
2. Drain queued messages after session creation completes
3. Show error feedback in chat when SendPromptAsync fails (instead of
   only logging to console)
4. Add missing fields to ChatMessageEntity (ToolInput, ImagePath,
   ImageDataUri, Caption) — fixes crash log in AddMessageAsync and
   prevents data loss on DB round-trip

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen
Copy link
Copy Markdown
Owner

PureWeen commented Mar 5, 2026

PR #287 Review — "fix: queue messages during session creation to prevent silent loss"

CI Status: ⚠️ No checks reported (branch has no CI run)
Prior reviews/comments: None (first review)


Consensus Findings (flagged by 3+ of 5 models)


🔴 CRITICAL — CopilotService.cs:~1662 — Queued message dequeued before send; silently lost on failure

var nextPrompt = state.Info.MessageQueue[0];
state.Info.MessageQueue.RemoveAt(0);   // ← removed BEFORE send
// ...
_ = SendPromptAsync(name, nextPrompt, ...)
    .ContinueWith(t =>
    {
        if (t.IsFaulted)
            Debug($"[CREATE] Failed to send queued message...");  // ← Debug only, no re-queue, no UI
    }, TaskContinuationOptions.OnlyOnFaulted);

The drain removes the message from the queue before calling SendPromptAsync. If SendPromptAsync faults (e.g., due to a transient network error, session ID race, or SDK rejection), the ContinueWith handler only writes a Debug() message. The user's first message — exactly the one this PR is trying to preserve — is permanently and silently lost. This is the same class of bug the PR fixes, but in the new drain path.

Fix: On drain failure, add an ErrorMessage to the session's History and call InvokeOnUI (consistent with the Dashboard's ContinueWith error path). Optionally re-queue the message before attempting send and remove it only on success.

Flagged by 5/5 models.


🟡 MODERATE — CopilotService.cs:~1657 — Thread safety: MessageQueue is List<string> mutated concurrently

MessageQueue is a plain List<string>. EnqueueMessage mutates it from the Blazor/UI thread. The drain code in CreateSessionAsync runs on the async continuation after await _client.CreateSessionAsync(...) — which, if the SynchronizationContext is not captured (e.g., ConfigureAwait(false) anywhere in the call chain), runs on a thread-pool thread. Concurrent Add/RemoveAt/Count with no synchronization can corrupt the list or throw ArgumentOutOfRangeException.

Fix: Either document that CreateSessionAsync must always resume on the UI thread (and enforce it), or change to ConcurrentQueue<string> / add a lock.

Flagged by 4/5 models.


🟡 MODERATE — Dashboard.razor:~1325 — Synchronous catch block adds error to History but never calls SafeRefreshAsync

catch (Exception ex)
{
    s.History.Add(ChatMessage.ErrorMessage($"Failed to send: {ex.Message}"));
    s.MessageCount = s.History.Count;
    // ← NO SafeRefreshAsync() here
}

The async ContinueWith fault handler at line ~1314 correctly calls _ = SafeRefreshAsync() after adding the error. The synchronous catch block (which catches InvalidOperationException from SendPromptAsync when the session is in a bad state) adds the same error message but never triggers a re-render. The error message sits in the history model but is invisible until an unrelated UI event causes a refresh.

Fix: Add _ = SafeRefreshAsync(); inside the catch block, mirroring the async path.

Flagged by 4/5 models.


🟡 MODERATE — CopilotService.cs:~1630 — Early-return on closed session skips drain and leaks image/mode queues

{
    try { await copilotSession.DisposeAsync(); } catch { }
    return info;   // ← skips drain, leaks _queuedImagePaths/_queuedAgentModes
}

If the session is closed while waiting for SDK creation, the drain is skipped (queued messages are silently dropped with no user notification). More importantly, the corresponding entries in _queuedImagePaths and _queuedAgentModes are not cleaned up. If a new session is subsequently created with the same name (e.g., user tries again), stale image/mode entries from the prior attempt will be dequeued first and attached to the wrong messages.

Fix: Before returning early, clear _queuedImagePaths and _queuedAgentModes for name, and optionally add a warning message to whatever UI is appropriate.

Flagged by 2/5 models (above the consensus threshold).


False Positive (Codex only — NOT an issue)

  • SQLite schema migration: Codex flagged that new columns (ToolInput, ImagePath, Caption) wouldn't be added to existing DBs. This is not a real issuesqlite-net-pcl v1.9.172's CreateTableAsync automatically issues ALTER TABLE ADD COLUMN for new properties on existing tables. No manual migration needed.

Test Coverage Assessment

The 13 new tests in NewSessionEmptyDialogTests.cs cover entity round-trips and basic queue mechanics but have a key gap:

  • No test for drain failure path: There is no test verifying that if SendPromptAsync fails during drain, the user gets a visible error (not just a silent Debug() log). This is the gap that corresponds to the critical finding above.
  • No test for N>1 messages queued: MessageQueue_PreservesOrderDuringIsCreating verifies 3 messages are queued, but there is no test verifying the behavior after drain — that message [0] is sent and messages [1, 2] remain in queue for manual send.

Summary

# Severity Location Finding
1 🔴 CRITICAL CopilotService.cs:~1662 Queued message removed before send; silently lost on failure
2 🟡 MODERATE CopilotService.cs:~1657 MessageQueue (List) mutated concurrently without lock
3 🟡 MODERATE Dashboard.razor:~1325 Sync catch block missing SafeRefreshAsync() call
4 🟡 MODERATE CopilotService.cs:~1630 Early-return skips queue cleanup → stale image/mode entries leak

Recommended action: ⚠️ Request changes — Finding #1 is a critical regression: the drain path can silently lose the user's queued message, which is exactly the bug this PR is supposed to fix. Findings #2–4 are moderate but straightforward to address. The architecture of the fix is sound; these are implementation gaps.


Review by 5-model parallel dispatch (claude-opus-4.6 ×2, claude-sonnet-4.6, gemini-3-pro-preview, gpt-5.3-codex) with 2+ model consensus filter.

@PureWeen
Copy link
Copy Markdown
Owner

PureWeen commented Mar 5, 2026

PR Review Squad — PR #287 Round 1

CI Status: ⚠️ No checks reported

🔴 CRITICAL (5/5 models)

CopilotService.cs:~1662 — Message dequeued (RemoveAt(0)) before SendPromptAsync fires. On failure, ContinueWith only calls Debug(...) — no re-queue, no UI error, no History entry. The users queued message is permanently and silently lost. The existing drain in CompleteResponse (Events.cs:947-976) re-queues on error; this new path does not.

🟡 MODERATE

  1. CopilotService.cs:~1660 (4/5) — MessageQueue is List<string> accessed concurrently: EnqueueMessage mutates on UI thread while drain runs on post-await continuation. No synchronization — race can corrupt the list or throw ArgumentOutOfRangeException.

  2. Dashboard.razor:~1330 (4/5) — Synchronous catch block adds error to History but omits _ = SafeRefreshAsync(). The async ContinueWith error path correctly calls it. Sync exceptions produce invisible error messages.

  3. CopilotService.cs:~1629-1634 (2/5) — Early-return (session closed during creation) skips drain without cleaning up _queuedImagePaths/_queuedAgentModes. Stale entries leak until ReconnectAsync.

Recommended Action: ⚠️ Request changes

The critical drain-before-send bug defeats the PRs purpose. Match the CompleteResponse drain pattern: send first, re-queue on failure, surface errors to History.

…return, add SafeRefreshAsync to catch

- Drain ContinueWith now calls InvokeOnUI to add ChatMessage.ErrorMessage to history
  and fire OnStateChanged when SendPromptAsync faults, instead of silent Debug() only
- Early-return path (session closed during creation) now clears MessageQueue,
  _queuedImagePaths, and _queuedAgentModes before disposing to prevent stale-entry leaks
- Dashboard.razor sync catch block now calls SafeRefreshAsync() so error messages
  added to History are immediately visible in the UI

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen
Copy link
Copy Markdown
Owner

PureWeen commented Mar 5, 2026

PR #287 Re-Review (Round 2) — All Critical and Moderate Findings Fixed

Fix commit: 563c682 pushed to fix/session-mobile-new-sessions-opens-empty-20260305-1436


Previous Findings Status (4/5 models, consensus)

Finding Status Notes
F1 🔴 CRITICAL: Drain removes message before send; silent loss on failure ✅ FIXED ContinueWith now calls InvokeOnUI to add ChatMessage.ErrorMessage to history + fire OnStateChanged
F2 🟡 MODERATE: MessageQueue (List) concurrent access ✅ N/A Confirmed by 4/4 models: in MAUI Blazor Hybrid, all continuations resume on UI thread via captured SynchronizationContext — no real concurrent access
F3 🟡 MODERATE: Dashboard.razor sync catch missing SafeRefreshAsync ✅ FIXED _ = SafeRefreshAsync(); added to catch block
F4 🟡 MODERATE: Early-return skips queue cleanup ✅ FIXED MessageQueue.Clear() + _queuedImagePaths/_queuedAgentModes TryRemove added before early-return dispose

New Finding (3/4 models: Opus1, Opus2, Sonnet)

🟢 MINOR — CopilotService.cs:~1691 — If first drain send fails, messages [1..n] have no auto-retry trigger

If 2+ messages were queued during creation and the first drain send fails: message [0] shows a visible error message, but messages [1..n] remain in the queue with no automatic processing trigger (because only fires when a turn successfully completes, not on failure). The messages are not lost — they remain visible in the queue UI and the user can manually resend — but they won't auto-send after the error.

This is a pre-existing limitation of the fire-and-forget drain pattern and is not a regression. No change required for this PR.


Recommended Action: ✅ Approve

All critical and moderate findings from Round 1 are fixed. The PR is in good shape to merge.


Re-review by 4-model parallel dispatch (claude-opus-4.6 ×2, claude-sonnet-4.6, gemini-3-pro-preview) with consensus filter.

@PureWeen PureWeen merged commit 8bb8eb9 into main Mar 5, 2026
@PureWeen PureWeen deleted the fix/session-mobile-new-sessions-opens-empty-20260305-1436 branch March 5, 2026 16:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants