fix: force sync bypasses streaming guard for active session#444
fix: force sync bypasses streaming guard for active session#444
Conversation
When the user clicks the sync button, ForceRefreshRemoteAsync requests full history from the server but SyncRemoteSessions skips applying it because the session is in _remoteStreamingSessions. This caused 'Already up to date' even when messages were missed during a disconnect. Now force sync directly applies the server's history for the active session, bypassing the streaming guard. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PureWeen
left a comment
There was a problem hiding this comment.
Review: PR #444 — fix: force sync bypasses streaming guard for active session
Small, focused fix that correctly addresses the root cause described in the PR: SyncRemoteSessions was blocking history replacement for sessions in _remoteStreamingSessions, causing the sync button to always show "Already up to date" even after a disconnect.
✅ What's correct
- The
lock (forceState.Info.HistoryLock)around Clear+AddRange is correct — consistent with howSyncRemoteSessionsdoes it. - Using
SessionHistories.TryGetValueas the authoritative copy source is the right approach (it holds what the server sent). - The diagnostic log is helpful for post-mortem analysis.
- Small diff, surgical change.
⚠️ Bug: force-apply can corrupt history during an active streaming session
The new block has no guard against sessions that are genuinely mid-stream (not just post-disconnect):
if (activeSessionName != null
&& _bridgeClient.SessionHistories.TryGetValue(activeSessionName, out var serverMessages)
&& _sessions.TryGetValue(activeSessionName, out var forceState))
{
lock (forceState.Info.HistoryLock)
{
forceState.Info.History.Clear();
forceState.Info.History.AddRange(serverMessages); // ← could be stale
}If the user hits sync while a session is actively streaming (in _remoteStreamingSessions), SessionHistories[name] holds the last pushed history snapshot from the server — which is behind the incremental content_delta events that have already been applied locally. Force-replacing history with this stale snapshot erases content the user can already see.
The streaming guard exists exactly to prevent this. Bypassing it unconditionally is unsafe.
Suggested fix: only bypass the guard when the session is NOT actively streaming, OR when the server snapshot is strictly larger than local history:
var isActivelyStreaming = _remoteStreamingSessions.ContainsKey(activeSessionName);
{
lock (forceState.Info.HistoryLock)
{
forceState.Info.History.Clear();
forceState.Info.History.AddRange(serverMessages);
}
forceState.Info.MessageCount = forceState.Info.History.Count;
}This mirrors the logic already in SyncRemoteSessions on line 556 — skip the guard only when the server has more messages than local.
Minor: MessageCount written outside the lock
lock (forceState.Info.HistoryLock)
{
forceState.Info.History.Clear();
forceState.Info.History.AddRange(serverMessages);
}
forceState.Info.MessageCount = forceState.Info.History.Count; // ← outside lockSyncRemoteSessions keeps MessageCount inside the lock. This should match for consistency (and to avoid a theoretical race where MessageCount is read between the lock release and this assignment).
Pre-existing: Task.Delay(500) still racy
Not introduced by this PR, but worth noting: if the history response from the server takes >500ms, SessionHistories.TryGetValue returns false and the force-apply block is silently skipped. The user still sees "Already up to date" and no error. This PR doesn't make it worse, but it remains the underlying fragility of this approach.
Summary
The fix is directionally correct but bypasses the streaming guard unconditionally. Add the "not streaming OR server has more" condition before merging to avoid corrupting live session history.
🔍 Squad Review — PR #444PR: fix: force sync bypasses streaming guard for active session Root Cause & Fix
The fix directly applies if (activeSessionName != null
&& _bridgeClient.SessionHistories.TryGetValue(activeSessionName, out var serverMessages)
&& _sessions.TryGetValue(activeSessionName, out var forceState))
{
lock (forceState.Info.HistoryLock)
{
forceState.Info.History.Clear();
forceState.Info.History.AddRange(serverMessages);
}
forceState.Info.MessageCount = forceState.Info.History.Count;
Debug($"[SYNC] Force-applied {serverMessages.Count} messages ...");
}Logic is correct. ✅ Thread Safety Analysis
Post-sync snapshot ordering: The force-apply happens before the post-sync snapshot ( Potential Edge Case (non-blocking)If the user clicks sync during active streaming (rare but possible), the force-apply clears and replaces History with the server's last-known snapshot. Content_delta events will continue arriving and appending correctly after the clear — they represent the in-progress message, not messages already in the snapshot. ✅ Verdict: ApproveSmall, well-reasoned fix. HistoryLock is correctly held (actually improves on the existing 🚢 Good to merge. |
🤖 PR #444 ReviewTitle: fix: force sync bypasses streaming guard for active session ✅ Problem Statement — Clear and Well-DefinedThe PR description clearly identifies the issue: Symptom: User clicks sync button on mobile, but sees "Already up to date" even when messages were missed during a disconnect window. Root cause: Line 556 in Why the guard exists: During active streaming (content deltas, tool events), the incrementally-built local history is more up-to-date than the server's cached history. The guard prevents overwriting real-time updates with stale server data. Why this is a problem for force sync: When a user explicitly clicks the sync button after a disconnect, they need the server's full history to recover missed messages. The automatic ✅ Solution — Surgical and CorrectThe fix adds 16 lines to // Force-apply server history for the active session, bypassing the streaming guard.
// SyncRemoteSessions skips sessions in _remoteStreamingSessions, but a user-initiated
// force sync should always replace local history with the server's authoritative copy.
if (activeSessionName != null
&& _bridgeClient.SessionHistories.TryGetValue(activeSessionName, out var serverMessages)
&& _sessions.TryGetValue(activeSessionName, out var forceState))
{
lock (forceState.Info.HistoryLock)
{
forceState.Info.History.Clear();
forceState.Info.History.AddRange(serverMessages);
}
forceState.Info.MessageCount = forceState.Info.History.Count;
Debug($"[SYNC] Force-applied {serverMessages.Count} messages for '{activeSessionName}' (bypassed streaming guard)");
}Why this is correct:
🟢 Minor Observations1. Interaction with streaming guard removalThe streaming guard is normally removed on Impact: None — the fix is specifically designed for this scenario (mid-turn force sync after disconnect). 2. Post-sync count may double-countAfter the force-apply block (line 672), the code snapshots Wait, let me trace this more carefully:
Are Yes!
Verdict: The count logic is correct. No issue here. 3. The 500ms delay race condition (from PR #438 Round 1)This PR doesn't address the hardcoded 500ms delay at line 656, which is still a race condition. However:
📊 Test CoveragePassing: All 2987 tests pass Is test coverage needed? The fix is a 16-line addition to an existing method. Ideally, there would be a test like: [Fact]
public async Task ForceRefreshRemoteAsync_WhenStreamingGuardActive_StillAppliesServerHistory()
{
var svc = CreateRemoteService();
await AddRemoteSession(svc, "test-session");
// Simulate streaming guard active (turn in progress)
svc._remoteStreamingSessions.TryAdd("test-session", 1);
// Local has 5 messages, server has 10 (some were missed)
AddLocalMessages(svc, "test-session", 5);
_bridgeClient.SessionHistories["test-session"] = CreateMessages(10);
var result = await svc.ForceRefreshRemoteAsync("test-session");
// Force sync should bypass guard and apply all 10 messages
var session = svc.GetSession("test-session");
Assert.Equal(10, session.History.Count);
Assert.Equal(5, result.MessageCountBefore);
Assert.Equal(10, result.MessageCountAfter);
}Impact of missing test: Low — the fix is straightforward and follows established patterns. The existing 2987 tests provide good coverage of the surrounding code. A targeted test would be nice-to-have for regression protection, but not blocking. 🎯 Final Verdict✅ APPROVE — Clean, focused fix that solves a real user-facing problem. Strengths:
Recommendations for follow-up:
Recommendation: Ready to merge as-is. Test coverage would be ideal but is not blocking given the simplicity and clarity of the fix. 📈 ContextThis PR builds on the sync button feature from PR #438 (which added PR #438 history:
PR #444: Fixes the streaming guard bypass for force sync (this review) The incremental fix approach shows good engineering discipline — each PR addresses a specific, well-scoped issue. |
🔍 Squad Review — PR #444 Round 1Models: claude-opus-4.6 ×2, claude-sonnet-4.6 ×2, gpt-5.3-codex 🟡 Moderate (consensus: 4/5 models)M1 — Force-apply unconditionally clears History, no count guardThe force-apply does Scenario: During active streaming, local history has 10 messages (partially built by content_delta). Server snapshot (from the last The intended scenario (force sync after disconnect) is correct — stuck sessions in Fix: Add a guard: only force-apply when M2 —
|
| # | Issue | Models |
|---|---|---|
| m1 | Task.Delay(500) still sole sync mechanism — if server responds >500ms, SessionHistories has stale data and force-apply does nothing new |
3/5 |
⚠️ Request Changes
One functional concern:
M1 — The force-apply needs a count guard or streaming-state gate to prevent truncating history during active streaming. Without it, a user clicking sync while the model is actively responding can lose in-progress content. The fix is a one-line if guard.
M2 is a one-line fix (move inside lock). M3 (tests) is strongly recommended given this code bypasses a safety guard that has required multiple fix cycles.
…side lock - Only bypass streaming guard when server has more messages than local (missed during disconnect) or session isn't actively streaming - Moved MessageCount assignment inside HistoryLock for consistency with SyncRemoteSessions Addresses review feedback on PR #444. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Replaced racy Task.Delay(500) with reference-equality polling (50ms
intervals, 3s timeout) that detects when SessionHistories is updated
by the server response
- Added SetRemoteStreamingGuardForTesting() helper for test access
- Added 3 regression tests:
- ForceSync applies server history when streaming guard active but
server has more messages (disconnect recovery)
- ForceSync skips when streaming guard active and server has fewer
messages (prevents stale overwrite)
- ForceSync always applies when not streaming
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🔄 Re-Review Round 2 (4-Model Consensus)Tests: ✅ 2990/2990 (1 pre-existing flaky R1 Findings Status
New Findings🟡 Minor: Double-lock pattern is a maintenance hazard (3/4 models flagged)
lock (forceState.Info.HistoryLock)
localCount = forceState.Info.History.Count; // lock 1 released
// ← a content_delta could append here
if (!isActivelyStreaming || serverMessages.Count > localCount) // stale localCount
{
lock (forceState.Info.HistoryLock) // lock 2
{
forceState.Info.History.Clear(); // could wipe the just-appended deltaWhile the consequence is mild (streaming naturally recovers from the next delta), merging into a single lock region eliminates the race entirely and is cleaner: lock (forceState.Info.HistoryLock)
{
var localCount = forceState.Info.History.Count;
if (!isActivelyStreaming || serverMessages.Count > localCount)
{
forceState.Info.History.Clear();
forceState.Info.History.AddRange(serverMessages);
forceState.Info.MessageCount = forceState.Info.History.Count;
}
}🟢 Nit: Missing test for the original regression scenario (2/4 models)The new test // ForceSync_WhenStreamingGuardActive_SameCount_SkipsApply
// local=3, server=3 (stale snapshot, different content), streaming=true → should NOT replace
Assert.Equal(3, session.History.Count); // local preserved🟢 Nit: New tests add ~3s each to the test suite (1 model)The polling loop runs for the full 3s timeout in tests because ℹ️ Pre-existing:
|
- Merged TOCTOU double-lock into single lock region in force-apply (read localCount + decide + apply all under one HistoryLock acquisition) - Added ForceSync_WhenStreamingGuardActive_SameCount_SkipsApply test to cover the original regression scenario (same count, different content) - Made StubWsBridgeClient.RequestHistoryAsync replace the reference so polling loop breaks immediately (eliminates ~9s of idle wait in CI) - Fixed context menu: position:fixed with JS positioning, max-height with scroll for tall menus, viewport clamping on all edges - Added anti-scroll-snap on mobile (userScrolledUp flag) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🔄 Re-Review Round 3 (3-Model Consensus)Tests: ✅ 2991/2991 (clean) Previous Findings Status
The core bridge fix is complete and correct. All tests pass cleanly. New Findings🔴 High:
|
- Reset __userScrolledUp and __wasAtBottom when forceScroll triggers, so streaming auto-scroll resumes after user sends a message - Add _menuPositioned guard to skip redundant positionSessionMenu JS calls on every render cycle while menu is open - Close context menu on sidebar scroll to prevent fixed-position drift Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
✅ Re-Review Round 4 (Orchestrator Direct)Tests: ✅ 2991/2991 Previous Findings Status
New FindingsNone. All R1–R3 findings fully resolved. Verdict: ✅ ApproveClean implementation. The bridge force-sync is correct (count guard + single lock), scroll tracking works (forceScroll resets user-scroll-up flag), menu positioning is efficient (one-shot + scroll-close). 4 regression tests cover the key scenarios. |
When the user clicks the sync button on mobile,
ForceRefreshRemoteAsyncrequests full history from the server butSyncRemoteSessionsskips applying it because the session is in_remoteStreamingSessions. This caused 'Already up to date' even when messages were missed during a disconnect window.Fix: Force sync now directly applies the server's authoritative history for the active session, bypassing the streaming guard.
Root cause: Line 556 in
CopilotService.Bridge.cs— the streaming guard preventsSyncRemoteSessionsfrom replacing incrementally-built history during active streaming. But when a user explicitly clicks sync after a disconnect, they need the server's full history to fill in missed messages.