[codex] Harden WebSocket reconnect recovery#1864
[codex] Harden WebSocket reconnect recovery#1864juliusmarminge merged 3 commits intopingdotgg:mainfrom
Conversation
- restart stalled reconnect timers when the retry window expires - retry replay and snapshot recovery through transient transport errors - preserve default websocket lifecycle tracking when adding custom handlers
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is ON, but it could not run because the branch was deleted or merged before autofix could start.
Reviewed by Cursor Bugbot for commit 6fa364b. Configure here.
Co-authored-by: codex <codex@users.noreply.github.com>
Co-authored-by: codex <codex@users.noreply.github.com>
ApprovabilityVerdict: Approved This PR hardens WebSocket reconnection recovery by adding retry logic for transport errors during recovery operations and fixing lifecycle handler composition. The changes are well-tested bug fixes limited to reconnection edge cases, authored by the primary maintainer of these files. You can customize Macroscope's approvability policy. Learn more. |

What changed
Why
Reconnect recovery could stall or lose lifecycle bookkeeping after transient websocket transport failures. This hardens the client-side recovery path so reconnect state resumes predictably instead of getting stuck.
Impact
WebSocket reconnect recovery in the web client is more reliable under disconnects, retries, and partial recovery failures.
Validation
bun fmtbun lintbun typecheckcd apps/web && bun run test src/components/WebSocketConnectionSurface.logic.test.ts src/environments/runtime/connection.test.ts src/rpc/wsTransport.test.tsNote
Harden WebSocket reconnect recovery by retrying stalled reconnects and transport errors
exhaustWsReconnectIfStillWaitingwithshouldRestartStalledReconnectin WebSocketConnectionSurface.tsx: when a scheduled retry window elapses while still inwaitingphase, the coordinator now triggers an actual reconnect attempt instead of exhausting the retry window.replayEventsandgetSnapshotcalls in connection.ts withretryTransportRecoveryOperation, which retries up to 20 times with a 250ms delay on transport connection errors.composeLifecycleHandlersin protocol.ts so default WebSocket lifecycle tracking always runs alongside any custom handlers provided tocreateWsRpcProtocolLayer.Macroscope summarized db8e216.
Note
Medium Risk
Modifies WebSocket reconnect coordination and orchestration recovery retry behavior; mistakes could cause reconnect loops, delayed recovery, or missed state updates under flaky networks.
Overview
Hardens client WebSocket reconnect/recovery paths to avoid stalled or brittle reconnect behavior.
The reconnect coordinator now detects a stalled scheduled retry window and proactively calls
reconnect()(viashouldRestartStalledReconnect) instead of forcing the connection into an exhausted state, and the oldexhaustWsReconnectIfStillWaitingpath is removed.Orchestration snapshot/replay recovery is wrapped in a transport-error retry loop (up to 20 attempts with a short delay) so transient disconnects during resubscribe/bootstrap don’t immediately fail recovery. The protocol layer also now composes custom lifecycle handlers with default connection-state tracking so user-provided handlers can’t accidentally bypass reconnection bookkeeping, with tests added/updated to cover these behaviors.
Reviewed by Cursor Bugbot for commit db8e216. Bugbot is set up for automated code reviews on this repo. Configure here.