Skip to content

fix(transport/websocket): refuse reconnect on evicted session instead of silently spawning new mux (fixes #354)#360

Merged
Kiryuumaru merged 1 commit into
masterfrom
fix/354-channelclosed-no-fallthrough
May 22, 2026
Merged

fix(transport/websocket): refuse reconnect on evicted session instead of silently spawning new mux (fixes #354)#360
Kiryuumaru merged 1 commit into
masterfrom
fix/354-channelclosed-no-fallthrough

Conversation

@Kiryuumaru
Copy link
Copy Markdown
Owner

Summary

Fixes #354.

WebSocketMuxListener.HandleAsync's ChannelClosedException catch (added by #279 for the eviction race) fell through to the fresh-session creation branch. This silently bound the reconnecting client's pair to a brand-new mux with a brand-new SessionId, causing the client's reconnect handshake to fault terminally with SessionMismatch — the exact defect #236 fixed and #279 unintentionally re-surfaced.

Trace

  1. Mux X is active. _sessions[X] = entry.
  2. Client reconnect Updated CICD #1HandleAsync(sessionId=X)WriteAsync(pair1) succeeds (channel cap 1).
  3. Mux X has not yet consumed pair1 when reconnect Added ARM64 CPU architecture support #2 arrives → WriteAsync(pair2) blocks (channel full).
  4. Mux X's transport faults → DisconnectedevictHandlerTryComplete() on the channel.
  5. Blocked WriteAsync throws ChannelClosedException.
  6. Pre-fix: catch falls through, creates a new mux Y with SessionId = Y ≠ X, wires pair2 into Y's StreamFactory. Client's handshake then claims session X against mux Y → MultiplexerException(SessionMismatch) → client's mux is poisoned for life.

Fix

Mirror the unknown-session branch in the ChannelClosedException catch:

catch (ChannelClosedException)
{
    try
    {
        await webSocket.CloseOutputAsync(
            WebSocketCloseStatus.PolicyViolation,
            "Session has been evicted.",
            cancellationToken).ConfigureAwait(false);
    }
    catch { /* peer may be gone */ }
    try { await pair.DisposeAsync().ConfigureAwait(false); } catch { }
    return;
}

The client's next reconnect attempt will observe the session is truly gone and can fall back to a fresh sessionId = null connection (or surface the eviction to the application).

Tests

New: tests/NetConduit.Transport.WebSocket.IntegrationTests/Issue354ChannelClosedReconnectTests.cs

ReconnectBlockedOnFullChannel_SessionEvicted_RefusesInsteadOfNewMux:

  1. Spins up a Kestrel WebSocket endpoint.
  2. Opens a first WS, hands it into the listener as a new session; no consumer drains the mux so its bounded ConnectionChannel stays full.
  3. Opens a second WS, calls HandleAsync(sessionId=X); blocks on WriteAsync.
  4. Calls listener.RemoveSession(X) to evict the session, completing the channel writer.
  5. Asserts HandleAsync#2 returns cleanly within 5s, the client observes a WebSocketCloseStatus.PolicyViolation close frame, and the internal _sessions dictionary is empty (no orphan mux Y was registered).

Verification:

  • Without the fix: test fails with TimeoutException at the 5s WaitAsyncHandleAsync#2 is hanging inside the spawned mux Y's completion.Task.WaitAsync.
  • With the fix: passes (~3.6s).

Full NetConduit.Transport.WebSocket.IntegrationTests suite: 18/18 pass.
Full solution build (net8/9/10): clean.

Related

… of silently spawning new mux (fixes #354)

HandleAsync's ChannelClosedException catch (added by #279 for the eviction race) fell through to the fresh-session creation branch. When a client reconnect with a known sessionId X was blocked on a full ConnectionChannel and the session was evicted before the blocked WriteAsync drained, the catch ran and the pair was bound to a brand-new mux with SessionId Y. The client's reconnect handshake then faulted terminally with SessionMismatch — the exact defect #236 fixed and #279 unintentionally re-surfaced.

Mirror the unknown-session branch: send a PolicyViolation close, dispose the pair, return. The client can then decide to start fresh with sessionId=null.

Regression test creates a session, fills its bounded ConnectionChannel, starts a second HandleAsync(sessionId=X) that blocks on WriteAsync, evicts the session, and asserts (a) HandleAsync returns cleanly, (b) the client observes a PolicyViolation close, (c) no orphan mux is registered in _sessions. Without the fix the test times out waiting on the spawned mux's completion; with the fix it passes.
@Kiryuumaru Kiryuumaru merged commit f5e1647 into master May 22, 2026
14 checks passed
@Kiryuumaru Kiryuumaru deleted the fix/354-channelclosed-no-fallthrough branch May 22, 2026 11:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant