Skip to content

[Upstream PR #256] fix(server): eliminate multi-session protocol corruption #173

@quangdang46

Description

@quangdang46

Mirrored from upstream 1jehuang/jcodePull Request #256 by @Zephyr709
Original state: open
Created: 2026-05-21T14:13:20Z · Updated: 2026-05-21T14:13:20Z
Diff: https://github.com/1jehuang/jcode/pull/256.diff
This issue is an auto-mirrored copy. Comments and edits here are local to quangdang46/jcode — do not expect them to propagate upstream.


Summary

Fixes mass session crashes caused by multi-session protocol corruption on the shared-server when multiple clients (swarm + multi-window) are attached. Symptom is RemoteConnection::next_event: protocol error=expected value at line 1 column 1 followed by Remote protocol error is not retryable; stopping reconnect loop, taking down every attached TUI within ~100ms.

Root cause

Several server code paths used the singular fallback member.event_tx directly instead of fanning out to all member.event_txs, and register_session_event_sender, unregister_session_event_sender, and fanout_session_event silently overwrote member.event_tx to point at whichever connection's writer happened to be touched last. With multiple attached clients, a send intended for one client's writer could land on another client's writer mid-line, splicing event tails into unrelated frames and crashing every session on the shared-server.

Changes

Commit 1 (266e8759) — server fix (root cause)

  • register_session_event_sender: only adopt new sender as singular fallback when existing fallback is closed.
  • unregister_session_event_sender: do not silently re-point member.event_tx to a surviving connection.
  • fanout_session_event: snapshot all live attachments without mutating the singular fallback.
  • comm_plan / comm_control / debug_swarm_write / swarm / client_session: route via super::fanout_session_event instead of direct member.event_tx.send, dropping read locks before fanout acquires the write lock.

Commit 2 (d3e3b753) — protocol resilience (defense in depth)

  • encode_event (in jcode-protocol) refuses to emit a JSON frame containing raw newlines. If serialization ever produces one (custom Display impls, hand-built JSON), log the kind and strip the byte instead of shipping a frame that would split into two on the receiver and crash every attached client.
  • RemoteConnection::next_event no longer treats a single malformed frame as fatal. It logs a truncated preview and resyncs at the next newline, giving up only after 16 consecutive corrupt frames to avoid busy looping.

Tests

  • 5 new regression tests in src/server/state.rs::multi_connection_protocol_tests covering register/unregister/fanout semantics that previously caused cross-connection writer corruption.
  • Full suite: 188 passed, 0 failed (cargo test --lib -p jcode server::).

Validation in production

Deployed binary at ~/.jcode/builds/versions/d3e3b753-protocol-resilience/jcode (running as shared-server pid 62439 since 11:33:39 today).

  • Pre-fix log (old binary): mass session crashes at 10:41, 11:14:51, 11:14:54 with Remote protocol error is not retryable; stopping reconnect loop.
  • Post-fix log (new binary, ~9 min uptime so far): 0 not retryable errors, 0 session teardowns. One transient bad frame logged at 11:40:41 with consecutive_malformed=1 — exactly what the resilience commit is supposed to do (log + resync, no crash). The session that hit the bad frame (session_bird) is still attached and alive.

Risk / rollback

  • The server fix changes registration semantics: the singular event_tx is no longer overwritten on each new connection. Any code that depended on the overwrite behavior would now see stale senders; the audit and 5 regression tests cover all known call sites.
  • Rollback: revert the two commits and the prior binary at ~/.jcode/builds/versions/266e8759-fix-multi-session-protocol/jcode is still on disk.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions