Stop keystrokes stalling on PTY input-stream open after re-attach#40
Conversation
Typing into an agent could appear completely dead after a window reload, app restart, or broker churn. Every keystroke went through the WS PTY input stream via getOrOpenInputStream + `await stream.send(data)`. The SDK's PtyInputStream.send() queues until the broker acks `pty_input_ready`, with a 10s open timeout — so when the stream couldn't open promptly (common right after a re-attach while the broker is still coming up), each keystroke blocked up to 10s before falling back to HTTP. sendInputFireAndForget swallowed the failures with only a console.warn, so the UI showed nothing. And because non-404 open failures never set the permanent fallback, a doomed WS was re-opened over and over. Make HTTP the reliable default and the WS stream a latency optimization that only carries input once it is confirmed ready: - ensureInputStream (replaces getOrOpenInputStream) opens the WS in the background and never awaits the handshake; it reports whether the stream is ready. sendInput only awaits stream.send when ready, otherwise sends over HTTP immediately — no keystroke ever waits on the open timeout. - Track readiness (inputStreamReady) and consecutive open failures (inputStreamOpenFailures). After MAX_INPUT_STREAM_OPEN_FAILURES, stop retrying the WS for that agent and ride HTTP until it re-attaches. - attachTerminal clears the HTTP-only fallback so a re-attach gives the WS a fresh chance instead of being stuck on HTTP for the agent's lifetime. - Clean up all three per-agent maps/sets on close and per project. Net effect: typing works instantly on re-attach (over HTTP), then transparently upgrades to the low-latency WS stream once it opens. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Warning Review limit reached
Your plan includes 3 reviews of capacity. Refill in 8 minutes and 41 seconds. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more review capacity refills, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than trial, open-source, and free plans. In all cases, review capacity refills continuously over time. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Free Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthrough
ChangesPTY Input Stream Non-Blocking Handling
Sequence DiagramsequenceDiagram
participant Client
participant sendInput
participant ensureInputStream
participant WS as WS Stream
participant HTTP as HTTP Fallback
Client->>sendInput: input to send
sendInput->>ensureInputStream: get stream and ready status
ensureInputStream-->>sendInput: stream, ready=false or true
alt ready is true
sendInput->>WS: stream.send(input)
WS-->>sendInput: ack or error
alt send error
sendInput->>sendInput: close stream, mark fallback for unsupported
end
else ready is false
sendInput->>HTTP: POST input (fallback path)
HTTP-->>sendInput: ack
end
🎯 4 (Complex) | ⏱️ ~45 minutes
Note 🎁 Summarized by CodeRabbit FreeYour organization is on the Free plan. CodeRabbit will generate a high-level summary and a walkthrough for each pull request. For a comprehensive line-by-line review, please upgrade your subscription to CodeRabbit Pro by visiting https://app.coderabbit.ai/login. Comment |
Addresses review feedback: when an agent's WS input stream repeatedly fails to open and we permanently fall back to HTTP, log it once. Transient not-ready (the normal startup/re-attach window) stays silent and is handled by HTTP; only a genuinely stuck stream — where the latency fast path is off for the agent's lifetime — is surfaced. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Problem
Typing into an agent (e.g.
claude-1) could appear completely dead after a window reload, app restart, or broker churn — with no error in the UI. Reproduced live: after a dev-server restart, keystrokes to a healthy agent (PTY process confirmed alive) went nowhere; aCmd+R"fixed" it only because by then the broker was fully up.Root cause
Every keystroke went
use-terminal → sendInputFast → broker.sendInput → getOrOpenInputStream → await stream.send(data). The SDK'sPtyInputStream.send()queues andflush()awaits the open handshake (pty_input_ready) with a 10s open timeout. So when the WS stream couldn't open promptly — common right after a re-attach while the broker is still coming up — each keystroke blocked up to 10s before falling back to HTTP, andsendInputFireAndForgetswallowed the failure with aconsole.warn.Fix
The WS input stream stays the primary, fast path whenever it is open — i.e. the entire steady-state typing case is unchanged, and there is no latency regression. The only change is to the brief window where the stream is not yet open: instead of blocking the keystroke on the up-to-10s handshake, send it over HTTP (the broker's other, always-available input endpoint) and switch back to the WS stream the instant it's ready.
ensureInputStream(replacesgetOrOpenInputStream) opens the WS in the background, never awaits the handshake, and reports areadyflag.sendInputuses the WS stream whenready(the normal case); only the not-yet-open window falls through to HTTP — so a keystroke never waits on the open timeout.MAX_INPUT_STREAM_OPEN_FAILURES(3) it rides HTTP for that agent and logs it (a persistently dead fast path is surfaced, not hidden) until the terminal re-attaches.attachTerminalclears the fallback so a re-attach gives the WS a fresh chance.Net: WS remains primary; the sub-second "stream still opening" window no longer freezes typing.
Testing
npm test→ 10/10 passtsc -p tsconfig.node.json→ no new errors (broker.tsclean; 36 pre-existing elsewhere, identical onmain)claude-1input died after a restart while its PTY was confirmed alive.Note
Independent of #38/#39 (renderer memory/CPU); this is the main-process input path.
🤖 Generated with Claude Code