fix(dev): swallow socket errors so peer disconnects don't crash dev server#911
Conversation
…erver Node throws on unhandled 'error' events on sockets. When the browser drops a connection mid-response (HMR triggers a reload while an RSC stream is still flushing), the next res.write surfaces ECONNRESET on res.socket with no listener attached and takes down `bun dev`. Attach a no-op error listener on every connection — same guard Vite's HMR server and Next.js install for the same reason. Closes #905. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
commit: |
|
/bigbonk review |
|
There was a problem hiding this comment.
Verified the fix and the prior art. Approving with one observation worth considering for follow-up.
Verification
Prior art is accurate. I checked the linked references against current upstream:
- Next.js
router-server.tsinstalls no-op error listeners on bothreqandresper request, plus another pair on the WebSocket upgrade socket. Same intent as this PR. - Vite's
ws.tsinstalls a logging error handler on the HMR WebSocket socket. Same pattern. - Vite's
createServerCloseFnalready attaches a'connection'listener onhttpServerto track sockets for graceful shutdown but deliberately does not install an'error'listener on them, so there's no overlap or double-handling.
Implementation is correct:
server.httpServer?.on(...)correctly handles middleware mode (wherehttpServerisnull).- The
connectionevent fires on bothhttp.ServerandHttp2SecureServer(the two members of Vite'sHttpServerunion), so HTTP/HTTPS/HTTP2 are all covered. - The error is still surfaced to writers via the normal
write()callback /'error'onreq/res— Node only throws when there are zero listeners, so adding one no-op listener neutralises the throw without hiding anything from other listeners.
Subtle difference from Next.js
Next.js attaches per-request on req and res; this PR attaches per-connection on the underlying socket. They cover the reported scenario equivalently (the socket error propagates up to req/res streams, and Node only throws when nobody is listening on any of them), but they're not strictly identical:
- Per-connection (this PR): one listener covers every keep-alive request on that socket. Cleaner for the streaming-RSC case where the response surface isn't owned by vinext (e.g., responses written by
@vitejs/plugin-rsc). - Per-request (Next.js): catches errors that propagate to req/res specifically, including any future scenario where req/res emits
'error'from something other than the socket.
I think the per-connection choice is the right call here for exactly the reason called out in the PR description — vinext doesn't own all streaming surfaces in dev. It's the simpler invariant.
Suggestions (non-blocking)
-
Consider also installing on the upgrade path. Next.js attaches the same guard in its
upgradeHandlerfor WS upgrade requests (router-server.ts L823-830). Vite's HMR ws server already logs socket errors, so this is probably already covered for HMR specifically — but if any other plugin attaches its own upgrade listener (e.g., custom WS endpoints) we'd be relying on them to handle their own socket errors. Worth a quick check that the issue in #905 doesn't reproduce on the/_next/hmrupgrade path; if it can, attaching a listener onupgradesockets too would close the gap. -
No test added. This is hard to test deterministically (you'd need to abort a TCP connection mid-write and assert the process didn't crash), so I don't think a unit test is required — but a Playwright E2E that aborts a fetch mid-stream and then makes another request would catch regressions cheaply if anyone wants to add it.
-
Minor: the no-op listener swallows everything. If a real socket-level bug ever shows up, it'll be silent in dev. Matches Next.js's
// TODO: log socket errors?exactly, so this is just inheriting their tradeoff. Could be worth aif (process.env.DEBUG)guardedconsole.warnlater if it becomes a debugging pain point, but not necessary now.
LGTM as-is.
|
Review posted on PR #911. Summary of findings: The fix is correct, minimal, and well-justified. I verified:
Non-blocking suggestions in the review: consider also covering the WS upgrade path (matching Next.js), and noted the absence of a test (acknowledged hard to write deterministically). |
|
Thanks for the careful review! On suggestion 1 (upgrade path): I think this is already covered by the connection-level approach. Agreed on 2 and 3 — leaving as-is. |
…913) * fix(dev): backstop uncaughtException for socket disconnects pipe() misses PR #911 attached an error listener to every connection socket, which fixes the direct res.socket case. But Node's pipe() machinery has a separate re-emission path: when a Readable source errors and the destination has no error listener, pipe() emits the error on the destination — and that destination is not always the connection socket. Two real call sites trigger this: - fromWeb(fetch().body).pipe(res) in proxyExternalRewriteNode, where the source is the upstream fetch body, not the inbound socket - streaming surfaces inside @vitejs/plugin-rsc that own their own pipe topology Outbound sockets created by fetch() also never fire 'connection' on server.httpServer, so they sit completely outside the existing guard. Add a process-level uncaughtException handler scoped to the dev server that drops only peer-disconnect codes (ECONNRESET, EPIPE, ECONNABORTED) and re-throws everything else on nextTick to preserve Node's default crash behavior for real errors. Removed on httpServer 'close' so it doesn't leak across server restarts. Refs #905. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: address bonk review on uncaughtException backstop Three issues from PR #913 review: 1. (Blocker) The process.nextTick(() => throw err) re-throw created an infinite loop, not a crash: the listener is still installed when the nextTick callback fires, so it catches the re-thrown error and schedules another, forever — silently swallowing genuine bugs. Replace with a synchronous throw inside the handler, which Node treats as fatal and produces the same observable as no listener (stack to stderr, non-zero exit). 2. Skip installation entirely in middleware mode (httpServer is null). The embedding host owns process-level handlers and there's no reliable teardown hook to remove ours, so installation would leak. 3. Mirror the same filter onto unhandledRejection — peer-disconnect errors from outbound fetch() and pipe re-emissions can surface through either channel depending on the async path. Verified locally: sync throw inside the handler crashes Node with the original stack and a non-zero exit code; the previous nextTick pattern loops indefinitely on the same input. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR #911 (connection socket guard) and PR #913 (process-level uncaughtException backstop) both left a hole. The dev server still crashes on ECONNRESET in this exact shape: node:events:487 throw er; // Unhandled 'error' event Error: read ECONNRESET at TCP.onStreamRead Emitted 'error' event on Socket instance at: at Socket.onerror (node:internal/streams/readable:1035:14) // pipe() at Socket.emit (node:domain:489:12) // domain wrap at emitErrorNT (node:internal/streams/destroy:170:8) The Socket.onerror frame is Node's pipe() machinery re-emitting the source's error onto a destination Stream that has no error listener. The Socket.emit frame in node:domain wraps the emit and routes the throw through the deprecated-but-still-loaded domain module, which bypasses uncaughtException entirely. Neither the per-connection listener nor a process-level handler can intercept this path. Three real call sites in vinext dev hit it: - fromWeb(fetch().body).pipe(res) in proxyExternalRewriteNode - streaming surfaces inside @vitejs/plugin-rsc with their own pipe topology (destination is not the inbound connection socket) - outbound sockets created by middleware fetch() The only place to catch every variant is at the source: EventEmitter.prototype.emit. Patch it to short-circuit when *all three* conditions hold: type === 'error', the error has a peer-disconnect code (ECONNRESET / EPIPE / ECONNABORTED), and the emitter has zero 'error' listeners (the exact condition under which Node would throw). All other errors — and any error on an emitter that already has a listener — pass through untouched. Genuine bugs still surface. Installed once per process at plugin construction, guarded by a Symbol.for so dep re-optimization, full reloads, or repeated plugin invocations can't double-install or tear it down. Belt-and-braces uncaughtException / unhandledRejection handlers stay in place for surfaces that bypass EventEmitter.emit (raw promise rejections, native callbacks) — also synchronous-throw on non-peer-disconnect errors so Node's default crash semantics are preserved. Refs #905. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Closes #905.
bun devcrashes intermittently with an unhandledECONNRESETon aSocketwhen files are saved mid-request — typically when HMR triggers a reload while an RSC stream is still flushing. The trace has no vinext or app frames; it comes straight from Node internals. Node's default behavior on an unhandled'error'event is to throw, and the dev server doesn't currently attach a listener on the request/response sockets, so a peer disconnect duringres.writetakes the process down.This patch attaches a no-op
'error'listener on every connection inconfigureServer. The error is still surfaced to the writing stream as a normal write failure — the listener only neutralises Node's default throw, it doesn't hide real bugs.Connection-level (rather than per-request) was chosen because vinext doesn't own all the dev middlewares: App Router routing goes through
@vitejs/plugin-rsc, which streams its own response. One guard at the connection level covers both routers and any future streaming surface.Dev-only — production runs as a Cloudflare Worker where the runtime owns socket lifecycle, and
prod-server.tsalready attaches an error handler on the request stream where it needs one.Prior art
Both Vite and Next.js attach the same kind of guard for the same reason:
req/resper request: router-server.ts#L289-L294Test plan
vp checkpasses (format, lint, typecheck)bun dev, navigate around the example app, save source files rapidly while the browser is mid-RSC-stream — server no longer crashes withUnhandled 'error' event/ECONNRESET🤖 Generated with Claude Code