fix(vtz): node:http hang — graceful shutdown + sync .then() bug (resolves #2718, #2720)#2749
Merged
Merged
Conversation
#2720) The synthetic `node:http` module's `createServer().listen()` treated `globalThis.__vtz_http.serve()` as async, but `serve()` is a synchronous op that returns the server object directly. The resulting `.then(...)` on a non-thenable threw a TypeError that was swallowed by the `new Promise((resolve) => server.listen(0, resolve))` idiom, so the listen callback never fired and tests hung at the 120s watchdog. The same bug existed in the CJS `require('http')` shim. Fixing the shim surfaced three secondary defects in the op layer: - `close()` aborted the axum task immediately, cancelling in-flight response futures mid-reply. Replaced the abort-handle teardown with axum's `with_graceful_shutdown(...)` signal so existing connections drain before the task exits. - `op_http_serve_respond` keyed the pending oneshot map per server, so replying after close (which removed the ServerInstance) silently dropped the response. Moved pending-responses onto HttpServeState and keyed it globally by request_id. - `op_http_serve_accept` / `op_http_serve_respond` returned "Unknown server id" errors when the accept loop re-polled after close(), poisoning the event loop. Both ops now treat a missing server as a soft null / no-op. The JS createServer() shim also gained proper Node-compatible semantics: listen(cb) fires the callback via queueMicrotask, close(cb) defers until all in-flight requests finish, and new connections received after close() get a 503 without entering the user handler. Restored `node-handler.local.ts` and `docs-cli-actions.local.ts` to `.test.ts` — both now pass under `vtz test`. Removed the bun-fallback `test:integration` npm scripts that existed to keep these files running. `image-processor.local.ts` stays quarantined (NAPI, separate issue). Regression coverage: - packages/vertz/__tests__/node-http.test.ts — 5 end-to-end scenarios covering listen/fetch/close semantics and graceful shutdown. - native/vtz/src/runtime/ops/http_serve.rs — two new Rust unit tests covering graceful shutdown of in-flight responses and the no-unknown-server-id contract. Closes #2718. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Root cause
The synthetic
node:httpmodule'screateServer().listen()treatedglobalThis.__vtz_http.serve()as asynchronous — butserve()is a synchronous op that returns the server object directly. The resulting.then(...)on a non-thenable threw aTypeErrorthat was swallowed by the idiomaticnew Promise((resolve) => server.listen(0, resolve)), so the listen callback never fired and tests hung at the 120 s watchdog. The CJSrequire('http')shim had the same bug.Fixing the shim exposed three more defects in the underlying op layer:
op_http_serve_closeaborted the tokio task immediately, cancelling in-flight axum handler futures mid-response sofetch()never got its reply.op_http_serve_respondkeyed the pending-responses map per server — replying after close (which removed theServerInstancefrom state) silently dropped the response.op_http_serve_accept/op_http_serve_respondthrew "Unknown server id" when the accept loop re-polled after close, poisoning the event loop and corrupting unrelated tests.What changed
Rust (
native/vtz/src/runtime/ops/http_serve.rs)with_graceful_shutdown(...)signal so existing connections drain cleanly.pending_responsesontoHttpServeStateand keyed it globally byrequest_idso in-flight replies still work across close.op_http_serve_acceptandop_http_serve_respondtolerate a missing server id (null / no-op) instead of throwing.JS (
native/vtz/src/runtime/module_loader.rs, bothNODE_HTTP_MODULEand the CJShttpshim)__vtz_http.serve()as synchronous.listen(cb)fires the callback viaqueueMicrotask(proper Node-async semantics).close(cb)defers the callback until all in-flight requests finish; connections received afterclose()return 503 without entering the user handler.on,once,off,emit,unref,ref) that callers expect.address()returnsnullbeforelisten()completes, matching Node.Tests
packages/ui-server/src/__tests__/node-handler.test.tsandpackages/docs/src/__tests__/docs-cli-actions.test.tsfrom.local.ts— both now pass undervtz test(26 + 11 tests, all green).packages/vertz/__tests__/node-http.test.tswith 5 scenarios coveringlisten(0)callback firing, buffered serve, empty-pending close, deferred close with an in-flight request, and bind failure reporting.http_serve.rs(test_http_serve_graceful_shutdown_preserves_in_flight_response,test_http_serve_close_does_not_throw_unknown_server_id).test:integrationnpm scripts that existed purely to run these files under bun;image-processor.local.tsstays (NAPI, out of scope).Quality gates
cargo test -p vtz --lib— 3452 passing, 0 failingcargo clippy -p vtz --all-targets -- -D warnings— cleancargo fmt --all -- --check— cleanvtz testinpackages/ui-server(node-handler, 26 pass)vtz testinpackages/docs(263 pass)vtz testinpackages/vertz(21 pass incl. 5 new)tsgo --noEmitin ui-server, docs, vertz — cleanNote: the pre-existing
test_inspect_brk_unblocks_after_debugger_connectsflake under concurrentcargo test --allis unrelated; it passes in isolation.Public API changes
None — internal runtime behavior only. Fixes bugs, no new surface.
Test plan
node:httpAPI.test.tsfiles exercise node:http in anger (SSR tests with dozens of buffered + streaming requests per test run)