fix(client): keep request timeout armed across body-read phase#1826
fix(client): keep request timeout armed across body-read phase#1826quanru wants to merge 1 commit intoopenai:masterfrom
Conversation
`fetchWithTimeout` previously cleared the request timer in a `finally` block immediately after `await fetch()` resolved. Native `fetch` resolves when response headers arrive — not when the body is fully read — so the subsequent body consumer (`await response.json()` in `internal/parse.ts`) ran without any timer attached. Servers that send a fast `200 OK` and then stall while streaming the body would cause the SDK to hang indefinitely (observed: 37-minute hang in production for one downstream user, see openai#1825). This patch wraps the response body in a `ReadableStream` whose completion / cancellation / error callback runs the original `clearTimeout`. Responses without a body have the timer cleared synchronously, preserving the previous behaviour for HEAD / 204 / etc. The user-visible `Response` object is otherwise unchanged. Also adds a regression test that fakes a server returning 200 headers plus a stalled body, configures the client with a short timeout, and asserts the request rejects within the timeout window. Verified locally: passes on patched code (~60ms), fails on unpatched code (elapsed >= BODY_STALL_MS). Refs: openai#1825
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8b36834a1e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| for (;;) { | ||
| const { done, value } = await reader.read(); | ||
| if (done) break; | ||
| controller.enqueue(value); |
There was a problem hiding this comment.
Respect backpressure in wrapped response stream
The wrapper reads the upstream response.body to completion inside start(), which runs eagerly when new ReadableStream(...) is constructed, not when the caller pulls. Because this loop never checks downstream demand, slow consumers (especially stream: true SSE users that process events incrementally) can accumulate an unbounded in-memory queue and lose the original backpressure behavior of fetch bodies, leading to large memory growth or OOM on long streams.
Useful? React with 👍 / 👎.
| cancel(reason) { | ||
| clearOnce(); | ||
| return originalBody.cancel(reason); |
There was a problem hiding this comment.
Cancel the active reader instead of locked stream
In cancel(), calling originalBody.cancel(reason) is invalid while start() holds a reader lock (originalBody.getReader()), so cancellation can reject with TypeError: ReadableStream is locked and fail to stop the upstream body read. This affects callers that cancel early (e.g., response.body.cancel() / early termination paths), leaving the network read running despite cancellation.
Useful? React with 👍 / 👎.
Closes #1825.
Problem
fetchWithTimeoutclears the request timer in afinallyblock immediately afterawait fetch()resolves. But nativefetchresolves when response headers arrive — not when the body has been fully read. The body is consumed later viaawait response.json()(etc.) ininternal/parse.ts, with no timer attached.So if a server sends a fast
200 OKand then stalls while streaming the body, the SDK hangs indefinitely. We observed a 37-minute hang against an OpenAI-compatible provider before the caller had to be killed; the user-configuredtimeout: 10 minutesnever fired and no retry kicked in.Verified the bug is still present on
master(and on every published version going back to v4.x) — thetry/finallyinfetchWithTimeouthas not changed in any meaningful way.Fix
Wrap the response body in a
ReadableStreamwhosestart/cancelpaths run the originalclearTimeoutcallback exactly once: when the body is fully read, cancelled, or errored. Responses without a body (e.g.HEAD,204) have the timer cleared synchronously — preserving the previous behaviour. The wrappedResponsekeeps the originalstatus,statusText, andheaders.The minimal contract change is:
fetchWithTimeoutno longer guarantees the timer is cleared by the time it returns. It is cleared once the consumer is done with the response (or errors out, or the timer fires and aborts the request).Bug-case test
Added
'timeout covers body read phase'intests/index.test.ts. The test fakes a server that returns 200 headers immediately and never enqueues body chunks, configures the client with a 50ms timeout andmaxRetries: 0, and asserts the request rejects well before the body stall would naturally complete.Verified locally on Node 24:
BODY_STALL_MS(~2000 ms) before the assertion runs.Other timeout-related tests still pass (
retry on timeoutfor example), andtests/streaming.test.tsis unaffected.Why this matters
Many OpenAI-compatible providers (proxies, reasoning-heavy models, gateway tiers under load) routinely flush response headers fast and then stream the body slowly. Any stall in that body window currently translates into an unbounded hang in the SDK, with no error and no retry — the calling application has no way to detect the failure short of layering its own
AbortSignalon every request. After this fix the SDK's documentedtimeoutoption means what users expect: a hard upper bound on the entire request including body read.Notes for maintainers
src/client.tsis generated, so this patch will need to be reflected upstream in the Stainless spec / generator config to survive future regenerations. Happy to follow whatever process you'd like — wanted to send a self-contained PR with a regression test so the failure mode is captured no matter how the fix lands.