Skip to content

fix(client): keep request timeout armed across body-read phase#1826

Open
quanru wants to merge 1 commit intoopenai:masterfrom
quanru:fix/timeout-cover-body-read
Open

fix(client): keep request timeout armed across body-read phase#1826
quanru wants to merge 1 commit intoopenai:masterfrom
quanru:fix/timeout-cover-body-read

Conversation

@quanru
Copy link
Copy Markdown

@quanru quanru commented Apr 15, 2026

Closes #1825.

Problem

fetchWithTimeout clears the request timer in a finally block immediately after await fetch() resolves. But native fetch resolves when response headers arrive — not when the body has been fully read. The body is consumed later via await response.json() (etc.) in internal/parse.ts, with no timer attached.

So if a server sends a fast 200 OK and then stalls while streaming the body, the SDK hangs indefinitely. We observed a 37-minute hang against an OpenAI-compatible provider before the caller had to be killed; the user-configured timeout: 10 minutes never fired and no retry kicked in.

Verified the bug is still present on master (and on every published version going back to v4.x) — the try/finally in fetchWithTimeout has not changed in any meaningful way.

Fix

Wrap the response body in a ReadableStream whose start / cancel paths run the original clearTimeout callback exactly once: when the body is fully read, cancelled, or errored. Responses without a body (e.g. HEAD, 204) have the timer cleared synchronously — preserving the previous behaviour. The wrapped Response keeps the original status, statusText, and headers.

The minimal contract change is: fetchWithTimeout no longer guarantees the timer is cleared by the time it returns. It is cleared once the consumer is done with the response (or errors out, or the timer fires and aborts the request).

Bug-case test

Added 'timeout covers body read phase' in tests/index.test.ts. The test fakes a server that returns 200 headers immediately and never enqueues body chunks, configures the client with a 50ms timeout and maxRetries: 0, and asserts the request rejects well before the body stall would naturally complete.

Verified locally on Node 24:

  • Without the fix (master): the test fails — elapsed reaches BODY_STALL_MS (~2000 ms) before the assertion runs.
  • With the fix: the test passes — request rejects in ~60 ms.

Other timeout-related tests still pass (retry on timeout for example), and tests/streaming.test.ts is unaffected.

Why this matters

Many OpenAI-compatible providers (proxies, reasoning-heavy models, gateway tiers under load) routinely flush response headers fast and then stream the body slowly. Any stall in that body window currently translates into an unbounded hang in the SDK, with no error and no retry — the calling application has no way to detect the failure short of layering its own AbortSignal on every request. After this fix the SDK's documented timeout option means what users expect: a hard upper bound on the entire request including body read.

Notes for maintainers

src/client.ts is generated, so this patch will need to be reflected upstream in the Stainless spec / generator config to survive future regenerations. Happy to follow whatever process you'd like — wanted to send a self-contained PR with a regression test so the failure mode is captured no matter how the fix lands.

`fetchWithTimeout` previously cleared the request timer in a `finally`
block immediately after `await fetch()` resolved. Native `fetch` resolves
when response headers arrive — not when the body is fully read — so the
subsequent body consumer (`await response.json()` in `internal/parse.ts`)
ran without any timer attached. Servers that send a fast `200 OK` and
then stall while streaming the body would cause the SDK to hang
indefinitely (observed: 37-minute hang in production for one downstream
user, see openai#1825).

This patch wraps the response body in a `ReadableStream` whose
completion / cancellation / error callback runs the original
`clearTimeout`. Responses without a body have the timer cleared
synchronously, preserving the previous behaviour for HEAD / 204 / etc.
The user-visible `Response` object is otherwise unchanged.

Also adds a regression test that fakes a server returning 200 headers
plus a stalled body, configures the client with a short timeout, and
asserts the request rejects within the timeout window. Verified
locally: passes on patched code (~60ms), fails on unpatched code
(elapsed >= BODY_STALL_MS).

Refs: openai#1825
@quanru quanru requested a review from a team as a code owner April 15, 2026 09:31
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8b36834a1e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/client.ts
Comment on lines +272 to +275
for (;;) {
const { done, value } = await reader.read();
if (done) break;
controller.enqueue(value);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Respect backpressure in wrapped response stream

The wrapper reads the upstream response.body to completion inside start(), which runs eagerly when new ReadableStream(...) is constructed, not when the caller pulls. Because this loop never checks downstream demand, slow consumers (especially stream: true SSE users that process events incrementally) can accumulate an unbounded in-memory queue and lose the original backpressure behavior of fetch bodies, leading to large memory growth or OOM on long streams.

Useful? React with 👍 / 👎.

Comment thread src/client.ts
Comment on lines +289 to +291
cancel(reason) {
clearOnce();
return originalBody.cancel(reason);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Cancel the active reader instead of locked stream

In cancel(), calling originalBody.cancel(reason) is invalid while start() holds a reader lock (originalBody.getReader()), so cancellation can reject with TypeError: ReadableStream is locked and fail to stop the upstream body read. This affects callers that cancel early (e.g., response.body.cancel() / early termination paths), leaving the network read running despite cancellation.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SDK timeout option does not cover response-body read phase (can hang forever)

1 participant