Confirm this is an issue with the Python library and not an underlying OpenAI API
Describe the bug
Summary
Non-streaming OpenAI API calls (Responses API with medium/high reasoning, long completions.create() calls) hang indefinitely when run behind a NAT gateway. The OpenAI server successfully generates the response — visible in the dashboard — but the client never receives it and blocks forever.
Root Cause
The default httpx transport used by the SDK has no TCP keepalive (SO_KEEPALIVE is off). During a long non-streaming call, the TCP connection sits idle while the server generates. NAT gateways silently drop idle TCP connections after their idle timeout:
NAT type | Typical idle timeout
-- | --
AWS NAT Gateway | ~350 s
GCP Cloud NAT | ~120 s
Home routers / ISP NAT | 60–300 s
With o-series and GPT-5.x models under high reasoning, server-side generation routinely takes 300–700 s. The NAT drops the connection mid-generation, the client receives no error (TCP is stateless — neither side knows the connection is dead without keepalive probes), and the call hangs indefinitely.
This affects any deployment behind NAT — EKS, ECS, Cloud Run, GKE, and even local development behind a home router.
Evidence
Running the reproducer below from a home network with gpt-5.4 + high reasoning:
- OpenAI dashboard: call completed successfully, generation time ~606 s
- Client (no keepalive): hung indefinitely — TCP connection was silently dropped by home router NAT at ~350 s idle; response never delivered despite being ready on the server
- Client (with keepalive): completed successfully in ~606 s
Fix
Enable TCP keepalive on the default httpx transport in _DefaultHttpxClient and _DefaultAsyncHttpxClient in _base_client.py using kwargs.setdefault — so any caller-supplied custom transport is completely unaffected:
kwargs.setdefault("transport", httpx.AsyncHTTPTransport(socket_options=_build_keepalive_socket_options()))
This is identical to what the Anthropic Python SDK already does.
Workaround(until patched)
Pass a keepalive-enabled http_client explicitly as shown in the reproducer below.
PR
Fix submitted in: https://github.com/
/pull/3270
To Reproduce
"""
Run from behind any NAT (home router, cloud VM, k8s pod).
Both calls fire concurrently — [keepalive] succeeds, [no-keepalive] hangs forever.
Usage: OPENAI_API_KEY=sk-... python -u reproducer_nat_timeout.py
"""
Code snippets
import asyncio, socket, time, httpx
from openai import AsyncOpenAI
MODEL = "gpt-5.4"
PROMPT = (
"You are a senior financial analyst. Write an exhaustive, deeply researched "
"investment memo (minimum 8,000 words) covering market sizing, competitive "
"landscape, unit economics, risks, and a DCF valuation for a hypothetical "
"B2B SaaS company targeting the enterprise data-observability market. "
"Cite real industry benchmarks throughout."
)
async def call_without_keepalive() -> None:
client = AsyncOpenAI() # default transport — no TCP keepalive
t0 = time.monotonic()
print("[no-keepalive] START")
try:
await client.responses.create(
model=MODEL, reasoning={"effort": "high"}, input=PROMPT, stream=False
)
print(f"[no-keepalive] OK in {time.monotonic()-t0:.0f}s")
except Exception as e:
print(f"[no-keepalive] FAILED after {time.monotonic()-t0:.0f}s: {type(e).__name__}: {e}")
def _make_keepalive_client() -> httpx.AsyncClient:
opts: list[tuple[int, int, int]] = [(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)]
if hasattr(socket, "TCP_KEEPIDLE"):
opts.append((socket.IPPROTO_TCP, socket.TCP_KEEPIDLE, 60))
elif hasattr(socket, "TCP_KEEPALIVE"):
opts.append((socket.IPPROTO_TCP, socket.TCP_KEEPALIVE, 60))
if hasattr(socket, "TCP_KEEPINTVL"):
opts.append((socket.IPPROTO_TCP, socket.TCP_KEEPINTVL, 60))
if hasattr(socket, "TCP_KEEPCNT"):
opts.append((socket.IPPROTO_TCP, socket.TCP_KEEPCNT, 5))
return httpx.AsyncClient(transport=httpx.AsyncHTTPTransport(socket_options=opts))
async def call_with_keepalive() -> None:
client = AsyncOpenAI(http_client=_make_keepalive_client())
t0 = time.monotonic()
print("[keepalive] START")
try:
await client.responses.create(
model=MODEL, reasoning={"effort": "high"}, input=PROMPT, stream=False
)
print(f"[keepalive] OK in {time.monotonic()-t0:.0f}s")
except Exception as e:
print(f"[keepalive] FAILED after {time.monotonic()-t0:.0f}s: {type(e).__name__}: {e}")
async def main() -> None:
print(f"Model: {MODEL} reasoning: high")
print("Both calls fired concurrently — watch [keepalive] succeed while [no-keepalive] hangs.\n")
await asyncio.gather(call_without_keepalive(), call_with_keepalive())
asyncio.run(main())
Actual output:
Model: gpt-5.4 reasoning: high
Both calls fired concurrently — watch [keepalive] succeed while [no-keepalive] hangs.
[no-keepalive] START
[keepalive] START
[keepalive] OK in 604s ← completes
[no-keepalive] START ← still hanging...
OS
macOS
Python version
3.11
Library version
2.37.0
Confirm this is an issue with the Python library and not an underlying OpenAI API
Describe the bug
Summary
Non-streaming OpenAI API calls (Responses API with medium/high reasoning, long
completions.create()calls) hang indefinitely when run behind a NAT gateway. The OpenAI server successfully generates the response — visible in the dashboard — but the client never receives it and blocks forever.Root Cause
The default httpx transport used by the SDK has no TCP keepalive (
NAT type | Typical idle timeout -- | -- AWS NAT Gateway | ~350 s GCP Cloud NAT | ~120 s Home routers / ISP NAT | 60–300 sSO_KEEPALIVEis off). During a long non-streaming call, the TCP connection sits idle while the server generates. NAT gateways silently drop idle TCP connections after their idle timeout:With o-series and GPT-5.x models under high reasoning, server-side generation routinely takes 300–700 s. The NAT drops the connection mid-generation, the client receives no error (TCP is stateless — neither side knows the connection is dead without keepalive probes), and the call hangs indefinitely.
This affects any deployment behind NAT — EKS, ECS, Cloud Run, GKE, and even local development behind a home router.
Evidence
Running the reproducer below from a home network with
gpt-5.4+ high reasoning:Fix
Enable TCP keepalive on the default httpx transport in
_DefaultHttpxClientand_DefaultAsyncHttpxClientin_base_client.pyusingkwargs.setdefault— so any caller-supplied custom transport is completely unaffected:This is identical to what the Anthropic Python SDK already does.
Workaround(until patched)
Pass a keepalive-enabled http_client explicitly as shown in the reproducer below.PR
Fix submitted in: https://github.com//pull/3270To Reproduce
"""
Run from behind any NAT (home router, cloud VM, k8s pod).
Both calls fire concurrently — [keepalive] succeeds, [no-keepalive] hangs forever.
Usage: OPENAI_API_KEY=sk-... python -u reproducer_nat_timeout.py
"""
Code snippets
Actual output:
Model: gpt-5.4 reasoning: high
Both calls fired concurrently — watch [keepalive] succeed while [no-keepalive] hangs.
[no-keepalive] START
[keepalive] START
[keepalive] OK in 604s ← completes
[no-keepalive] START ← still hanging...
OS
macOS
Python version
3.11
Library version
2.37.0