Skip to content

HTTP connection pool timeout mismatch causes request failures with some third-party providers #7272

@k-l-lambda

Description

@k-l-lambda

What version of Codex is running?

codex-tui 0.0.0

What subscription do you have?

Pro

Which model were you using?

gpt-5.1

What platform is your computer?

Windows

What issue are you seeing?

When using Codex with certain third-party OpenAI-compatible API providers, requests frequently fail with errors like 400 Bad Request, 404 Not Found, or
connection reset errors. These failures trigger retry logic, leading to rate limiting (429 Too Many Requests) and repeated "Reconnecting... X/5" messages.

The requests themselves are valid (same payload succeeds with curl), but Codex's HTTP connection pooling is incompatible with providers that close idle
connections more aggressively than reqwest's 90-second default timeout.

Logs showing the issue:
[DEBUG] reuse idle connection for ("https", api.provider.com)
[DEBUG] Request completed status=400 Bad Request
[DEBUG] Turn error: unexpected status 400 Bad Request

What steps can reproduce the bug?

  1. Configure Codex to use a third-party OpenAI-compatible API provider that closes idle connections after ~30 seconds
  2. Use Codex for an extended session with periodic pauses (e.g., 40+ seconds between requests)
  3. Observe that after idle periods, requests fail with 400/404 errors
  4. Notice "Reconnecting..." messages appearing frequently

Configuration example:

[model_providers.custom]
name = "Third Party Provider"
base_url = "https://api.example.com/v1"

Note: The same request payload tested with curl succeeds, proving the API and request format are correct.

What is the expected behavior?

  • Requests should succeed reliably without connection-related errors
  • No frequent reconnection cycles
  • Stable streaming responses without interruptions

Additional information

Root cause analysis:

Codex uses reqwest's default connection pool with a 90-second idle timeout. The failure pattern occurs when:

  1. A connection sits idle in the pool for 40+ seconds
  2. The provider closes the connection server-side (e.g., after 30 seconds)
  3. Codex attempts to reuse the stale connection from the pool
  4. The request fails with 400/404 because the connection is already closed
  5. Retry logic triggers rapid retries, causing 429 rate limit errors

Why curl works but Codex doesn't:

curl doesn't maintain persistent connections by default, so each request uses a fresh connection. The issue only manifests with connection pooling.

Impact:

  • Degraded user experience with frequent reconnections
  • Increased API costs from unnecessary retry requests
  • Makes some third-party providers effectively unusable

Affected component: codex-rs/core/src/default_client.rs - HTTP client with reqwest connection pooling

Environment:

  • Codex CLI version: Latest main branch
  • Platform: Windows/Linux/macOS (all affected)
  • Multiple third-party OpenAI-compatible providers affected

Metadata

Metadata

Assignees

No one assigned

    Labels

    CLIIssues related to the Codex CLIbugSomething isn't workingchat-endpointBugs or PRs related to the chat/completions endpoint (wire API)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions