Add `model_providers.<name>.discovery_url` for cluster-aware base-URL refresh

## Summary

When codex is pointed at a **distributed inference cluster** (exo, leader-follower llama.cpp, vLLM with ray, etc.), the active inference endpoint can rotate as nodes join or leave. The current `model_providers.<name>.base_url` field is a single static URL, which forces operators to either:

- Edit `~/.codex/config.toml` every time the leader changes; or
- Front the cluster with a sticky proxy (extra hop, extra failure mode).

Neither is great. A thin sidecar that reports the current leader's URL on demand is a better fit, but codex has no way to consume one.

## Proposed solution

Two new provider-scoped fields plus a `DiscoveryResponse` wire format:

- **`discovery_url: Option<String>`** — URL codex `GET`s at session start to retrieve the current effective `base_url`.
- **`discovery_request_timeout_ms: Option<u64>`** — per-attempt timeout (default 5_000 ms, capped at 60_000 ms so a misconfigured endpoint can never wedge session start).

The endpoint MUST respond with `application/json` matching:

```json
{ "base_url": "http://leader.example.com:8080/v1" }
```

Extra fields are ignored so future TTL / version / alternate-URL metadata can land without breaking v1 clients.

Example `config.toml`:

```toml
[model_providers.my-cluster]
name = "my exo cluster"
base_url = "http://fallback.example.com/v1"   # used if discovery fails
wire_api = "responses"
discovery_url = "http://cluster-manager.local/codex/discover"
discovery_request_timeout_ms = 2500
```

Behavior:

1. At session start, codex `GET`s `discovery_url`.
2. On success, the response's `base_url` replaces the static one for the lifetime of the session.
3. On failure (timeout, non-2xx, invalid JSON, relative URL, body > 64 KiB cap), codex emits a startup warning and falls back to the static `base_url`.

Periodic refresh / TTL is intentionally out of scope for v1; once-per-session resolution covers the common case. Codex sessions are short relative to typical leader-rotation intervals, and future work can add a background refresh task without changing the wire format.

## Safety

- 64 KiB cap on discovery response body — bounds memory exposure if a misconfigured endpoint streams unbounded.
- Discovered `base_url` is parsed via `reqwest::Url::parse` and rejected if not absolute, so an attacker controlling the discovery endpoint can't slip in a relative URL.
- 60-second hard cap on per-attempt timeout.
- `DiscoveryError` is exhaustive and each variant carries enough context to surface a useful operator-facing warning.

## What I considered and rejected

- **Reading `base_url` from an env var that the cluster updates**: requires every codex caller to also have a sidecar polling the cluster, which is just relocating the problem.
- **Periodic background refresh in v1**: meaningfully larger surface (refresh task lifecycle, race against in-flight requests, TTL semantics) for a marginal UX win. v1 covers the 90% case.
- **Single shared discovery endpoint for all providers**: doesn't scale to clusters under different cluster managers.

## Reference implementation

A working implementation with tests lives on the team-wcv fork:

- Branch: https://github.com/team-wcv/codex/tree/feat/provider-discovery-url
- Diff: https://github.com/team-wcv/codex/compare/main...feat/provider-discovery-url

Touches:

- `codex-rs/model-provider-info/src/lib.rs` — new `discovery_url`, `discovery_request_timeout_ms` fields on `ModelProviderInfo`; new `DiscoveryResponse` struct; `DEFAULT_DISCOVERY_REQUEST_TIMEOUT_MS` / `MAX_DISCOVERY_REQUEST_TIMEOUT_MS` constants.
- `codex-rs/model-provider/src/discovery.rs` — new module with two helpers:
  - `resolve_provider_discovery(client, info) -> Result<ModelProviderInfo, DiscoveryError>` — fail-fast variant.
  - `resolve_provider_discovery_or_warn(client, info) -> ModelProviderInfo` — best-effort variant that warns and falls back on failure.
- `codex-rs/model-provider/src/discovery_tests.rs` — 9 tests using `wiremock` covering success, extra-fields-ignored, non-2xx, non-JSON, relative-URL rejection, invalid-discovery-URL, best-effort fallback, and no-op-when-unset.
- `codex-rs/core/config.schema.json` — regenerated via `just write-config-schema`.

## Scope of the reference implementation

The fields and helper are wired into `ModelProviderInfo`, but the helper is **not yet invoked from the 30+ `create_model_provider` call sites** across `codex-core`, `codex-tui`, `codex-app-server`, etc. Where to plumb session-start discovery is a design choice that materially affects async ergonomics, and I wanted to collect maintainer guidance on shape before touching every session-construction path. The helper is self-contained and ready to drop into whichever site(s) you'd prefer:

- A central session-init helper that wraps `create_model_provider`?
- An async builder on `Config` that pre-resolves discovery before the synchronous `create_model_provider` is called?
- Per-site, with operators opting in by feature flag?

Happy to do that integration as a follow-up PR once the wire format and helper shape are approved (or to rework if you'd prefer different ergonomics).

## Open to alternatives

If `discovery_url` as a name conflicts with existing terminology, or if you'd prefer the wire format to carry more metadata up front (e.g., `{ "base_url": "...", "ttl_seconds": 30, "expires_at": "..." }`), happy to rework. The patch shape is small enough that iteration is cheap.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `model_providers.<name>.discovery_url` for cluster-aware base-URL refresh #22063

Summary

Proposed solution

Safety

What I considered and rejected

Reference implementation

Scope of the reference implementation

Open to alternatives

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Add model_providers.<name>.discovery_url for cluster-aware base-URL refresh #22063

Description

Summary

Proposed solution

Safety

What I considered and rejected

Reference implementation

Scope of the reference implementation

Open to alternatives

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Add `model_providers.<name>.discovery_url` for cluster-aware base-URL refresh #22063