Codex Desktop: gpt-5.5 xhigh turn stalled 30m before first output, then resumed normally

## Summary

A Codex Desktop turn using `gpt-5.5` with `xhigh` reasoning remained user-visible as `Thinking` for more than 30 minutes before the first persisted assistant/reasoning item appeared. Once the first item appeared, the turn continued normally with assistant text and tool calls within seconds.

This looks different from ordinary slow generation: the local rollout has no assistant/reasoning/tool event during the gap, and normal persisted logs did not retain a useful stream/retry diagnostic for the missing interval.

## Environment

- Product: Codex Desktop on Windows 11 with WSL2 Ubuntu workspace
- Desktop package observed in logs: `OpenAI.Codex_26.519.5221.0`
- Desktop release string observed in logs: `26.519.41501`
- WSL app-server binary: `codex-cli 0.130.0-alpha.5`
- Model: `gpt-5.5`
- Reasoning effort: `xhigh`
- Workspace type: WSL project

## Primary observed case

All timestamps below are UTC.

- User submitted turn: `2026-05-23T16:09:18.281Z`
- First persisted assistant/reasoning item: `2026-05-23T16:39:56.682Z`
- Pre-first-output gap: about `30m38s`

First persisted output sequence after the stall:

```text
2026-05-23T16:39:56.682Z response_item reasoning
2026-05-23T16:39:57.612Z event_msg agent_message
2026-05-23T16:39:57.612Z response_item message assistant
2026-05-23T16:40:08.580Z response_item function_call
2026-05-23T16:40:08.726Z response_item function_call_output
2026-05-23T16:40:08.727Z event_msg token_count
```

User-visible behavior: the thread sat on `Thinking` for the entire gap. When it finally resumed, it did not appear to replay a backlog; it just began producing the first reasoning/message/tool items at normal cadence.

## Secondary same-day signal

In another thread on the same Desktop session, the UI visibly showed `Reconnecting... 2/5` while still in `Thinking` on `gpt-5.5`/`xhigh`. That shorter case had about a `41.6s` gap before first reasoning output, but the local Desktop app-server transport logs did not show a corresponding app-server reconnect/restart.

This may be related to existing reconnect/stream issues, but the important gap here is observability: the visible reconnect state and the long pre-first-output stall are not represented clearly enough in the durable rollout/log artifacts.

## Historical local scan

I scanned local rollout JSONL files for the same host, using time from user submission to first assistant/reasoning/tool item. Private transcripts and paths were not included in this report.

High-level results:

```text
rollout files scanned: 305
completed turns with usable timing: 7618

gpt-5.5 / xhigh: n=168, p95=1m11s, max=30m38s, >=120s=3, >=300s=1, >=600s=1, >=1800s=1
gpt-5.5 / high:  n=496, max=3m13s, >=120s=2
gpt-5.5 / medium: n=347, max=1m55s, >=120s=0
gpt-5.5 / low: n=120, max=1m07s, >=120s=0
codex-auto-review / low: n=6452, max=0m48s, >=120s=0
```

This suggests the 30m+ outlier is strongly associated with `gpt-5.5` + `xhigh` in this local sample, though it does not prove the issue is exclusive to `xhigh`.

## Expected behavior

- A turn should either start streaming within the normal startup range, or surface a durable diagnostic/error state if the response stream is idle/retrying for many minutes before first output.
- If the UI shows reconnecting/retry state, the durable logs/rollout should retain enough information to distinguish model queueing, websocket retry, backend stall, app-server transport reconnect, and local UI state races.

## Actual behavior

- UI remained on `Thinking` for `30m38s` before the first persisted output item.
- The turn eventually resumed normally, making it look like the request was alive but silent for the entire interval.
- A shorter same-day case showed visible `Reconnecting... 2/5`, but there was no matching Desktop app-server reconnect/restart in local logs.

## Related issues

Possibly related, but not exact duplicates:

- #18471: `Reconnecting...` visible while app-server transport appears connected
- #20739: Responses WebSocket closes before `response.completed`
- #21360: Desktop sessions stuck in thinking/generating lifecycle stalls


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Codex Desktop: gpt-5.5 xhigh turn stalled 30m before first output, then resumed normally #24260

Summary

Environment

Primary observed case

Secondary same-day signal

Historical local scan

Expected behavior

Actual behavior

Related issues

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Codex Desktop: gpt-5.5 xhigh turn stalled 30m before first output, then resumed normally #24260

Description

Summary

Environment

Primary observed case

Secondary same-day signal

Historical local scan

Expected behavior

Actual behavior

Related issues

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions