Summary
The Codex CLI app-server worker process exits with code 0 mid-turn on Linux/WSL2 immediately after a specific tool-call sequence pattern, leaving the broker stuck and the wrapper layer hanging on state.completion. No error event is propagated. Reproducible 4 times across 24 hours on codex-cli 0.130.0 and 0.129.0.
Environment
- Codex CLI: 0.129.0 → 0.130.0 (both reproduce)
- Companion / wrapper:
@openai/codex plugin runtime via codex-companion.mjs
- Host: Windows 11 + WSL2 (Ubuntu 22.04, kernel 6.6.114.1-microsoft-standard-WSL2)
- Model: gpt-5.5 with xhigh reasoning effort (default for codex)
- Sandbox:
danger-full-access
- Trigger: dispatched as background job from a Claude Code orchestrator
Reproducer (high reliability — 4/4 in 24h)
The worker dies after this exact sequence in a single turn:
- Multiple
psql / sed exploration commands (parallel batch, all complete with exit 0)
- Model emits an "announcement" assistant message ("I'm going to write
<file>...")
mkdir -p <output_dir> — succeeds
- ONE more
psql or shell probe (e.g., psql -Atc "SELECT exchange, contract_type FROM symbols") — succeeds
- Worker silently exits. No
Turn completed event, no error, no SIGTERM in logs. Companion state.completion Promise hangs indefinitely.
Observed across 4 distinct Codex jobs today:
task-moxeya7t-248j70 (v4 analysis executor, 21:20 UTC)
task-moxpj6m0-zr9uvr (PR creation task, 02:13 UTC next day)
task-moxq4ixm-aln8jh (REST probe investigation, 02:29 UTC)
bt0dkgkg6 / b77sx7ne9 (analytical re-run, 16:00 UTC)
What I see in logs
/home/user/.claude/plugins/data/codex-openai-codex/state/<workspace>/jobs/<task-id>.log ends with:
[ts] Command completed: /bin/bash -lc 'psql ...' (exit 0)
and zero lines after that. No turn/completed notification, no client.exit, no error event.
broker.log is empty (0 bytes) on the workspace's /tmp/cxc-*/ directory.
Expected behavior
Either:
- Turn completes normally with subsequent file-write tool calls
- OR worker emits an error event before exiting
Neither happens — exit is silent.
Workaround applied locally (wrapper-layer)
Patched lib/codex.mjs captureTurn() to use Promise.race([state.completion, exitRace]) where exitRace watches client.exitPromise and throws if turn isn't yet complete. This converts indefinite hang into clean exception, but doesn't fix the underlying app-server crash.
Related (non-duplicate)
Reproduction setup (if upstream wants to bisect)
I can provide:
- All 4 task log files (~10-90 lines each, JSON-ish single-line structure)
- Companion broker.log captures
- Process tree at moment of failure (broker alive, app-server dead)
- The wrapper-layer Promise.race patch as before/after diff
Request: enable structured logging in app-server's exit path, OR add a "worker-died-mid-turn" event to surface this state instead of silent exit.
Summary
The Codex CLI
app-serverworker process exits with code 0 mid-turn on Linux/WSL2 immediately after a specific tool-call sequence pattern, leaving the broker stuck and the wrapper layer hanging onstate.completion. No error event is propagated. Reproducible 4 times across 24 hours oncodex-cli 0.130.0and0.129.0.Environment
@openai/codexplugin runtime viacodex-companion.mjsdanger-full-accessReproducer (high reliability — 4/4 in 24h)
The worker dies after this exact sequence in a single turn:
psql/sedexploration commands (parallel batch, all complete with exit 0)<file>...")mkdir -p <output_dir>— succeedspsqlor shell probe (e.g.,psql -Atc "SELECT exchange, contract_type FROM symbols") — succeedsTurn completedevent, no error, no SIGTERM in logs. Companionstate.completionPromise hangs indefinitely.Observed across 4 distinct Codex jobs today:
task-moxeya7t-248j70(v4 analysis executor, 21:20 UTC)task-moxpj6m0-zr9uvr(PR creation task, 02:13 UTC next day)task-moxq4ixm-aln8jh(REST probe investigation, 02:29 UTC)bt0dkgkg6/b77sx7ne9(analytical re-run, 16:00 UTC)What I see in logs
/home/user/.claude/plugins/data/codex-openai-codex/state/<workspace>/jobs/<task-id>.logends with:and zero lines after that. No
turn/completednotification, noclient.exit, no error event.broker.logis empty (0 bytes) on the workspace's/tmp/cxc-*/directory.Expected behavior
Either:
Neither happens — exit is silent.
Workaround applied locally (wrapper-layer)
Patched
lib/codex.mjscaptureTurn()to usePromise.race([state.completion, exitRace])whereexitRacewatchesclient.exitPromiseand throws if turn isn't yet complete. This converts indefinite hang into clean exception, but doesn't fix the underlying app-server crash.Related (non-duplicate)
Reproduction setup (if upstream wants to bisect)
I can provide:
Request: enable structured logging in app-server's exit path, OR add a "worker-died-mid-turn" event to surface this state instead of silent exit.