Skip to content

Detached task-worker exits without writing failed status on broker socket disconnect #21813

@vstandos

Description

@vstandos

Issue

When a codex-companion.mjs task --background task-worker process exits due to broker JSON-RPC socket close/reset, runTrackedJob's catch block is never reached. The job stays permanently in status: "running" in state.json with a now-dead PID.

Reproduce

  1. Start a long-running codex task in background mode
  2. Kill the broker mid-task OR cause broker socket reset (e.g., transient network blip on app-server connection)
  3. task-worker process dies silently
  4. codex status reports "running" indefinitely

Evidence (from local instance)

  • task-worker spawn: codex-companion.mjs lines 641-679 (spawnDetachedTaskWorker with detached: true, stdio: "ignore")
  • State write on completion: lib/tracked-jobs.mjs lines 153-203 — only catch path writes failed status
  • Two ghost-running tasks observed today with last activity timestamps minutes before broker reset, then no log entries, no error, dead PIDs

Proposed fix

Add process.on('exit'/'uncaughtException'/'unhandledRejection') handler in handleTaskWorker that synchronously writes status: "failed" to the job file before exit. Without this, operators cannot distinguish actively-running tasks from dead ones.

Workaround

Operators must ps -p <PID> to check if the worker is alive. If dead, manually mark task as failed in state.json.

Environment: codex-cli 0.129.0, Linux x86_64 (WSL2 Ubuntu 24.04), npm 11.x.

Metadata

Metadata

Assignees

No one assigned

    Labels

    CLIIssues related to the Codex CLIapp-serverIssues involving app server protocol or interfacesbugSomething isn't workingconnectivityIssues involving networking or endpoint connectivity problems (disconnections)

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions