Skip to content

Codex CLI native process can remain orphaned/headless and spin at 100% CPU after session completes #28794

Description

@mychaelconnolly

Summary

Codex sometimes leaves a stale native codex process running after the session appears complete. In my latest case, the process had no TTY, revoked stdio, an orphaned/reparented process shape, and consumed one full core overnight.

This looks related to #13196 / #12491 / #19753, but the latest incident was not primarily an MCP-helper leak: the hot process was the native Codex CLI child itself.

Environment

  • Codex CLI: 0.140.0
  • macOS: 26.5.1 build 25F80
  • Observed across iTerm2 and Terminal.app usage
  • Frequent use of plan mode / background-terminal style workflows

Observed

On 2026-06-17, my Mac was hot. Process inspection found:

  • parent node PID 17724
  • child native codex PID 17725
  • MCP/lazy-gate children 17767, 17769
  • no TTY
  • fd 0/1 revoked, fd 2 /dev/null
  • process start around 2026-06-16 16:00 local
  • native codex was around 100% CPU
  • accumulated about 831 CPU minutes

The likely matching session log started at 2026-06-16T20:01:41Z and last recorded task_complete at 2026-06-16T22:32:24Z, but the native process was still running hot the next morning.

After terminating only that stale tree, CPU returned to mostly idle.

Expected

When a Codex session/task/terminal lifecycle ends, Codex should terminate or reap its native worker process tree, including MCP/helper descendants, and should not leave a headless codex process spinning under launchd or with revoked stdio.

Actual

A completed/stale Codex process tree remained alive without a usable terminal/stdio path and continued consuming CPU for many hours.

Related

Hypothesis

There is still a session shutdown path where the parent Node/TUI/app process exits or disconnects without terminating the native codex child process group. #19753 appears to address stdio MCP server shutdown, but this case suggests the native Codex worker itself can survive one layer above that fix.

Prior similar local incident

On 2026-05-18, I hit a related but different runaway shape: Codex app-server spawned stale LAN/MCP ping children, each consuming about 96% CPU for over an hour. That issue was mitigated locally by lazy-gating MCP startup, but the 2026-06-17 incident happened at the native Codex worker layer rather than as hot LAN probe children.

Metadata

Metadata

Assignees

No one assigned

    Labels

    CLIIssues related to the Codex CLIbugSomething isn't workingperformancesessionIssues involving session (thread) management, resuming, forking, naming, archiving

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions