Skip to content

Codex Desktop: shutdown/nonexistent subagents appear to count toward spawn_agent thread limit #23219

@zhiyuxinxi

Description

@zhiyuxinxi

Summary

Codex Desktop appears to hit the spawn_agent thread limit even when the visible subagent list only contains the main thread plus historical subagents marked as shutdown. Attempts to close those historical subagents fail with thread ... not found, but subsequent spawn_agent calls still report the session/thread limit as reached.

This makes multi-agent review workflows unreliable: the user sees old subagent entries that are no longer live or closable, while the model cannot spawn the intended new reviewers.

Environment

  • Product: Codex Desktop app
  • Platform: Windows / PowerShell session
  • Workspace path shape: H:\...
  • App version: not available from this session; I can provide it later from the About dialog if needed.
  • Mode: local desktop session using the multi-agent/collab spawn_agent, list_agents, and close_agent tools.

What happened

During a multi-AI review workflow, the model attempted to spawn several read-only reviewer agents. Only one new reviewer could be spawned. The other spawn_agent calls failed with:

collab spawn failed: agent thread limit reached

The model then inspected live agents with list_agents. The returned list showed:

  • /root as running
  • several older subagents as shutdown
  • the new reviewer as running initially

The historical shutdown entries included older unrelated review tasks from previous work in the same root session. They were not part of the current task.

The model attempted to explicitly close those old shutdown entries with close_agent, but each close attempt failed with errors like:

collab tool failed: thread <id> not found

After this, the system still behaved as if the thread/session limit had been reached, even though those old agents were already shutdown and not closable.

Expected behavior

One of these should happen:

  1. shutdown or nonexistent subagents should not count against the active max_concurrent_threads_per_session / spawn limit.
  2. If they do count, close_agent should be able to remove/release them.
  3. If the limit is not about active agents but about total historical agents in a root tree, the error message should say so explicitly and provide a recovery path.
  4. list_agents should clearly distinguish active, closed-but-retained, and non-counting historical entries.

Actual behavior

  • spawn_agent reports the thread limit is reached.
  • list_agents shows old unrelated agents as shutdown.
  • close_agent cannot close them because the thread is already not found.
  • The user and model cannot tell which entries are still consuming capacity or how to release capacity.

Why this matters

This blocks legitimate multi-agent workflows. In this case, the user explicitly asked for multiple AI reviewers to concurrently critique planning artifacts. The model could only spawn one reviewer and had to fall back to a less reliable main-thread-only review.

It also creates a confusing UX: the model reports old shutdown threads, cannot close them, and cannot spawn new agents, which looks like stale session state or leaked accounting.

Reproduction outline

  1. In Codex Desktop, create a root session.
  2. Spawn several subagents for review tasks.
  3. Let them complete/shutdown, or have them become historical shutdown entries in list_agents.
  4. Later in the same root session, try to spawn several new subagents.
  5. Observe spawn_agent failing with agent thread limit reached.
  6. Call list_agents; observe old entries marked shutdown.
  7. Try close_agent on those old entries; observe thread ... not found.
  8. Try spawning again; observe the effective capacity is still unavailable.

Related-looking issues found before filing

I found some similar-but-not-identical issues around subagent/session state:

This report is specifically about shutdown / nonexistent agents being visible but not closable, while spawn_agent still reports a thread-limit condition.

Suggested fix direction

  • Make the spawn limit count only currently running or pending agents.
  • Ensure completed/shutdown child agents release capacity deterministically.
  • Make close_agent idempotent: closing an already-shutdown/nonexistent child should return success or a clear already closed; not counted result.
  • Add a diagnostic field to list_agents, such as counts_toward_spawn_limit: true/false.
  • Improve the agent thread limit reached error to include active-count / max-count and a list of counted agents.

Metadata

Metadata

Assignees

No one assigned

    Labels

    appIssues related to the Codex desktop appbugSomething isn't workingsessionIssues involving session (thread) management, resuming, forking, naming, archivingsubagentIssues involving subagents or multi-agent featureswindows-osIssues related to Codex on Windows systems

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions