What version of Codex CLI is running?
codex-cli 0.132.0
What subscription do you have?
ChatGPT Pro
Which model were you using?
gpt-5.5 as supervising Codex model; external worker commands used opencode with deepseek/deepseek-v4-pro and deepseek-chat
What platform is your computer?
Darwin 25.5.0 arm64 arm
What terminal emulator and version are you using (if applicable)?
Ghostty with tmux 3.6b, zsh. Codex env reports TERM_PROGRAM=tmux, TERM=tmux-256color. Manual opencode commands were run from a normal terminal session.
Codex doctor report
codex doctor --json was available. Relevant excerpts from this Codex session:
- codexVersion: 0.132.0
- config.load: ok; model: gpt-5.5; feature flags include shell_tool, unified_exec, multi_agent, plugins, in_app_browser, browser_use, etc.
- sandbox.helpers: ok; approval policy OnRequest; filesystem sandbox restricted; network sandbox restricted
- state.paths: ok; state/log DBs inspectable under ~/.codex
- terminal.env: ok; tmux 3.6b; TERM=tmux-256color; shell=/bin/zsh
- network.provider_reachability: fail inside the Codex environment due sandboxed/network reachability, while the user terminal could run the external CLI successfully
- updates.status: warning due api.github.com DNS from the Codex environment
I can provide the full JSON if useful, but omitted it here to keep the issue focused and avoid unnecessary local path/account detail.
What issue are you seeing?
When using Codex as a supervising agent and delegating implementation/testing to an external CLI worker (OpenCode with DeepSeek), commands launched through the Codex exec/harness were unreliable in ways that did not reproduce in a normal terminal.
The concrete pattern:
-
The user could run this manually without issue:
opencode run -m deepseek/deepseek-v4-pro 'Read tests/test_telegram_notify.py, then reply with the number of test functions only. Do not edit files.'
It completed and returned the expected answer.
-
Similar OpenCode/DeepSeek runs launched through Codex/tmux/exec were unreliable: broad prompts stalled, behavior was hard to diagnose from the harness, and sandboxed/executed variants appeared to have trouble with OpenCode's normal home-directory state/lock/database files.
-
The likely failure surface was not DeepSeek or OpenCode itself. The same model/tool worked outside the Codex harness. The problem was the mismatch between Codex's sandbox/exec/harness environment and a normal host terminal when supervising an external agent CLI that maintains its own state.
-
The result was misleading for a supervisor workflow: Codex could not confidently tell whether the worker was actually blocked, sandboxed, waiting, or simply behaving differently under the harness.
Expected supervisor workflow:
- Codex reads and plans.
- Codex starts a tmux pane running an external worker CLI such as OpenCode/DeepSeek.
- Worker edits/tests.
- Codex reviews diffs and runs final verification.
That workflow works manually on the same machine, but was unreliable when Codex launched/managed the external CLI.
What steps can reproduce the bug?
Uploaded thread: 019e49fc-70cc-7632-b042-68686cbb5289
-
On macOS, from a repo workspace, ask Codex to supervise implementation while delegating coding/testing to an external CLI worker in tmux, e.g. OpenCode with DeepSeek.
-
Have Codex launch a worker command such as:
opencode run -m deepseek/deepseek-v4-pro ''
or run it inside a tmux pane from Codex.
-
Observe that the worker can stall or behave unreliably from the Codex-managed execution path. The same tool/model can work normally from Terminal.
-
In a normal Terminal, run a minimal smoke test:
opencode run -m deepseek/deepseek-v4-pro 'Read tests/test_telegram_notify.py, then reply with the number of test functions only. Do not edit files.'
In my case this completed successfully and returned the expected result.
-
Compare that with the Codex-supervised/harness-launched run. Codex had difficulty determining whether the worker was blocked due to sandbox permissions, external CLI state files, model behavior, or harness management.
Searches I checked before filing:
- opencode sandbox
- deepseek opencode
- SQLITE_READONLY OR EPERM opencode
- external agent sandbox CLI
I did not find an issue specifically covering OpenCode/DeepSeek or external agent CLI supervision failing under Codex while the same command works in Terminal.
What is the expected behavior?
Codex should either support this external-supervisor workflow reliably or make the limitation explicit.
Expected behavior:
- If a command is approved/escalated, Codex should make clear whether it is host-equivalent or still subject to sandbox/harness differences.
- External CLI tools with normal home-directory state, lock files, and local databases should either work when approved, or Codex should surface the exact sandbox denial/state-path problem rather than leaving the supervising agent to infer it from a stalled worker.
- A tmux-launched external CLI worker should be observable enough for Codex to distinguish: running, waiting for input, blocked by sandbox, failed due permission, failed due model/provider, or completed.
- The same command should not behave materially differently under Codex without a clear warning or diagnostic.
This matters because external worker CLIs are a practical way to implement supervised multi-agent workflows where Codex remains responsible for planning, diff review, final verification, and merge decisions.
Additional information
This is not about Codex's built-in subagents. It is about Codex supervising an external CLI agent process, specifically OpenCode with DeepSeek, from a tmux/executed shell workflow.
Workarounds used:
- The user manually ran OpenCode/DeepSeek in Terminal to confirm the tool/model worked outside Codex.
- Codex narrowed prompts and switched from deepseek/deepseek-v4-pro to deepseek-chat for the delegated implementation after troubleshooting.
- Codex treated worker output as untrusted, reviewed every diff manually, and ran final verification itself.
What would make this better:
- Clear diagnostics when sandboxed commands cannot access external CLI state under ~/.local, ~/.cache, ~/.config, or similar paths.
- A documented/reliable way for Codex to launch and monitor external supervisor-worker CLIs in tmux.
- Explicit UI/tool output stating whether an escalated command is truly host-equivalent or still differs from a normal terminal.
- Better status reporting for worker processes launched through the Codex harness: running, blocked on permission, waiting for input, exited, or unreachable.
What version of Codex CLI is running?
codex-cli 0.132.0
What subscription do you have?
ChatGPT Pro
Which model were you using?
gpt-5.5 as supervising Codex model; external worker commands used opencode with deepseek/deepseek-v4-pro and deepseek-chat
What platform is your computer?
Darwin 25.5.0 arm64 arm
What terminal emulator and version are you using (if applicable)?
Ghostty with tmux 3.6b, zsh. Codex env reports TERM_PROGRAM=tmux, TERM=tmux-256color. Manual opencode commands were run from a normal terminal session.
Codex doctor report
What issue are you seeing?
When using Codex as a supervising agent and delegating implementation/testing to an external CLI worker (OpenCode with DeepSeek), commands launched through the Codex exec/harness were unreliable in ways that did not reproduce in a normal terminal.
The concrete pattern:
The user could run this manually without issue:
opencode run -m deepseek/deepseek-v4-pro 'Read tests/test_telegram_notify.py, then reply with the number of test functions only. Do not edit files.'
It completed and returned the expected answer.
Similar OpenCode/DeepSeek runs launched through Codex/tmux/exec were unreliable: broad prompts stalled, behavior was hard to diagnose from the harness, and sandboxed/executed variants appeared to have trouble with OpenCode's normal home-directory state/lock/database files.
The likely failure surface was not DeepSeek or OpenCode itself. The same model/tool worked outside the Codex harness. The problem was the mismatch between Codex's sandbox/exec/harness environment and a normal host terminal when supervising an external agent CLI that maintains its own state.
The result was misleading for a supervisor workflow: Codex could not confidently tell whether the worker was actually blocked, sandboxed, waiting, or simply behaving differently under the harness.
Expected supervisor workflow:
That workflow works manually on the same machine, but was unreliable when Codex launched/managed the external CLI.
What steps can reproduce the bug?
Uploaded thread: 019e49fc-70cc-7632-b042-68686cbb5289
On macOS, from a repo workspace, ask Codex to supervise implementation while delegating coding/testing to an external CLI worker in tmux, e.g. OpenCode with DeepSeek.
Have Codex launch a worker command such as:
opencode run -m deepseek/deepseek-v4-pro ''
or run it inside a tmux pane from Codex.
Observe that the worker can stall or behave unreliably from the Codex-managed execution path. The same tool/model can work normally from Terminal.
In a normal Terminal, run a minimal smoke test:
opencode run -m deepseek/deepseek-v4-pro 'Read tests/test_telegram_notify.py, then reply with the number of test functions only. Do not edit files.'
In my case this completed successfully and returned the expected result.
Compare that with the Codex-supervised/harness-launched run. Codex had difficulty determining whether the worker was blocked due to sandbox permissions, external CLI state files, model behavior, or harness management.
Searches I checked before filing:
I did not find an issue specifically covering OpenCode/DeepSeek or external agent CLI supervision failing under Codex while the same command works in Terminal.
What is the expected behavior?
Codex should either support this external-supervisor workflow reliably or make the limitation explicit.
Expected behavior:
This matters because external worker CLIs are a practical way to implement supervised multi-agent workflows where Codex remains responsible for planning, diff review, final verification, and merge decisions.
Additional information
This is not about Codex's built-in subagents. It is about Codex supervising an external CLI agent process, specifically OpenCode with DeepSeek, from a tmux/executed shell workflow.
Workarounds used:
What would make this better: