Skip to content

fix(codex): approvalPolicy=never + workspaceWrite sandbox#488

Closed
dcellison wants to merge 3 commits into
mainfrom
fix/codex-approval-policy
Closed

fix(codex): approvalPolicy=never + workspaceWrite sandbox#488
dcellison wants to merge 3 commits into
mainfrom
fix/codex-approval-policy

Conversation

@dcellison
Copy link
Copy Markdown
Owner

Summary

Codex's default approvalPolicy: "on-request" makes the server emit approval-request notifications and wait for the client to respond before any tool call. Kai is an unattended Telegram bot with no in-loop approver; codex sits silent waiting for approval, the bot's stdout-readline ceiling fires, the operator sees Error: Codex timed out. The default readOnly sandbox also prevents codex from writing review/artifact files under /tmp.

Surfaced today during the live spec-review smoke test: PING (no tool calls) worked; every task that required reading a file or writing an artifact timed out.

Change

thread/start params now pin:

"approvalPolicy": "never",
"sandbox": "workspaceWrite",

These match the BashTool / Edit / Write posture the claude backend ships with — codex now has the same authority profile in the workspace.

Test plan

  • pytest tests/test_codex.py — 63 passed
  • make check clean
  • Smoke test on Mac mini: a spec-review task that reads /var/lib/kai/home/.../specs/... and writes /tmp/spec-N-review.md completes without timing out.

Refs #480 epic; same smoke session as #485.

zigguratt added 3 commits May 15, 2026 14:14
Codex's default `approvalPolicy: "on-request"` emits approval-request
notifications and waits for the client to respond before any tool
call. Kai is an unattended Telegram bot with no in-loop approver;
the bot's stdout-readline ceiling fires before codex unblocks, and
the operator sees "Codex timed out". The default readOnly sandbox
also prevents codex from writing review files to /tmp.

thread/start params now pin `approvalPolicy: "never"` and
`sandbox: "workspaceWrite"`, matching the BashTool / Edit / Write
posture the claude backend ships with.
Codex's sandbox enum variants are kebab-case ('read-only',
'workspace-write', 'danger-full-access'). The previous commit pinned
'workspaceWrite' (camelCase) which the server rejected at
thread/start with 'unknown variant workspaceWrite'.
Two smoke-test findings from the same session:

1. The `workspace-write` sandbox in the previous commit disabled
   network access by default, breaking `gh`, `curl`, and anything
   else codex needs to reach outside the local filesystem. Switch
   to `danger-full-access` to match the claude backend's posture -
   claude --print runs unsandboxed with whatever permissions the
   bot's os_user has, and codex now has the same authority profile.

2. A single turn can emit MULTIPLE agentMessage items (e.g. a
   preamble agentMessage, a tool call, then a post-tool summary
   agentMessage). The previous parser concatenated deltas across
   items without a separator AND let item N's item/completed
   `text` field override the entire `accumulated` string -
   erasing prior committed items. The Telegram message displayed
   only the last item's text, overwriting earlier content.

   Rewrite the streaming state to track per-item: `committed_text`
   holds completed items joined with blank-line separators;
   `current_item_text` holds the in-flight item's delta sum;
   `current_item_id` correlates which item the deltas belong to.
   item/completed authoritatively sets the CURRENT item's text
   only, commits it to the prefix, and resets. The visible text
   is `committed + ("\n\n" + current if current)`. Defensive
   schema-drift handling: a delta arriving without a prior
   item/started treats the new itemId as opening a new item;
   turn/completed flushes any uncommitted current_item_text into
   the prefix before returning.

Two new tests in TestStreamParsing lock the multi-item join with
a blank-line separator and the per-item override scope.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants