From e2a3d9751de9161f04e06b907648dcc4bbd5795d Mon Sep 17 00:00:00 2001
From: chaodu-agent <chaodu-agent@users.noreply.github.com>
Date: Sun, 3 May 2026 18:40:47 +0000
Subject: [PATCH] docs(codex): add Approval Policy & Auto-review section

---
 docs/codex.md | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 49 insertions(+)

diff --git a/docs/codex.md b/docs/codex.md
index 68f48c78..50d1b693 100644
--- a/docs/codex.md
+++ b/docs/codex.md
@@ -187,6 +187,55 @@ Example user prompt after creating such a skill:
 Use $discord-imagegen-deliver to generate a warm hand-painted sky with birds and send it back to this Discord thread.
 ```
 
+## Approval Policy & Auto-review
+
+Codex offers three approval modes that control what happens when the agent
+tries to act outside the sandbox (network calls, running scripts, etc.):
+
+| Mode | Behaviour | Best for |
+|------|-----------|----------|
+| **Manual** (`approval_policy = "on-request"`) | Every out-of-sandbox action waits for a human to approve | Interactive, attended sessions |
+| **Auto-review** (`approval_policy = "auto-review"`) | A separate reviewer agent (GPT-5.4 Thinking) approves or denies automatically | **OpenAB / unattended agents** |
+| **Full Access** (`approval_policy = "full-access"`) | No sandbox enforcement at all | Trusted, isolated environments only |
+
+For OpenAB deployments, **Auto-review is the recommended mode**. OpenAB agents
+run as long-lived background processes with no human watching the terminal, so
+manual approval is impractical and Full Access removes all guardrails.
+
+Enable Auto-review in `/home/node/.codex/config.toml`:
+
+```toml
+[sandbox]
+approval_policy = "auto-review"
+```
+
+Or pass it at install time via Helm:
+
+```bash
+helm install openab openab/openab \
+  --set agents.codex.discord.enabled=true \
+  # ... other flags ...
+  --set-json 'agents.codex.extraConfig={"sandbox":{"approval_policy":"auto-review"}}'
+```
+
+### What Auto-review does
+
+- Approves ~99% of legitimate out-of-sandbox actions automatically.
+- Blocks actions that could exfiltrate data, expose secrets, delete data, or
+  weaken security settings.
+- When it rejects an action, it gives the agent a rationale so Codex can find a
+  safer alternative (succeeds >50% of the time without human input).
+- Stops the trajectory after repeated denials to prevent gaming.
+
+### Limitations
+
+Auto-review is **not** a security guarantee. It can be misled by adversarial
+inputs and cannot detect a model that hides malicious intent within the sandbox.
+Treat it as a strong default, not a replacement for network-level controls and
+secret management.
+
+For more details, see the [OpenAI Alignment Blog post on Auto-review](https://alignment.openai.com/auto-review).
+
 ## Troubleshooting
 
 ### `bwrap: No permissions to create a new namespace`