feat(ce-code-review): add cross-model adversarial review pass by tmchow · Pull Request #1007 · EveryInc/compound-engineering-plugin

tmchow · 2026-06-26T01:05:02Z

Summary

ce-code-review's adversarial pass ran only as an in-process subagent — same model, same process as the orchestrator — so its "independent" findings shared the host model's blind spots. This adds a cross-model adversarial pass: when the adversarial reviewer is selected, the review also runs the same brief through a different model family in a separate read-only process and folds its findings into Stage 5, where agreement between the two becomes the strongest signal in the set. The PR also reworks how the review is presented and hardens host self-identification so the pass works across Claude Code, Codex, and Cursor.

The cross-model pass

Self-identifies the host at runtime (no build-time wiring) and picks a peer from a different model family:

Running in	Peer	Read-only via
Claude Code / Cursor	Codex (`codex exec`)	`-s read-only` hard sandbox
Codex	Claude (`claude -p`)	`dontAsk` + denied `Bash`/edit tools + embedded diff

The peer returns the same findings-schema.json shape every reviewer uses, enters Stage 5 as adversarial-<peer>, and a shared dedup fingerprint with the in-process adversarial persona promotes the finding (different model families, separate processes — the strongest corroboration available).

The invocation lives in a bundled script, scripts/cross-model-adversarial-review.sh, invoked via a SKILL_DIR anchor: the agent sets SKILL_DIR to the directory it read the SKILL.md from, then runs "$SKILL_DIR/scripts/…". The Bash tool's working directory is the user's project on Claude Code, Codex, and Cursor alike, and no harness exposes a skill-dir env var, so the model supplies the path — deterministic on every host. The script self-locates the persona + schema via BASH_SOURCE, composes the prompt from the canonical references/personas/adversarial-reviewer.md (single source — no duplication, no prompt passed by the orchestrator), and writes adversarial-<peer>.json.

Design decisions

Runtime self-identification, not build/convert-time — host markers (CLAUDECODE, CODEX_SANDBOX/CODEX_SESSION_ID/CODEX_THREAD_ID/CODEX_CI, CURSOR_AGENT) plus a process-ancestry fallback when env markers are stripped. An unrecognized host degrades to a silent skip — never a wrong peer.
Non-blocking — any gap (no peer, unauthed, timeout, unparseable output) skips the pass; the review never fails on it.
Bounded, orphan-safe peer — the peer runs under gtimeout/timeout (kills the whole process tree). Codex streams its reasoning, so it gets an idle-timeout (killed only when output stalls, letting a productive long run continue) with a hard backstop; Claude's single-shot output gets the hard cap.
Genuinely read-only — Codex is sandboxed; Claude (no sandbox) denies Bash outright so it can't mutate even under a broad user Bash(*) allow-rule, and reviews an embedded diff.
Robust I/O — the diff goes into the peer via stdin (no ARG_MAX ceiling on large diffs); codex's findings come back via -o (written by the CLI outside the sandbox, so it works under -s read-only) with a stdout-recovery fallback if that file ever doesn't materialize. Every peer return is validated against the full Stage 5 contract (reviewer + findings/residual_risks/testing_gaps arrays) and its reviewer name normalized to adversarial-<peer> before fold-in.
Launched in parallel with the persona reviewers (a background shell-out, not a subagent), so its runtime overlaps theirs.

Presentation rework

Replaced Stage 6's rigid "use this exact pipe table or it's an instant fail" with action-shaped direction: make four things clear per finding (what/why/response/confidence), group by unit-of-work or decision (decisions a human must make vs mechanical work), spend words only on what the diff can't show, and keep a self-sufficient closing because in long terminal output the bottom is the most-read screen. The team announce drops internal [session model]/[mid-tier] plumbing (model-tier selection is preserved as an internal correctness invariant) and surfaces the cross-model peer prominently. Hard constraints kept: ASCII-only / no box-drawing, stable numbering, verdict + actionable list last.

Authoring guidance (AGENTS.md)

Recast the bundled-script path-resolution guidance into a three-tier model — read-time references (relative, no anchor), prose pointers the agent acts on (relative + a "from this skill's directory" cue), and executed shell (the SKILL_DIR anchor as the conservative default; bare relative is agent-resolved and often works but isn't deterministic). Reframed ${CLAUDE_SKILL_DIR} as a footgun in this cross-agent plugin (it silently misses on every non-Claude install).

Validation

Cross-host, end-to-end: all three directions exercised on a live diff, read-only, zero edits to the target — Claude→Codex, Codex→Claude, Cursor→Codex.
Path resolution settled empirically across Claude Code, Codex, and Cursor (project CWD on all three; the SKILL_DIR anchor runs the bundled script on all three).
skill-creator eval: a fresh agent ran the full skill from disk on a planted-bug repo — set SKILL_DIR, launched the script in the background, codex ran read-only, findings folded into Stage 5, and the cross-user export IDOR was promoted to P0 via in-process adversarial + adversarial-codex agreement.
Codex PR-review feedback resolved across rounds: peer-process bounding + idle-timeout, reviewer-name normalization, Claude read-only hardening, full Stage 5 contract validation, claude-via-stdin (ARG_MAX), codex stdout-recovery, schema/auth.
bun test 1613 / 0; release:validate in sync.

Behavior is skill prose plus a bundled script, not a runnable UI — validation is the cross-host runs, the eval, and the test suite above.

Run the adversarial review through a different model family than the host, in a separate process, for genuine independence the in-process subagent cannot provide. Self-identifies the host at runtime (Claude, Codex, or Cursor) and shells out to the peer CLI read-only -- Codex when host is Claude/Cursor, Claude when host is Codex. The peer returns the same findings-schema.json shape and folds into Stage 5 as reviewer `adversarial-<peer>`, so agreement with the in-process adversarial persona promotes the finding (different model families, separate processes -- the strongest signal in the set). Gated on adversarial-reviewer selection plus local-aligned/standalone scope, and non-blocking: any host-detect, auth, timeout, or parse gap degrades to a silent skip and never fails the review.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8d59e24c9e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Codex exposes different session markers per surface: the CLI sets CODEX_SANDBOX/CODEX_SESSION_ID, but web/API/CI runs set only CODEX_THREAD_ID/CODEX_CI. Host detection checked only the former, so on those surfaces XHOST fell through to `unknown` and the cross-model pass silently skipped even with claude installed and authed. Check the union. Addresses Codex review feedback on #1007.

No env var Codex sets is guaranteed across every surface, and shell_environment_policy / cloud / IDE inheritance can strip markers from subprocesses -- so the env-marker tier (now Tier A) will always be incomplete and a miss left the cross-model pass silently skipping. Add Tier B: when Tier A yields `unknown`, walk the process ancestry (basename only, via `ps -o comm=`) for codex/claude/cursor-agent before giving up. The launching CLI sits in the process tree even when its env is stripped. Verified the walk resolves the host on macOS. A miss still degrades to a silent skip, never a wrong peer.

The team announce leaked internal plumbing to the user (model-tier labels, scope-mode codenames, disk-staging and dispatch narration) and buried the cross-model pass in a generic unnamed footer. Add a "report outcomes, not machinery" operating principle, derive each reviewer's model tier from the fixed persona rule at dispatch (so the tier no longer has to be printed in the announce), and require the cross-model pass to surface as its own prominent line that names the peer model. Guidance is principle-based with falsifiable requirements, not scripted wording.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6ebc73660c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…l pass Presentation: replace Stage 6's rigid "use this exact pipe table or it's an instant fail" with principle-based direction that shapes output to the reader's next action — per-finding clarity (what/why/response/confidence), shape serves the finding type, group by unit of work/decision (apply-queue vs decision-gate), economy of expression not coverage (don't paste files), and a self-sufficient closing because in long output the bottom is the most-read screen. Keep only the hard constraints (ASCII / no box-drawing / stable numbering / verdict last). Template reframed to a canonical section skeleton with one example rendering; contract test retargeted from the table shape to the direction + hard constraints. Cross-model pass: launch the peer in parallel (background) with the persona reviewers instead of after them, and drop --output-schema on the Codex path (its strict mode rejects the permissive findings-schema, wasting a call) — rely on -o capture + the JSON-demanding prompt + Stage 5 malformed-drop.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bef32fdfa6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…ness invariant The earlier presentation cleanup removed the [session model]/[mid-tier] labels from the user-facing announce, but those labels did two jobs: visibility AND serving as the external-memory record that the right model gets applied at dispatch (on a top-tier session, a missed override silently runs a reviewer at the expensive tier). Removing the display is correct; losing the discipline is not. Reinstate the tier decision as an explicit internal working list applied on every dispatch call, framed as a correctness guarantee with its rationale — display removed, discipline kept.

…SKILL_DIR anchor Empirical 3-host test (Claude Code, Codex, Cursor) showed all three run a skill's bundled-script command with pwd = the project, not the skill dir, so bare-relative fails everywhere (not just Claude), and no *_SKILL_DIR env var exists on any host. The portable, verified pattern is the model-filled SKILL_DIR anchor (set SKILL_DIR to the dir the agent read SKILL.md from, run "$SKILL_DIR/scripts/..."), as used in production by last30days. Reframe ${CLAUDE_SKILL_DIR} as a Claude-only alternative and replace the "avoid scripts for portability" guidance with "anchor them instead."

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4fcffb66e4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…ored script The cross-model adversarial pass's invocation (prompt composition, read-only flags, output capture, timeouts) moves from inline reference prose into scripts/cross-model-adversarial-review.sh, invoked via the SKILL_DIR-from-read- path anchor (portable across Claude/Codex/Cursor, empirically verified; bare relative and ${CLAUDE_SKILL_DIR} both fail or are non-portable). The script reads the canonical adversarial-reviewer persona + findings-schema itself (single source, no duplication, no prompt passed by the orchestrator), adds a per-peer timeout (idle for streaming codex, hard cap for single-shot claude), self-locates via BASH_SOURCE, and is non-blocking. The reference shrinks to: detect host -> gate -> announce -> run script -> fold in. e2e-verified: the script ran codex read-only on a real diff and produced 3 schema-shaped findings.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cc4bfe4ee8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…sallowed-tools Addresses Codex review feedback on #1007: - Bound the peer process itself, not just the wrapper shell: run codex/claude in the foreground under gtimeout/timeout (kills the whole process tree on expiry), replacing the hand-rolled background+kill that leaked orphans on macOS (no setsid) and could write adversarial-<peer>.json after Stage 5 had already skipped it. Hard cap 300s via CROSS_MODEL_HARD_SECS. - Pass --disallowedTools as separate variadic args (Edit Write NotebookEdit) rather than one quoted string, which is ambiguous since tool names can contain spaces. (Two other threads — Claude API-key auth and the Codex findings schema — were already resolved by the earlier script refactor: there is no claude-auth-status preflight anymore, and the full schema is embedded in the codex prompt, verified to produce complete schema-shaped findings.)

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1a58591463

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

The previous fix for the orphan-process bug over-corrected to a plain wall-clock cap, dropping the idle-timeout the cross-model "second opinion" pattern calls for. Restore it correctly: codex exec streams its reasoning, so watch that stream and kill only when it stalls for CROSS_MODEL_IDLE_SECS (default 120s) — a productive long run continues — with a hard backstop CROSS_MODEL_HARD_SECS (default 600s). Orphan-safety is kept because the peer runs under gtimeout/timeout, which forwards an external kill to the whole process tree (verified on this host), so the idle watchdog signals gtimeout rather than an un-propagated subshell. Claude stays hard-cap only (its --output-format json output is single-shot, nothing to watch).

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cdc3ceda56

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…de peer to read-only Addresses two Codex review threads on #1007: - Force reviewer = adversarial-<peer> after capture. The persona's example JSON uses reviewer:"adversarial"; if the peer echoed that, Stage 5 would fold it as the in-process adversarial reviewer and lose the cross-model agreement signal. - Make the Claude peer genuinely read-only: deny Bash entirely (deny overrides a user's broad Bash(*) allow-rule; dontAsk + edit-tool denial alone did not) and embed the diff in the prompt so Claude needs no shell. Codex is unchanged -- it fetches its own diff inside the -s read-only hard sandbox.

- Bare-relative fails LOUDLY (exit 127) with nondeterministic model recovery, not a silent skip; frame the SKILL_DIR anchor's value as determinism. - The genuine silent skip is the guarded ${CLAUDE_SKILL_DIR} form off-Claude (unset var -> then-branch never fires); fold the native-Codex/converter rationale into that narrower point and make the anchor the stated default. - Note shell state doesn't persist across Bash-tool calls, so SKILL_DIR must be set inline per invocation, and a script needing its own dir uses BASH_SOURCE (ref impl: cross-model-adversarial-review.sh). - Permission caveat: the anchor's dynamic absolute path won't match a static Bash pin, so expect a one-time prompt per command; the pin trick is for the fixed ${CLAUDE_SKILL_DIR} form.

…agent plugin Every skill here is authored once and installed across Claude/Codex/Cursor/ Gemini, so ${CLAUDE_SKILL_DIR} (a Claude-only content substitution, empty elsewhere) silently misses on every non-Claude install. Reframe it from a neutral alternative to "avoid": use the model-filled SKILL_DIR anchor for all bundled-script invocations; reach for ${CLAUDE_SKILL_DIR} only for genuinely Claude-only behavior that never runs elsewhere (essentially never here), and treat any new use as a smell. Existing guarded uses (ce-compound) should migrate.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 77dda0247c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…pt via stdin Addresses two Codex review threads on #1007: - Validate the peer return against the full Stage 5 top-level contract before fold-in (reviewer string + findings/residual_risks/testing_gaps arrays), and backfill the two soft arrays / drop non-array findings. Previously only `.findings` was checked, so a return Stage 5 would later drop could pass here. - Pass the claude peer prompt via stdin instead of a `$(cat …)` argv. Now that the diff is embedded in the prompt, a large diff could exceed ARG_MAX and make claude fail silently on exactly the big reviews that need it. (claude -p reads the prompt from stdin; verified.)

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f5e6a57385

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…ction skeleton Addresses Codex review on #1007: the hard constraint "Verdict and Actionable list are present, last" appeared to contradict the Stage 6 skeleton, which places the Actionable Findings section at position 5 (before Coverage/Verdict). Clarify that the constraint is satisfied by the closing, not the skeleton: the Verdict is the final report section, immediately followed by the post-report Actionable recap ("Emit actionable findings summary"), which is the self- sufficient last word; the in-report Actionable Findings section keeps position 5 as the detailed table.

… model Independent research surfaced that "bare relative does not work / fails loudly" overstated it: relative paths in executed bash are agent-resolved (the agent translates them against the skill dir it loaded), so they often work — they're just non-deterministic. Recast the section into three tiers: - Tier 1 read-time refs: relative, no anchor (unchanged). - Tier 2 prose pointers the agent acts on: relative + a "from this skill's directory" cue. - Tier 3 executed shell: the SKILL_DIR anchor as the conservative house default — bare relative can work via agent resolution but isn't deterministic; the failure mode is a fenced block copied verbatim into a Bash call. Scope the CLAUDE_SKILL_DIR replacement to tier-3 executed-shell (was "all invocations"); keep the footgun reasoning, the two SKILL_DIR constraints, and the permission caveat. Simplify the example to the minimal inline form.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5b25034e62

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

… unavailable codex exec -o writes the artifact at the CLI level, outside the model's sandbox, so it succeeds under -s read-only (verified). But to be robust to any sandbox/ config where it doesn't materialize, fall back to recovering the same JSON from codex's captured stdout (which we already collect for the idle-monitor) via a balanced-brace scan for the last object with a findings key. No data lost on the read-only path. Also capture stdout in the no-(g)timeout branch so the fallback works there too.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dc5efa4a46

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…liveness Addresses Codex review thread "Kill the Codex peer, not only the timeout wrapper": killing the (g)timeout wrapper on idle wasn't guaranteed to reap codex (-k only escalates on gtimeout's own expiry, not an external kill, so a peer deferring SIGTERM could survive wait and write $OUT after a skip). Codex now runs in its own process group (set -m); the watchdog owns both the idle-timeout and the hard cap and reaps the whole group (TERM then KILL after a grace). A second-opinion codex review then found that the grace loop checked only the leader pid — a leader that exits while a child defers TERM would skip the group KILL — so the loop now checks group liveness (kill -0 on the negative pgid). Verified against that exact repro (leader exits on TERM, child ignores it -> group reaped, 0 survivors). claude keeps the gtimeout wrapper (single-shot; its own -k escalates correctly).

…g can't misfire Addresses Codex review thread "Avoid treating hidden Codex reasoning as idle": under a user config with hide_agent_reasoning = true, codex exec can stop writing to the stream during a long reasoning phase, so the idle watchdog could reap a still-working peer on big diffs. Force reasoning on for our subprocess (-c hide_agent_reasoning=false) so the stream stays a reliable liveness signal regardless of the user's global config, and raise the idle default 120s -> 180s for extra headroom. The hard cap (600s) remains the backstop.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ef3a0a1b99

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…de peer Addresses Codex review thread "Deny MCP tools for the Claude peer": under --permission-mode dontAsk, a user's pre-approved MCP tools (Linear/Slack/deploy writers) would run without prompting, so the peer wasn't actually read-only despite the deny list covering only built-in edit/shell tools. Add `mcp__*` (the documented way to remove MCP tools) and `Task` (a spawned subagent would bypass the deny list). Also drop `MultiEdit`, which `claude` reports as "matches no known tool" (a no-op that only emitted a warning). Verified the new list parses with no warnings.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dcc6933185

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-26T18:05:24Z

+    # read-only even with MCP servers configured. claude can't write a file under those
+    # perms, so it emits the JSON envelope on stdout (captured to PEERLOG); we extract it.
+    if [ -n "$TO_BIN" ]; then
+      "$TO_BIN" -k 10 "$HARD_SECS" claude -p --model opus --permission-mode dontAsk \


Disable Claude hooks for read-only peer

When this runs on a Codex host with user or project Claude hooks configured, this claude -p invocation still loads those hooks; --disallowedTools only removes model-visible tools. I checked the Claude CLI reference, which says --bare skips hooks/MCP/CLAUDE.md, and the hooks reference, which says hooks execute automatically at lifecycle points, so SessionStart/UserPromptSubmit/Stop hook shell or HTTP handlers can still mutate the checkout or trigger side effects despite this pass being advertised as read-only. Add --bare/safe-mode or otherwise disable hook loading for the peer invocation.

Useful? React with 👍 / 👎.

…anded #1007 merged the three-tier AGENTS.md rewrite to main, so the doc no longer supersedes AGENTS.md — it is the rationale behind it. Replace the 'Repo migration status' note (which assumed the old guard guidance) with 'Where this is codified', and restore the AGENTS.md reference to state the two agree. Makes the Codex P2 on #1012 moot.

jarodtaylor · 2026-06-27T17:03:37Z

🔥