Skip to content

feat(review): integrate codex-reviewer into stable review pipeline#356

Open
mvanhorn wants to merge 2 commits intoEveryInc:mainfrom
mvanhorn:osc/348-codex-reviewer-review-beta
Open

feat(review): integrate codex-reviewer into stable review pipeline#356
mvanhorn wants to merge 2 commits intoEveryInc:mainfrom
mvanhorn:osc/348-codex-reviewer-review-beta

Conversation

@mvanhorn
Copy link
Copy Markdown
Contributor

@mvanhorn mvanhorn commented Mar 24, 2026

Summary

Add codex-reviewer as a cross-cutting conditional persona in the stable ce:review pipeline. This agent delegates code review to OpenAI's Codex CLI for cross-model validation, catching blind spots that same-model reviewers share.

Retargeted from ce-review-beta to stable ce:review per @tmchow's feedback - beta was promoted to stable in #371.

Changes

  • plugins/compound-engineering/agents/review/codex-reviewer.md - New agent with environment guards (Codex sandbox detection at codex-reviewer.md:17, CLI availability check at codex-reviewer.md:32), structured findings output matching the findings schema, and conservative confidence calibration (0.65 baseline)
  • plugins/compound-engineering/skills/ce-review/SKILL.md - Register codex-reviewer in the cross-cutting conditional persona dispatch table (line 97)
  • plugins/compound-engineering/skills/ce-review/references/persona-catalog.md - Add codex entry to catalog (line 35)
  • tests/review-skill-contract.test.ts - 3 new tests: agent references stable skill, registered in catalog + SKILL.md, follows structured findings contract
  • plugins/compound-engineering/README.md - Add codex-reviewer to review agent table

Context

Testing

All 389 tests pass (bun test), including 3 new codex-reviewer contract tests verifying:

  1. Agent file references stable ce:review skill (no ce:review-beta remnants)
  2. Codex-reviewer registered in both persona-catalog.md and SKILL.md dispatch table
  3. Agent follows structured findings contract (confidence calibration, JSON output, suppression rules)

This contribution was developed with AI assistance (Claude Code).

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3f96cf6279

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +51 to +55
Extract the base branch from the review context passed by ce:review-beta.

Fallback resolution order:
1. Base branch from PR metadata (if reviewing a PR)
2. Detect from remote HEAD:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Pass PR base branch into codex reviewer context

This step assumes ce:review-beta provides the base branch in review context, but the current subagent template only passes intent, file list, and diff (skills/ce-review-beta/references/subagent-template.md, lines 36-43). In PRs that target a non-default branch (for example release/backport PRs), this reviewer will fall back to origin/HEAD or main and run codex review against the wrong baseline, which can both miss real regressions and introduce unrelated findings.

Useful? React with 👍 / 👎.

1. Base branch from PR metadata (if reviewing a PR)
2. Detect from remote HEAD:
```bash
git symbolic-ref refs/remotes/origin/HEAD 2>/dev/null | sed 's@^refs/remotes/origin/@@'
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Replace chained shell recipe with single-step commands

This command uses a pipe and redirection (... 2>/dev/null | sed ...), which violates the repository's required agent/skill rule to keep shell-only examples as single simple commands with no chaining, pipes, or redirects (plugins/compound-engineering/AGENTS.md, lines 116-118). Leaving this pattern in a published reviewer prompt creates cross-platform execution drift and avoidable approval friction in sub-agent runs.

Useful? React with 👍 / 👎.

@tmchow tmchow force-pushed the feat/compare-review-skills branch from 76bf361 to 70e8729 Compare March 24, 2026 03:02
@tmchow
Copy link
Copy Markdown
Collaborator

tmchow commented Mar 24, 2026

#348 was merged to main, so rebase off main please

@mvanhorn mvanhorn force-pushed the osc/348-codex-reviewer-review-beta branch from 3f96cf6 to d7a5270 Compare March 24, 2026 17:39
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d7a5270386

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +2 to +3
".": "2.50.0",
"plugins/compound-engineering": "2.50.0",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Revert release-managed version bumps from feature commit

This feat(review) commit manually bumps release-owned versions to 2.50.0 (and adds release entries) even though release automation is the canonical source for versioning/changelogs in this repo. Pre-bumping these values here can desynchronize Release Please state and cause the next automated release PR to compute the wrong delta or skip expected notes, which is exactly what AGENTS.md warns against for routine feature PRs.

Useful? React with 👍 / 👎.

```

```
if [ -n "$BASE" ]; then echo "BASE:$BASE" && echo "FILES:" && git diff --name-only $BASE && echo "DIFF:" && git diff -U10 $BASE && echo "UNTRACKED:" && git ls-files --others --exclude-standard; else echo "ERROR: Unable to resolve PR base branch <base> locally. Fetch the base branch and rerun so the review scope stays aligned with the PR."; fi
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Split chained scope command into simple shell steps

This scope command packs many operations into one shell line using &&, which conflicts with the plugin rule in plugins/compound-engineering/AGENTS.md (tool-selection section) that shell examples must be single simple commands without chaining/pipes/redirection. Because this snippet is part of the published ce:review-beta workflow prompt, it will propagate avoidable execution drift and permission friction across agent platforms instead of the intended portable behavior.

Useful? React with 👍 / 👎.

@mvanhorn mvanhorn force-pushed the osc/348-codex-reviewer-review-beta branch from d7a5270 to 93ae222 Compare March 24, 2026 18:07
@mvanhorn
Copy link
Copy Markdown
Contributor Author

Rebased onto main. Squashed the branch to a single commit on top of current main since #348's merge created too many conflicts for a clean commit-by-commit rebase. Tests match main (same 42 pre-existing failures, no new ones).

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 93ae22240d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +49 to +50
if (options.mappedTools && options.mappedTools.length > 0) {
return options.mappedTools
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Fall back when tool mapping is only partially successful

In resolveStructuredAgentTools, any non-empty mappedTools list is accepted as final, even if some source tools were unmapped and silently dropped earlier. For example, a Claude agent with tools: ["Read", "Question"] becomes tools: ["read"] in Copilot/Kiro output with no warning, which can remove capabilities the agent depends on and cause runtime failures after conversion. This should treat partial mappings as lossy (warn + safe fallback) rather than narrowing permissions silently.

Useful? React with 👍 / 👎.

@tmchow
Copy link
Copy Markdown
Collaborator

tmchow commented Mar 25, 2026

ce:review-beta is promoted to stable now.

@mvanhorn
Copy link
Copy Markdown
Contributor Author

Noted - since review-beta is now promoted to stable, should I retarget this to integrate codex-reviewer into the stable review pipeline instead? Or close this if the integration approach has changed?

@tmchow
Copy link
Copy Markdown
Collaborator

tmchow commented Mar 25, 2026

Noted - since review-beta is now promoted to stable, should I retarget this to integrate codex-reviewer into the stable review pipeline instead? Or close this if the integration approach has changed?

Target the stable skill.

@mvanhorn
Copy link
Copy Markdown
Contributor Author

Got it - will retarget the codex-reviewer integration to the stable ce:review pipeline. I'll rebase onto main, move the agent and references from ce-review-beta/ into ce-review/, and update the orchestration to match the promoted pipeline structure.

mvanhorn added a commit to mvanhorn/compound-engineering-plugin that referenced this pull request Mar 25, 2026
Add codex-reviewer as a cross-cutting conditional persona in ce:review.
Retargeted from ce-review-beta (now promoted to stable) per maintainer
feedback on PR EveryInc#356.

- Add codex-reviewer.md agent with Codex CLI delegation and environment
  guards (sandbox detection, CLI availability check)
- Register codex in persona-catalog.md and SKILL.md dispatch table
- Add codex-reviewer contract tests to review-skill-contract.test.ts
- Update README.md agent table with codex-reviewer entry
@mvanhorn mvanhorn force-pushed the osc/348-codex-reviewer-review-beta branch from 93ae222 to ec8ef55 Compare March 25, 2026 23:50
@mvanhorn mvanhorn changed the title feat(review): integrate codex-reviewer into review-beta pipeline feat(review): integrate codex-reviewer into stable review pipeline Mar 25, 2026
@mvanhorn mvanhorn changed the base branch from feat/compare-review-skills to main March 25, 2026 23:50
@chatgpt-codex-connector
Copy link
Copy Markdown

💡 Codex Review

PR_BASE_REF=$(git rev-parse --verify "$PR_BASE_REMOTE_REF" 2>/dev/null || git rev-parse --verify <base> 2>/dev/null || true)

P1 Badge Avoid resolving PR base from unrelated local branches

The PR-mode scope command still falls back to git rev-parse --verify <base> when the actual PR base repository ref cannot be resolved. In fork/backport workflows, a local <base> branch often points at a different remote (for example origin/main), so this path can silently compute BASE against the wrong lineage and produce misleading review diffs instead of failing closed. That contradicts the later requirement to stop when the PR base cannot be resolved and can hide real regressions.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

mvanhorn and others added 2 commits March 26, 2026 18:53
Add codex-reviewer as a cross-cutting conditional persona in ce:review.
Retargeted from ce-review-beta (now promoted to stable) per maintainer
feedback on PR EveryInc#356.

- Add codex-reviewer.md agent with Codex CLI delegation and environment
  guards (sandbox detection, CLI availability check)
- Register codex in persona-catalog.md and SKILL.md dispatch table
- Add codex-reviewer contract tests to review-skill-contract.test.ts
- Update README.md agent table with codex-reviewer entry
- Remove error suppression (2>/dev/null) from `which codex` check
- Split chained `git symbolic-ref | sed` into separate steps per
  AGENTS.md shell rule (no chaining, no error suppression)
- Add fail-closed guard: do not fall back to git rev-parse against
  local branch names that may track a different remote
- Clarify that ce:review orchestrator provides base branch context

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@mvanhorn mvanhorn force-pushed the osc/348-codex-reviewer-review-beta branch from ec8ef55 to f524f46 Compare March 27, 2026 01:54
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f524f46d7f

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Extract the base branch from the review context passed by ce:review. The orchestrator passes the base branch as part of the subagent dispatch context.

Resolution order (stop at the first success):
1. Base branch from the review context (ce:review always provides this for PR reviews)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Stop assuming ce:review passes base branch context

This instruction is currently incorrect and causes Codex to review against the wrong baseline for PRs targeting non-default branches. Fresh evidence in this commit: the stable subagent template still only passes Intent, Changed files, and Diff (plugins/compound-engineering/skills/ce-review/references/subagent-template.md, lines 37-44), so the "always provides" claim is false and the agent falls back to origin/HEAD/main, which can miss or invent regressions on release/backport PRs.

Useful? React with 👍 / 👎.

## What you don't flag

- **Style preferences** -- codex often has opinions on naming and formatting. Suppress these entirely.
- **Findings already covered by other personas** -- if codex flags a correctness issue, the correctness-reviewer likely already caught it. Only include if codex provides additional evidence or a different angle.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Remove impossible cross-persona suppression rule

At this point the codex reviewer is running as an independent parallel subagent and does not receive other reviewers' outputs, so telling it to suppress findings "already covered by other personas" forces guesswork and can silently drop valid issues. Since deduplication already happens centrally in Stage 5, this early suppression introduces false negatives without any reliable signal that a finding is truly redundant.

Useful? React with 👍 / 👎.

@mvanhorn
Copy link
Copy Markdown
Contributor Author

Rebased onto main and addressed the Codex P1 findings in f524f46:

  1. Removed 2>/dev/null error suppression from which codex check
  2. Split the chained git symbolic-ref | sed into separate steps per AGENTS.md shell rules
  3. Added a fail-closed guard for base branch resolution - the agent no longer falls back to git rev-parse --verify against local branch names that may track a different remote

The P2 about partial tool mapping in agent-content.ts is a general utility issue outside this PR's scope - happy to address that separately if you'd like.

@tmchow this is now targeting stable ce:review as requested. Ready for another look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants