From 92ac8d8bae80dd3decf7ea7dfcbb0eb7ddbf7d94 Mon Sep 17 00:00:00 2001
From: Lien Chen <screen.leon@gmail.com>
Date: Sat, 2 May 2026 10:02:21 +0900
Subject: [PATCH 1/4] chore: claude-config tier-1 cleanup + agent lint
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Address deferred items from PR #1 critic/architecture review and
add a regression guard:

- commands/pr-gate.md: detect integration branch via
  `git symbolic-ref refs/remotes/origin/HEAD` instead of hardcoding
  `main`; mark Agent(...) block as illustrative pseudocode.
- commands/pm.md: drop "or spawn reviewers X/Y/Z" half — that's
  /pr-gate territory. Move full Bash(codex exec) fallback recipe
  here so PM prompt doesn't carry main-thread implementation
  details.
- agents/project-pm.md: drop the Bash(codex exec) fallback line
  from PM prompt — main-thread fallback paths don't belong here.
- README.md: refresh project-pm description; add "Subagents
  cannot spawn subagents" as the lead Design note with link to
  Agent SDK docs.
- scripts/lint-agents.sh: regression guard — fail if any subagent
  declares Agent in frontmatter tools. Wired into install.sh
  pre-flight.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 README.md              |  3 ++-
 agents/project-pm.md   |  2 +-
 commands/pm.md         |  4 +++-
 commands/pr-gate.md    | 22 +++++++++++++++-------
 install.sh             |  7 +++++++
 scripts/lint-agents.sh | 37 +++++++++++++++++++++++++++++++++++++
 6 files changed, 65 insertions(+), 10 deletions(-)
 create mode 100755 scripts/lint-agents.sh
diff --git a/README.md b/README.md
index 3b75aff..60c4f0a 100644
--- a/README.md
+++ b/README.md
@@ -26,7 +26,7 @@ Idempotent — re-run safely after adding files. Per-file symlinks so other tool
 ### Agents
 
 **Orchestration**
-- **project-pm** — PM across `~/github/` repos. Triages requests, decomposes work, dispatches to `codex-executor`, runs the PR gate, maintains per-project memory at `~/.claude/projects/-home-screenleon-github/memory/project_<repo>.md`.
+- **project-pm** — PM across `~/github/` repos. Triages requests, decomposes work, writes briefs (main thread dispatches), synthesizes PR-gate reviews, maintains per-project memory at `~/.claude/projects/-home-screenleon-github/memory/project_<repo>.md`.
 - **codex-executor** — Thin wrapper subagent. Accepts a complete brief, dispatches to the Codex CLI via `scripts/codex-dispatch.sh`, verifies via `git diff`, reports back.
 
 **Reviewers (advisors — PM may override with reasoning)**
@@ -53,6 +53,7 @@ Idempotent — re-run safely after adding files. Per-file symlinks so other tool
 
 ## Design notes
 
+- **Subagents cannot spawn subagents.** Claude Code intentionally restricts nested `Agent` tool calls regardless of frontmatter declaration ([Agent SDK docs](https://code.claude.com/docs/en/agent-sdk/subagents.md)). The **main thread** orchestrates: it spawns subagents (PM, reviewers, codex-executor) and relays outputs between them. PM produces briefs and synthesizes verdicts; it does not dispatch. Reviewers run in parallel from the main thread, not from PM. Never include `Agent` in any subagent's `tools:` frontmatter — `scripts/lint-agents.sh` enforces this.
 - **PM thinks, Codex implements.** `project-pm` writes the brief; `codex-executor` is a dispatcher, not a designer. Architecture, scope, and acceptance criteria stay with the PM.
 - **Definitions in repo, state on disk.** Agent and command definitions are version-controlled here. Per-project state (memory, traces) lives in `~/.claude/` and stays out of this repo.
 - **Decoupled from agent-playbook-template.** The playbook is a methodology framework; this repo is a personal config. They evolve independently.
diff --git a/agents/project-pm.md b/agents/project-pm.md
index 33a072d..7d57dbc 100644
--- a/agents/project-pm.md
+++ b/agents/project-pm.md
@@ -76,7 +76,7 @@ Required: **working dir** (abs path), **goal** (one sentence), **files to touch*
 Example:
 > In `~/github/foo/`, `src/auth/Login.tsx` drops the redirect param after OAuth callback — `/auth/callback?next=/dashboard` lands on `/`. Fix redirect handling. Existing tests in `src/auth/__tests__/` must still pass; add a test for the redirect case. Sandbox: workspace-write.
 
-Return the brief to the main thread; main thread dispatches via `Agent(subagent_type: "codex-executor", ...)` (or directly via `Bash(codex exec ...)` if `codex-executor` is unavailable). Verify the resulting report against `git diff` before claiming success.
+Return the brief to the main thread; main thread dispatches it. Verify the resulting report against `git diff` before claiming success.
 
 # Per-project memory shape
 
diff --git a/commands/pm.md b/commands/pm.md
index 5a1cc18..eedd0c5 100644
--- a/commands/pm.md
+++ b/commands/pm.md
@@ -7,4 +7,6 @@ Invoke `project-pm` via Agent. Brief with: request ($ARGUMENTS), current working
 
 Relay the PM's user-facing summary. Do not do the PM's job yourself.
 
-**Note**: Subagents cannot spawn subagents. If the PM's reply is "dispatch this brief to codex-executor" or "spawn reviewers X / Y / Z", the **main thread** must do the dispatching. Treat the PM's brief as the input to your own `Agent(subagent_type: "codex-executor", ...)` call (or `Bash(codex exec ...)` if `codex-executor` itself is unavailable in your environment).
+**Note**: Subagents cannot spawn subagents. If the PM's reply is "dispatch this brief to codex-executor", the **main thread** must do the dispatching — treat the PM's brief as input to your own `Agent(subagent_type: "codex-executor", ...)` call (illustrative — emit it as a real Agent tool call). If `codex-executor` is unavailable, fallback to `Bash(codex exec --sandbox workspace-write --skip-git-repo-check -C <cwd> -)` with the brief on stdin.
+
+For PR-gate flows, use `/pr-gate` instead — that skill handles reviewer orchestration; do not re-implement it inline here.
diff --git a/commands/pr-gate.md b/commands/pr-gate.md
index 15dfba6..ea550b8 100644
--- a/commands/pr-gate.md
+++ b/commands/pr-gate.md
@@ -7,7 +7,14 @@ Run the PR gate. Subagents cannot spawn subagents in Claude Code, so the **main
 
 ## Step 1 — classify the diff
 
-In the main thread, run `git diff main...HEAD --stat` (substitute the actual integration branch). Decide:
+In the main thread, detect the integration branch and stat the diff:
+
+```bash
+BASE=$(git symbolic-ref refs/remotes/origin/HEAD 2>/dev/null | sed 's@^refs/remotes/origin/@@' || echo main)
+git diff "$BASE"...HEAD --stat
+```
+
+Decide:
 
 - **docs / content-only** (no runtime code change) → spawn `critic` + `architecture-reviewer` only. Security / risk / qa are `pass-not-applicable`.
 - **implementation change** (any runtime code diff) → spawn all four advisors + qa-tester.
@@ -16,14 +23,15 @@ If unsure, invoke `project-pm` first with the diff stat to get the classificatio
 
 ## Step 2 — spawn reviewers in parallel from main thread
 
-In a single message, make multiple Agent calls (this is the parallel pattern):
+In a single message, make N parallel Agent tool calls — one per applicable reviewer. Pseudocode (illustrative, not literal call syntax):
 
 ```
-Agent(subagent_type: "critic", prompt: <branch + diff context + $ARGUMENTS>)
-Agent(subagent_type: "architecture-reviewer", prompt: <same>)
-Agent(subagent_type: "security-reviewer", prompt: <same>)   # implementation only
-Agent(subagent_type: "risk-reviewer", prompt: <same>)       # implementation only
-Agent(subagent_type: "qa-tester", prompt: <same>)           # implementation only
+# pseudocode — emit each as a real Agent tool call in one message
+Agent(subagent_type: "critic", ...)
+Agent(subagent_type: "architecture-reviewer", ...)
+Agent(subagent_type: "security-reviewer", ...)   # implementation only
+Agent(subagent_type: "risk-reviewer", ...)       # implementation only
+Agent(subagent_type: "qa-tester", ...)           # implementation only
 ```
 
 Each reviewer brief should include: working dir, branch name vs integration branch, diff summary, scope hints from $ARGUMENTS.
diff --git a/install.sh b/install.sh
index b259ace..7c8e12b 100755
--- a/install.sh
+++ b/install.sh
@@ -78,6 +78,13 @@ echo "  claude home: $CLAUDE_HOME"
 if [[ "$DRY_RUN" -eq 1 ]]; then echo "  mode:        DRY RUN"; fi
 echo
 
+# Pre-flight: agent frontmatter must not declare Agent (subagents can't spawn subagents)
+if [[ -x "$REPO_ROOT/scripts/lint-agents.sh" ]]; then
+  echo "==> lint agents"
+  "$REPO_ROOT/scripts/lint-agents.sh"
+  echo
+fi
+
 install_dir agents
 install_dir skills
 install_dir commands
diff --git a/scripts/lint-agents.sh b/scripts/lint-agents.sh
new file mode 100755
index 0000000..0962777
--- /dev/null
+++ b/scripts/lint-agents.sh
@@ -0,0 +1,37 @@
+#!/usr/bin/env bash
+# Validate that no subagent declares the `Agent` tool in its frontmatter.
+# Claude Code strips Agent from subagent runtime schemas regardless of
+# declaration, so leaving it in frontmatter is misleading drift.
+# See README.md "Design notes" for the rule.
+
+set -euo pipefail
+
+repo_root="$(cd "$(dirname "$0")/.." && pwd)"
+agents_dir="$repo_root/agents"
+
+if [ ! -d "$agents_dir" ]; then
+  echo "lint-agents: $agents_dir not found" >&2
+  exit 2
+fi
+
+violations=0
+for f in "$agents_dir"/*.md; do
+  [ -e "$f" ] || continue
+  # Extract content between the first two `---` lines (YAML frontmatter)
+  fm=$(awk '/^---$/{c++; next} c==1' "$f")
+  tools_line=$(printf '%s\n' "$fm" | grep -E '^tools:' || true)
+  if [ -z "$tools_line" ]; then
+    continue
+  fi
+  if printf '%s' "$tools_line" | grep -qE '(^|[, ])Agent([, ]|$)'; then
+    echo "FAIL: $(basename "$f") declares Agent in tools: $tools_line" >&2
+    violations=$((violations + 1))
+  fi
+done
+
+if [ "$violations" -gt 0 ]; then
+  echo "lint-agents: $violations violation(s). Subagents cannot spawn subagents — remove Agent from tools." >&2
+  exit 1
+fi
+
+echo "lint-agents: OK ($(ls "$agents_dir"/*.md 2>/dev/null | wc -l) agent files checked)"

From f636164596d36658dfce71be53d25844fe70d372 Mon Sep 17 00:00:00 2001
From: Lien Chen <screen.leon@gmail.com>
Date: Sat, 2 May 2026 10:10:36 +0900
Subject: [PATCH 2/4] fix(lint): handle YAML block-list/flow forms; CI
 workflow; reduce duplication
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Address review findings on prior tier-1 cleanup commit:

- scripts/lint-agents.sh: previously only matched the inline scalar
  form `tools: Read, Bash, Agent`. The YAML block-list form
      tools:
        - Read
        - Agent
  was silently passed — exact escape hatch the script is meant
  to prevent. Now extracts the full tools: block (handles inline,
  flow `[a, b, c]`, and block-list) and matches Agent as a whole
  token. Also warn on files lacking YAML frontmatter rather than
  treating them as compliant. Verified against fixtures covering
  inline / flow / block-list / solo Agent / "MyAgent" suffix /
  "AgentX" prefix / no-frontmatter cases.

- .github/workflows/lint.yml: run scripts/lint-agents.sh on push
  and PRs. install.sh pre-flight only catches drift on machines
  that re-run install; CI catches it on every contribution
  regardless of whether the contributor symlinks. Defense in depth.

- commands/pm.md: codex fallback now points at scripts/codex-dispatch.sh
  (single source of truth for sandbox / approval / trace flags)
  instead of duplicating the raw codex exec invocation. Avoids
  drift the next time codex-dispatch.sh changes.

- README.md: codex-executor description now explicitly says
  "dispatched by the main thread" — matches the new project-pm
  description and the Design notes lead bullet.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 .github/workflows/lint.yml | 14 ++++++++++++++
 README.md                  |  2 +-
 commands/pm.md             |  2 +-
 scripts/lint-agents.sh     | 32 +++++++++++++++++++++++++++-----
 4 files changed, 43 insertions(+), 7 deletions(-)
 create mode 100644 .github/workflows/lint.yml

diff --git a/.github/workflows/lint.yml b/.github/workflows/lint.yml
new file mode 100644
index 0000000..6a97106
--- /dev/null
+++ b/.github/workflows/lint.yml
@@ -0,0 +1,14 @@
+name: lint
+
+on:
+  push:
+    branches: [main]
+  pull_request:
+
+jobs:
+  lint-agents:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - name: Validate agent frontmatter
+        run: ./scripts/lint-agents.sh
diff --git a/README.md b/README.md
index 60c4f0a..5830681 100644
--- a/README.md
+++ b/README.md
@@ -27,7 +27,7 @@ Idempotent — re-run safely after adding files. Per-file symlinks so other tool
 
 **Orchestration**
 - **project-pm** — PM across `~/github/` repos. Triages requests, decomposes work, writes briefs (main thread dispatches), synthesizes PR-gate reviews, maintains per-project memory at `~/.claude/projects/-home-screenleon-github/memory/project_<repo>.md`.
-- **codex-executor** — Thin wrapper subagent. Accepts a complete brief, dispatches to the Codex CLI via `scripts/codex-dispatch.sh`, verifies via `git diff`, reports back.
+- **codex-executor** — Thin wrapper subagent, dispatched by the main thread. Accepts a complete brief, calls Codex via `scripts/codex-dispatch.sh`, verifies via `git diff`, reports back.
 
 **Reviewers (advisors — PM may override with reasoning)**
 - **critic** — Adversarial review of plan / diff. Scope creep, incompleteness, convention drift.
diff --git a/commands/pm.md b/commands/pm.md
index eedd0c5..d814aca 100644
--- a/commands/pm.md
+++ b/commands/pm.md
@@ -7,6 +7,6 @@ Invoke `project-pm` via Agent. Brief with: request ($ARGUMENTS), current working
 
 Relay the PM's user-facing summary. Do not do the PM's job yourself.
 
-**Note**: Subagents cannot spawn subagents. If the PM's reply is "dispatch this brief to codex-executor", the **main thread** must do the dispatching — treat the PM's brief as input to your own `Agent(subagent_type: "codex-executor", ...)` call (illustrative — emit it as a real Agent tool call). If `codex-executor` is unavailable, fallback to `Bash(codex exec --sandbox workspace-write --skip-git-repo-check -C <cwd> -)` with the brief on stdin.
+**Note**: Subagents cannot spawn subagents. If the PM's reply is "dispatch this brief to codex-executor", the **main thread** must do the dispatching — treat the PM's brief as input to your own `Agent(subagent_type: "codex-executor", ...)` call (illustrative — emit it as a real Agent tool call). If `codex-executor` is unavailable, fallback to invoking `scripts/codex-dispatch.sh` directly via Bash — the script encodes the canonical sandbox / approval / trace-capture flags so this command stays in sync with the rest of the dispatch path.
 
 For PR-gate flows, use `/pr-gate` instead — that skill handles reviewer orchestration; do not re-implement it inline here.
diff --git a/scripts/lint-agents.sh b/scripts/lint-agents.sh
index 0962777..3b4d8fe 100755
--- a/scripts/lint-agents.sh
+++ b/scripts/lint-agents.sh
@@ -17,14 +17,36 @@ fi
 violations=0
 for f in "$agents_dir"/*.md; do
   [ -e "$f" ] || continue
-  # Extract content between the first two `---` lines (YAML frontmatter)
+
+  # Require well-formed frontmatter (>=2 `---` markers)
+  fence_count=$(grep -c '^---$' "$f" || true)
+  if [ "$fence_count" -lt 2 ]; then
+    echo "WARN: $(basename "$f") has no YAML frontmatter; skipping" >&2
+    continue
+  fi
+
+  # Extract content between the first two `---` lines
   fm=$(awk '/^---$/{c++; next} c==1' "$f")
-  tools_line=$(printf '%s\n' "$fm" | grep -E '^tools:' || true)
-  if [ -z "$tools_line" ]; then
+
+  # Capture the tools: block — the inline scalar form OR a block-list
+  # spanning indented `- ` lines until the next non-indented key.
+  tools_block=$(printf '%s\n' "$fm" | awk '
+    /^tools:/ { print; in_block=1; next }
+    in_block {
+      if ($0 ~ /^[[:space:]]/) { print; next }
+      else { in_block=0 }
+    }
+  ')
+
+  if [ -z "$tools_block" ]; then
     continue
   fi
-  if printf '%s' "$tools_line" | grep -qE '(^|[, ])Agent([, ]|$)'; then
-    echo "FAIL: $(basename "$f") declares Agent in tools: $tools_line" >&2
+
+  # Match Agent as a whole token. Separators in YAML scalar/flow:
+  # `,` ` ` `[` `]` `:` and start/end of line. In block-list, `-` precedes.
+  if printf '%s' "$tools_block" | grep -qE '(^|[][, :-])Agent([][, :]|$)'; then
+    echo "FAIL: $(basename "$f") declares Agent in tools" >&2
+    printf '%s\n' "$tools_block" | sed 's/^/  /' >&2
     violations=$((violations + 1))
   fi
 done

From 3a5a07343244dd39f69b345c3f9b9fd4bee00913 Mon Sep 17 00:00:00 2001
From: Lien Chen <screen.leon@gmail.com>
Date: Sat, 2 May 2026 10:18:46 +0900
Subject: [PATCH 3/4] docs(codex): canonical brief schema + executor
 enforcement
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Briefs dispatched to codex-executor have been written ad-hoc each
time — required fields varied, self-verify patterns were copied
from memory rather than referenced, and a vague brief would still
get dispatched (executor's only validation was "working dir is
missing"). This drives composition cost up and quality variance.

Establish a canonical schema and codify the reusable patterns:

- docs/codex-brief.md (new): four required fields (working_dir,
  goal, files, acceptance) and five reusable self-verify macros
  extracted from real briefs run this session: cross-source,
  sample-N OK re-check, git-status no-collateral-damage,
  dedup-across-N, schema-match. Includes one good example brief
  and one bad example for contrast.

- agents/codex-executor.md: enforce the schema. Reject briefs
  missing any required field rather than improvising. Link to
  the schema doc.

- agents/project-pm.md: PM's brief-writing section now points at
  the schema doc and macros, instead of reciting an inline
  example. -3 lines.

- commands/pm.md: note that briefs must follow the schema.

- README.md: new "Codex briefs" section linking to the schema;
  "Adding new pieces" reminds contributors not to put Agent in
  subagent tools (lint-agents.sh enforces).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 README.md                |  6 ++-
 agents/codex-executor.md |  7 ++-
 agents/project-pm.md     |  5 +--
 commands/pm.md           |  2 +
 docs/codex-brief.md      | 97 ++++++++++++++++++++++++++++++++++++++++
 5 files changed, 111 insertions(+), 6 deletions(-)
 create mode 100644 docs/codex-brief.md

diff --git a/README.md b/README.md
index 5830681..32cf940 100644
--- a/README.md
+++ b/README.md
@@ -60,6 +60,10 @@ Idempotent — re-run safely after adding files. Per-file symlinks so other tool
 
 ## Adding new pieces
 
-- New agent: drop a `name.md` (with frontmatter) into `agents/`, re-run `install.sh`.
+- New agent: drop a `name.md` (with frontmatter) into `agents/`, re-run `install.sh`. **Don't include `Agent` in `tools:`** — `scripts/lint-agents.sh` will reject the install.
 - New command: drop a `name.md` into `commands/`, re-run `install.sh`.
 - Settings allowlist additions: edit `~/.claude/settings.json` directly (or use the `update-config` skill); don't try to symlink settings.
+
+## Codex briefs
+
+Schema and reusable self-verify macros: [`docs/codex-brief.md`](docs/codex-brief.md). All briefs dispatched to `codex-executor` must include `working_dir`, `goal`, `files`, and `acceptance`; the executor rejects briefs missing those fields.
diff --git a/agents/codex-executor.md b/agents/codex-executor.md
index 6838cbe..6a7dc13 100644
--- a/agents/codex-executor.md
+++ b/agents/codex-executor.md
@@ -8,7 +8,12 @@ Thin dispatcher. You write nothing yourself; you invoke Codex.
 
 # Job
 
-1. Receive brief. If working dir is missing or the change is ambiguous, stop and ask — do not guess.
+1. **Validate brief against schema** at `~/github/claude-config/docs/codex-brief.md`. REJECT (stop and ask the caller) if missing any of:
+   - `working_dir` (absolute path that exists)
+   - `goal` (one sentence — what changes after this runs)
+   - `files` (concrete paths or search hint; create-new and edit-existing both enumerated)
+   - `acceptance` (testable post-conditions Codex can verify before declaring done)
+   Do not improvise missing fields.
 2. Dispatch via `~/github/claude-config/scripts/codex-dispatch.sh`. Never call `codex exec` directly.
 3. Verify the result against `git diff` — Codex's self-report may not match reality.
 4. Report back in the shape below.
diff --git a/agents/project-pm.md b/agents/project-pm.md
index 7d57dbc..831697c 100644
--- a/agents/project-pm.md
+++ b/agents/project-pm.md
@@ -71,10 +71,7 @@ The main thread runs reviewers in parallel (single message, multiple Agent calls
 
 # Writing a brief for codex-executor
 
-Required: **working dir** (abs path), **goal** (one sentence), **files to touch** (paths or search hint), **constraints** (don't-change, conventions, tests that must still pass), **acceptance criteria** (test, build, concrete check).
-
-Example:
-> In `~/github/foo/`, `src/auth/Login.tsx` drops the redirect param after OAuth callback — `/auth/callback?next=/dashboard` lands on `/`. Fix redirect handling. Existing tests in `src/auth/__tests__/` must still pass; add a test for the redirect case. Sandbox: workspace-write.
+The canonical schema lives in `~/github/claude-config/docs/codex-brief.md`. Briefs must declare `working_dir`, `goal`, `files`, and `acceptance`. Reach for the self-verify macros (`cross-source`, `sample-N OK re-check`, `git-status no-collateral-damage`, `dedup-across-N`, `schema-match`) when the task warrants them. `codex-executor` rejects briefs missing the four required fields — write the full set up front rather than getting bounced.
 
 Return the brief to the main thread; main thread dispatches it. Verify the resulting report against `git diff` before claiming success.
 
diff --git a/commands/pm.md b/commands/pm.md
index d814aca..92ec447 100644
--- a/commands/pm.md
+++ b/commands/pm.md
@@ -9,4 +9,6 @@ Relay the PM's user-facing summary. Do not do the PM's job yourself.
 
 **Note**: Subagents cannot spawn subagents. If the PM's reply is "dispatch this brief to codex-executor", the **main thread** must do the dispatching — treat the PM's brief as input to your own `Agent(subagent_type: "codex-executor", ...)` call (illustrative — emit it as a real Agent tool call). If `codex-executor` is unavailable, fallback to invoking `scripts/codex-dispatch.sh` directly via Bash — the script encodes the canonical sandbox / approval / trace-capture flags so this command stays in sync with the rest of the dispatch path.
 
+Briefs must follow the schema at `docs/codex-brief.md` (working_dir / goal / files / acceptance, plus optional self-verify macros). codex-executor rejects briefs missing the required fields.
+
 For PR-gate flows, use `/pr-gate` instead — that skill handles reviewer orchestration; do not re-implement it inline here.
diff --git a/docs/codex-brief.md b/docs/codex-brief.md
new file mode 100644
index 0000000..e0f6834
--- /dev/null
+++ b/docs/codex-brief.md
@@ -0,0 +1,97 @@
+# Codex brief schema
+
+The canonical structure for any brief dispatched to `codex-executor` (directly via Agent, or indirectly via `scripts/codex-dispatch.sh`).
+
+`codex-executor` rejects briefs missing the four required fields. PMs and main-thread dispatchers should always write briefs against this schema; reach for the optional macros below when the task warrants them.
+
+## Required fields
+
+| Field | What | Example |
+|---|---|---|
+| `working_dir` | Absolute path. Must exist. | `/home/screenleon/github/japanese-site/` |
+| `goal` | One sentence. What changes after this runs. | "Backfill 40 N4 / 40 N3 / 40 N2 kanji entries to fill the empty middle-tier overlay." |
+| `files` | Concrete paths or a search hint. Both create-new and edit-existing must be enumerated. | `server/data/corpus/kanji/{N4,N3,N2}.jsonl` (new); read `N1.jsonl` and `N5.jsonl` for schema |
+| `acceptance` | Testable post-conditions Codex itself can verify before declaring done. | `wc -l shows 40/40/40`; every line parses as JSON; no character collisions across N1-N5 |
+
+A brief missing any of these is a request for guesswork. Reject and ask the caller.
+
+## Optional sections
+
+Use as needed; not all briefs require all of them.
+
+- **`constraints`** — what NOT to do. File paths off-limits, conventions to preserve, tests that must still pass after the change.
+- **`self_verify`** — see macros below. Use whenever the work has external authority (sources, schema, level tag) Codex must not invent.
+- **`output_format`** — when the deliverable is a report (audit, plan), specify the file path and required sections.
+- **`sandbox`** / **`approval`** — only set when overriding the defaults (`workspace-write` / `never`). Caller must authorize.
+
+## Self-verify macros
+
+Reusable phrases. Drop into `self_verify` block of any brief.
+
+### `cross-source` — N independent sources per item
+
+> For each `<item type>`, cross-check against ≥`<N>` independent authoritative sources from `<allowed source list>`. Cite the source in the report column. Do not rely on internal knowledge alone.
+
+Used for: JLPT level audits, fact-checked content batches.
+
+### `sample-N OK re-check` — false-positive guard
+
+> After producing the report, sample `<N>` random items you marked `OK` and re-verify them against the same source list. Note the re-check at the report footer (which items, sources re-checked, conclusion held).
+
+Used when most items will be `OK`; protects against rubber-stamping.
+
+### `git-status no-collateral-damage` — scope discipline
+
+> When done, run `git status --short` and confirm only the brief's allowlisted files changed. If anything else is modified or created, flag it in the report — do not claim success.
+
+Used for any brief with a strict allowlist (audit reports that must NOT touch source data, etc.).
+
+### `dedup-across-N` — collision check across multiple files
+
+> After writing, concatenate `<files>` and confirm <key> is unique across all of them (`sort -u | wc -l` matches input line count).
+
+Used when the same key (e.g. kanji character) must not appear in multiple level files.
+
+### `schema-match` — preserve existing JSONL/JSON shape
+
+> Read `<reference file>` first to learn the schema. Every new entry must include exactly the same keys, same types, same canonical values for non-content fields (e.g. `source`, `license`).
+
+Used when extending an existing data file family.
+
+## Example brief (good)
+
+```
+working_dir: /home/screenleon/github/japanese-site/
+goal: Audit PR #8 N1 corpus additions for JLPT level appropriateness — flag mis-classification.
+files:
+  - read: server/data/corpus/grammar/N1/{ga-hayai-ka,...}.{json,examples.jsonl}
+  - read: server/data/corpus/vocab/N1.jsonl (60 net-new rows)
+  - read: server/data/corpus/kanji/N1.jsonl (40 net-new rows)
+  - write: audits/pr-8-jlpt-level-audit-2026-05-02.md
+constraints:
+  - READ-ONLY on all files under server/data/corpus/
+  - Only write the audit report
+self_verify:
+  - cross-source: vocab+kanji ≥1 source from Jisho/Tanos; grammar ≥2 sources from Bunpro/JLPT-Sensei/Maggie/Tanos
+  - sample-N OK re-check: 3 random OK items at the footer
+  - git-status no-collateral-damage: only the audit file as new
+acceptance:
+  - report file exists at the specified path
+  - every flagged entry has citations
+  - footer contains the 3-item OK re-check
+output_format: markdown with three grouped sections (## OK / ## needs review / ## re-classify); table columns per row
+```
+
+## Example brief (bad — would be rejected)
+
+```
+"Audit the N1 stuff and flag anything wrong."
+```
+
+No working_dir, no files, no acceptance criteria. Codex would have to guess what corpus, what sources, how to verify, where to write the report. Reject and ask.
+
+## Style notes
+
+- Prose is fine; YAML-like keys above are conventions, not strict syntax. Use a heredoc when piping the brief on stdin.
+- Keep briefs in the active voice ("Audit X", not "X should be audited"). Codex parses imperatives more reliably than declaratives.
+- Don't write implementation steps. The brief tells Codex *what* and *what counts as done*; *how* is Codex's job.

From 14b665c24e806704f07a3ec62e4dceff1326744e Mon Sep 17 00:00:00 2001
From: Lien Chen <screen.leon@gmail.com>
Date: Sat, 2 May 2026 10:19:11 +0900
Subject: [PATCH 4/4] =?UTF-8?q?fix(pr-gate):=20single=20PM=20hop=20?=
 =?UTF-8?q?=E2=80=94=20main=20thread=20classifies=20via=20heuristic?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Architecture-reviewer noted that the prior pr-gate flow could call
PM twice on ambiguous classification (once to classify, once to
synthesize). Each PM invocation reloads the full PM system prompt
(~3KB).

Move classification to a deterministic heuristic in the skill text:

  non_docs=$(git diff $BASE...HEAD --name-only \
    | grep -vE '\.(md|jsonl|txt)$|^\.gitignore$|^audits/|^docs/|^\.github/')
  [ -z "$non_docs" ] && CLASS=docs-only || CLASS=implementation

Bias toward `implementation` when ambiguous: an unnecessary
security/risk spawn returns `pass-not-applicable` cheaply, but a
missed implementation review ships a real bug.

PM's role narrows to synthesis only — invoked once at Step 3 with
classification + reviewer outputs. Net: one PM round-trip per gate
instead of one or two.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 commands/pr-gate.md | 24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/commands/pr-gate.md b/commands/pr-gate.md
index ea550b8..ff2fdbd 100644
--- a/commands/pr-gate.md
+++ b/commands/pr-gate.md
@@ -3,23 +3,28 @@ description: Run the pre-PR review pipeline (critic + architecture + security +
 argument-hint: [optional context, e.g. "skip qa, already audited"]
 ---
 
-Run the PR gate. Subagents cannot spawn subagents in Claude Code, so the **main thread** orchestrates reviewers; `project-pm` synthesizes.
+Run the PR gate. Subagents cannot spawn subagents in Claude Code, so the **main thread** orchestrates reviewers; `project-pm` is invoked once at the end to synthesize.
 
-## Step 1 — classify the diff
+## Step 1 — classify the diff (main thread, no PM hop)
 
-In the main thread, detect the integration branch and stat the diff:
+Detect the integration branch and check the diff:
 
 ```bash
 BASE=$(git symbolic-ref refs/remotes/origin/HEAD 2>/dev/null | sed 's@^refs/remotes/origin/@@' || echo main)
 git diff "$BASE"...HEAD --stat
 ```
 
-Decide:
+Apply the heuristic — bias toward `implementation` when ambiguous (an unnecessary security/risk spawn returns `pass-not-applicable` cheaply; a missed implementation review can ship a real bug):
 
-- **docs / content-only** (no runtime code change) → spawn `critic` + `architecture-reviewer` only. Security / risk / qa are `pass-not-applicable`.
-- **implementation change** (any runtime code diff) → spawn all four advisors + qa-tester.
+```bash
+non_docs=$(git diff "$BASE"...HEAD --name-only | grep -vE '\.(md|jsonl|txt)$|^\.gitignore$|^audits/|^docs/|^\.github/' || true)
+[ -z "$non_docs" ] && CLASS=docs-only || CLASS=implementation
+```
 
-If unsure, invoke `project-pm` first with the diff stat to get the classification, then proceed to step 2.
+- `docs-only` → spawn `critic` + `architecture-reviewer`. Security / risk / qa are implicitly `pass-not-applicable`; do not spawn.
+- `implementation` → spawn all four advisors + `qa-tester`.
+
+Do not invoke PM at this step. PM's role is synthesis only.
 
 ## Step 2 — spawn reviewers in parallel from main thread
 
@@ -36,9 +41,10 @@ Agent(subagent_type: "qa-tester", ...)           # implementation only
 
 Each reviewer brief should include: working dir, branch name vs integration branch, diff summary, scope hints from $ARGUMENTS.
 
-## Step 3 — synthesize via project-pm
+## Step 3 — synthesize via project-pm (single hop)
+
+After all reviewers return, invoke `project-pm` once with the classification + their verbatim outputs and ask it to:
 
-After all reviewers return, invoke `project-pm` with their verbatim outputs and ask it to:
 - Compose the final gate summary (each reviewer's verdict, any blocks with override paths, final go/no-go).
 - Record `block-soft` overrides or trade-off advisories into project memory.