screenleon · screenleon · May 2, 2026 · May 2, 2026 · May 2, 2026 · May 2, 2026
diff --git a/.github/workflows/lint.yml b/.github/workflows/lint.yml
@@ -0,0 +1,14 @@
+name: lint
+
+on:
+  push:
+    branches: [main]
+  pull_request:
+
+jobs:
+  lint-agents:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - name: Validate agent frontmatter
+        run: ./scripts/lint-agents.sh
diff --git a/README.md b/README.md
@@ -26,8 +26,8 @@ Idempotent — re-run safely after adding files. Per-file symlinks so other tool
 ### Agents
 
 **Orchestration**
-- **project-pm** — PM across `~/github/` repos. Triages requests, decomposes work, dispatches to `codex-executor`, runs the PR gate, maintains per-project memory at `~/.claude/projects/-home-screenleon-github/memory/project_<repo>.md`.
-- **codex-executor** — Thin wrapper subagent. Accepts a complete brief, dispatches to the Codex CLI via `scripts/codex-dispatch.sh`, verifies via `git diff`, reports back.
+- **project-pm** — PM across `~/github/` repos. Triages requests, decomposes work, writes briefs (main thread dispatches), synthesizes PR-gate reviews, maintains per-project memory at `~/.claude/projects/-home-screenleon-github/memory/project_<repo>.md`.
+- **codex-executor** — Thin wrapper subagent, dispatched by the main thread. Accepts a complete brief, calls Codex via `scripts/codex-dispatch.sh`, verifies via `git diff`, reports back.
 
 **Reviewers (advisors — PM may override with reasoning)**
 - **critic** — Adversarial review of plan / diff. Scope creep, incompleteness, convention drift.
@@ -53,12 +53,17 @@ Idempotent — re-run safely after adding files. Per-file symlinks so other tool
 
 ## Design notes
 
+- **Subagents cannot spawn subagents.** Claude Code intentionally restricts nested `Agent` tool calls regardless of frontmatter declaration ([Agent SDK docs](https://code.claude.com/docs/en/agent-sdk/subagents.md)). The **main thread** orchestrates: it spawns subagents (PM, reviewers, codex-executor) and relays outputs between them. PM produces briefs and synthesizes verdicts; it does not dispatch. Reviewers run in parallel from the main thread, not from PM. Never include `Agent` in any subagent's `tools:` frontmatter — `scripts/lint-agents.sh` enforces this.
 - **PM thinks, Codex implements.** `project-pm` writes the brief; `codex-executor` is a dispatcher, not a designer. Architecture, scope, and acceptance criteria stay with the PM.
 - **Definitions in repo, state on disk.** Agent and command definitions are version-controlled here. Per-project state (memory, traces) lives in `~/.claude/` and stays out of this repo.
 - **Decoupled from agent-playbook-template.** The playbook is a methodology framework; this repo is a personal config. They evolve independently.
 
 ## Adding new pieces
 
-- New agent: drop a `name.md` (with frontmatter) into `agents/`, re-run `install.sh`.
+- New agent: drop a `name.md` (with frontmatter) into `agents/`, re-run `install.sh`. **Don't include `Agent` in `tools:`** — `scripts/lint-agents.sh` will reject the install.
 - New command: drop a `name.md` into `commands/`, re-run `install.sh`.
 - Settings allowlist additions: edit `~/.claude/settings.json` directly (or use the `update-config` skill); don't try to symlink settings.
+
+## Codex briefs
+
+Schema and reusable self-verify macros: [`docs/codex-brief.md`](docs/codex-brief.md). All briefs dispatched to `codex-executor` must include `working_dir`, `goal`, `files`, and `acceptance`; the executor rejects briefs missing those fields.
diff --git a/agents/codex-executor.md b/agents/codex-executor.md
@@ -8,7 +8,12 @@ Thin dispatcher. You write nothing yourself; you invoke Codex.
 
 # Job
 
-1. Receive brief. If working dir is missing or the change is ambiguous, stop and ask — do not guess.
+1. **Validate brief against schema** at `~/github/claude-config/docs/codex-brief.md`. REJECT (stop and ask the caller) if missing any of:
+   - `working_dir` (absolute path that exists)
+   - `goal` (one sentence — what changes after this runs)
+   - `files` (concrete paths or search hint; create-new and edit-existing both enumerated)
+   - `acceptance` (testable post-conditions Codex can verify before declaring done)
+   Do not improvise missing fields.
 2. Dispatch via `~/github/claude-config/scripts/codex-dispatch.sh`. Never call `codex exec` directly.
 3. Verify the result against `git diff` — Codex's self-report may not match reality.
 4. Report back in the shape below.

diff --git a/agents/project-pm.md b/agents/project-pm.md
@@ -71,12 +71,9 @@ The main thread runs reviewers in parallel (single message, multiple Agent calls
 
 # Writing a brief for codex-executor
 
-Required: **working dir** (abs path), **goal** (one sentence), **files to touch** (paths or search hint), **constraints** (don't-change, conventions, tests that must still pass), **acceptance criteria** (test, build, concrete check).
+The canonical schema lives in `~/github/claude-config/docs/codex-brief.md`. Briefs must declare `working_dir`, `goal`, `files`, and `acceptance`. Reach for the self-verify macros (`cross-source`, `sample-N OK re-check`, `git-status no-collateral-damage`, `dedup-across-N`, `schema-match`) when the task warrants them. `codex-executor` rejects briefs missing the four required fields — write the full set up front rather than getting bounced.
 
-Example:
-> In `~/github/foo/`, `src/auth/Login.tsx` drops the redirect param after OAuth callback — `/auth/callback?next=/dashboard` lands on `/`. Fix redirect handling. Existing tests in `src/auth/__tests__/` must still pass; add a test for the redirect case. Sandbox: workspace-write.
-
-Return the brief to the main thread; main thread dispatches via `Agent(subagent_type: "codex-executor", ...)` (or directly via `Bash(codex exec ...)` if `codex-executor` is unavailable). Verify the resulting report against `git diff` before claiming success.
+Return the brief to the main thread; main thread dispatches it. Verify the resulting report against `git diff` before claiming success.
 
 # Per-project memory shape
 

diff --git a/commands/pm.md b/commands/pm.md
@@ -7,4 +7,8 @@ Invoke `project-pm` via Agent. Brief with: request ($ARGUMENTS), current working
 
 Relay the PM's user-facing summary. Do not do the PM's job yourself.
 
-**Note**: Subagents cannot spawn subagents. If the PM's reply is "dispatch this brief to codex-executor" or "spawn reviewers X / Y / Z", the **main thread** must do the dispatching. Treat the PM's brief as the input to your own `Agent(subagent_type: "codex-executor", ...)` call (or `Bash(codex exec ...)` if `codex-executor` itself is unavailable in your environment).
+**Note**: Subagents cannot spawn subagents. If the PM's reply is "dispatch this brief to codex-executor", the **main thread** must do the dispatching — treat the PM's brief as input to your own `Agent(subagent_type: "codex-executor", ...)` call (illustrative — emit it as a real Agent tool call). If `codex-executor` is unavailable, fallback to invoking `scripts/codex-dispatch.sh` directly via Bash — the script encodes the canonical sandbox / approval / trace-capture flags so this command stays in sync with the rest of the dispatch path.
+
+Briefs must follow the schema at `docs/codex-brief.md` (working_dir / goal / files / acceptance, plus optional self-verify macros). codex-executor rejects briefs missing the required fields.
+
+For PR-gate flows, use `/pr-gate` instead — that skill handles reviewer orchestration; do not re-implement it inline here.
diff --git a/commands/pr-gate.md b/commands/pr-gate.md
@@ -3,34 +3,48 @@ description: Run the pre-PR review pipeline (critic + architecture + security +
 argument-hint: [optional context, e.g. "skip qa, already audited"]
 ---
 
-Run the PR gate. Subagents cannot spawn subagents in Claude Code, so the **main thread** orchestrates reviewers; `project-pm` synthesizes.
+Run the PR gate. Subagents cannot spawn subagents in Claude Code, so the **main thread** orchestrates reviewers; `project-pm` is invoked once at the end to synthesize.
 
-## Step 1 — classify the diff
+## Step 1 — classify the diff (main thread, no PM hop)
 
-In the main thread, run `git diff main...HEAD --stat` (substitute the actual integration branch). Decide:
+Detect the integration branch and check the diff:
 
-- **docs / content-only** (no runtime code change) → spawn `critic` + `architecture-reviewer` only. Security / risk / qa are `pass-not-applicable`.
-- **implementation change** (any runtime code diff) → spawn all four advisors + qa-tester.
+```bash
+BASE=$(git symbolic-ref refs/remotes/origin/HEAD 2>/dev/null | sed 's@^refs/remotes/origin/@@' || echo main)
+git diff "$BASE"...HEAD --stat
+```
+
+Apply the heuristic — bias toward `implementation` when ambiguous (an unnecessary security/risk spawn returns `pass-not-applicable` cheaply; a missed implementation review can ship a real bug):
+
+```bash
+non_docs=$(git diff "$BASE"...HEAD --name-only | grep -vE '\.(md|jsonl|txt)$|^\.gitignore$|^audits/|^docs/|^\.github/' || true)
+[ -z "$non_docs" ] && CLASS=docs-only || CLASS=implementation
+```
+
+- `docs-only` → spawn `critic` + `architecture-reviewer`. Security / risk / qa are implicitly `pass-not-applicable`; do not spawn.
+- `implementation` → spawn all four advisors + `qa-tester`.
 
-If unsure, invoke `project-pm` first with the diff stat to get the classification, then proceed to step 2.
+Do not invoke PM at this step. PM's role is synthesis only.
 
 ## Step 2 — spawn reviewers in parallel from main thread
 
-In a single message, make multiple Agent calls (this is the parallel pattern):
+In a single message, make N parallel Agent tool calls — one per applicable reviewer. Pseudocode (illustrative, not literal call syntax):
 
 ```
-Agent(subagent_type: "critic", prompt: <branch + diff context + $ARGUMENTS>)
-Agent(subagent_type: "architecture-reviewer", prompt: <same>)
-Agent(subagent_type: "security-reviewer", prompt: <same>)   # implementation only
-Agent(subagent_type: "risk-reviewer", prompt: <same>)       # implementation only
-Agent(subagent_type: "qa-tester", prompt: <same>)           # implementation only
+# pseudocode — emit each as a real Agent tool call in one message
+Agent(subagent_type: "critic", ...)
+Agent(subagent_type: "architecture-reviewer", ...)
+Agent(subagent_type: "security-reviewer", ...)   # implementation only
+Agent(subagent_type: "risk-reviewer", ...)       # implementation only
+Agent(subagent_type: "qa-tester", ...)           # implementation only
 ```
 
 Each reviewer brief should include: working dir, branch name vs integration branch, diff summary, scope hints from $ARGUMENTS.
 
-## Step 3 — synthesize via project-pm
+## Step 3 — synthesize via project-pm (single hop)
+
+After all reviewers return, invoke `project-pm` once with the classification + their verbatim outputs and ask it to:
 
-After all reviewers return, invoke `project-pm` with their verbatim outputs and ask it to:
 - Compose the final gate summary (each reviewer's verdict, any blocks with override paths, final go/no-go).
 - Record `block-soft` overrides or trade-off advisories into project memory.
 

diff --git a/docs/codex-brief.md b/docs/codex-brief.md
@@ -0,0 +1,97 @@
+# Codex brief schema
+
+The canonical structure for any brief dispatched to `codex-executor` (directly via Agent, or indirectly via `scripts/codex-dispatch.sh`).
+
+`codex-executor` rejects briefs missing the four required fields. PMs and main-thread dispatchers should always write briefs against this schema; reach for the optional macros below when the task warrants them.
+
+## Required fields
+
+| Field | What | Example |
+|---|---|---|
+| `working_dir` | Absolute path. Must exist. | `/home/screenleon/github/japanese-site/` |
+| `goal` | One sentence. What changes after this runs. | "Backfill 40 N4 / 40 N3 / 40 N2 kanji entries to fill the empty middle-tier overlay." |
+| `files` | Concrete paths or a search hint. Both create-new and edit-existing must be enumerated. | `server/data/corpus/kanji/{N4,N3,N2}.jsonl` (new); read `N1.jsonl` and `N5.jsonl` for schema |
+| `acceptance` | Testable post-conditions Codex itself can verify before declaring done. | `wc -l shows 40/40/40`; every line parses as JSON; no character collisions across N1-N5 |
+
+A brief missing any of these is a request for guesswork. Reject and ask the caller.
+
+## Optional sections
+
+Use as needed; not all briefs require all of them.
+
+- **`constraints`** — what NOT to do. File paths off-limits, conventions to preserve, tests that must still pass after the change.
+- **`self_verify`** — see macros below. Use whenever the work has external authority (sources, schema, level tag) Codex must not invent.
+- **`output_format`** — when the deliverable is a report (audit, plan), specify the file path and required sections.
+- **`sandbox`** / **`approval`** — only set when overriding the defaults (`workspace-write` / `never`). Caller must authorize.
+
+## Self-verify macros
+
+Reusable phrases. Drop into `self_verify` block of any brief.
+
+### `cross-source` — N independent sources per item
+
+> For each `<item type>`, cross-check against ≥`<N>` independent authoritative sources from `<allowed source list>`. Cite the source in the report column. Do not rely on internal knowledge alone.
+
+Used for: JLPT level audits, fact-checked content batches.
+
+### `sample-N OK re-check` — false-positive guard
+
+> After producing the report, sample `<N>` random items you marked `OK` and re-verify them against the same source list. Note the re-check at the report footer (which items, sources re-checked, conclusion held).
+
+Used when most items will be `OK`; protects against rubber-stamping.
+
+### `git-status no-collateral-damage` — scope discipline
+
+> When done, run `git status --short` and confirm only the brief's allowlisted files changed. If anything else is modified or created, flag it in the report — do not claim success.
+
+Used for any brief with a strict allowlist (audit reports that must NOT touch source data, etc.).
+
+### `dedup-across-N` — collision check across multiple files
+
+> After writing, concatenate `<files>` and confirm <key> is unique across all of them (`sort -u | wc -l` matches input line count).
+
+Used when the same key (e.g. kanji character) must not appear in multiple level files.
+
+### `schema-match` — preserve existing JSONL/JSON shape
+
+> Read `<reference file>` first to learn the schema. Every new entry must include exactly the same keys, same types, same canonical values for non-content fields (e.g. `source`, `license`).
+
+Used when extending an existing data file family.
+
+## Example brief (good)
+
+```
+working_dir: /home/screenleon/github/japanese-site/
+goal: Audit PR #8 N1 corpus additions for JLPT level appropriateness — flag mis-classification.
+files:
+  - read: server/data/corpus/grammar/N1/{ga-hayai-ka,...}.{json,examples.jsonl}
+  - read: server/data/corpus/vocab/N1.jsonl (60 net-new rows)
+  - read: server/data/corpus/kanji/N1.jsonl (40 net-new rows)
+  - write: audits/pr-8-jlpt-level-audit-2026-05-02.md
+constraints:
+  - READ-ONLY on all files under server/data/corpus/
+  - Only write the audit report
+self_verify:
+  - cross-source: vocab+kanji ≥1 source from Jisho/Tanos; grammar ≥2 sources from Bunpro/JLPT-Sensei/Maggie/Tanos
+  - sample-N OK re-check: 3 random OK items at the footer
+  - git-status no-collateral-damage: only the audit file as new
+acceptance:
+  - report file exists at the specified path
+  - every flagged entry has citations
+  - footer contains the 3-item OK re-check
+output_format: markdown with three grouped sections (## OK / ## needs review / ## re-classify); table columns per row
+```
+
+## Example brief (bad — would be rejected)
+
+```
+"Audit the N1 stuff and flag anything wrong."
+```
+
+No working_dir, no files, no acceptance criteria. Codex would have to guess what corpus, what sources, how to verify, where to write the report. Reject and ask.
+
+## Style notes
+
+- Prose is fine; YAML-like keys above are conventions, not strict syntax. Use a heredoc when piping the brief on stdin.
+- Keep briefs in the active voice ("Audit X", not "X should be audited"). Codex parses imperatives more reliably than declaratives.
+- Don't write implementation steps. The brief tells Codex *what* and *what counts as done*; *how* is Codex's job.
diff --git a/install.sh b/install.sh
@@ -78,6 +78,13 @@ echo "  claude home: $CLAUDE_HOME"
 if [[ "$DRY_RUN" -eq 1 ]]; then echo "  mode:        DRY RUN"; fi
 echo
 
+# Pre-flight: agent frontmatter must not declare Agent (subagents can't spawn subagents)
+if [[ -x "$REPO_ROOT/scripts/lint-agents.sh" ]]; then
+  echo "==> lint agents"
+  "$REPO_ROOT/scripts/lint-agents.sh"
+  echo
+fi
+
 install_dir agents
 install_dir skills
 install_dir commands

diff --git a/scripts/lint-agents.sh b/scripts/lint-agents.sh
@@ -0,0 +1,59 @@
+#!/usr/bin/env bash
+# Validate that no subagent declares the `Agent` tool in its frontmatter.
+# Claude Code strips Agent from subagent runtime schemas regardless of
+# declaration, so leaving it in frontmatter is misleading drift.
+# See README.md "Design notes" for the rule.
+
+set -euo pipefail
+
+repo_root="$(cd "$(dirname "$0")/.." && pwd)"
+agents_dir="$repo_root/agents"
+
+if [ ! -d "$agents_dir" ]; then
+  echo "lint-agents: $agents_dir not found" >&2
+  exit 2
+fi
+
+violations=0
+for f in "$agents_dir"/*.md; do
+  [ -e "$f" ] || continue
+
+  # Require well-formed frontmatter (>=2 `---` markers)
+  fence_count=$(grep -c '^---$' "$f" || true)
+  if [ "$fence_count" -lt 2 ]; then
+    echo "WARN: $(basename "$f") has no YAML frontmatter; skipping" >&2
+    continue
+  fi
+
+  # Extract content between the first two `---` lines
+  fm=$(awk '/^---$/{c++; next} c==1' "$f")
+
+  # Capture the tools: block — the inline scalar form OR a block-list
+  # spanning indented `- ` lines until the next non-indented key.
+  tools_block=$(printf '%s\n' "$fm" | awk '
+    /^tools:/ { print; in_block=1; next }
+    in_block {
+      if ($0 ~ /^[[:space:]]/) { print; next }
+      else { in_block=0 }
+    }
+  ')
+
+  if [ -z "$tools_block" ]; then
+    continue
+  fi
+
+  # Match Agent as a whole token. Separators in YAML scalar/flow:
+  # `,` ` ` `[` `]` `:` and start/end of line. In block-list, `-` precedes.
+  if printf '%s' "$tools_block" | grep -qE '(^|[][, :-])Agent([][, :]|$)'; then
+    echo "FAIL: $(basename "$f") declares Agent in tools" >&2
+    printf '%s\n' "$tools_block" | sed 's/^/  /' >&2
+    violations=$((violations + 1))
+  fi
+done
+
+if [ "$violations" -gt 0 ]; then
+  echo "lint-agents: $violations violation(s). Subagents cannot spawn subagents — remove Agent from tools." >&2
+  exit 1
+fi
+
+echo "lint-agents: OK ($(ls "$agents_dir"/*.md 2>/dev/null | wc -l) agent files checked)"