From 92ac8d8bae80dd3decf7ea7dfcbb0eb7ddbf7d94 Mon Sep 17 00:00:00 2001 From: Lien Chen Date: Sat, 2 May 2026 10:02:21 +0900 Subject: [PATCH 1/4] chore: claude-config tier-1 cleanup + agent lint MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Address deferred items from PR #1 critic/architecture review and add a regression guard: - commands/pr-gate.md: detect integration branch via `git symbolic-ref refs/remotes/origin/HEAD` instead of hardcoding `main`; mark Agent(...) block as illustrative pseudocode. - commands/pm.md: drop "or spawn reviewers X/Y/Z" half — that's /pr-gate territory. Move full Bash(codex exec) fallback recipe here so PM prompt doesn't carry main-thread implementation details. - agents/project-pm.md: drop the Bash(codex exec) fallback line from PM prompt — main-thread fallback paths don't belong here. - README.md: refresh project-pm description; add "Subagents cannot spawn subagents" as the lead Design note with link to Agent SDK docs. - scripts/lint-agents.sh: regression guard — fail if any subagent declares Agent in frontmatter tools. Wired into install.sh pre-flight. Co-Authored-By: Claude Opus 4.7 (1M context) --- README.md | 3 ++- agents/project-pm.md | 2 +- commands/pm.md | 4 +++- commands/pr-gate.md | 22 +++++++++++++++------- install.sh | 7 +++++++ scripts/lint-agents.sh | 37 +++++++++++++++++++++++++++++++++++++ 6 files changed, 65 insertions(+), 10 deletions(-) create mode 100755 scripts/lint-agents.sh diff --git a/README.md b/README.md index 3b75aff..60c4f0a 100644 --- a/README.md +++ b/README.md @@ -26,7 +26,7 @@ Idempotent — re-run safely after adding files. Per-file symlinks so other tool ### Agents **Orchestration** -- **project-pm** — PM across `~/github/` repos. Triages requests, decomposes work, dispatches to `codex-executor`, runs the PR gate, maintains per-project memory at `~/.claude/projects/-home-screenleon-github/memory/project_.md`. +- **project-pm** — PM across `~/github/` repos. Triages requests, decomposes work, writes briefs (main thread dispatches), synthesizes PR-gate reviews, maintains per-project memory at `~/.claude/projects/-home-screenleon-github/memory/project_.md`. - **codex-executor** — Thin wrapper subagent. Accepts a complete brief, dispatches to the Codex CLI via `scripts/codex-dispatch.sh`, verifies via `git diff`, reports back. **Reviewers (advisors — PM may override with reasoning)** @@ -53,6 +53,7 @@ Idempotent — re-run safely after adding files. Per-file symlinks so other tool ## Design notes +- **Subagents cannot spawn subagents.** Claude Code intentionally restricts nested `Agent` tool calls regardless of frontmatter declaration ([Agent SDK docs](https://code.claude.com/docs/en/agent-sdk/subagents.md)). The **main thread** orchestrates: it spawns subagents (PM, reviewers, codex-executor) and relays outputs between them. PM produces briefs and synthesizes verdicts; it does not dispatch. Reviewers run in parallel from the main thread, not from PM. Never include `Agent` in any subagent's `tools:` frontmatter — `scripts/lint-agents.sh` enforces this. - **PM thinks, Codex implements.** `project-pm` writes the brief; `codex-executor` is a dispatcher, not a designer. Architecture, scope, and acceptance criteria stay with the PM. - **Definitions in repo, state on disk.** Agent and command definitions are version-controlled here. Per-project state (memory, traces) lives in `~/.claude/` and stays out of this repo. - **Decoupled from agent-playbook-template.** The playbook is a methodology framework; this repo is a personal config. They evolve independently. diff --git a/agents/project-pm.md b/agents/project-pm.md index 33a072d..7d57dbc 100644 --- a/agents/project-pm.md +++ b/agents/project-pm.md @@ -76,7 +76,7 @@ Required: **working dir** (abs path), **goal** (one sentence), **files to touch* Example: > In `~/github/foo/`, `src/auth/Login.tsx` drops the redirect param after OAuth callback — `/auth/callback?next=/dashboard` lands on `/`. Fix redirect handling. Existing tests in `src/auth/__tests__/` must still pass; add a test for the redirect case. Sandbox: workspace-write. -Return the brief to the main thread; main thread dispatches via `Agent(subagent_type: "codex-executor", ...)` (or directly via `Bash(codex exec ...)` if `codex-executor` is unavailable). Verify the resulting report against `git diff` before claiming success. +Return the brief to the main thread; main thread dispatches it. Verify the resulting report against `git diff` before claiming success. # Per-project memory shape diff --git a/commands/pm.md b/commands/pm.md index 5a1cc18..eedd0c5 100644 --- a/commands/pm.md +++ b/commands/pm.md @@ -7,4 +7,6 @@ Invoke `project-pm` via Agent. Brief with: request ($ARGUMENTS), current working Relay the PM's user-facing summary. Do not do the PM's job yourself. -**Note**: Subagents cannot spawn subagents. If the PM's reply is "dispatch this brief to codex-executor" or "spawn reviewers X / Y / Z", the **main thread** must do the dispatching. Treat the PM's brief as the input to your own `Agent(subagent_type: "codex-executor", ...)` call (or `Bash(codex exec ...)` if `codex-executor` itself is unavailable in your environment). +**Note**: Subagents cannot spawn subagents. If the PM's reply is "dispatch this brief to codex-executor", the **main thread** must do the dispatching — treat the PM's brief as input to your own `Agent(subagent_type: "codex-executor", ...)` call (illustrative — emit it as a real Agent tool call). If `codex-executor` is unavailable, fallback to `Bash(codex exec --sandbox workspace-write --skip-git-repo-check -C -)` with the brief on stdin. + +For PR-gate flows, use `/pr-gate` instead — that skill handles reviewer orchestration; do not re-implement it inline here. diff --git a/commands/pr-gate.md b/commands/pr-gate.md index 15dfba6..ea550b8 100644 --- a/commands/pr-gate.md +++ b/commands/pr-gate.md @@ -7,7 +7,14 @@ Run the PR gate. Subagents cannot spawn subagents in Claude Code, so the **main ## Step 1 — classify the diff -In the main thread, run `git diff main...HEAD --stat` (substitute the actual integration branch). Decide: +In the main thread, detect the integration branch and stat the diff: + +```bash +BASE=$(git symbolic-ref refs/remotes/origin/HEAD 2>/dev/null | sed 's@^refs/remotes/origin/@@' || echo main) +git diff "$BASE"...HEAD --stat +``` + +Decide: - **docs / content-only** (no runtime code change) → spawn `critic` + `architecture-reviewer` only. Security / risk / qa are `pass-not-applicable`. - **implementation change** (any runtime code diff) → spawn all four advisors + qa-tester. @@ -16,14 +23,15 @@ If unsure, invoke `project-pm` first with the diff stat to get the classificatio ## Step 2 — spawn reviewers in parallel from main thread -In a single message, make multiple Agent calls (this is the parallel pattern): +In a single message, make N parallel Agent tool calls — one per applicable reviewer. Pseudocode (illustrative, not literal call syntax): ``` -Agent(subagent_type: "critic", prompt: ) -Agent(subagent_type: "architecture-reviewer", prompt: ) -Agent(subagent_type: "security-reviewer", prompt: ) # implementation only -Agent(subagent_type: "risk-reviewer", prompt: ) # implementation only -Agent(subagent_type: "qa-tester", prompt: ) # implementation only +# pseudocode — emit each as a real Agent tool call in one message +Agent(subagent_type: "critic", ...) +Agent(subagent_type: "architecture-reviewer", ...) +Agent(subagent_type: "security-reviewer", ...) # implementation only +Agent(subagent_type: "risk-reviewer", ...) # implementation only +Agent(subagent_type: "qa-tester", ...) # implementation only ``` Each reviewer brief should include: working dir, branch name vs integration branch, diff summary, scope hints from $ARGUMENTS. diff --git a/install.sh b/install.sh index b259ace..7c8e12b 100755 --- a/install.sh +++ b/install.sh @@ -78,6 +78,13 @@ echo " claude home: $CLAUDE_HOME" if [[ "$DRY_RUN" -eq 1 ]]; then echo " mode: DRY RUN"; fi echo +# Pre-flight: agent frontmatter must not declare Agent (subagents can't spawn subagents) +if [[ -x "$REPO_ROOT/scripts/lint-agents.sh" ]]; then + echo "==> lint agents" + "$REPO_ROOT/scripts/lint-agents.sh" + echo +fi + install_dir agents install_dir skills install_dir commands diff --git a/scripts/lint-agents.sh b/scripts/lint-agents.sh new file mode 100755 index 0000000..0962777 --- /dev/null +++ b/scripts/lint-agents.sh @@ -0,0 +1,37 @@ +#!/usr/bin/env bash +# Validate that no subagent declares the `Agent` tool in its frontmatter. +# Claude Code strips Agent from subagent runtime schemas regardless of +# declaration, so leaving it in frontmatter is misleading drift. +# See README.md "Design notes" for the rule. + +set -euo pipefail + +repo_root="$(cd "$(dirname "$0")/.." && pwd)" +agents_dir="$repo_root/agents" + +if [ ! -d "$agents_dir" ]; then + echo "lint-agents: $agents_dir not found" >&2 + exit 2 +fi + +violations=0 +for f in "$agents_dir"/*.md; do + [ -e "$f" ] || continue + # Extract content between the first two `---` lines (YAML frontmatter) + fm=$(awk '/^---$/{c++; next} c==1' "$f") + tools_line=$(printf '%s\n' "$fm" | grep -E '^tools:' || true) + if [ -z "$tools_line" ]; then + continue + fi + if printf '%s' "$tools_line" | grep -qE '(^|[, ])Agent([, ]|$)'; then + echo "FAIL: $(basename "$f") declares Agent in tools: $tools_line" >&2 + violations=$((violations + 1)) + fi +done + +if [ "$violations" -gt 0 ]; then + echo "lint-agents: $violations violation(s). Subagents cannot spawn subagents — remove Agent from tools." >&2 + exit 1 +fi + +echo "lint-agents: OK ($(ls "$agents_dir"/*.md 2>/dev/null | wc -l) agent files checked)" From f636164596d36658dfce71be53d25844fe70d372 Mon Sep 17 00:00:00 2001 From: Lien Chen Date: Sat, 2 May 2026 10:10:36 +0900 Subject: [PATCH 2/4] fix(lint): handle YAML block-list/flow forms; CI workflow; reduce duplication MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Address review findings on prior tier-1 cleanup commit: - scripts/lint-agents.sh: previously only matched the inline scalar form `tools: Read, Bash, Agent`. The YAML block-list form tools: - Read - Agent was silently passed — exact escape hatch the script is meant to prevent. Now extracts the full tools: block (handles inline, flow `[a, b, c]`, and block-list) and matches Agent as a whole token. Also warn on files lacking YAML frontmatter rather than treating them as compliant. Verified against fixtures covering inline / flow / block-list / solo Agent / "MyAgent" suffix / "AgentX" prefix / no-frontmatter cases. - .github/workflows/lint.yml: run scripts/lint-agents.sh on push and PRs. install.sh pre-flight only catches drift on machines that re-run install; CI catches it on every contribution regardless of whether the contributor symlinks. Defense in depth. - commands/pm.md: codex fallback now points at scripts/codex-dispatch.sh (single source of truth for sandbox / approval / trace flags) instead of duplicating the raw codex exec invocation. Avoids drift the next time codex-dispatch.sh changes. - README.md: codex-executor description now explicitly says "dispatched by the main thread" — matches the new project-pm description and the Design notes lead bullet. Co-Authored-By: Claude Opus 4.7 (1M context) --- .github/workflows/lint.yml | 14 ++++++++++++++ README.md | 2 +- commands/pm.md | 2 +- scripts/lint-agents.sh | 32 +++++++++++++++++++++++++++----- 4 files changed, 43 insertions(+), 7 deletions(-) create mode 100644 .github/workflows/lint.yml diff --git a/.github/workflows/lint.yml b/.github/workflows/lint.yml new file mode 100644 index 0000000..6a97106 --- /dev/null +++ b/.github/workflows/lint.yml @@ -0,0 +1,14 @@ +name: lint + +on: + push: + branches: [main] + pull_request: + +jobs: + lint-agents: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v4 + - name: Validate agent frontmatter + run: ./scripts/lint-agents.sh diff --git a/README.md b/README.md index 60c4f0a..5830681 100644 --- a/README.md +++ b/README.md @@ -27,7 +27,7 @@ Idempotent — re-run safely after adding files. Per-file symlinks so other tool **Orchestration** - **project-pm** — PM across `~/github/` repos. Triages requests, decomposes work, writes briefs (main thread dispatches), synthesizes PR-gate reviews, maintains per-project memory at `~/.claude/projects/-home-screenleon-github/memory/project_.md`. -- **codex-executor** — Thin wrapper subagent. Accepts a complete brief, dispatches to the Codex CLI via `scripts/codex-dispatch.sh`, verifies via `git diff`, reports back. +- **codex-executor** — Thin wrapper subagent, dispatched by the main thread. Accepts a complete brief, calls Codex via `scripts/codex-dispatch.sh`, verifies via `git diff`, reports back. **Reviewers (advisors — PM may override with reasoning)** - **critic** — Adversarial review of plan / diff. Scope creep, incompleteness, convention drift. diff --git a/commands/pm.md b/commands/pm.md index eedd0c5..d814aca 100644 --- a/commands/pm.md +++ b/commands/pm.md @@ -7,6 +7,6 @@ Invoke `project-pm` via Agent. Brief with: request ($ARGUMENTS), current working Relay the PM's user-facing summary. Do not do the PM's job yourself. -**Note**: Subagents cannot spawn subagents. If the PM's reply is "dispatch this brief to codex-executor", the **main thread** must do the dispatching — treat the PM's brief as input to your own `Agent(subagent_type: "codex-executor", ...)` call (illustrative — emit it as a real Agent tool call). If `codex-executor` is unavailable, fallback to `Bash(codex exec --sandbox workspace-write --skip-git-repo-check -C -)` with the brief on stdin. +**Note**: Subagents cannot spawn subagents. If the PM's reply is "dispatch this brief to codex-executor", the **main thread** must do the dispatching — treat the PM's brief as input to your own `Agent(subagent_type: "codex-executor", ...)` call (illustrative — emit it as a real Agent tool call). If `codex-executor` is unavailable, fallback to invoking `scripts/codex-dispatch.sh` directly via Bash — the script encodes the canonical sandbox / approval / trace-capture flags so this command stays in sync with the rest of the dispatch path. For PR-gate flows, use `/pr-gate` instead — that skill handles reviewer orchestration; do not re-implement it inline here. diff --git a/scripts/lint-agents.sh b/scripts/lint-agents.sh index 0962777..3b4d8fe 100755 --- a/scripts/lint-agents.sh +++ b/scripts/lint-agents.sh @@ -17,14 +17,36 @@ fi violations=0 for f in "$agents_dir"/*.md; do [ -e "$f" ] || continue - # Extract content between the first two `---` lines (YAML frontmatter) + + # Require well-formed frontmatter (>=2 `---` markers) + fence_count=$(grep -c '^---$' "$f" || true) + if [ "$fence_count" -lt 2 ]; then + echo "WARN: $(basename "$f") has no YAML frontmatter; skipping" >&2 + continue + fi + + # Extract content between the first two `---` lines fm=$(awk '/^---$/{c++; next} c==1' "$f") - tools_line=$(printf '%s\n' "$fm" | grep -E '^tools:' || true) - if [ -z "$tools_line" ]; then + + # Capture the tools: block — the inline scalar form OR a block-list + # spanning indented `- ` lines until the next non-indented key. + tools_block=$(printf '%s\n' "$fm" | awk ' + /^tools:/ { print; in_block=1; next } + in_block { + if ($0 ~ /^[[:space:]]/) { print; next } + else { in_block=0 } + } + ') + + if [ -z "$tools_block" ]; then continue fi - if printf '%s' "$tools_line" | grep -qE '(^|[, ])Agent([, ]|$)'; then - echo "FAIL: $(basename "$f") declares Agent in tools: $tools_line" >&2 + + # Match Agent as a whole token. Separators in YAML scalar/flow: + # `,` ` ` `[` `]` `:` and start/end of line. In block-list, `-` precedes. + if printf '%s' "$tools_block" | grep -qE '(^|[][, :-])Agent([][, :]|$)'; then + echo "FAIL: $(basename "$f") declares Agent in tools" >&2 + printf '%s\n' "$tools_block" | sed 's/^/ /' >&2 violations=$((violations + 1)) fi done From 3a5a07343244dd39f69b345c3f9b9fd4bee00913 Mon Sep 17 00:00:00 2001 From: Lien Chen Date: Sat, 2 May 2026 10:18:46 +0900 Subject: [PATCH 3/4] docs(codex): canonical brief schema + executor enforcement MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Briefs dispatched to codex-executor have been written ad-hoc each time — required fields varied, self-verify patterns were copied from memory rather than referenced, and a vague brief would still get dispatched (executor's only validation was "working dir is missing"). This drives composition cost up and quality variance. Establish a canonical schema and codify the reusable patterns: - docs/codex-brief.md (new): four required fields (working_dir, goal, files, acceptance) and five reusable self-verify macros extracted from real briefs run this session: cross-source, sample-N OK re-check, git-status no-collateral-damage, dedup-across-N, schema-match. Includes one good example brief and one bad example for contrast. - agents/codex-executor.md: enforce the schema. Reject briefs missing any required field rather than improvising. Link to the schema doc. - agents/project-pm.md: PM's brief-writing section now points at the schema doc and macros, instead of reciting an inline example. -3 lines. - commands/pm.md: note that briefs must follow the schema. - README.md: new "Codex briefs" section linking to the schema; "Adding new pieces" reminds contributors not to put Agent in subagent tools (lint-agents.sh enforces). Co-Authored-By: Claude Opus 4.7 (1M context) --- README.md | 6 ++- agents/codex-executor.md | 7 ++- agents/project-pm.md | 5 +-- commands/pm.md | 2 + docs/codex-brief.md | 97 ++++++++++++++++++++++++++++++++++++++++ 5 files changed, 111 insertions(+), 6 deletions(-) create mode 100644 docs/codex-brief.md diff --git a/README.md b/README.md index 5830681..32cf940 100644 --- a/README.md +++ b/README.md @@ -60,6 +60,10 @@ Idempotent — re-run safely after adding files. Per-file symlinks so other tool ## Adding new pieces -- New agent: drop a `name.md` (with frontmatter) into `agents/`, re-run `install.sh`. +- New agent: drop a `name.md` (with frontmatter) into `agents/`, re-run `install.sh`. **Don't include `Agent` in `tools:`** — `scripts/lint-agents.sh` will reject the install. - New command: drop a `name.md` into `commands/`, re-run `install.sh`. - Settings allowlist additions: edit `~/.claude/settings.json` directly (or use the `update-config` skill); don't try to symlink settings. + +## Codex briefs + +Schema and reusable self-verify macros: [`docs/codex-brief.md`](docs/codex-brief.md). All briefs dispatched to `codex-executor` must include `working_dir`, `goal`, `files`, and `acceptance`; the executor rejects briefs missing those fields. diff --git a/agents/codex-executor.md b/agents/codex-executor.md index 6838cbe..6a7dc13 100644 --- a/agents/codex-executor.md +++ b/agents/codex-executor.md @@ -8,7 +8,12 @@ Thin dispatcher. You write nothing yourself; you invoke Codex. # Job -1. Receive brief. If working dir is missing or the change is ambiguous, stop and ask — do not guess. +1. **Validate brief against schema** at `~/github/claude-config/docs/codex-brief.md`. REJECT (stop and ask the caller) if missing any of: + - `working_dir` (absolute path that exists) + - `goal` (one sentence — what changes after this runs) + - `files` (concrete paths or search hint; create-new and edit-existing both enumerated) + - `acceptance` (testable post-conditions Codex can verify before declaring done) + Do not improvise missing fields. 2. Dispatch via `~/github/claude-config/scripts/codex-dispatch.sh`. Never call `codex exec` directly. 3. Verify the result against `git diff` — Codex's self-report may not match reality. 4. Report back in the shape below. diff --git a/agents/project-pm.md b/agents/project-pm.md index 7d57dbc..831697c 100644 --- a/agents/project-pm.md +++ b/agents/project-pm.md @@ -71,10 +71,7 @@ The main thread runs reviewers in parallel (single message, multiple Agent calls # Writing a brief for codex-executor -Required: **working dir** (abs path), **goal** (one sentence), **files to touch** (paths or search hint), **constraints** (don't-change, conventions, tests that must still pass), **acceptance criteria** (test, build, concrete check). - -Example: -> In `~/github/foo/`, `src/auth/Login.tsx` drops the redirect param after OAuth callback — `/auth/callback?next=/dashboard` lands on `/`. Fix redirect handling. Existing tests in `src/auth/__tests__/` must still pass; add a test for the redirect case. Sandbox: workspace-write. +The canonical schema lives in `~/github/claude-config/docs/codex-brief.md`. Briefs must declare `working_dir`, `goal`, `files`, and `acceptance`. Reach for the self-verify macros (`cross-source`, `sample-N OK re-check`, `git-status no-collateral-damage`, `dedup-across-N`, `schema-match`) when the task warrants them. `codex-executor` rejects briefs missing the four required fields — write the full set up front rather than getting bounced. Return the brief to the main thread; main thread dispatches it. Verify the resulting report against `git diff` before claiming success. diff --git a/commands/pm.md b/commands/pm.md index d814aca..92ec447 100644 --- a/commands/pm.md +++ b/commands/pm.md @@ -9,4 +9,6 @@ Relay the PM's user-facing summary. Do not do the PM's job yourself. **Note**: Subagents cannot spawn subagents. If the PM's reply is "dispatch this brief to codex-executor", the **main thread** must do the dispatching — treat the PM's brief as input to your own `Agent(subagent_type: "codex-executor", ...)` call (illustrative — emit it as a real Agent tool call). If `codex-executor` is unavailable, fallback to invoking `scripts/codex-dispatch.sh` directly via Bash — the script encodes the canonical sandbox / approval / trace-capture flags so this command stays in sync with the rest of the dispatch path. +Briefs must follow the schema at `docs/codex-brief.md` (working_dir / goal / files / acceptance, plus optional self-verify macros). codex-executor rejects briefs missing the required fields. + For PR-gate flows, use `/pr-gate` instead — that skill handles reviewer orchestration; do not re-implement it inline here. diff --git a/docs/codex-brief.md b/docs/codex-brief.md new file mode 100644 index 0000000..e0f6834 --- /dev/null +++ b/docs/codex-brief.md @@ -0,0 +1,97 @@ +# Codex brief schema + +The canonical structure for any brief dispatched to `codex-executor` (directly via Agent, or indirectly via `scripts/codex-dispatch.sh`). + +`codex-executor` rejects briefs missing the four required fields. PMs and main-thread dispatchers should always write briefs against this schema; reach for the optional macros below when the task warrants them. + +## Required fields + +| Field | What | Example | +|---|---|---| +| `working_dir` | Absolute path. Must exist. | `/home/screenleon/github/japanese-site/` | +| `goal` | One sentence. What changes after this runs. | "Backfill 40 N4 / 40 N3 / 40 N2 kanji entries to fill the empty middle-tier overlay." | +| `files` | Concrete paths or a search hint. Both create-new and edit-existing must be enumerated. | `server/data/corpus/kanji/{N4,N3,N2}.jsonl` (new); read `N1.jsonl` and `N5.jsonl` for schema | +| `acceptance` | Testable post-conditions Codex itself can verify before declaring done. | `wc -l shows 40/40/40`; every line parses as JSON; no character collisions across N1-N5 | + +A brief missing any of these is a request for guesswork. Reject and ask the caller. + +## Optional sections + +Use as needed; not all briefs require all of them. + +- **`constraints`** — what NOT to do. File paths off-limits, conventions to preserve, tests that must still pass after the change. +- **`self_verify`** — see macros below. Use whenever the work has external authority (sources, schema, level tag) Codex must not invent. +- **`output_format`** — when the deliverable is a report (audit, plan), specify the file path and required sections. +- **`sandbox`** / **`approval`** — only set when overriding the defaults (`workspace-write` / `never`). Caller must authorize. + +## Self-verify macros + +Reusable phrases. Drop into `self_verify` block of any brief. + +### `cross-source` — N independent sources per item + +> For each ``, cross-check against ≥`` independent authoritative sources from ``. Cite the source in the report column. Do not rely on internal knowledge alone. + +Used for: JLPT level audits, fact-checked content batches. + +### `sample-N OK re-check` — false-positive guard + +> After producing the report, sample `` random items you marked `OK` and re-verify them against the same source list. Note the re-check at the report footer (which items, sources re-checked, conclusion held). + +Used when most items will be `OK`; protects against rubber-stamping. + +### `git-status no-collateral-damage` — scope discipline + +> When done, run `git status --short` and confirm only the brief's allowlisted files changed. If anything else is modified or created, flag it in the report — do not claim success. + +Used for any brief with a strict allowlist (audit reports that must NOT touch source data, etc.). + +### `dedup-across-N` — collision check across multiple files + +> After writing, concatenate `` and confirm is unique across all of them (`sort -u | wc -l` matches input line count). + +Used when the same key (e.g. kanji character) must not appear in multiple level files. + +### `schema-match` — preserve existing JSONL/JSON shape + +> Read `` first to learn the schema. Every new entry must include exactly the same keys, same types, same canonical values for non-content fields (e.g. `source`, `license`). + +Used when extending an existing data file family. + +## Example brief (good) + +``` +working_dir: /home/screenleon/github/japanese-site/ +goal: Audit PR #8 N1 corpus additions for JLPT level appropriateness — flag mis-classification. +files: + - read: server/data/corpus/grammar/N1/{ga-hayai-ka,...}.{json,examples.jsonl} + - read: server/data/corpus/vocab/N1.jsonl (60 net-new rows) + - read: server/data/corpus/kanji/N1.jsonl (40 net-new rows) + - write: audits/pr-8-jlpt-level-audit-2026-05-02.md +constraints: + - READ-ONLY on all files under server/data/corpus/ + - Only write the audit report +self_verify: + - cross-source: vocab+kanji ≥1 source from Jisho/Tanos; grammar ≥2 sources from Bunpro/JLPT-Sensei/Maggie/Tanos + - sample-N OK re-check: 3 random OK items at the footer + - git-status no-collateral-damage: only the audit file as new +acceptance: + - report file exists at the specified path + - every flagged entry has citations + - footer contains the 3-item OK re-check +output_format: markdown with three grouped sections (## OK / ## needs review / ## re-classify); table columns per row +``` + +## Example brief (bad — would be rejected) + +``` +"Audit the N1 stuff and flag anything wrong." +``` + +No working_dir, no files, no acceptance criteria. Codex would have to guess what corpus, what sources, how to verify, where to write the report. Reject and ask. + +## Style notes + +- Prose is fine; YAML-like keys above are conventions, not strict syntax. Use a heredoc when piping the brief on stdin. +- Keep briefs in the active voice ("Audit X", not "X should be audited"). Codex parses imperatives more reliably than declaratives. +- Don't write implementation steps. The brief tells Codex *what* and *what counts as done*; *how* is Codex's job. From 14b665c24e806704f07a3ec62e4dceff1326744e Mon Sep 17 00:00:00 2001 From: Lien Chen Date: Sat, 2 May 2026 10:19:11 +0900 Subject: [PATCH 4/4] =?UTF-8?q?fix(pr-gate):=20single=20PM=20hop=20?= =?UTF-8?q?=E2=80=94=20main=20thread=20classifies=20via=20heuristic?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Architecture-reviewer noted that the prior pr-gate flow could call PM twice on ambiguous classification (once to classify, once to synthesize). Each PM invocation reloads the full PM system prompt (~3KB). Move classification to a deterministic heuristic in the skill text: non_docs=$(git diff $BASE...HEAD --name-only \ | grep -vE '\.(md|jsonl|txt)$|^\.gitignore$|^audits/|^docs/|^\.github/') [ -z "$non_docs" ] && CLASS=docs-only || CLASS=implementation Bias toward `implementation` when ambiguous: an unnecessary security/risk spawn returns `pass-not-applicable` cheaply, but a missed implementation review ships a real bug. PM's role narrows to synthesis only — invoked once at Step 3 with classification + reviewer outputs. Net: one PM round-trip per gate instead of one or two. Co-Authored-By: Claude Opus 4.7 (1M context) --- commands/pr-gate.md | 24 +++++++++++++++--------- 1 file changed, 15 insertions(+), 9 deletions(-) diff --git a/commands/pr-gate.md b/commands/pr-gate.md index ea550b8..ff2fdbd 100644 --- a/commands/pr-gate.md +++ b/commands/pr-gate.md @@ -3,23 +3,28 @@ description: Run the pre-PR review pipeline (critic + architecture + security + argument-hint: [optional context, e.g. "skip qa, already audited"] --- -Run the PR gate. Subagents cannot spawn subagents in Claude Code, so the **main thread** orchestrates reviewers; `project-pm` synthesizes. +Run the PR gate. Subagents cannot spawn subagents in Claude Code, so the **main thread** orchestrates reviewers; `project-pm` is invoked once at the end to synthesize. -## Step 1 — classify the diff +## Step 1 — classify the diff (main thread, no PM hop) -In the main thread, detect the integration branch and stat the diff: +Detect the integration branch and check the diff: ```bash BASE=$(git symbolic-ref refs/remotes/origin/HEAD 2>/dev/null | sed 's@^refs/remotes/origin/@@' || echo main) git diff "$BASE"...HEAD --stat ``` -Decide: +Apply the heuristic — bias toward `implementation` when ambiguous (an unnecessary security/risk spawn returns `pass-not-applicable` cheaply; a missed implementation review can ship a real bug): -- **docs / content-only** (no runtime code change) → spawn `critic` + `architecture-reviewer` only. Security / risk / qa are `pass-not-applicable`. -- **implementation change** (any runtime code diff) → spawn all four advisors + qa-tester. +```bash +non_docs=$(git diff "$BASE"...HEAD --name-only | grep -vE '\.(md|jsonl|txt)$|^\.gitignore$|^audits/|^docs/|^\.github/' || true) +[ -z "$non_docs" ] && CLASS=docs-only || CLASS=implementation +``` -If unsure, invoke `project-pm` first with the diff stat to get the classification, then proceed to step 2. +- `docs-only` → spawn `critic` + `architecture-reviewer`. Security / risk / qa are implicitly `pass-not-applicable`; do not spawn. +- `implementation` → spawn all four advisors + `qa-tester`. + +Do not invoke PM at this step. PM's role is synthesis only. ## Step 2 — spawn reviewers in parallel from main thread @@ -36,9 +41,10 @@ Agent(subagent_type: "qa-tester", ...) # implementation only Each reviewer brief should include: working dir, branch name vs integration branch, diff summary, scope hints from $ARGUMENTS. -## Step 3 — synthesize via project-pm +## Step 3 — synthesize via project-pm (single hop) + +After all reviewers return, invoke `project-pm` once with the classification + their verbatim outputs and ask it to: -After all reviewers return, invoke `project-pm` with their verbatim outputs and ask it to: - Compose the final gate summary (each reviewer's verdict, any blocks with override paths, final go/no-go). - Record `block-soft` overrides or trade-off advisories into project memory.