chore: claude-config tier-1 cleanup + agent lint + CI#2
Merged
Conversation
Address deferred items from PR #1 critic/architecture review and add a regression guard: - commands/pr-gate.md: detect integration branch via `git symbolic-ref refs/remotes/origin/HEAD` instead of hardcoding `main`; mark Agent(...) block as illustrative pseudocode. - commands/pm.md: drop "or spawn reviewers X/Y/Z" half — that's /pr-gate territory. Move full Bash(codex exec) fallback recipe here so PM prompt doesn't carry main-thread implementation details. - agents/project-pm.md: drop the Bash(codex exec) fallback line from PM prompt — main-thread fallback paths don't belong here. - README.md: refresh project-pm description; add "Subagents cannot spawn subagents" as the lead Design note with link to Agent SDK docs. - scripts/lint-agents.sh: regression guard — fail if any subagent declares Agent in frontmatter tools. Wired into install.sh pre-flight. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…lication
Address review findings on prior tier-1 cleanup commit:
- scripts/lint-agents.sh: previously only matched the inline scalar
form `tools: Read, Bash, Agent`. The YAML block-list form
tools:
- Read
- Agent
was silently passed — exact escape hatch the script is meant
to prevent. Now extracts the full tools: block (handles inline,
flow `[a, b, c]`, and block-list) and matches Agent as a whole
token. Also warn on files lacking YAML frontmatter rather than
treating them as compliant. Verified against fixtures covering
inline / flow / block-list / solo Agent / "MyAgent" suffix /
"AgentX" prefix / no-frontmatter cases.
- .github/workflows/lint.yml: run scripts/lint-agents.sh on push
and PRs. install.sh pre-flight only catches drift on machines
that re-run install; CI catches it on every contribution
regardless of whether the contributor symlinks. Defense in depth.
- commands/pm.md: codex fallback now points at scripts/codex-dispatch.sh
(single source of truth for sandbox / approval / trace flags)
instead of duplicating the raw codex exec invocation. Avoids
drift the next time codex-dispatch.sh changes.
- README.md: codex-executor description now explicitly says
"dispatched by the main thread" — matches the new project-pm
description and the Design notes lead bullet.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Briefs dispatched to codex-executor have been written ad-hoc each time — required fields varied, self-verify patterns were copied from memory rather than referenced, and a vague brief would still get dispatched (executor's only validation was "working dir is missing"). This drives composition cost up and quality variance. Establish a canonical schema and codify the reusable patterns: - docs/codex-brief.md (new): four required fields (working_dir, goal, files, acceptance) and five reusable self-verify macros extracted from real briefs run this session: cross-source, sample-N OK re-check, git-status no-collateral-damage, dedup-across-N, schema-match. Includes one good example brief and one bad example for contrast. - agents/codex-executor.md: enforce the schema. Reject briefs missing any required field rather than improvising. Link to the schema doc. - agents/project-pm.md: PM's brief-writing section now points at the schema doc and macros, instead of reciting an inline example. -3 lines. - commands/pm.md: note that briefs must follow the schema. - README.md: new "Codex briefs" section linking to the schema; "Adding new pieces" reminds contributors not to put Agent in subagent tools (lint-agents.sh enforces). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Architecture-reviewer noted that the prior pr-gate flow could call
PM twice on ambiguous classification (once to classify, once to
synthesize). Each PM invocation reloads the full PM system prompt
(~3KB).
Move classification to a deterministic heuristic in the skill text:
non_docs=$(git diff $BASE...HEAD --name-only \
| grep -vE '\.(md|jsonl|txt)$|^\.gitignore$|^audits/|^docs/|^\.github/')
[ -z "$non_docs" ] && CLASS=docs-only || CLASS=implementation
Bias toward `implementation` when ambiguous: an unnecessary
security/risk spawn returns `pass-not-applicable` cheaply, but a
missed implementation review ships a real bug.
PM's role narrows to synthesis only — invoked once at Step 3 with
classification + reviewer outputs. Net: one PM round-trip per gate
instead of one or two.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 15, 2026
screenleon
added a commit
that referenced
this pull request
May 23, 2026
Gate findings on PR (CC-243 stacked on cc-060): - [medium, 3-reviewer overlap] Default snapshot output path used only a second-resolution timestamp (`/tmp/pm-snapshot-<ts>.md`); concurrent /pm spawns could collide and clobber each other's snapshots, feeding the wrong ground truth to PM. Switched the default to include PID + $RANDOM suffix (`/tmp/pm-snapshot-<ts>-$$.$RANDOM.md`), which is collision-resistant enough for solo + concurrent dispatches without introducing a mktemp portability concern. - [medium] Snapshot script hard-failed when `origin/main` could not be resolved (detached HEAD, missing remote, fresh local clone). Added a graceful fallback: try `origin/main` -> `main` (local) -> emit `branch_base: unresolved` with a `# warn:` line. The same fallback also degrades `ahead_by` to 0 + warns instead of hard-failing. - [low] BACKLOG.md CC-243 description claimed `project_tooling` was derived from a shell probe `ls Makefile project/backlog.yml 2>/dev/null`, but the script actually emits three fixed boolean keys (`makefile`, `backlog_render_target`, `has_validate_sh`). Updated the row text to match the impl contract and noted the map is extensible. Also updated the default OUT_PATH shape in the description and added pr:TBD ref (PR number unknown until open). - [low] CLI `--help` / unknown-flag paths had no test coverage. Added cli-help (exits 0, usage on stderr), cli-unknown-flag (non-zero, "unknown argument" message), and branch-base-warn-on-missing-origin-main (the medium #2 fix's positive case). Local verify: 31 tests passed (was 29), lint-scripts OK, validator emits 31 E-* lines (= CC-228 baseline unchanged). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
screenleon
added a commit
that referenced
this pull request
May 23, 2026
Gate findings on PR (CC-243 stacked on cc-060): - [medium, 3-reviewer overlap] Default snapshot output path used only a second-resolution timestamp (`/tmp/pm-snapshot-<ts>.md`); concurrent /pm spawns could collide and clobber each other's snapshots, feeding the wrong ground truth to PM. Switched the default to include PID + $RANDOM suffix (`/tmp/pm-snapshot-<ts>-$$.$RANDOM.md`), which is collision-resistant enough for solo + concurrent dispatches without introducing a mktemp portability concern. - [medium] Snapshot script hard-failed when `origin/main` could not be resolved (detached HEAD, missing remote, fresh local clone). Added a graceful fallback: try `origin/main` -> `main` (local) -> emit `branch_base: unresolved` with a `# warn:` line. The same fallback also degrades `ahead_by` to 0 + warns instead of hard-failing. - [low] BACKLOG.md CC-243 description claimed `project_tooling` was derived from a shell probe `ls Makefile project/backlog.yml 2>/dev/null`, but the script actually emits three fixed boolean keys (`makefile`, `backlog_render_target`, `has_validate_sh`). Updated the row text to match the impl contract and noted the map is extensible. Also updated the default OUT_PATH shape in the description and added pr:TBD ref (PR number unknown until open). - [low] CLI `--help` / unknown-flag paths had no test coverage. Added cli-help (exits 0, usage on stderr), cli-unknown-flag (non-zero, "unknown argument" message), and branch-base-warn-on-missing-origin-main (the medium #2 fix's positive case). Local verify: 31 tests passed (was 29), lint-scripts OK, validator emits 31 E-* lines (= CC-228 baseline unchanged). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
screenleon
added a commit
that referenced
this pull request
May 23, 2026
* backlog: add CC-243 (pm-prep-snapshot) + CC-244 (typed pipeline, someday) Captures the agent-routing improvements proposed in the CC-060 retro: - CC-243 (🔵 active, P2): scripts/pm-prep-snapshot.sh writes a typed state snapshot to /tmp/pm-snapshot-<ts>.md before any PM agent spawn. Solves the two HALT classes hit during CC-242: caller brief carrying a stale main commit (#129+#130 had merged after the brief was written) and a ticket-ID collision (CC-241 was already consumed). Schema-like keys (branch_base/ticket_ids_consumed/project_tooling) chosen now to align with the future spike_v1 schema in CC-244. - CC-244 (🟢 someday): typed spike → brief → handover pipeline. Define spike_v1 schema + validator + gen-brief-from-spike helper. Deferred because only one spike doc (CC-060) exists today; schema value scales with N. Promote when 3+ spikes accumulate or an automation use case demands structured spike state. Validator: BACKLOG.md emits 31 E-* lines (unchanged from pre-existing CC-228 baseline; the 4 area/epic enum errors I initially hit on ops/agent-prep + agent-routing were resolved by picking valid tokens ops/arch + design). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(cc-243): add pm pre-spawn snapshot capture * fix(cc-243): three pm-prep-snapshot bugs caught by local test run Codex's initial commit (d964e56) had three issues that the test suite caught on first local run: 1. printf '---\n' triggers `printf: --: invalid option` on bash 5.1.16 builtin printf — leading `---` is parsed as a flag prefix. Switched both occurrences to `printf '%s\n' '---'` (lines 246, 271). 2. `${#RECENTLY_MERGED[@]:-0}` is invalid bash syntax — the :- default operator cannot combine with the # length operator on an array. RECENTLY_MERGED is declared as an empty array at line 160, so `${#RECENTLY_MERGED[@]}` returns 0 safely. Dropped the :-0. 3. `collect_backlog_row()` used `awk -F'|'` and ran `gsub` on `$2` to trim whitespace for matching. Modifying `$2` causes awk to rebuild `$0` with the default OFS (space), which stripped all `|` from the emitted row — making the test for `| CC-244 |` literal fail. Fixed by copying `$2` into a local `candidate` variable, so `$0` stays untouched. The CC-243 test passing earlier was a false-positive: the literal `| CC-243 |` happens to appear inside the CC-243 body section's example YAML, so it matched there even though the actual row was malformed. Local verify: bash scripts/test-pm-prep-snapshot.sh -> 29 passed, 0 failed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(cc-243): address pr-gate findings (2 medium + 2 low) Gate findings on PR (CC-243 stacked on cc-060): - [medium, 3-reviewer overlap] Default snapshot output path used only a second-resolution timestamp (`/tmp/pm-snapshot-<ts>.md`); concurrent /pm spawns could collide and clobber each other's snapshots, feeding the wrong ground truth to PM. Switched the default to include PID + $RANDOM suffix (`/tmp/pm-snapshot-<ts>-$$.$RANDOM.md`), which is collision-resistant enough for solo + concurrent dispatches without introducing a mktemp portability concern. - [medium] Snapshot script hard-failed when `origin/main` could not be resolved (detached HEAD, missing remote, fresh local clone). Added a graceful fallback: try `origin/main` -> `main` (local) -> emit `branch_base: unresolved` with a `# warn:` line. The same fallback also degrades `ahead_by` to 0 + warns instead of hard-failing. - [low] BACKLOG.md CC-243 description claimed `project_tooling` was derived from a shell probe `ls Makefile project/backlog.yml 2>/dev/null`, but the script actually emits three fixed boolean keys (`makefile`, `backlog_render_target`, `has_validate_sh`). Updated the row text to match the impl contract and noted the map is extensible. Also updated the default OUT_PATH shape in the description and added pr:TBD ref (PR number unknown until open). - [low] CLI `--help` / unknown-flag paths had no test coverage. Added cli-help (exits 0, usage on stderr), cli-unknown-flag (non-zero, "unknown argument" message), and branch-base-warn-on-missing-origin-main (the medium #2 fix's positive case). Local verify: 31 tests passed (was 29), lint-scripts OK, validator emits 31 E-* lines (= CC-228 baseline unchanged). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ci: add test-pm-prep-snapshot.sh to skill-refine shellcheck exclude Same pattern as the other test-*.sh harness consumers in this exclude list (test-codex-dispatch.sh, test-dispatch-handover.sh, ...): they source scripts/lib/test-harness.sh which exposes shared vars like tmp_root, and that pattern triggers a slew of style-tier shellcheck warnings (SC1090 non-constant source, SC2154 referenced-but-unassigned, SC2120 function arg passthrough). Excluding the file matches existing project policy rather than scattering inline disable directives. CI test-skill-refine job was red on PR #132 because the new file wasn't on the list. test-pm-prep-snapshot.sh itself was clean per the local `bash scripts/lint-scripts.sh` run (which uses a different shellcheck severity profile). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes the deferred items from PR #1 review plus a deeper Tier-A pass on token usage and the codex pipeline.
Tier 1 cleanup (initial commits)
git symbolic-ref refs/remotes/origin/HEAD; mark Agent(...) block as illustrative pseudocode.scripts/codex-dispatch.sh, not rawcodex exec.Bash(codex exec)fallback line; main-thread implementation details don't belong in subagent prompts.Agentintools:. Handles inline scalar, YAML flow[a,b,c], and YAML block-list. Verified against fixtures.Tier A — token + codex pipeline (3 added commits)
docs(codex): canonical brief schema + executor enforcementcross-source,sample-N OK re-check,git-status no-collateral-damage,dedup-across-N,schema-match. Good and bad example briefs.fix(pr-gate): single PM hop — main thread classifies via heuristicgrep -vE '\.(md|jsonl|txt)$|^audits/|...') instead of asking PM. PM is invoked once at Step 3 (synthesis only). Bias towardimplementationwhen ambiguous.Pre-merge review
advise→ medium finding (lint missed YAML block-list form) fixed; codex flag duplication fixed; codex-executor README description fixed.advise→ install-time-only enforcement fixed via CI workflow; double PM hop concern fixed via single-hop pr-gate rewrite (Tier A).Test plan
docs/codex-brief.mdschema; codex-executor accepts (or correctly rejects malformed briefs)