feat(cc-250): /pr-gate v2 — machine-readable result + escalation hint#144
Merged
Conversation
Bundle of 4 gate-infra improvements per the M1 prerequisite ticket. **A. YAML frontmatter** on every result file (sequential + parallel): `gate_result_version: pr_gate_result_v1`, `final`, `tier`, `mode`, `most_severe`, `reviewers:` map (all 5 with `skipped` for out-of-tier), `escalation:` block. Parallel mode adds `SYNTHESIS_FINAL ↔ frontmatter final:` parity check. **B. `## Escalation` body section** mirrors frontmatter `escalation:`. Trigger: sensitive-path keyword in diff AND ≥1 reviewer returned advise/block-soft. Both empty + populated cases are valid emissions. Hint-only — auto-escalation execution out of scope. **C. `--base` fallback chain** prepends `gh pr view --json baseRefName` when no `--base` given and `gh` is on PATH. Silent fall-through to existing `origin/HEAD → main` chain if absent. **D. `## Override policy` section** added to each of 5 reviewer agent .md files, consolidating override discipline already prose-scattered across `agents/project-pm.md`. Cross-references the canonical §"User override discipline" section. No code-path change. Backward-compat: `^Final: (GO|NO-GO)$` line preserved verbatim in `## Gate Conclusion`; validate.sh + downstream parsers unaffected. Test additions in test-pr-gate.sh: - frontmatter shape assertions (sequential) - escalation section presence + parity with frontmatter - gh pr view fallback (success + absent gh + non-zero gh) - Final-line uniqueness regression (back-compat) - 5-reviewer override-policy presence - existing `synthesis-artifact-tamper-detected` stub updated to emit frontmatter so the new parity check passes before the tamper assertion `bash pm/scripts/validate.sh BACKLOG.md` parity preserved at 30 baseline (CC-228). `bash scripts/test-pr-gate.sh` 46/46. `bash scripts/test-run-all-tests.sh` 13/13. BACKLOG.md adds CC-250 index row + body. MILESTONES.md adds `### M1 prerequisite — gate-infra typed surface` sub-table positioning CC-250 as the typed-output contract feeding CC-231 (reviewer-policy extraction) and CC-215 (pmctl) downstream consumers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4 tasks
screenleon
added a commit
that referenced
this pull request
May 23, 2026
…close cc-250 (#145) * docs(cc-251): brief-authoring discipline for multi-file dispatches + close cc-250 CC-251 — 3 patterns added to prevent codex apply_patch debug-loop hang on > 4 files OR > 50 lines verbatim briefs: 1. **apply_patch retry-cap** (constraint text): HALT after 2nd consecutive verification failure on the same file; no 3rd retry. Codex has no internal retry-cap → without this, debug loop runs until dispatch timeout (1800s). 2. **Verbatim-as-attached-file** (pattern): write embedded content (override-policy paragraphs, BACKLOG rows, brief-template fragments) to /tmp/<task>-content/*.md BEFORE dispatch; brief references path + says "copy verbatim, do NOT paraphrase". Eliminates the hallucinate-when-retyping failure mode (CC-250 stderr observed `pass/fail print-format` → `print-format` — "pass/fail" dropped). 3. **`expected_head_sha` state pin** (schema field): 40-char sha in brief metadata + `git rev-parse HEAD == <sha>` self_verify check. Catches "wrong branch / branch advanced / file changed by another process" before any patch is attempted. Documented in: - `agents/project-pm.md` — "Multi-file brief discipline" prose added to the "Writing a brief for codex-executor" section - `docs/dispatch-brief.md` — `expected_head_sha` as Optional section with usage example - Memory `[[feedback_codex_brief_discipline]]` (separate repo) — retro evidence from CC-247/248 #142 + CC-250 #144 dispatch hangs For briefs touching > 8 files OR > 200 lines verbatim, ALSO split the dispatch into 2–3 smaller ones (each 2–4 files); split alone without the 3 patterns above doesn't fully prevent the hang. Also closes **CC-250** as a tail-cleanup (status flip + `pr:TBD` → `pr:#144` + body Outcome / See blocks; mirrors the pattern PR #144 brief explicitly deferred to a follow-up commit). MILESTONES.md M1 prerequisite sub-table flips CC-250 ⏳ → ✅ (#144) and adds CC-251 ⏳. Long-term resolution: CC-235 (tiered-lifecycle-gate) enforces split mechanically; CC-244 (typed pipeline) turns verbatim into schema fields; CC-215 (pmctl) may add `--expect-head <sha>` wrapper flag. Validator parity preserved at 30 (CC-228 baseline). No code change; discipline is brief-authoring time, not runtime. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(cc-251): normalize verbatim-length threshold to 50 lines PR-gate critic [low]: intro states `> 50 lines` trigger but bullet 2 said `> 30 lines`. Standardize to 50 across both. Matches the memory copy at feedback_codex_brief_discipline.md (also updated 30→50). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
screenleon
added a commit
that referenced
this pull request
May 23, 2026
…cc-252 (#146) * docs(cc-249): spike result — assert_* consolidation decisions CC-249 spike completes investigation: decides API shape + migration strategy for the divergent assert_* helpers Explore surveyed across 14 test-*.sh files. Impl follows in PR-B (2-dispatch plan documented in spike Cost Estimate). Decisions (transcribed verbatim into docs/spikes/CC-249.md): - **Q1** assert_exit signature = `(name, actual, expected)` — majority (4 of 5 files); 1 outlier (test-check-docs-freshness) rewrites in PR-B - **Q2** assert_contains splits into 3 separate named helpers: `assert_file_contains` (literal), `assert_file_matches` (regex), `assert_string_contains` (string-form). Per breaking-change-for- maintainability: separate names self-document intent - **Q3** migration = break-and-rewrite. No multi-arity shim, no deprecation period. Internal test infra, bounded blast radius - **Q4** assert_output_contains merged into assert_string_contains with explicit-arg form (no hidden $LAST_OUTPUT global) - **Q5** 5 Uncertainties resolved (U1→PR-B audit, U2/U3/U5 scope-out, U4 resolved by Q4) Includes per-consumer migration matrix + Cost Estimate (~+80 to +200 LoC across ~16 files; net repo line count likely negative due to local-helper deletions). BACKLOG row flipped ⏸ deferred → 🟢 someday: impl is now fully scoped, only sequencing gates PR-B. Per pm-schema, 🟢 someday is the bare-token enum for "scoped, ready, no urgent driver". Validator parity preserved at 30 (CC-228 baseline). No code change; docs-only PR. Brief applied CC-251 patterns: expected_head_sha verified before first edit; apply_patch retry-cap constraint included; verbatim-as- attached-file pattern N/A (brief touched only 2 files). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(backlog): file CC-252 — /pr-gate Final: line emission needs hardening Discovered while running /pr-gate on this branch (CC-249 spike): codex applied prose markdown emphasis to the Final line as `**Final: GO**`, but pr-gate.sh's back-compat parity check uses `^Final: (GO|NO-GO)$` (no leading `**`), so the gate script exits 1 even though the verdict is GO. CC-252 captures the brief-template hardening needed in the CC-250 (#144) brief inside scripts/pr-gate.sh. Out of scope for this PR (CC-249 spike doc); filed for separate impl. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
v0.3.0 M1 prerequisite. Bundle of 4 gate-infra improvements that make the
/pr-gateresult file a typed contract — preparing the surface that CC-231 (reviewer-policy extraction) and CC-215 (pmctl) will consume.gate_result_version: pr_gate_result_v1+ final/tier/mode/most_severe/reviewers map + escalation block)scripts/pr-gate.shsequential + parallel brief templates## Escalationbody section — trigger: sensitive-path keyword in diff AND ≥1 reviewer returned advise/block-soft. Hint-only (no auto-execution)--basefallback chain prependsgh pr view --json baseRefNamewhen no--basegiven andghis on PATH; silent fall-through to existingorigin/HEAD → mainchain## Override policysection in each of 5 reviewer agent .md files consolidating override discipline already prose-scattered inagents/project-pm.mdagents/{critic,qa-tester,architecture-reviewer,security-reviewer,risk-reviewer}.mdBackward compatibility
^Final: (GO\|NO-GO)$line preserved verbatim in## Gate Conclusion— validate.sh + downstream parsers unaffected.SYNTHESIS_FINALMUST match frontmatterfinal:field.Meta-test
Ran
/pr-gate standardon this branch — the gate produced its OWN result file in the new format:Final: GO — critic + qa-tester + architecture-reviewer all approve, zero findings.
Test plan
bash scripts/test-pr-gate.sh— 46/46 (existing 41 + new 5 covering frontmatter / escalation / gh-pr-view / Final-line uniqueness / override-policy)bash scripts/test-run-all-tests.sh— 13/13 integrationbash pm/scripts/validate.sh BACKLOG.md— parity 30 (CC-228 baseline)/pr-gate standardself-gate — Final: GO, frontmatter present, escalation false (no sensitive paths)shellcheck --severity=style scripts/pr-gate.sh— not installed locally; CI will runCC-250 hang recovery note
Initial codex dispatch hit a debug loop on
apply_patch verification failed(same pattern as #142): codex's internal model of the file drifted from disk state after several edits, then it retried patches against stale context until self-recovering. The dispatch did finish (exit 0) without main-thread intervention this time. One test stub (synthesis-artifact-tamper-detected) needed a one-line frontmatter fix afterwards because the new parity check rejected its frontmatter-less synthesis stub output before the tamper assertion could run.Going forward, brief-authoring discipline improvements are planned (retry-cap constraint, verbatim-content-as-attached-file pattern,
expected_head_shastate pin) — see CC-244 / CC-235 typed-pipeline + tiered-lifecycle-gate roadmap for the structural fix.Out of scope
verdict: {level, override_policy}enum data model → CC-231 (M1 core/policy)backlog_candidatesoutput → CC-215 (pmctl)🤖 Generated with Claude Code