plan: Phase 0 design-lock for quality-gate structured-marker contract by cbeaulieu-gt · Pull Request #226 · glitchwerks/github-actions

cbeaulieu-gt · 2026-05-07T01:18:24Z

Summary

Phase 0 design-lock for issue #185 (Adopt structured marker contract for quality-gate). Ships the implementation plan only; no code changes. Phases 1-4 are separate PRs.

Refs #185, #183, #223. Closures land in Phase 4 (#185, #183, #223).

Background

Issue #223 is the concrete reproducer: PR #222's review used ### Critical Issues Found 🚨 heading + **two critical inconsistencies** lowercase phrasing; the gate's prose-regex didn't match either pattern; gate posted success despite blocker findings. The strategic fix has been long-tracked as #185 (Option A: HTML-comment JSON), but no schema or rollout had been locked. This plan locks 8 design decisions and phases the implementation.

Locked design decisions (full table in plan file)

#	Decision	Lock
1	Schema	JSON block, sentinel `claude-pr-review-summary-v1`, required `schemaVersion` + `findings.{critical,high,medium,low}`
2	Marker placement	Last in review body; persona reserves ~2 final turns (honor system)
3	Verdict source	Derived by gate (`blocking` if `critical>0 OR high>0`)
4	Shadow mechanism	Second commit status `claude-pr-review/quality-gate-shadow`, advisory
5	Cutover	≥20 reviews / ≥5 PRs / zero disagreement / zero marker-missing
6	Phase 4 marker-missing	Fail-closed; recovery via `gh run rerun`
7	#183 disposition	Open through Phase 3, close at Phase 4
8	Plan location	`docs/superpowers/plans/2026-05-06-quality-gate-structured-marker.md`

Phasing

Phase	PR shape	Closes
0 (this PR)	Plan file only	—
1	Persona emits marker; rebuild review overlay; repoint digest pin	—
2	Gate runs both regex (auth) + marker (advisory shadow status); fixture corpus seeded with PR #222 review body	—
3	Cutover — marker authoritative; regex retained as fallback for missing markers	—
4	Cleanup — remove regex; marker-missing fails closed	#185, #183, #223

Why Option A (HTML-comment JSON), not B or C

#185 originally proposed three options:

A (HTML-comment JSON) — the locked choice. Fully implementable in this repo with no upstream cooperation; jq-parseable; survives Markdown rendering invisibly.
B (action step outputs) — would require upstream anthropics/claude-code-action cooperation; out of our control.
C (separate artifact) — decouples comment from signal; overkill for current needs.

Files

docs/superpowers/plans/2026-05-06-quality-gate-structured-marker.md (+406 lines, new file)

Test plan

Plan file committed to issue-185-quality-gate-structured-marker (not main)
All 8 locked decisions preserved verbatim in the plan
Phase 4 acceptance criteria includes closes #185, closes #183, closes #223
Out-of-scope section explicitly names Quality-gate fails open on transient gh api failure #182 and multi-bot collision
Reviewer eye on whether the JSON schema matches what the persona will reasonably emit (no surprise required-fields gotchas)
Reviewer eye on whether the cutover criterion (≥20/≥5/zero) is achievable given current PR cadence
Reviewer eye on Phase 1's acceptance — manual test PR is the validation, not CI

Out of scope (carried forward in plan)

Quality-gate fails open on transient gh api failure #182 (gate fails open on transient gh api failure) — orthogonal failure mode
Multi-bot comment-selection ambiguity — separate concern
Upstream cooperation with anthropics/claude-code-action — Option A doesn't need it

🤖 Generated by Claude Code on behalf of @cbeaulieu-gt

Phase 0 design-lock for issue #185 (Adopt structured marker contract for quality-gate). Implements Option A (HTML-comment JSON block). Locks 8 design decisions made during planning conversation: schema format, marker placement, verdict source, shadow-mode mechanism, cutover criterion, Phase 4 marker-missing handling, #183 disposition, plan-file location. Phases 1-4 are separate PRs; this one ships the plan only. References #222/#223 reproducer; PR #222's review body becomes Phase 2's regression fixture. Refs #185, #183, #223. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

github-actions · 2026-05-07T01:18:53Z

Claude finished @cbeaulieu-gt's task in 2m 44s —— View job

PR Review Complete

I've reviewed the Phase 0 design-lock plan for the quality-gate structured-marker contract. This is a well-structured design document with clear decisions and phasing. Below are my findings organized by severity.

✅ Strengths

Excellent phasing strategy: Shadow-mode (Phase 2) before cutover (Phase 3) is the right approach for de-risking the transition
Clear locked decisions: All 8 decisions are explicitly documented and authoritative
Strong rationale for Option A: HTML-comment JSON is the right choice given the constraints
Fail-closed posture: Phase 4's strict marker-missing handling is appropriate after the shadow period
Test fixtures: Using PR docs: refresh consumer-facing README for Phase 5+ container-pinned workflows #222 as the regression case directly addresses issue quality-gate reports success on review with explicit "two critical" language and "### Critical Issues Found" heading #223
Explicit scope boundaries: "Out of scope" section prevents feature creep

🟡 Medium-Priority Concerns

1. Schema validation specification missing (Phase 2)

Location: Lines 226-234 (Phase 2 deliverable)

The plan specifies what the marker parser should do but not how it validates the schema. Should explicitly specify:

Field presence validation (all 4 required fields present?)
Type validation (all values are integers?)
Range validation (reject negative numbers? maximum value?)
Malformed JSON handling (syntax errors vs missing fields)

Recommendation: Add a "Schema validation rules" subsection under "Marker schema (v1)" that specifies:

- All fields (critical, high, medium, low) MUST be present
- All values MUST be non-negative integers
- Values outside [0, 2^31-1] are rejected as invalid
- Missing fields → marker_invalid
- Syntax errors → marker_invalid

2. Cutover criterion tracking mechanism unclear

Location: Line 291-295 (Phase 3 cutover criterion)

Phase 3 requires "≥20 reviews across ≥5 PRs" to be documented, but there's no aggregator script mentioned (unlike Phase 2-3 which has marker-emission-aggregate.yml). The PR body will need to link to workflow runs that prove this criterion is met.

Recommendation: Add to Phase 2's deliverables: extend marker-emission-aggregate.yml to also track successful reviews (not just marker_missing events) so the Phase 3 cutover data is automatically available.

3. Phase 1 manual validation is fragile

Location: Lines 208-209 (Phase 1 acceptance)

"After merge, the next 5 PR reviews on this repo each contain the marker block (manual visual check)" assumes the persona will work correctly. If there's a turn-budget regression, reviews might silently omit the marker.

Recommendation: Consider adding a Phase 1.5 validation script that greps recent bot comments for the marker sentinel and alerts if missing. This could run weekly and open an issue, similar to overlay-smoke.yml.

4. Backward compatibility for future schema versions unspecified

Location: Lines 136-138 (Backward compatibility)

The plan says "The gate MUST accept schemaVersion: 1 indefinitely" but doesn't specify what happens when the persona emits schemaVersion: 2. Will the gate:

Reject it as invalid?
Fall back to prose-regex?
Attempt to parse known fields and ignore unknown ones?

Recommendation: Add a forward-compatibility rule to the schema section:

The gate MUST reject unknown schemaVersion values (>1) as marker_invalid until
the gate is updated to handle that version. This prevents silent misparsing.

5. Phase 4 K=10 tracking has no automation

Location: Line 353 (Phase 4 acceptance)

"K=10 consecutive marker-authoritative runs across ≥3 distinct PRs since Phase 3 cutover documented in PR body" has no tracking script mentioned.

Recommendation: Reuse the marker-emission-aggregate.yml from Phase 2 or add a step to count successful marker-derived verdicts. Otherwise this is purely manual and error-prone.

🔵 Low-Priority / Clarifications

6. Turn reservation discipline is vague

Location: Line 58 (locked decision #2)

"Reserve final ~~2 turns for marker emission" — the tilde (~~) makes this ambiguous. Is it 2 turns or 3?

Recommendation: Change to "Reserve final 2-3 turns" or "Reserve at least 2 final turns" for clarity.

7. Verdict rule doesn't explicitly state the zero-findings case

Location: Lines 125-130 (Verdict rule)

The rule says if critical > 0 OR high > 0: blocking, else: clean, but doesn't explicitly say "all zeros → clean". This is implied but could be clearer.

Recommendation: Add an explicit example:

Example: findings: {critical: 0, high: 0, medium: 0, low: 0} → verdict "clean" → state=success

8. Multi-bot collision acknowledged but not mitigated

Location: Lines 371-374 (Out of scope)

The plan explicitly defers multi-bot scenarios. This is fine for Phase 0, but the gate currently fetches "most recent comment authored by github-actions[bot]" (line 238-249 of pr-review/action.yml). If multiple bots start posting, this could break.

Recommendation: Add a note in "Out of scope" that if multi-bot support is needed in the future, the marker sentinel should include a bot-specific prefix (e.g., claude-pr-review-summary-v1-glitchwerks-bot).

9. PR #222 comment ID should be verified

Location: Line 142-143 (Regression fixture)

The plan references "issue-comment id 4392784955" as PR #222's review body. Should verify this is the correct comment before using it as the regression fixture in Phase 2.

Recommendation: In Phase 2 PR, include a verification step that fetches this comment ID and confirms it contains the expected text ("Critical Issues Found 🚨").

📋 Questions for Author (from PR body)

The PR body asks for reviewer input on two questions:

Q1: "Reviewer eye on whether the JSON schema matches what the persona will reasonably emit"

✅ Answer: The schema is minimal and reasonable. The persona should be able to emit this reliably. However:

Add explicit validation rules for negative numbers (see concern feat: scope PR reviews to incremental diff on synchronize events #1)
Consider whether schemaVersion should allow the persona to skip it (defaulting to 1) or if it's strictly required

Q2: "Reviewer eye on whether the cutover criterion (≥20/≥5/zero) is achievable given current PR cadence"

⚠️ Cannot assess without data: I don't have access to historical PR metrics for this repo. The cutover criterion requires:

20 reviews across 5 PRs
Zero disagreements between prose-regex and marker
Zero marker-missing events

If the repo averages 5+ PRs per month with automated reviews, this is achievable in 1-2 months. If cadence is lower, it could take 3-6 months. Recommendation: Run a historical query to check how many PRs with automated reviews occurred in the last 3 months.

✅ Phase 0 Acceptance Criteria Status

Checking against the plan's own Phase 0 acceptance criteria (lines 161-164):

Plan file is committed to main via a merged PR — pending (this PR)
All eight locked design decisions are present in the plan verbatim — ✅ confirmed (lines 55-64)
No code, persona, or workflow changes in this PR — ✅ confirmed (only the plan file added)
PR body references Adopt structured marker contract for quality-gate (replace grep) #185, Validate quality-gate severity-marker regex against anthropics/claude-code-action actual output #183, quality-gate reports success on review with explicit "two critical" language and "### Critical Issues Found" heading #223 with Refs (not Closes) keywords — ✅ confirmed in PR body

🎯 Recommendation

APPROVE with requested changes. The plan is sound and well-structured. Address the medium-priority concerns (especially #1 schema validation and #2 cutover tracking) before Phase 1 implementation. The low-priority items can be refined during Phase 1-2 implementation.

📝 Summary

This is a high-quality design document that locks a concrete solution to the prose-regex fragility problem. The phasing is well-thought-out, and the fail-closed approach in Phase 4 is the right call. The main gaps are around schema validation specification and automated tracking for the cutover criteria. Once those are addressed in the plan or in Phase 1/2 implementation, this should be a smooth rollout.

cbeaulieu-gt mentioned this pull request May 7, 2026

Reviewer persona must emit binary APPROVE/BLOCK verdict — no "approve with requested changes" or softened forms #227

Closed

7 tasks

cbeaulieu-gt merged commit f396c27 into main May 7, 2026
4 checks passed

cbeaulieu-gt deleted the issue-185-quality-gate-structured-marker branch May 7, 2026 01:51

cbeaulieu-gt mentioned this pull request May 7, 2026

persona: emit binary verdict + structured marker block (#185 Phase 1, #227) #228

Merged

13 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

plan: Phase 0 design-lock for quality-gate structured-marker contract#226

plan: Phase 0 design-lock for quality-gate structured-marker contract#226
cbeaulieu-gt merged 1 commit intomainfrom
issue-185-quality-gate-structured-marker

cbeaulieu-gt commented May 7, 2026

Uh oh!

github-actions Bot commented May 7, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cbeaulieu-gt commented May 7, 2026

Summary

Background

Locked design decisions (full table in plan file)

Phasing

Why Option A (HTML-comment JSON), not B or C

Files

Test plan

Out of scope (carried forward in plan)

Uh oh!

github-actions Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Complete

✅ Strengths

🟡 Medium-Priority Concerns

1. Schema validation specification missing (Phase 2)

2. Cutover criterion tracking mechanism unclear

3. Phase 1 manual validation is fragile

4. Backward compatibility for future schema versions unspecified

5. Phase 4 K=10 tracking has no automation

🔵 Low-Priority / Clarifications

6. Turn reservation discipline is vague

7. Verdict rule doesn't explicitly state the zero-findings case

8. Multi-bot collision acknowledged but not mitigated

9. PR #222 comment ID should be verified

📋 Questions for Author (from PR body)

✅ Phase 0 Acceptance Criteria Status

🎯 Recommendation

📝 Summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented May 7, 2026 •

edited

Loading