feat(plan-rollout): /plan-rollout MVP — decomposition-as-artifact#1424
Open
mastermanas805 wants to merge 12 commits into
Open
feat(plan-rollout): /plan-rollout MVP — decomposition-as-artifact#1424mastermanas805 wants to merge 12 commits into
mastermanas805 wants to merge 12 commits into
Conversation
The semantic-contract-graph schema. Optional input to /plan-rollout — declares role-level contracts (auth mints session tokens middleware enforces; breaks-if format change without coordinated deploy). Distinct from the import graph (discovered at runtime). Repo-wide, long-lived, hand-authored. This commit lands the spec only. The consuming skill ships in the next commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Decomposition-as-artifact. Reads the working diff (committed + staged + unstaged + untracked) plus SYSTEM.md if present, writes decomposition.md with per-slice file lists, reader-time estimates, dependency edges, and contract-graph reconciliation flags. Positioned as the post-decision consumer to /plan-pull-request: - /plan-pull-request decides shape in conversation (pre-code). - /plan-rollout analyzes a real diff and writes the artifact (post-code). Triggers narrowed to "decompose the diff" / "write a decomposition" / "plan-rollout" to avoid collision with /plan-pull-request's pre-decision triggers. MVP boundaries (explicit): - No rollout.md, no /spill-check, no /ship-/review integrations. - No SYSTEM.md scaffolder — humans write the schema by hand or copy the example. - Reconciliation is informational, never blocking. - Step 2 explicitly handles uncommitted working-tree state via `git diff <base>` (not `<base>...HEAD`) plus `git ls-files --others --exclude-standard` for untracked. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…s.md
Add the one-line registry entries the skill-validation tests
("every skill is documented") expect. Positions /plan-rollout in the
plan-mode review group (alongside /plan-eng-review, /plan-tune) with
its specialist label "Decomposition Analyst" in docs/skills.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Worked example: manually walked /plan-rollout against garrytan#1241 (fix(ask-user): keep question payloads compact) — 41 files, +661 / -282, regeneration-heavy. Verdict the skill should emit: "one PR." 39 of 41 files are deterministic regenerations of one source change in scripts/resolvers/preamble/generate-ask-user-format.ts; slicing them off would leave dependent fragments. Reader time: ~21 min (under cap). Findings (v1.1 todos) surfaced by the dogfood: 1. Deterministic-regeneration detector — merge build-output slices into their source slice automatically. 2. Regen-multiplier on reader-time so skim-only output isn't over-counted. 3. --explain mode — when verdict is "one PR," print the rejected slicing alternatives and the signals that rejected them. 4. Calibration loop — predicted vs actual reader-time on first ~10 real invocations to ground v2 heuristics in data. SYSTEM.md is explicitly called out as the WRONG primitive for catching build-output coupling — that needs a Makefile-style dependency, not a contract graph. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
/plan-pull-request is not a gstack-shipped skill; it's a separately installed global skill. Referencing it as a sibling in our SKILL.md and docs/skills.md created a dangling dependency from the maintainer's perspective. The skill stands on its own: it reads a real diff and writes decomposition.md. Whatever pre-decision workflow the user runs beforehand is the user's setup, not this skill's documented contract. - Stripped "Relationship to /plan-pull-request" section from plan-rollout/SKILL.md.tmpl - Removed the "If invoked before code exists, point at /plan-pull-request" redirect — now a simple "nothing to decompose, write a slice first" exit - Reworded the docs/skills.md table row to describe what the skill does on its own, no external pairing claims - Tightened the description frontmatter accordingly Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Match the SCREAMING_SNAKE_CASE convention used by other topic docs in docs/ (ADDING_A_HOST.md, OPENCLAW.md, REMOTE_BROWSER_ACCESS.md, ON_THE_LOC_CONTROVERSY.md). The hyphenated form was carried over from closed PR garrytan#1417 and matched no existing convention in this directory. No content changes — pure rename. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Match the versioned design-doc convention in docs/designs/ (PLAN_TUNING_V0.md, PLAN_TUNING_V1.md, SELF_LEARNING_V0.md, PACING_UPDATES_V0.md). The original PLAN_ROLLOUT_DOGFOOD.md filename introduced a new "_DOGFOOD" suffix that didn't match any existing pattern and read like an evidence appendix rather than a design doc. Restructure: - New "Design" section at the top describing what /plan-rollout is, what v0 ships, and what's deferred to v1.1+ - "Dogfood: PR garrytan#1241" section retains the worked example (file breakdown, reader-time estimate, verdict, findings) - New "v1.1 roadmap" section consolidates the four follow-up todos All original dogfood content preserved verbatim under its new section heading. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 11, 2026
Trim ceremonial prose. Keep load-bearing content: intro, schema, field reference (now a table), example, "how /plan-rollout uses it", out-of-scope. Drop redundant "what it is / what it isn't" expansion, "relationship to other declarative files" table, separate scaffolding section. No semantic change to the schema or the skill's contract with it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Drop "What v0 got right" (self-congratulation), "What v0 proves" (redundant), and the long prose framing on each finding. Collapse findings into one-sentence-each v1.1 backlog. All concrete content preserved: problem, design, what ships, dogfood table, verdict, four findings, limit-surfaced. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tighten step prose. All 8 steps + self-check + limits preserved semantically. Behavior unchanged — same bash commands, same priority order in slice ranking, same verdict-first design. Combined with the docs/ compressions, total substantive diff drops 701 → 414 lines (-41%). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… tree Per OSS convention, design rationale and dogfood evidence live in the PR description, not as checked-in cruft. The V0.md content has been folded into PR garrytan#1424's description, where reviewers actually look. 81 lines off the reviewable diff. If long-form design rationale is wanted in-repo later, it can land as a follow-up — but only if a real consumer (a feature that depends on the rationale) ships at the same time. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Apply Anthropic skill-authoring guidance from their public docs:
"every line is a recurring token cost; if a competent reader wouldn't
miss it, remove it." State what to do, drop the why/narration. Trust
that the reader is Claude.
Cuts:
- "What this skill does/doesn't do" prose framing (replaced by terse
bullets)
- Per-step rationale paragraphs ("These edges are how you order slices
because...") → kept the rule, dropped the explanation
- Repeated "no slicing" hedging across multiple sections → one source
of truth in the When-to-invoke section
Behavior unchanged. Generated SKILL.md drops 1011 → 897 lines (~10%
fewer tokens at every skill invocation).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Author
|
@garrytan I have recreated the PR with asked changes in #1192 (comment) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Third pass at #1192. Foundation-only doc-PR (#1417) was correctly closed as "literature without a consumer." This PR lands the consumer.
Summary
/plan-rollout(plan-rollout/SKILL.md.tmpl, 179 lines). Reads the working diff (committed + staged + unstaged + untracked) plusSYSTEM.mdif present, writesdecomposition.mdwith per-slice file lists, reader-time estimates, dependency edges, and reconciliation flags.docs/SYSTEM_MD.md(129 lines). Optional input — the skill falls back to path heuristics + import-graph discovery when absent.Size discipline
After the compression pass (latest 2 commits):
The 896-line gap between substantive and raw is the generated
plan-rollout/SKILL.md— deterministic output of the template, reviewer skim.Per OSS PR-size research (SmartBear/Cisco, Google internal): review effectiveness drops sharply beyond 400 changed lines. Substantive content here (310 lines) is well inside the healthy band.
Dogfood: PR #1241 — "is this one PR or three?"
Manually walked
/plan-rollout's logic against garrytan/gstack#1241 (fix(ask-user): keep question payloads compact, 41 files, +661 / −282).Verdict the skill should emit: one PR. 39 of 41 files are deterministic regenerations of one source change in
scripts/resolvers/preamble/generate-ask-user-format.ts. Not independently shippable — splitting them off leaves dependent fragments. Reader time: ~21 min, under the 30-min cap.Bucketing (no SYSTEM.md):
*/SKILL.mdregenerations (36 files, +576/−252) ·scripts/resolvers/preamble/(1 file, the fix, +16/−7) ·test/fixtures/golden/regenerations (3, +54/−27) ·test/tests (2, +31/−3).v1.1 backlog surfaced by this dogfood (none required for v0):
*/SKILL.mdis mechanical output of*/SKILL.md.tmpl. Without the detector a naive operator could still ship "source + regenerations" as two PRs.--explainmode. When the verdict is "one PR," print the rejected slicing alternatives and the signal that rejected each.Honest limit surfaced:
SYSTEM.mdis not the right primitive for build-output coupling — that's a Makefile-style dependency, not a contract graph. The schema doc says so.Self-dogfood
Ran
/plan-rolloutagainst this branch. Output: a 50-line "one PR" verdict written to~/.gstack/projects/. The skill correctly did not manufacture a multi-slice stack for a tightly-coupled set of files.Explicit MVP boundaries
Out of scope (deferred — will only land with consumers):
rollout.md(rollout/rollback strategy + inverse-rollback auto-gen)/shipand/reviewintegrationsSYSTEM.mdscaffolderReverting v0 is
git rm -r plan-rollout/ docs/SYSTEM_MD.md. Schema and registry entries revert independently if either piece doesn't fit.Relationship to other plan-* skills
/plan-ceo-review//plan-eng-review/plan-rollout(this PR)decomposition.mdartifact/plan-rolloutcomplements the plan-* family without duplicating them — they review the plan, this analyzes the diff.Test plan
bun test test/skill-validation.test.ts test/gen-skill-docs.test.ts— 704/704 passbun run gen:skill-docs --host allgenerates clean for claude + 7 external hosts/plan-rolloutagainst this branch, verdict correctly "one PR"Pre-existing test failures in
test/*gbrain-sync*are present onmain(verified by stashing the change and re-running onmain).Commits (bisectable)
10 commits in three logical waves:
Refs #1192. Supersedes #1417.
🤖 Generated with Claude Code