From 6fedaaf8180e79478dd0b7b22abe8c695cdb5644 Mon Sep 17 00:00:00 2001 From: Jon Langevin Date: Sun, 31 May 2026 17:54:41 -0400 Subject: [PATCH 1/9] docs(adversarial-design-review): design for Existence/runtime-validity bug-class (#55) --- ...stence-runtime-validity-bugclass-design.md | 141 ++++++++++++++++++ 1 file changed, 141 insertions(+) create mode 100644 docs/plans/2026-05-31-existence-runtime-validity-bugclass-design.md diff --git a/docs/plans/2026-05-31-existence-runtime-validity-bugclass-design.md b/docs/plans/2026-05-31-existence-runtime-validity-bugclass-design.md new file mode 100644 index 0000000..6a008f6 --- /dev/null +++ b/docs/plans/2026-05-31-existence-runtime-validity-bugclass-design.md @@ -0,0 +1,141 @@ +# Existence / Runtime-Validity Bug-Class Design + +**Status:** Approved (autonomous — user pre-authorized full-pipeline execution) +**Date:** 2026-05-31 +**Issue:** https://github.com/GoCodeAlone/autonomous-dev-kit/issues/55 + +## Problem + +`adversarial-design-review` scans a fixed checklist of bug-classes. Two recent +retros independently hit the same gap the checklist does not cover: **the review +verifies the intended content/shape of an artifact but never verifies the +artifact actually EXISTS or RUNS as the design assumes.** + +Evidence (2-retro trend, both *gate-asserted-shape, reality-differed*): + +1. **wfctl smart CI generation** (`docs/retros/2026-05-30-wfctl-secrets-wizard-and-smart-ci-retro.md`): + generated CI steps were shape-valid but **non-functional at runtime** — + emitted `wfctl ci run --phase migrate` (no such phase) and a `... || true` + plan-guard that gated nothing. Caught late (real-repo regen + code review), + not at design. +2. **required_secrets sweep** (`docs/retros/2026-05-31-plugin-required-secrets-sweep-retro.md`): + design assumed all 18 target plugins had a `workflow-registry` manifest to + edit; **3 (entra/scalekit/auth0) had none** — discovered at execution (jq + failed on a missing file), forcing a mid-execution scope-lock amendment. One + `ls plugins//manifest.json` at design time would have caught it. + +The two adjacent existing classes don't cover it: +- `Plugin-loader runtime layout` → about the plugin *binary* layout, not "does + the target artifact exist". +- `Config-validation schema rules` → about new config files satisfying a schema, + not "does the generated artifact execute" or "does the thing I'm mutating + exist". + +## Goals + +- Add one bug-class that flags any design/plan asserting artifact *content* + correctness without an *existence* (for mutated artifacts) or *behavior* + (for generated artifacts) check. +- Scan it in **both** review phases (design + plan), per the issue title. +- Match the existing rows' voice: declarative definition grounded in concrete, + repo-real examples + an explicit "flag X" instruction. + +## Non-Goals + +- No new report-format field, no new `## section`, no worked-example subsection. +- No change to the FAIL/PASS convergence loop or severity model. +- No second row — one combined class, not split existence/runtime rows. +- No change to any other skill (`demonstration-fidelity` already covers the + *faked-demo* failure downstream; this is the upstream design/plan complement). + +## Global Design Guidance + +`Guidance: none found as docs/design-guidance.md; canon from README §Cross-LLM, +docs/plans/2026-04-25-cross-llm-portability-design.md, ADR 0001.` + +| guidance | design response | +|---|---| +| Host-neutral / cross-LLM first | Pure SKILL.md prose; no host-specific tool or hook. | +| Checklist is the floor, not the ceiling | New row joins the mandatory-scan set; reviewer must report Finding/Clean for it like every other row. | +| Ground classes in concrete invariants | Row cites the two real retro misses verbatim (`wfctl ci run --phase migrate`; missing `workflow-registry` manifest). | +| Minimal surface | One table row + version bump + release notes. No structural change. | + +## Approach Options + +| option | summary | trade-off | +|---|---|---| +| **Recommended: one row in the design-phase checklist** | Design-phase classes are inherited by the plan phase (`SKILL.md:101` — "plan-phase reviewer scans the design-phase classes above"). One row → scanned in both phases. | Smallest surface; exactly matches "design + plan phases". Existence half is design-altitude; runtime half complements plan-phase `Verification-class mismatch` from upstream. | +| Two rows (existence in design, runtime-validity in plan) | Split the concern by altitude. | Duplicates the idea; issue asks for *a* (singular) combined class; more text, no extra coverage. | +| One row in the plan-phase checklist only | Sit next to its cousins (`Plugin-loader runtime layout`). | Misses the design phase — but the required_secrets miss was a *design*-time existence gap. Rejected. | + +## Design + +Single edit to `skills/adversarial-design-review/SKILL.md`: append one row to the +**design-phase** bug-class table (lines 85–97). The plan-phase table already +declares (line 101) that it scans the design-phase classes, so the new class is +covered in both invocations with no second edit. + +Row wording (matches existing declarative + concrete-example voice): + +> **Existence / runtime-validity** — For any design/plan that mutates or +> generates an external artifact (registry manifest, plugin release, CI workflow +> step, API endpoint, config the tool consumes): does it verify (a) each +> artifact it *mutates* actually **exists** — an `ls`/`gh` at design time (e.g. +> confirm a target plugin has a `workflow-registry` manifest before the plan +> edits it; a missing one forced a mid-execution amendment in the +> required_secrets sweep) — and (b) each artifact it *generates* actually +> **executes / contract-checks** against the real consumer — run it / dry-run it +> (e.g. the emitted CI step is a real command, not `wfctl ci run --phase +> migrate`, which no subcommand accepts), not merely that the intended content +> parses? Flag any design that asserts content correctness without an existence +> or behavior check. Cheap to satisfy (usually one `ls`/`gh`/dry-run); +> complements `demonstration-fidelity` by pushing the check upstream into +> design/plan. + +Version: bump `6.2.1 → 6.2.2` (patch — additive skill content, no behavior +break) across the three manifests via `scripts/bump-version.sh 6.2.2`. Add a +`RELEASE-NOTES.md` entry. Merge to main auto-triggers `release-tag.yml`. + +## Security Review + +None. Documentation-only change to a skill markdown file. No secrets, no network, +no new permissions, no executable surface. + +## Infrastructure Impact + +None at runtime. Release path is the existing `release-tag.yml` (push to main +touching `.claude-plugin/plugin.json` → version-check → tag → marketplace +dispatch). No new infra. + +## Multi-Component Validation + +The "components" are the two reviewer invocations. Validation is structural: +- `tests/version-check.sh` confirms all three manifests agree post-bump. +- `tests/skill-content-check.yml` (skill-content lint) passes on the edited + SKILL.md. +- Inheritance is asserted by the existing `SKILL.md:101` line — verified present + before relying on it (no second edit needed). The plan-phase reviewer will now + enumerate the new class because it scans the design-phase table. + +## Assumptions + +| id | assumption | challenge | fallback | +|---|---|---|---| +| A1 | Design-phase classes are scanned in the plan phase | `SKILL.md:101` could change | Verified the line is present this session; if it were removed the row would need duplicating into the plan table. | +| A2 | Patch bump is correct (additive, non-breaking) | Could be seen as minor feature | Additive checklist row changes no existing behavior or contract → patch per semver. | +| A3 | `release-tag.yml` fires on plugin.json change at merge | Workflow could be disabled | Workflow file present + path-filtered on `.claude-plugin/plugin.json`; verified this session. | + +## Rollback + +Revert the PR. The row is purely additive prose; removing it restores the prior +checklist with no migration. Version bump reverts with the same commit. + +## Self-Challenge + +- **Simplest alternative:** raw one-line edit, no design doc. Rejected — repo + convention runs the pipeline even for one-row skill changes (precedent: + `2026-05-31-session-owned-lock-claims-design.md`). +- **Fragile assumption:** A1 (inheritance). Mitigated by verifying `SKILL.md:101` + this session. +- **YAGNI sweep:** no second row, no new report field, no worked example — all + rejected as surface the issue didn't ask for. From 32f8124e2bd11c34095f67fd9cc59dab8af3c7a6 Mon Sep 17 00:00:00 2001 From: Jon Langevin Date: Sun, 31 May 2026 17:59:45 -0400 Subject: [PATCH 2/9] docs: revise existence/runtime-validity design per adversarial cycle 1 (2I+3m) --- ...stence-runtime-validity-bugclass-design.md | 54 +++++++++++-------- 1 file changed, 31 insertions(+), 23 deletions(-) diff --git a/docs/plans/2026-05-31-existence-runtime-validity-bugclass-design.md b/docs/plans/2026-05-31-existence-runtime-validity-bugclass-design.md index 6a008f6..7f371af 100644 --- a/docs/plans/2026-05-31-existence-runtime-validity-bugclass-design.md +++ b/docs/plans/2026-05-31-existence-runtime-validity-bugclass-design.md @@ -51,7 +51,9 @@ The two adjacent existing classes don't cover it: ## Global Design Guidance `Guidance: none found as docs/design-guidance.md; canon from README §Cross-LLM, -docs/plans/2026-04-25-cross-llm-portability-design.md, ADR 0001.` +docs/plans/2026-04-25-cross-llm-portability-design.md. No applicable ADR (the +existing ADRs 0001/0002 govern the scope-lock lifecycle, not the review-skill +checklist).` | guidance | design response | |---|---| @@ -75,22 +77,16 @@ Single edit to `skills/adversarial-design-review/SKILL.md`: append one row to th declares (line 101) that it scans the design-phase classes, so the new class is covered in both invocations with no second edit. -Row wording (matches existing declarative + concrete-example voice): - -> **Existence / runtime-validity** — For any design/plan that mutates or -> generates an external artifact (registry manifest, plugin release, CI workflow -> step, API endpoint, config the tool consumes): does it verify (a) each -> artifact it *mutates* actually **exists** — an `ls`/`gh` at design time (e.g. -> confirm a target plugin has a `workflow-registry` manifest before the plan -> edits it; a missing one forced a mid-execution amendment in the -> required_secrets sweep) — and (b) each artifact it *generates* actually -> **executes / contract-checks** against the real consumer — run it / dry-run it -> (e.g. the emitted CI step is a real command, not `wfctl ci run --phase -> migrate`, which no subcommand accepts), not merely that the intended content -> parses? Flag any design that asserts content correctness without an existence -> or behavior check. Cheap to satisfy (usually one `ls`/`gh`/dry-run); -> complements `demonstration-fidelity` by pushing the check upstream into -> design/plan. +Row wording, shown in **actual table-row form** (pipe-delimited, matching all 19 +existing rows — not a blockquote — so the implementer copies it verbatim into the +table). The trigger is split by artifact *type* to avoid false positives on +create-not-mutate designs (an artifact a design *creates* has nothing to `ls` or +dry-run at design time), and a closing sentence gives an explicit `Clean` escape +hatch (like the `Rollback story` row's built-in scoping): + +``` +| **Existence / runtime-validity** | For a design/plan that touches an artifact another tool/contract consumes (registry manifest, plugin release, CI workflow step, API endpoint, config a tool reads): (a) for any artifact it *edits but did not create*, does it verify the artifact **exists** before the plan mutates it — an `ls`/`gh` at design time (e.g. confirm a target plugin has a `workflow-registry` manifest before editing it; a missing one forced a mid-execution amendment in the required_secrets sweep)? (b) for any artifact it *emits*, does it verify the **real consumer** accepts the emitted call — that it is a real command/schema, not `wfctl ci run --phase migrate` (no such subcommand) — by running/dry-running it or contract-checking the consumer's actual surface (you confirm the consumer exists, not that you pre-run output that may not exist yet)? Flag any design that asserts content correctness without the matching existence/behavior check. If the design neither edits an existing consumed artifact nor emits one a consumer must accept, mark **Clean**. Cheap to satisfy (usually one `ls`/`gh`/dry-run); complements `demonstration-fidelity` by pushing the check upstream into design/plan. | +``` Version: bump `6.2.1 → 6.2.2` (patch — additive skill content, no behavior break) across the three manifests via `scripts/bump-version.sh 6.2.2`. Add a @@ -111,17 +107,29 @@ dispatch). No new infra. The "components" are the two reviewer invocations. Validation is structural: - `tests/version-check.sh` confirms all three manifests agree post-bump. -- `tests/skill-content-check.yml` (skill-content lint) passes on the edited - SKILL.md. -- Inheritance is asserted by the existing `SKILL.md:101` line — verified present - before relying on it (no second edit needed). The plan-phase reviewer will now - enumerate the new class because it scans the design-phase table. +- `skill-content-check.yml` runs `tests/skill-content-grep.sh`, which lints only + for forbidden host-specific tokens (`TaskCreate`/`TodoWrite`/`Sonnet`/`Opus`/…). + It does **not** validate table structure or row placement — so "passes" here + means "no forbidden host token introduced", and correct table placement + + pipe-formatting is author responsibility checked at PR review (the row is shown + in exact table form above to make that mechanical). +- Inheritance is asserted by the existing `SKILL.md:101` line ("The plan-phase + reviewer scans the design-phase classes above") — verified present before + relying on it (no second edit needed). The plan-phase reviewer enumerates the + new class **only if** the `--phase=plan` dispatch prompt embeds the + design-phase table inline (per the skill's own dispatch instruction at + `SKILL.md:~261`, "paste the checklist for the chosen phase verbatim"). This is + a **pre-existing property shared by all 11 design-phase classes** — the new row + inherits the same coverage path, introducing no new gap. PR description will + note that plan-phase dispatches must embed the full (design + plan) checklist, + which they already must for every design-phase class. ## Assumptions | id | assumption | challenge | fallback | |---|---|---|---| -| A1 | Design-phase classes are scanned in the plan phase | `SKILL.md:101` could change | Verified the line is present this session; if it were removed the row would need duplicating into the plan table. | +| A1 | Design-phase classes are scanned in the plan phase | `SKILL.md:101` could change; plan-phase dispatch must embed the design-phase table | Verified `SKILL.md:101` present this session; coverage path is identical to all 11 existing design-phase classes (no new gap). If the line were removed the row would need duplicating into the plan table. | +| A4 | Existence check must not mis-fire on create-not-mutate designs | A literal reviewer could flag a design that *creates* an artifact for "not verifying it exists" | Row scopes part (a) to artifacts "edited but did not create" and adds an explicit `Clean` escape hatch for designs that neither edit nor emit a consumed artifact. | | A2 | Patch bump is correct (additive, non-breaking) | Could be seen as minor feature | Additive checklist row changes no existing behavior or contract → patch per semver. | | A3 | `release-tag.yml` fires on plugin.json change at merge | Workflow could be disabled | Workflow file present + path-filtered on `.claude-plugin/plugin.json`; verified this session. | From 662399661e8700f9700b62dba5e15b087df3207f Mon Sep 17 00:00:00 2001 From: Jon Langevin Date: Sun, 31 May 2026 18:01:48 -0400 Subject: [PATCH 3/9] docs: existence/runtime-validity design PASS adversarial cycle-2 (part-b wording fix) --- .../2026-05-31-existence-runtime-validity-bugclass-design.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/docs/plans/2026-05-31-existence-runtime-validity-bugclass-design.md b/docs/plans/2026-05-31-existence-runtime-validity-bugclass-design.md index 7f371af..1212861 100644 --- a/docs/plans/2026-05-31-existence-runtime-validity-bugclass-design.md +++ b/docs/plans/2026-05-31-existence-runtime-validity-bugclass-design.md @@ -3,6 +3,8 @@ **Status:** Approved (autonomous — user pre-authorized full-pipeline execution) **Date:** 2026-05-31 **Issue:** https://github.com/GoCodeAlone/autonomous-dev-kit/issues/55 +**Adversarial review:** design-phase PASS at cycle 2 (cycle 1 = 0C/2I/3m, all +resolved; cycle 2 = 0C/0I/1m, the lone Minor — part (b) wording — applied above). ## Problem @@ -85,7 +87,7 @@ dry-run at design time), and a closing sentence gives an explicit `Clean` escape hatch (like the `Rollback story` row's built-in scoping): ``` -| **Existence / runtime-validity** | For a design/plan that touches an artifact another tool/contract consumes (registry manifest, plugin release, CI workflow step, API endpoint, config a tool reads): (a) for any artifact it *edits but did not create*, does it verify the artifact **exists** before the plan mutates it — an `ls`/`gh` at design time (e.g. confirm a target plugin has a `workflow-registry` manifest before editing it; a missing one forced a mid-execution amendment in the required_secrets sweep)? (b) for any artifact it *emits*, does it verify the **real consumer** accepts the emitted call — that it is a real command/schema, not `wfctl ci run --phase migrate` (no such subcommand) — by running/dry-running it or contract-checking the consumer's actual surface (you confirm the consumer exists, not that you pre-run output that may not exist yet)? Flag any design that asserts content correctness without the matching existence/behavior check. If the design neither edits an existing consumed artifact nor emits one a consumer must accept, mark **Clean**. Cheap to satisfy (usually one `ls`/`gh`/dry-run); complements `demonstration-fidelity` by pushing the check upstream into design/plan. | +| **Existence / runtime-validity** | For a design/plan that touches an artifact another tool/contract consumes (registry manifest, plugin release, CI workflow step, API endpoint, config a tool reads): (a) for any artifact it *edits but did not create*, does it verify the artifact **exists** before the plan mutates it — an `ls`/`gh` at design time (e.g. confirm a target plugin has a `workflow-registry` manifest before editing it; a missing one forced a mid-execution amendment in the required_secrets sweep)? (b) for any artifact it *emits*, does it verify the emitted call targets a **real** consumer surface — e.g. confirm `wfctl ci run --phase migrate` is an actual subcommand/phase (it is not) by checking `wfctl help`/the consumer schema/a dry-run, rather than assuming the generated content merely parses (you confirm the consumer command/schema exists, not that you pre-run output that may not exist yet)? Flag any design that asserts content correctness without the matching existence/behavior check. If the design neither edits an existing consumed artifact nor emits one a consumer must accept, mark **Clean**. Cheap to satisfy (usually one `ls`/`gh`/dry-run); complements `demonstration-fidelity` by pushing the check upstream into design/plan. | ``` Version: bump `6.2.1 → 6.2.2` (patch — additive skill content, no behavior From f5e6bb4789cd4091cb2ad1fce40e918ea67cde13 Mon Sep 17 00:00:00 2001 From: Jon Langevin Date: Sun, 31 May 2026 18:02:56 -0400 Subject: [PATCH 4/9] docs: implementation plan for existence/runtime-validity bug-class (#55) --- ...-31-existence-runtime-validity-bugclass.md | 174 ++++++++++++++++++ 1 file changed, 174 insertions(+) create mode 100644 docs/plans/2026-05-31-existence-runtime-validity-bugclass.md diff --git a/docs/plans/2026-05-31-existence-runtime-validity-bugclass.md b/docs/plans/2026-05-31-existence-runtime-validity-bugclass.md new file mode 100644 index 0000000..31030aa --- /dev/null +++ b/docs/plans/2026-05-31-existence-runtime-validity-bugclass.md @@ -0,0 +1,174 @@ +# Existence / Runtime-Validity Bug-Class Implementation Plan + +> **For the implementing agent:** REQUIRED SUB-SKILL: Use autodev:executing-plans to implement this plan task-by-task. + +**Goal:** Add one "Existence / runtime-validity" bug-class row to the design-phase checklist of `adversarial-design-review`, then release as v6.2.2. + +**Architecture:** Single additive table row in `skills/adversarial-design-review/SKILL.md` (design-phase table, inherited by the plan phase via `SKILL.md:101`). Version bump across the three manifests + RELEASE-NOTES entry. Merge to main auto-triggers `release-tag.yml`. + +**Tech Stack:** Markdown skill files; bash test scripts (`tests/version-check.sh`, `tests/skill-content-grep.sh`); GitHub Actions release-tag workflow. + +**Base branch:** main + +--- + +## Scope Manifest + +**PR Count:** 1 +**Tasks:** 3 +**Estimated Lines of Change:** ~20 (informational; not enforced) + +**Out of scope:** +- A second row in the plan-phase table (design rejected this — one row, inherited). +- Any new report-format field, `## section`, or worked-example subsection in the skill. +- Changes to any other skill (`demonstration-fidelity` already covers the downstream faked-demo case). +- Any code/behavior change — this is documentation + a version bump only. + +**PR Grouping:** + +| PR # | Title | Tasks | Branch | +|------|-------|-------|--------| +| 1 | feat: add Existence/runtime-validity bug-class to adversarial-design-review (#55) | Task 1, Task 2, Task 3 | feat/existence-runtime-validity-bugclass-55 | + +**Status:** Draft + +--- + +## Project Design Guidance + +`Guidance: none at docs/design-guidance.md; canon = README §Cross-LLM + docs/plans/2026-04-25-cross-llm-portability-design.md.` Guidance → work mapping: +- Host-neutral / cross-LLM → the row is pure SKILL.md prose, no host-specific token (verified by `tests/skill-content-grep.sh` in Task 1). +- Checklist is the floor → the row joins the mandatory-scan set; no opt-out. +- Ground classes in concrete invariants → row cites the two real retro misses verbatim. + +--- + +### Task 1: Add the bug-class row to the design-phase checklist + +**Files:** +- Modify: `skills/adversarial-design-review/SKILL.md` (insert after the `User-intent drift` row, line ~97, immediately before `## Bug-class checklist — plan phase` at line ~99) + +**Step 1: Verify the inheritance line the approach depends on still exists** + +Run: `grep -n "plan-phase reviewer scans the design-phase classes above" skills/adversarial-design-review/SKILL.md` +Expected: one match at ~line 101 (confirms the new row is scanned in both phases without a second edit). If absent → STOP, the design's A1 assumption is broken. + +**Step 2: Insert the row** + +Insert this exact line as the last row of the **design-phase** table (after `| **User-intent drift** | ... |`, before the blank line preceding `## Bug-class checklist — plan phase`): + +``` +| **Existence / runtime-validity** | For a design/plan that touches an artifact another tool/contract consumes (registry manifest, plugin release, CI workflow step, API endpoint, config a tool reads): (a) for any artifact it *edits but did not create*, does it verify the artifact **exists** before the plan mutates it — an `ls`/`gh` at design time (e.g. confirm a target plugin has a `workflow-registry` manifest before editing it; a missing one forced a mid-execution amendment in the required_secrets sweep)? (b) for any artifact it *emits*, does it verify the emitted call targets a **real** consumer surface — e.g. confirm `wfctl ci run --phase migrate` is an actual subcommand/phase (it is not) by checking `wfctl help`/the consumer schema/a dry-run, rather than assuming the generated content merely parses (you confirm the consumer command/schema exists, not that you pre-run output that may not exist yet)? Flag any design that asserts content correctness without the matching existence/behavior check. If the design neither edits an existing consumed artifact nor emits one a consumer must accept, mark **Clean**. Cheap to satisfy (usually one `ls`/`gh`/dry-run); complements `demonstration-fidelity` by pushing the check upstream into design/plan. | +``` + +**Step 3: Verify the table still parses + the row is in the design-phase table** + +Run: `awk '/## Bug-class checklist — design phase/{d=1} /## Bug-class checklist — plan phase/{d=0} d && /Existence . runtime-validity/{print NR": "$0}' skills/adversarial-design-review/SKILL.md` +Expected: one line printed (the new row), proving it sits inside the design-phase section, not the plan-phase one. + +Run: `grep -c "^| \*\*" skills/adversarial-design-review/SKILL.md` +Expected: count increased by exactly 1 vs main (was 19 class rows + 2 header-ish; assert the new total = previous + 1 — confirm with `git diff --stat`). + +**Step 4: Skill-content lint (no forbidden host tokens)** + +Run: `bash tests/skill-content-grep.sh 2>&1 | tail -5` +Expected: exit 0 / no forbidden-token failure for `adversarial-design-review/SKILL.md`. (Documentation change class — lint is the verification; this script is the repo's skill gate.) + +**Step 5: Commit** + +```bash +git add skills/adversarial-design-review/SKILL.md +git commit -m "feat(adversarial-design-review): add Existence/runtime-validity bug-class (#55)" +``` + +Rollback: revert this commit — the row is purely additive prose, no migration. + +--- + +### Task 2: Add the RELEASE-NOTES.md entry + +**Files:** +- Modify: `RELEASE-NOTES.md` (insert a new `## v6.2.2 — 2026-05-31` section directly below the `# Autonomous Dev Kit Release Notes` title, above the existing `## v6.2.1` section) + +**Step 1: Insert the entry** + +Insert directly after the title line (`# Autonomous Dev Kit Release Notes`) and its blank line, before `## v6.2.1 — 2026-05-31`: + +```markdown +## v6.2.2 — 2026-05-31 + +New **Existence / runtime-validity** bug-class in `adversarial-design-review` +(design-phase checklist, inherited by the plan phase), closing a 2-retro gap +where a review verified an artifact's intended content but never that the +artifact **exists** or **runs** as the design assumed (issue #55). + +- `skills/adversarial-design-review/SKILL.md`: one new design-phase row. (a) For + any artifact a design *edits but did not create*, require an `ls`/`gh` + existence check before mutation (the required_secrets sweep hit a missing + `workflow-registry` manifest at execution). (b) For any artifact a design + *emits*, require verifying the consumer surface is real (the smart-CI gen + emitted `wfctl ci run --phase migrate`, no such phase). Explicit `Clean` + escape hatch for designs that neither edit nor emit a consumed artifact. + Complements `demonstration-fidelity` by pushing the check upstream. +``` + +**Step 2: Verify no broken markdown anchors / structure** + +Run: `grep -n "^## v6.2.2" RELEASE-NOTES.md && grep -n "^## v6.2.1" RELEASE-NOTES.md` +Expected: v6.2.2 line number < v6.2.1 line number (new entry is on top). + +**Step 3: Commit** + +```bash +git add RELEASE-NOTES.md +git commit -m "docs: release notes for v6.2.2 (#55)" +``` + +Rollback: revert this commit. + +--- + +### Task 3: Version bump to 6.2.2 + consistency check + +**Files:** +- Modify: `.claude-plugin/plugin.json`, `.claude-plugin/marketplace.json`, `.cursor-plugin/plugin.json` (all via the script) + +**Step 1: Confirm starting version + that the target tag does not already exist** + +Run: `grep '"version"' .claude-plugin/plugin.json | head -1; git ls-remote --tags origin refs/tags/v6.2.2` +Expected: version is `6.2.1`; the `git ls-remote` prints nothing (no `v6.2.2` tag yet). If `v6.2.2` exists → STOP and pick the next patch. + +**Step 2: Run the bump script** + +Run: `scripts/bump-version.sh 6.2.2` +Expected: `Bumping version: 6.2.1 → 6.2.2` and success across all three manifests. + +**Step 3: Verify all manifests agree (this is the exact gate `release-tag.yml` runs)** + +Run: `bash tests/version-check.sh` +Expected: `OK: All version files agree on version 6.2.2` + +**Step 4: Commit** + +```bash +git add .claude-plugin/plugin.json .claude-plugin/marketplace.json .cursor-plugin/plugin.json +git commit -m "chore: bump version to 6.2.2 (#55)" +``` + +Rollback: `scripts/bump-version.sh 6.2.2 6.2.1` + revert commit; since the release tag is created only on merge-to-main, a pre-merge revert prevents the tag entirely. + +--- + +## Verification summary (change-class mapping) + +| Task | Change class | Verification | Expected | +|---|---|---|---| +| 1 | Documentation (skill prose) | `tests/skill-content-grep.sh` + awk section-scope check | exit 0; row inside design-phase table | +| 2 | Documentation | markdown anchor/order grep | v6.2.2 above v6.2.1 | +| 3 | Version pin (manifests) | `tests/version-check.sh` | all three agree on 6.2.2 | + +No runtime/build/deploy/migration/plugin-loading change → no `runtime-launch-validation` task required (none of the `finishing-a-development-branch` Step 1b triggers are met by a markdown + manifest-string change). The version bump is a manifest *string* change consumed only by `release-tag.yml`, whose own gate (`version-check.sh`) is run in Task 3. + +## Multi-Component / Integration proof + +The two "components" are the two reviewer invocations. The inheritance path (design-phase row → scanned in plan phase) is asserted by Task 1 Step 1 (grep for `SKILL.md:101`) + Step 3 (awk proves the row is in the design-phase section). The release "boundary" (manifests → `release-tag.yml`) is proven by Task 3 Step 3 running the identical `version-check.sh` the workflow runs. From 9edf49280f61ecc8628877a9f176c52146f611cf Mon Sep 17 00:00:00 2001 From: Jon Langevin Date: Sun, 31 May 2026 18:06:48 -0400 Subject: [PATCH 5/9] docs: plan PASS adversarial plan-phase (3 Minor wording/verification fixes) --- ...026-05-31-existence-runtime-validity-bugclass.md | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/docs/plans/2026-05-31-existence-runtime-validity-bugclass.md b/docs/plans/2026-05-31-existence-runtime-validity-bugclass.md index 31030aa..0e2854e 100644 --- a/docs/plans/2026-05-31-existence-runtime-validity-bugclass.md +++ b/docs/plans/2026-05-31-existence-runtime-validity-bugclass.md @@ -46,13 +46,16 @@ ### Task 1: Add the bug-class row to the design-phase checklist **Files:** -- Modify: `skills/adversarial-design-review/SKILL.md` (insert after the `User-intent drift` row, line ~97, immediately before `## Bug-class checklist — plan phase` at line ~99) +- Modify: `skills/adversarial-design-review/SKILL.md` (insert after the `User-intent drift` row at line ~97 — the LAST design-phase row — and before the blank line at ~line 98 that precedes the `## Bug-class checklist — plan phase` heading; do not collapse that blank separator) -**Step 1: Verify the inheritance line the approach depends on still exists** +**Step 1: Verify the inheritance line the approach depends on still exists + record the baseline class-row count** Run: `grep -n "plan-phase reviewer scans the design-phase classes above" skills/adversarial-design-review/SKILL.md` Expected: one match at ~line 101 (confirms the new row is scanned in both phases without a second edit). If absent → STOP, the design's A1 assumption is broken. +Run: `grep -c "^| \*\*" skills/adversarial-design-review/SKILL.md` +Expected: `19` (11 design-phase + 8 plan-phase class rows; table headers use `| Class | Definition |` and do NOT match `^| \*\*`). Record this baseline for Step 3. + **Step 2: Insert the row** Insert this exact line as the last row of the **design-phase** table (after `| **User-intent drift** | ... |`, before the blank line preceding `## Bug-class checklist — plan phase`): @@ -67,7 +70,7 @@ Run: `awk '/## Bug-class checklist — design phase/{d=1} /## Bug-class checklis Expected: one line printed (the new row), proving it sits inside the design-phase section, not the plan-phase one. Run: `grep -c "^| \*\*" skills/adversarial-design-review/SKILL.md` -Expected: count increased by exactly 1 vs main (was 19 class rows + 2 header-ish; assert the new total = previous + 1 — confirm with `git diff --stat`). +Expected: `20` (the Step 1 baseline of 19 + exactly 1; zero header rows match this pattern). Confirm the single added row with `git diff --stat`. **Step 4: Skill-content lint (no forbidden host tokens)** @@ -163,9 +166,9 @@ Rollback: `scripts/bump-version.sh 6.2.2 6.2.1` + revert commit; since the relea | Task | Change class | Verification | Expected | |---|---|---|---| -| 1 | Documentation (skill prose) | `tests/skill-content-grep.sh` + awk section-scope check | exit 0; row inside design-phase table | +| 1 | Documentation (skill prose) | `tests/skill-content-grep.sh` + awk section-scope check + `grep -c` count = 20 | exit 0; row inside design-phase table; count 19→20 | | 2 | Documentation | markdown anchor/order grep | v6.2.2 above v6.2.1 | -| 3 | Version pin (manifests) | `tests/version-check.sh` | all three agree on 6.2.2 | +| 3 | Version pin (manifests) | tag-uniqueness pre-check (`git ls-remote ... v6.2.2` empty, Task 3 Step 1) + `tests/version-check.sh` | no existing tag; all three agree on 6.2.2 | No runtime/build/deploy/migration/plugin-loading change → no `runtime-launch-validation` task required (none of the `finishing-a-development-branch` Step 1b triggers are met by a markdown + manifest-string change). The version bump is a manifest *string* change consumed only by `release-tag.yml`, whose own gate (`version-check.sh`) is run in Task 3. From 13c3606af185c53b408c4651b2253063e80e9f55 Mon Sep 17 00:00:00 2001 From: Jon Langevin Date: Sun, 31 May 2026 18:07:55 -0400 Subject: [PATCH 6/9] chore: lock scope for existence/runtime-validity bug-class (alignment passed) --- docs/plans/2026-05-31-existence-runtime-validity-bugclass.md | 2 +- ...2026-05-31-existence-runtime-validity-bugclass.md.scope-lock | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) create mode 100644 docs/plans/2026-05-31-existence-runtime-validity-bugclass.md.scope-lock diff --git a/docs/plans/2026-05-31-existence-runtime-validity-bugclass.md b/docs/plans/2026-05-31-existence-runtime-validity-bugclass.md index 0e2854e..36546ba 100644 --- a/docs/plans/2026-05-31-existence-runtime-validity-bugclass.md +++ b/docs/plans/2026-05-31-existence-runtime-validity-bugclass.md @@ -30,7 +30,7 @@ |------|-------|-------|--------| | 1 | feat: add Existence/runtime-validity bug-class to adversarial-design-review (#55) | Task 1, Task 2, Task 3 | feat/existence-runtime-validity-bugclass-55 | -**Status:** Draft +**Status:** Locked 2026-05-31T22:07:47Z --- diff --git a/docs/plans/2026-05-31-existence-runtime-validity-bugclass.md.scope-lock b/docs/plans/2026-05-31-existence-runtime-validity-bugclass.md.scope-lock new file mode 100644 index 0000000..61cc233 --- /dev/null +++ b/docs/plans/2026-05-31-existence-runtime-validity-bugclass.md.scope-lock @@ -0,0 +1 @@ +f765fa18cbd6b071f8bf5249006d8c8a6b8f1bcf3d561f103e818130f03ccc7c From 30d5ad21a76a2cc166c02a50fb42ee11a269a04e Mon Sep 17 00:00:00 2001 From: Jon Langevin Date: Sun, 31 May 2026 18:08:57 -0400 Subject: [PATCH 7/9] feat(adversarial-design-review): add Existence/runtime-validity bug-class (#55) --- skills/adversarial-design-review/SKILL.md | 1 + 1 file changed, 1 insertion(+) diff --git a/skills/adversarial-design-review/SKILL.md b/skills/adversarial-design-review/SKILL.md index 6d70686..cdbbfc8 100644 --- a/skills/adversarial-design-review/SKILL.md +++ b/skills/adversarial-design-review/SKILL.md @@ -95,6 +95,7 @@ checklist is the floor, not the ceiling. Additional findings are welcome. | **Rollback story** | How do we undo this if it goes wrong in production? For any change class that runtime-launch-validation already triggers on (build/deployment/version pins/startup config/migrations/plugin loading), the design MUST specify a rollback path. If absent → finding. | | **Simpler alternative not considered** | Name the laziest plausible solution. Did the design consider it and reject it for stated reasons? If not → finding. "Couldn't this be a flat file?" "Couldn't this be a cron job?" "Couldn't this be a single SQL view?" | | **User-intent drift** | Re-read the original ask. Does the design solve what the user asked for, or does it solve a different problem that was easier to design for? Compare the design's stated goals against the user's stated goals; flag drift. | +| **Existence / runtime-validity** | For a design/plan that touches an artifact another tool/contract consumes (registry manifest, plugin release, CI workflow step, API endpoint, config a tool reads): (a) for any artifact it *edits but did not create*, does it verify the artifact **exists** before the plan mutates it — an `ls`/`gh` at design time (e.g. confirm a target plugin has a `workflow-registry` manifest before editing it; a missing one forced a mid-execution amendment in the required_secrets sweep)? (b) for any artifact it *emits*, does it verify the emitted call targets a **real** consumer surface — e.g. confirm `wfctl ci run --phase migrate` is an actual subcommand/phase (it is not) by checking `wfctl help`/the consumer schema/a dry-run, rather than assuming the generated content merely parses (you confirm the consumer command/schema exists, not that you pre-run output that may not exist yet)? Flag any design that asserts content correctness without the matching existence/behavior check. If the design neither edits an existing consumed artifact nor emits one a consumer must accept, mark **Clean**. Cheap to satisfy (usually one `ls`/`gh`/dry-run); complements `demonstration-fidelity` by pushing the check upstream into design/plan. | ## Bug-class checklist — plan phase (must scan) From f1e5493661438094687b5ebb7d68cccb52619c05 Mon Sep 17 00:00:00 2001 From: Jon Langevin Date: Sun, 31 May 2026 18:09:20 -0400 Subject: [PATCH 8/9] docs: release notes for v6.2.2 (#55) --- RELEASE-NOTES.md | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/RELEASE-NOTES.md b/RELEASE-NOTES.md index b73428a..d6a71d7 100644 --- a/RELEASE-NOTES.md +++ b/RELEASE-NOTES.md @@ -1,5 +1,21 @@ # Autonomous Dev Kit Release Notes +## v6.2.2 — 2026-05-31 + +New **Existence / runtime-validity** bug-class in `adversarial-design-review` +(design-phase checklist, inherited by the plan phase), closing a 2-retro gap +where a review verified an artifact's intended content but never that the +artifact **exists** or **runs** as the design assumed (issue #55). + +- `skills/adversarial-design-review/SKILL.md`: one new design-phase row. (a) For + any artifact a design *edits but did not create*, require an `ls`/`gh` + existence check before mutation (the required_secrets sweep hit a missing + `workflow-registry` manifest at execution). (b) For any artifact a design + *emits*, require verifying the consumer surface is real (the smart-CI gen + emitted `wfctl ci run --phase migrate`, no such phase). Explicit `Clean` + escape hatch for designs that neither edit nor emit a consumed artifact. + Complements `demonstration-fidelity` by pushing the check upstream. + ## v6.2.1 — 2026-05-31 Scope-lock claim ownership hardening for issue #52: a resumed/fresh session can From b14632d1757f41b922b7ebc85a636eb69939589c Mon Sep 17 00:00:00 2001 From: Jon Langevin Date: Sun, 31 May 2026 18:09:30 -0400 Subject: [PATCH 9/9] chore: bump version to 6.2.2 (#55) --- .claude-plugin/marketplace.json | 2 +- .claude-plugin/plugin.json | 2 +- .cursor-plugin/plugin.json | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json index 75620ef..f4dafd1 100644 --- a/.claude-plugin/marketplace.json +++ b/.claude-plugin/marketplace.json @@ -9,7 +9,7 @@ { "name": "autodev", "description": "Autonomous development workflow skills for coding agents", - "version": "6.2.1", + "version": "6.2.2", "source": "./", "author": { "name": "Jon Langevin", diff --git a/.claude-plugin/plugin.json b/.claude-plugin/plugin.json index 9986483..9728fa1 100644 --- a/.claude-plugin/plugin.json +++ b/.claude-plugin/plugin.json @@ -1,7 +1,7 @@ { "name": "autodev", "description": "Autonomous development workflow skills for coding agents: design, review, planning, execution, monitoring, and retrospectives", - "version": "6.2.1", + "version": "6.2.2", "author": { "name": "Jon Langevin", "email": "jon@gocodealone.com" diff --git a/.cursor-plugin/plugin.json b/.cursor-plugin/plugin.json index c7b9ea1..e77ff1f 100644 --- a/.cursor-plugin/plugin.json +++ b/.cursor-plugin/plugin.json @@ -2,7 +2,7 @@ "name": "autodev", "displayName": "Autonomous Dev Kit", "description": "Autonomous development workflow skills for coding agents", - "version": "6.2.1", + "version": "6.2.2", "author": { "name": "Jon Langevin", "email": "jon@gocodealone.com"