GoCodeAlone · intel352 · May 29, 2026 · May 29, 2026 · May 29, 2026 · May 29, 2026
diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json
@@ -9,7 +9,7 @@
     {
       "name": "autodev",
       "description": "Autonomous development workflow skills for coding agents",
-      "version": "6.1.5",
+      "version": "6.2.0",
       "source": "./",
       "author": {
         "name": "Jon Langevin",

diff --git a/.claude-plugin/plugin.json b/.claude-plugin/plugin.json
@@ -1,7 +1,7 @@
 {
   "name": "autodev",
   "description": "Autonomous development workflow skills for coding agents: design, review, planning, execution, monitoring, and retrospectives",
-  "version": "6.1.5",
+  "version": "6.2.0",
   "author": {
     "name": "Jon Langevin",
     "email": "jon@gocodealone.com"

diff --git a/.cursor-plugin/plugin.json b/.cursor-plugin/plugin.json
@@ -2,7 +2,7 @@
   "name": "autodev",
   "displayName": "Autonomous Dev Kit",
   "description": "Autonomous development workflow skills for coding agents",
-  "version": "6.1.5",
+  "version": "6.2.0",
   "author": {
     "name": "Jon Langevin",
     "email": "jon@gocodealone.com"

diff --git a/README.md b/README.md
@@ -196,6 +196,7 @@ adversarial review challenges it explicitly.
 
 **Testing**
 - **test-driven-development** - RED-GREEN-REFACTOR cycle (includes testing anti-patterns reference)
+- **demonstration-fidelity** - A demo/example/showcase must execute the real artifact — no reimplementation, hard-coded output, or different-language fake
 
 **Debugging**
 - **systematic-debugging** - 4-phase root cause process (includes root-cause-tracing, defense-in-depth, condition-based-waiting techniques)

diff --git a/RELEASE-NOTES.md b/RELEASE-NOTES.md
@@ -1,5 +1,14 @@
 # Autonomous Dev Kit Release Notes
 
+## v6.2.0 — 2026-05-29
+
+New skill **demonstration-fidelity** + an advisory write-time hook, closing a verification-theater gap: an agent writes real code, then "demonstrates" it with a demo that never executes the real artifact — reimplementing the logic, hard-coding the output, or rewriting it in another language. The demo proves nothing yet is presented as proof.
+
+- **`skills/demonstration-fidelity/SKILL.md`** (host-neutral, load-bearing on every harness): a demonstration MUST execute the real artifact and show output produced by that run. Forbids reimplementation, hard-coded output, stubbing the artifact-under-demonstration, and detached prototypes — regardless of language. Allows substituting a *dependency* at a real interface seam **with disclosure**. Establishes "fidelity, not language sameness" (a real cross-language client crossing a real interface is valid), a 3-question fidelity test, a fake-vs-faithful example, and a rationalization table seeded from RED-baseline transcripts.
+- **`hooks/pretool-demo-fidelity-guard`** (advisory, NEVER blocks; Claude + Codex + Cursor via `hooks.json`): on a Write/Edit to a demo-like path, injects a fidelity reminder pointing at the skill. Heuristic is anchored to path *segments* (`demos`/`examples`) + basename prefixes (`demo*`/`example*`/`showcase*`/`quickstart*`) with segment/suffix exclusions (`test`/`spec`/`testdata`/`fixtures`/`vendor` segments, `*_test.*`/`*.spec.*` basenames) — so `example_test.go`/`testdata/` are skipped while `examples/latest-feature-demo.py` still fires. Session dedup keyed on `basename(transcript_path)`; fails **open** (fires) on state I/O failure; honors `SUPERPOWERS_HOOKS_DISABLE=1`.
+- **Pipeline wiring:** new `runtime-launch-validation` "Demonstration / example / showcase" change-class row (carving out artifact-stub-forbidden vs. disclosed-dependency-seam-allowed so it does not contradict RLV's "no stub on either end"); a `verification-before-completion` `demo/example works` claim-matrix row; a `finishing-a-development-branch` Step 1b demo note; `using-autodev` cross-cutting listing; README + `tests/cross-llm-coverage.md` rows.
+- **Tests:** 22 `tests/hook-contracts.sh` assertions for the new guard (fires/silent/excluded/dedup/fail-open/disable-env/malformed-stdin/never-blocks). Skill is host-neutral (`skill-content-grep.sh`) and cross-refs resolve (`skill-cross-refs.sh`).
+
 ## v6.1.5 — 2026-05-28
 
 SessionStart time-based dedup as defense in depth.

diff --git a/docs/plans/2026-05-29-demonstration-fidelity-design.md b/docs/plans/2026-05-29-demonstration-fidelity-design.md
diff --git a/docs/plans/2026-05-29-demonstration-fidelity.md b/docs/plans/2026-05-29-demonstration-fidelity.md
diff --git a/docs/plans/2026-05-29-demonstration-fidelity.md.scope-lock b/docs/plans/2026-05-29-demonstration-fidelity.md.scope-lock
@@ -0,0 +1 @@
+661d9faf234fdc5e4c8e2de72c6e4db95a0af91c49c47334dd5580ee079cc00f
diff --git a/hooks/hooks.json b/hooks/hooks.json
@@ -32,6 +32,16 @@
             "timeout": 10
           }
         ]
+      },
+      {
+        "matcher": "Write|Edit|MultiEdit",
+        "hooks": [
+          {
+            "type": "command",
+            "command": "\"${CLAUDE_PLUGIN_ROOT}/hooks/run-hook.cmd\" pretool-demo-fidelity-guard",
+            "timeout": 10
+          }
+        ]
       }
     ],
     "PostToolUse": [

diff --git a/hooks/pretool-demo-fidelity-guard b/hooks/pretool-demo-fidelity-guard
@@ -0,0 +1,129 @@
+#!/usr/bin/env bash
+# hooks/pretool-demo-fidelity-guard
+# PreToolUse hook (advisory, NEVER blocks): when an agent is about to write a
+# demonstration/example artifact, remind it that a demo MUST execute the real
+# artifact — no reimplementation, no hard-coded output, no stubbing the thing
+# being demonstrated. Substituting a *dependency* at a real interface seam is
+# allowed only if disclosed. See skills/demonstration-fidelity/SKILL.md.
+#
+# This is a best-effort, filename-detectable nudge only. The load-bearing
+# defense is the demonstration-fidelity skill (which covers inline / README /
+# normally-named / cross-language demos) plus the verification-before-completion
+# "demo/example works" claim-matrix row. This hook never blocks and never reads
+# file contents.
+#
+# Detection is anchored to path SEGMENTS + basename suffix globs — never bare
+# substrings — so `latest`/`contest`/`attestation`/`inspector`/`spectrum`
+# demos are not wrongly excluded while `example_test.go`/`testdata/`/`vendor/`
+# are.
+#
+# Global opt-out: set SUPERPOWERS_HOOKS_DISABLE=1
+
+set -euo pipefail
+
+[ "${SUPERPOWERS_HOOKS_DISABLE:-}" = "1" ] && exit 0
+
+# Require stdin (PreToolUse always sends a JSON payload).
+[ -t 0 ] && exit 0
+command -v jq >/dev/null 2>&1 || exit 0
+
+hook_input=$(cat || true)
+[ -z "$hook_input" ] && exit 0
+
+tool_name=$(printf '%s' "$hook_input" | jq -r '.tool_name // empty' 2>/dev/null || true)
+case "$tool_name" in
+  Write|Edit|MultiEdit) ;;
+  *) exit 0 ;;
+esac
+
+file_path=$(printf '%s' "$hook_input" | jq -r '.tool_input.file_path // empty' 2>/dev/null || true)
+[ -z "$file_path" ] && exit 0
+
+# Lowercase so Examples/, Demo*, Showcase match case-insensitively.
+lc_path=$(printf '%s' "$file_path" | tr '[:upper:]' '[:lower:]')
+base=${lc_path##*/}
+
+# Split into path segments.
+IFS='/' read -r -a segs <<< "$lc_path" || true
+
+# ── Exclusion (segment-exact + basename suffix globs; never bare substrings) ─
+# "${segs[@]:-}" form so an (unreachable) empty array can't trip `set -u` under
+# bash 3.2 (macOS system bash, which run-hook.cmd may exec).
+for seg in "${segs[@]:-}"; do
+  case "$seg" in
+    test|tests|spec|specs|testdata|fixtures|vendor|node_modules|.git) exit 0 ;;
+  esac
+done
+case "$base" in
+  *_test.*|*.test.*|*.spec.*|*_spec.*) exit 0 ;;
+esac
+
+# ── Trigger (segment-exact demos/examples, or basename prefix) ───────────────
+fire=0
+for seg in "${segs[@]:-}"; do
+  case "$seg" in
+    demos|examples) fire=1; break ;;
+  esac
+done
+if [ "$fire" -eq 0 ]; then
+  case "$base" in
+    demo*|example*|showcase*|quickstart*) fire=1 ;;
+  esac
+fi
+[ "$fire" -eq 0 ] && exit 0
+
+# ── Session-scoped dedup ─────────────────────────────────────────────────────
+# PreToolUse payloads carry transcript_path, NOT session_id (cf.
+# hooks/pre-tool-scope-guard); derive the session key the same way.
+transcript_path=$(printf '%s' "$hook_input" | jq -r '.transcript_path // empty' 2>/dev/null || true)
+session_key=""
+[ -n "$transcript_path" ] && session_key=$(basename "$transcript_path" 2>/dev/null || echo "")
+
+cwd_dir=$(printf '%s' "$hook_input" | jq -r '.cwd // empty' 2>/dev/null || true)
+[ -z "$cwd_dir" ] && cwd_dir="${PWD}"
+
+dedup_key=""
+if command -v sha256sum >/dev/null 2>&1; then
+  dedup_key=$(printf '%s' "${session_key}:${file_path}" | sha256sum 2>/dev/null | cut -d' ' -f1 || true)
+elif command -v shasum >/dev/null 2>&1; then
+  dedup_key=$(printf '%s' "${session_key}:${file_path}" | shasum -a 256 2>/dev/null | cut -d' ' -f1 || true)
+fi
+
+state_dir="${cwd_dir}/.claude/autodev-state"
+state_file="${state_dir}/demo-fidelity-seen"
+
+if [ -n "$dedup_key" ]; then
+  # Already nudged this session for this path → stay silent.
+  # (grep used as an `if` condition; errexit does not trip on conditions, so it
+  # is NOT wrapped in `|| true` — wrapping it would force the condition true.)
+  if [ -f "$state_file" ] && grep -qxF "$dedup_key" "$state_file" 2>/dev/null; then
+    exit 0
+  fi
+  # Record. Fail-OPEN: any state I/O failure must fall through to EMIT, never
+  # suppress. Guarded so `set -euo pipefail` cannot fail-CLOSED on an unwritable
+  # state dir, and so a failed `>>` redirection cannot leak to stderr (the
+  # group redirect applies before the inner append is attempted).
+  if mkdir -p "$state_dir" 2>/dev/null; then
+    { printf '%s\n' "$dedup_key" >> "$state_file"; } 2>/dev/null || true
+  fi
+fi
+
+reminder=$(cat <<'REMINDER'
+<IMPORTANT>
+You appear to be writing a demonstration/example artifact. A demo MUST execute the
+real artifact and show its actual output. Do NOT reimplement the logic, hard-code
+the output, or stub the thing being demonstrated. Substituting a *dependency* at a
+real interface seam is allowed only if disclosed. See autodev:demonstration-fidelity.
+</IMPORTANT>
+REMINDER
+)
+
+emit_additional_context() {
+    local event_name="$1"
+    local context="$2"
+    jq -n --arg event "$event_name" --arg context "$context" \
+        '{hookSpecificOutput:{hookEventName:$event,additionalContext:$context}}'
+}
+
+emit_additional_context "PreToolUse" "$reminder"
+exit 0
diff --git a/skills/demonstration-fidelity/SKILL.md b/skills/demonstration-fidelity/SKILL.md
@@ -0,0 +1,92 @@
+---
+name: demonstration-fidelity
+description: Use when creating a demo, example, quickstart, showcase, sample, or any artifact meant to prove an implementation works — before writing it. Triggers when about to "show it working", build a proof-of-concept, or generate sample output, especially under time pressure or when the real code is awkward to run. Catches fake demos that reimplement the logic, hard-code the output, or rewrite it in another language instead of executing the real artifact.
+---
+
+> Condensed format: load `autodev:condensed-pipeline-writing` to expand shorthand.
+
+# Demonstration Fidelity
+
+## Iron Law
+
+**A demonstration must execute the real artifact, and the output it shows must be produced by that execution.**
+
+A demo, example, quickstart, showcase, screenshot, or "here's it working" proof is a *claim that the code works*. If it doesn't run the real code, the claim is fabricated — however convincing the output looks. This operationalizes `autodev:verification-before-completion` for demo artifacts and is a sibling of `autodev:runtime-launch-validation`.
+
+## Forbidden — regardless of language
+
+- **Reimplementation** — re-coding the artifact's logic in the demo instead of calling it.
+- **Hard-coded output** — hand-authoring "expected" output and presenting it as produced output.
+- **Stubbing the artifact-under-demonstration** — wiring the demo to a fake *in place of the thing you are demonstrating*.
+- **Detached prototype** — a parallel throwaway instead of the shipped entry point.
+
+These prove nothing. They are fake code.
+
+## Allowed — with disclosure
+
+Substituting a **dependency** at a **real interface seam** (data store, external service, clock) so the demo runs locally — **provided** the artifact's own code path runs unchanged (you stubbed a *dependency*, not the artifact) **and** you state it plainly ("data source is an in-memory fixture; the handler is the real one"). This is the `autodev:runtime-launch-validation` posture (ephemeral DB row; Fall-back section). Disclosed seam-substitution is honest; faking the artifact is not.
+
+## Fidelity, not language sameness
+
+Cross-language is **not** the crime. A real client in another language crossing a **real interface** into the running artifact — e.g. a Python client making real HTTP calls to a running Go service — is valid, as long as that crossing is exercised (no stub on either end of *that* boundary). The question is always **"did the real code run to produce this output?"** — never "same language?".
+
+## The 3-question fidelity test
+
+1. **Execution:** does the demo call/import/invoke the real artifact — not a copy of it?
+2. **Provenance:** was every value shown produced by that run and captured — not typed by you?
+3. **Seams:** if you substituted anything, was it a *dependency* (not the artifact), and did you disclose it?
+
+Any "no" → the demo is fake. Fix it before presenting.
+
+## Example — fake vs. faithful
+
+Artifact: Go `text.Dedupe(s string) string`.
+
+**Fake** (different language, hard-coded — proves nothing):
+
+```python
+# demo.py — DO NOT DO THIS
+print("BEFORE:\n a\n a\n b")
+print("AFTER:\n a\n b")   # hand-typed; Dedupe never ran
+```
+
+**Faithful** (runs the real function, prints its real return value):
+
+```go
+// demo/main.go
+package main
+
+import ("fmt"; "example.com/app/text")
+
+func main() {
+    in := "a\n a\n b"
+    fmt.Printf("AFTER:\n%s\n", text.Dedupe(in)) // real output, captured by running it
+}
+```
+
+If the module tooling is awkward, sidestep the *tooling* (throwaway module, ephemeral dependency) — never sidestep *execution*.
+
+## Rationalizations — STOP
+
+| Excuse | Reality |
+|---|---|
+| "Build/DB tooling is finicky — I'll just print the expected output." | Sidestep the tooling, not the execution. A throwaway module / in-memory dependency runs the real code; printed literals run nothing. |
+| "A hard-coded demo looks identical on screen." | Looking identical is the trap. The value of a demo is that the real code produced it. |
+| "Quicker to rewrite it in Python/bash for the demo." | Fine only if that script actually calls/crosses into the real artifact. A script printing literals is fake in any language. |
+| "The real thing needs a DB/service I can't stand up." | Substitute the *dependency* at a real seam and disclose it; run the real artifact. Never fake the artifact. |
+| "It's just for the meeting / illustrative." | A demo presented as proof is a claim — `autodev:verification-before-completion` applies. |
+| "I simplified the logic for clarity." | A simplified reimplementation is a different program. Demo the real one. |
+
+## Red flags
+
+- The demo imports nothing from the module under demonstration.
+- You typed or pasted the "output" instead of capturing a run.
+- The demo is in another language and never crosses a real interface into the artifact.
+- "Simulated" / "for demonstration purposes" / "pretend" appears in the demo.
+- You have not actually run it and watched the output.
+
+## See also
+
+- `autodev:verification-before-completion` — evidence before any "works/done" claim (its claim matrix has a `demo/example works` row).
+- `autodev:runtime-launch-validation` — launch the built artifact; its "Demonstration / example / showcase" change-class row points here.
+- `autodev:scope-lock` — "there is no demo mode" for *partial scope* (distinct from fidelity).
diff --git a/skills/finishing-a-development-branch/SKILL.md b/skills/finishing-a-development-branch/SKILL.md
@@ -112,6 +112,8 @@ If NOT triggered (pure logic refactor, doc-only, test-only): skip this step.
 
 **The launch transcript is required in the PR body when this step triggers.** Without it, the PR is not ready for merge — even if all unit tests pass.
 
+**Demonstration artifacts:** if the change ships any demo/example/showcase/quickstart artifact (in this diff or the PR body), `autodev:demonstration-fidelity` applies — confirm the demo executes the real artifact (no reimplementation, hard-coded output, or different-language fake) before merge.
+
 ### Step 1c: Version-Skew Audit (conditional)
 
 **Trigger:** the diff updates a non-dev-only version pin (any "version: vX.Y.Z", "image: foo:vX.Y.Z", or `<package>@vX.Y.Z`) — excludes dev-only tooling pins (linters, formatters) where skew is generally benign.

diff --git a/skills/runtime-launch-validation/SKILL.md b/skills/runtime-launch-validation/SKILL.md
@@ -45,6 +45,9 @@ Triggered NOT by:
 | Library / SDK | Import into a tiny consumer program, exercise the new public surface | Output, behavior matches docs |
 | Plugin / extension | Load it into the host application, exercise a representative call | Host doesn't crash on load; representative call returns |
 | Interface boundary change (new method, field, event type, or hook — see `agents/boundary-classes.md` for the canonical boundary-class list) | Launch both sides/participants as applicable; exercise a real interaction across the boundary — not a mock or stub on either end | The receiving side correctly processes the new data/method/event/hook; no fallback silently swallows the new path; failure-signature scrape clean on all participating sides |
+| Demonstration / example / showcase artifact (anything built to show a change working) | The real artifact, invoked through its real entry point; capture output from that run | Output is produced by the real code path, not literals; the artifact-under-demonstration is NOT stubbed; any substituted *dependency* sits behind a real interface seam and is disclosed. See `autodev:demonstration-fidelity`. |
+
+When a demonstration *also* exercises a new boundary, both this row and the "Interface boundary change" row apply: stub neither the artifact nor the boundary under test — only a disclosed *dependency* behind the artifact may be substituted.
 
 ## Failure-signature scrape
 
@@ -95,6 +98,7 @@ The constraint is not an excuse to skip; it's a request for help.
 ## See also
 
 - `skills/verification-before-completion/SKILL.md` — general evidence-before-assertion principle
+- `autodev:demonstration-fidelity` — demo/example/showcase artifacts must execute the real artifact (the "Demonstration" change-class row above)
 - `skills/finishing-a-development-branch/SKILL.md` — Step 1b invokes this skill
 - `skills/writing-plans/SKILL.md` — related planning guidance for per-change-class verification
 - `agents/boundary-classes.md` — canonical definition of interface boundary classes (producer→consumer, caller→callee, sender→handler, plugin→host)
diff --git a/skills/using-autodev/SKILL.md b/skills/using-autodev/SKILL.md
@@ -83,7 +83,7 @@ When multiple skills could apply, use this order:
 3. **Pipeline skills auto-chain** — these invoke each other automatically in the autonomous pipeline:
    brainstorming → adversarial-design-review (design phase) → writing-plans → adversarial-design-review (plan phase) → alignment-check → **scope-lock** → subagent-driven-development → finishing-a-development-branch → pr-monitoring → post-merge-retrospective
 
-   Cross-cutting skills invoked from within the pipeline when conditions trigger: `project-design-guidance` (before designs/plans and during retros when durable guidance changes); `recording-decisions` (when designs/plans make non-trivial trade-offs, including user-approved manifest amendments); `scope-lock` (re-checked at every per-task checkpoint and before PR creation); `condensed-pipeline-writing` (for dense internal design/review/plan artifacts).
+   Cross-cutting skills invoked from within the pipeline when conditions trigger: `project-design-guidance` (before designs/plans and during retros when durable guidance changes); `recording-decisions` (when designs/plans make non-trivial trade-offs, including user-approved manifest amendments); `scope-lock` (re-checked at every per-task checkpoint and before PR creation); `condensed-pipeline-writing` (for dense internal design/review/plan artifacts); `demonstration-fidelity` (before writing any demo/example/showcase/proof artifact — it must execute the real code, not fake it).
 
 "Let's build X" → brainstorming first, then the pipeline runs autonomously after design approval.
 "Fix this bug" → debugging first, then domain-specific skills.

diff --git a/skills/verification-before-completion/SKILL.md b/skills/verification-before-completion/SKILL.md
@@ -37,6 +37,7 @@ Skip step = unverified claim.
 | agent completed | inspect diff + verify | agent report |
 | requirements met | checklist vs plan/design | tests alone |
 | lint clean (Go-repo PR) | `golangci-lint run` exit 0 | tests green alone |
+| demo/example works | the real artifact executed via the demo produced the shown output (see `autodev:demonstration-fidelity`) | hand-written/hard-coded output, a reimplementation, a different-language fake |
 
 ## Red Flags
Original file line number	Diff line number	Diff line change
		@@ -0,0 +1 @@
		661d9faf234fdc5e4c8e2de72c6e4db95a0af91c49c47334dd5580ee079cc00f