feat: demonstration-fidelity skill + advisory hook (v6.2.0)#50
Merged
Conversation
added 12 commits
May 29, 2026 07:31
…hten heuristic per adversarial review
…le RLV no-stub carve-out
…gate, exact hooks.json JSON per plan review
Advisory PreToolUse guard (never blocks) that nudges agents writing a demo/example artifact toward demonstration-fidelity: execute the real artifact, no reimplementation/hard-coded-output/artifact-stub. - Anchored heuristic: segment-exact demos/examples or basename prefix demo*/example*/showcase*/quickstart*; excludes test/spec/testdata/ fixtures/vendor segments + *_test.*/*.test.*/*.spec.* basenames; path lowercased for case-insensitive match. - Session dedup keyed on basename(transcript_path) (PreToolUse payloads carry no session_id); fail-open = fire on state I/O failure. - 22 hook-contract assertions; manual launch transcript clean.
Load-bearing, host-neutral skill: a demo MUST execute the real artifact; output shown MUST be produced by that run. Forbids reimplementation, hard-coded output, artifact-stubbing, detached prototypes — regardless of language. Allows disclosed dependency-seam substitution. Fidelity-not- language-sameness nuance + 3-question test + fake/faithful example + rationalization table seeded from RED baselines.
…/using-autodev/README/coverage - RLV: new 'Demonstration / example / showcase' change-class row (carves out artifact-stub forbidden vs disclosed dependency-seam allowed) + See also. - verification-before-completion: 'demo/example works' claim-matrix row. - finishing Step 1b: demo-artifact note. - using-autodev: demonstration-fidelity in cross-cutting skills list. - README Skills Library + cross-llm-coverage host-neutral row.
…eview I-1/M-1/M-2)
- pretool-demo-fidelity-guard: add *_spec.* to exclusion suffixes so RSpec
spec files (examples/widget_spec.rb) don't spuriously fire; guard seg loops
with ${segs[@]:-} for bash 3.2 set -u safety.
- hook-contracts: add widget_spec.rb + .spec.ts silent cases.
- runtime-launch-validation: one-clause note on demo+boundary overlap.
intel352
added a commit
that referenced
this pull request
May 29, 2026
* docs(retro): post-merge retro for demonstration-fidelity (v6.2.0, PR #50) * chore(scope-lock): mark demonstration-fidelity Complete; remove lock sidecar --------- Co-authored-by: Jon Langevin <jon@gocodealone.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes a verification-theater gap reported in production: an agent writes real code, then "demonstrates" it with a demo that never executes the real artifact — reimplementing the logic, hard-coding the output, or rewriting it in another language. The demo proves nothing yet is presented as proof. This adds a harness-agnostic
demonstration-fidelityskill (the load-bearing, all-harness layer) + an advisory write-time hook backstop (Claude/Codex/Cursor) + pipeline wiring.Built by dogfooding the full autodev pipeline: brainstorming → 3× design adversarial review → writing-plans → 2× plan adversarial review → alignment-check → scope-lock → TDD execution → code review.
Design / Plan
docs/plans/2026-05-29-demonstration-fidelity-design.md(adversarial-review PASS rev3)docs/plans/2026-05-29-demonstration-fidelity.md(adversarial-review PASS rev2; alignment PASS; Locked)Scope Manifest
Changes
skills/demonstration-fidelity/SKILL.md(host-neutral): a demo MUST execute the real artifact; output MUST be produced by that run. Forbids reimplementation / hard-coded output / artifact-stubbing / detached prototype — regardless of language. Allows disclosed dependency-seam substitution. "Fidelity, not language sameness" nuance + 3-question test + fake-vs-faithful example + rationalization table seeded from RED baselines.hooks/pretool-demo-fidelity-guard(advisory, NEVER blocks): demo-path nudge. Anchored heuristic (segmentdemos/examplesor basename prefixdemo*/example*/showcase*/quickstart*; excludestest/spec/testdata/fixtures/vendorsegments +*_test.*/*.spec.*/*_spec.*basenames; lowercased). Session dedup onbasename(transcript_path)(PreToolUse has nosession_id); fails open (fires) on state I/O error; honorsSUPERPOWERS_HOOKS_DISABLE=1.verification-before-completiondemo/example worksclaim-matrix row;finishingStep 1b note;using-autodevcross-cutting listing; README +cross-llm-coveragerows.Runtime launch transcript (Step 1b — hooks.json = plugin-loading-path)
Test evidence
tests/hook-contracts.sh: PASS (22 new demo-fidelity assertions — fires/silent/excluded incl.example_test.go/testdata//widget_spec.rb/.spec.ts/dedup/fail-open/disable-env/malformed-stdin/never-blocks)tests/skill-content-grep.sh: PASS (host-neutral)tests/skill-cross-refs.sh: PASStests/version-check.sh: PASS (all manifests 6.2.0)tests/plan-scope-check.sh --plan --verify-lock: PASS (lock intact)Storedependency, and printed the disclosure — no hard-coding.Review
session_id→transcript_path— + 5 Importants resolved).*_spec.rb) + Minors M-1/M-2 fixed in-PR.Generated by the implementing agent (autodev pipeline dogfood).