Skip to content

feat: demonstration-fidelity skill + advisory hook (v6.2.0)#50

Merged
intel352 merged 12 commits into
mainfrom
feat/demonstration-fidelity-2026-05-29T1128
May 29, 2026
Merged

feat: demonstration-fidelity skill + advisory hook (v6.2.0)#50
intel352 merged 12 commits into
mainfrom
feat/demonstration-fidelity-2026-05-29T1128

Conversation

@intel352
Copy link
Copy Markdown
Contributor

@intel352 intel352 commented May 29, 2026

Summary

Closes a verification-theater gap reported in production: an agent writes real code, then "demonstrates" it with a demo that never executes the real artifact — reimplementing the logic, hard-coding the output, or rewriting it in another language. The demo proves nothing yet is presented as proof. This adds a harness-agnostic demonstration-fidelity skill (the load-bearing, all-harness layer) + an advisory write-time hook backstop (Claude/Codex/Cursor) + pipeline wiring.

Built by dogfooding the full autodev pipeline: brainstorming → 3× design adversarial review → writing-plans → 2× plan adversarial review → alignment-check → scope-lock → TDD execution → code review.

Design / Plan

  • Design: docs/plans/2026-05-29-demonstration-fidelity-design.md (adversarial-review PASS rev3)
  • Plan: docs/plans/2026-05-29-demonstration-fidelity.md (adversarial-review PASS rev2; alignment PASS; Locked)

Scope Manifest

  • PR Count: 1 · Tasks: 8 · Status: Locked 2026-05-29T11:52:19Z
  • This PR = the sole grouping row (Tasks 1–8).
  • Out of scope: general anti-fabrication skill; blocking Stop interceptor; OpenCode per-tool hook port; user CLAUDE.md/AGENTS.md edits.

Changes

  • skills/demonstration-fidelity/SKILL.md (host-neutral): a demo MUST execute the real artifact; output MUST be produced by that run. Forbids reimplementation / hard-coded output / artifact-stubbing / detached prototype — regardless of language. Allows disclosed dependency-seam substitution. "Fidelity, not language sameness" nuance + 3-question test + fake-vs-faithful example + rationalization table seeded from RED baselines.
  • hooks/pretool-demo-fidelity-guard (advisory, NEVER blocks): demo-path nudge. Anchored heuristic (segment demos/examples or basename prefix demo*/example*/showcase*/quickstart*; excludes test/spec/testdata/fixtures/vendor segments + *_test.*/*.spec.*/*_spec.* basenames; lowercased). Session dedup on basename(transcript_path) (PreToolUse has no session_id); fails open (fires) on state I/O error; honors SUPERPOWERS_HOOKS_DISABLE=1.
  • Wiring: RLV "Demonstration" change-class row (carves artifact-stub-forbidden vs disclosed-dependency-seam-allowed — no contradiction with "no stub on either end"); verification-before-completion demo/example works claim-matrix row; finishing Step 1b note; using-autodev cross-cutting listing; README + cross-llm-coverage rows.
  • Version: 6.1.5 → 6.2.0 (all three manifests).

Runtime launch transcript (Step 1b — hooks.json = plugin-loading-path)

Launch: fires on demo path
$ printf '{"tool_name":"Write","tool_input":{"file_path":"examples/demo_main.go"},"cwd":"<repo>","transcript_path":"/tmp/t.jsonl"}' | bash hooks/pretool-demo-fidelity-guard
{"hookSpecificOutput":{"hookEventName":"PreToolUse","additionalContext":"<IMPORTANT>... See autodev:demonstration-fidelity ...</IMPORTANT>"}}

Launch: silent on normal path
$ printf '{"tool_name":"Write","tool_input":{"file_path":"internal/server.go"},...}' | bash hooks/pretool-demo-fidelity-guard
(no output — silent)

Failure-signature scrape: clean
Verdict: PASS

Test evidence

  • tests/hook-contracts.sh: PASS (22 new demo-fidelity assertions — fires/silent/excluded incl. example_test.go/testdata//widget_spec.rb/.spec.ts/dedup/fail-open/disable-env/malformed-stdin/never-blocks)
  • tests/skill-content-grep.sh: PASS (host-neutral)
  • tests/skill-cross-refs.sh: PASS
  • tests/version-check.sh: PASS (all manifests 6.2.0)
  • tests/plan-scope-check.sh --plan --verify-lock: PASS (lock intact)
  • GREEN behavioral test: subagent under fake-demo pressure WITH the skill ran the real handler, substituted only the Store dependency, and printed the disclosure — no hard-coding.

Review

  • Design adversarial review: PASS (2 Criticals + 5 Importants resolved across cycles).
  • Plan adversarial review: PASS (2 Criticals — incl. session_idtranscript_path — + 5 Importants resolved).
  • Code review: PASS, zero Critical; Important I-1 (RSpec *_spec.rb) + Minors M-1/M-2 fixed in-PR.

Generated by the implementing agent (autodev pipeline dogfood).

Jon Langevin added 12 commits May 29, 2026 07:31
…gate, exact hooks.json JSON per plan review
Advisory PreToolUse guard (never blocks) that nudges agents writing a
demo/example artifact toward demonstration-fidelity: execute the real
artifact, no reimplementation/hard-coded-output/artifact-stub.

- Anchored heuristic: segment-exact demos/examples or basename prefix
  demo*/example*/showcase*/quickstart*; excludes test/spec/testdata/
  fixtures/vendor segments + *_test.*/*.test.*/*.spec.* basenames; path
  lowercased for case-insensitive match.
- Session dedup keyed on basename(transcript_path) (PreToolUse payloads
  carry no session_id); fail-open = fire on state I/O failure.
- 22 hook-contract assertions; manual launch transcript clean.
Load-bearing, host-neutral skill: a demo MUST execute the real artifact;
output shown MUST be produced by that run. Forbids reimplementation,
hard-coded output, artifact-stubbing, detached prototypes — regardless of
language. Allows disclosed dependency-seam substitution. Fidelity-not-
language-sameness nuance + 3-question test + fake/faithful example +
rationalization table seeded from RED baselines.
…/using-autodev/README/coverage

- RLV: new 'Demonstration / example / showcase' change-class row (carves out
  artifact-stub forbidden vs disclosed dependency-seam allowed) + See also.
- verification-before-completion: 'demo/example works' claim-matrix row.
- finishing Step 1b: demo-artifact note.
- using-autodev: demonstration-fidelity in cross-cutting skills list.
- README Skills Library + cross-llm-coverage host-neutral row.
…eview I-1/M-1/M-2)

- pretool-demo-fidelity-guard: add *_spec.* to exclusion suffixes so RSpec
  spec files (examples/widget_spec.rb) don't spuriously fire; guard seg loops
  with ${segs[@]:-} for bash 3.2 set -u safety.
- hook-contracts: add widget_spec.rb + .spec.ts silent cases.
- runtime-launch-validation: one-clause note on demo+boundary overlap.
@intel352 intel352 merged commit d3142d7 into main May 29, 2026
7 checks passed
@intel352 intel352 deleted the feat/demonstration-fidelity-2026-05-29T1128 branch May 29, 2026 12:47
intel352 added a commit that referenced this pull request May 29, 2026
* docs(retro): post-merge retro for demonstration-fidelity (v6.2.0, PR #50)

* chore(scope-lock): mark demonstration-fidelity Complete; remove lock sidecar

---------

Co-authored-by: Jon Langevin <jon@gocodealone.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant