Estate audit — Wave 4: DYADT, post-action agent-claim verification (rein in LLMs) by hyperpolymath · Pull Request #457 · hyperpolymath/standards

hyperpolymath · 2026-07-03T01:48:26Z

Context

Follow-up to #453 (Waves 0–1) and #454 (Wave 3), both merged. This wave builds the piece you asked for directly — a system to rein in LLMs: a standard that checks, mechanically, that an agent's claimed outcomes actually happened.

The estate already gates what an agent may do before it acts (gatekeeper → AGENTIC → contractiles). It had nothing that takes an agent's asserted outcome and confirms it after. Every false-green hole this program has fixed is a special case of one disease: a claim was trusted instead of verified. DYADT is the missing Tier 4.

The four-tier accountability pipeline

Tier	Governs	Home
1. Admission	must read the manifest before acting	`0-ai-gatekeeper-protocol/`
2. Pre-action	entropy budgets, intent, confirmation	`agentic-a2ml/`
3. In-session gates	contractile MUST/TRUST/… at close/push/merge	`contractiles/`
4. Post-action (this PR)	claimed X → mechanically confirm/refute X	`did-you-actually-do-that/`

What's here

Spec set (did-you-actually-do-that/): README (pipeline binding) · spec/CLAIM-FORMAT.adoc (typed claims) · spec/VERIFICATION-PROTOCOL.adoc (the confirmed/refuted/unverifiable contract — unverifiable is loud, never green; a verifier must re-derive evidence, never read back the agent's own evidence field) · spec/CONSEQUENCE-LEDGER.adoc (append-only, dual-signed, per-actor confirmation rate that Tier-3 MAY gate on) · spec/conformance/ (6 executable vectors + runner) · docs/NAMING-RESOLUTION.adoc (resolves the PLASMA collision).

Executable + dogfooded:

scripts/verify-claims.sh — reference verifier (local verifiers real; network/manual return unverifiable).
Root CLAIMS.a2ml — 7 claims about this very change, re-derived from primary evidence; dyadt-verify.yml runs the verifier + conformance suite in CI. If a claim here were false, CI refutes it and fails. The spec's first conformance run is on itself.
scripts/tests/wave4-dyadt-test.sh (7/7) — proves a false claim is REFUTED despite an honest-sounding statement, and the incompatible-verifier + manual-only guards fire.

Registered + graded: added to build-registry.sh (32 specs); honest scorecard (5/5 MUST met, 90% systems coverage — the network verifier is an honest fail, since only the production impl does forge/CI APIs).

Boundary

This repo is the declaration layer: it ships the normative spec + a reference verifier + the dogfood. The production actuator (continuous, in-session, wired to hypatia/gitbot-fleet with real ledger enforcement) is chartered for hyperpolymath/did-you-actually-do-that, built against these conformance vectors — it MUST NOT diverge from this contract. That's the parallel session you flagged.

Licence/SPDX is manual-only end-to-end (flag-only policy — a licence claim is always unverifiable: manual-only).

Coming next (same track)

Wave 5 — AffineScript testing standard + template; Wave 6 — campaign issues (cross-linking #426/#451/#437/#446) + release hygiene.

🤖 Generated with Claude Code

Generated by Claude Code

The estate gated what an agent may do BEFORE it acts (gatekeeper, AGENTIC, contractiles) but had no stage that checks, mechanically, that an agent's CLAIMED outcomes actually happened. DYADT is that missing Tier 4: it takes an agent's asserted outcomes and confirms/refutes each against primary evidence, never trusting the agent's own narration. New registered spec `did-you-actually-do-that/` (governance stream): - README.adoc: the four-tier accountability pipeline (admission → pre-action → in-session gates → post-action verification). - spec/CLAIM-FORMAT.adoc: typed claims (claim_class, target, expect, verifier) + example CLAIMS.a2ml. Licence claims are always manual-only. - spec/VERIFICATION-PROTOCOL.adoc: the verifier taxonomy and the confirmed|refuted|unverifiable verdicts. unverifiable is loud, never green. A verifier must RE-DERIVE evidence, never read back the agent's evidence field. - spec/CONSEQUENCE-LEDGER.adoc: append-only, dual-signed ledger + per-actor confirmation rate that Tier-3 contractiles MAY gate on. - spec/conformance/: 6 executable vectors (+ runner) covering every verdict class — the shared ground truth the production verifier is TDD'd against. - docs/NAMING-RESOLUTION.adoc + CANONICAL-NAMES.adoc: resolve the PLASMA collision (PLASMA = licence/exactness only; claim verification = DYADT). Executable + dogfooded: - scripts/verify-claims.sh: reference verifier (git-diff / command-transcript / claims-compose local verifiers; network + manual return unverifiable). - Root CLAIMS.a2ml: 7 claims about THIS change, all re-derived from primary evidence; .github/workflows/dyadt-verify.yml runs the verifier + conformance on push/PR. If a claim here were false, CI refutes it and fails loudly. - scripts/tests/wave4-dyadt-test.sh (7/7): proves a false claim is REFUTED and the incompatible-verifier + manual-only guards fire. - Registered in build-registry.sh (32 specs); honest scorecard added (5/5 MUST met, 90% systems coverage; the network verifier is an honest fail). Production actuator (continuous, wired to hypatia/gitbot-fleet, real ledger enforcement) is chartered for hyperpolymath/did-you-actually-do-that against these conformance vectors. Licence rows manual-only throughout (flag-only). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_0114ps6mY5jAH4Sz

sonarqubecloud · 2026-07-03T01:54:03Z

Quality Gate passed

Issues
47 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

…458) ## Context Follow-up to #457 (Wave 4, merged), which introduced DYADT — the post-action verifier that checks an agent's *claimed* outcomes actually happened. Before considering it done, I ran an **adversarial review** of the reference verifier (3 red-team lenses — bypass, verdict-logic, spec-gaps — each finding independently re-verified). It found **real ways to make a false claim pass**, or to make the verifier drop or mis-judge a claim. A claim-checker that can be fooled is worse than none. This PR closes every confirmed hole **fail-safe** (return `unverifiable`, never a confident wrong verdict) and locks each with a regression assertion **and** a conformance vector. ## Holes closed (`scripts/verify-claims.sh`) | Bypass found | Now | |---|---| | Unresolvable/empty base ref → confident-wrong `created`/`modified`/`deleted` | `unverifiable no-base-ref` | | `created` confirmed any existing file (untracked build output, etc.) | requires a git-**tracked** file new to the change | | Missing required field → false-confirm / silent drop | `unverifiable missing-field`; a block with no `id` still appears (no silent drop) | | Empty / `.*` `expect` → unconditional confirm | `unverifiable empty-pattern`; malformed regex → `bad-regex` | | `target` = absolute / `..` / symlink → evidence redirection | `unverifiable unsafe-path` | | `stdout-contains:` matched **stderr** too | matches stdout only (stderr captured separately) | | `contains:`/`sha256:` on a directory/unreadable file → `refuted` | `unverifiable` | | Licence claim phrased only in `statement` → auto-confirmed | licence detected in class/target/expect/**statement** → `manual-only` | | `claims-compose` infinite recursion (cycle / fork bomb) | depth-capped at 8 | | `not_before` (stale-evidence) unimplemented | present → `unverifiable` (reference collects no timestamps) | | Parser only accepted `key = "v"` | whitespace-tolerant (`key="v"` too) | ## Spec `VERIFICATION-PROTOCOL.adoc` gains two normative sections: **Fail-safe requirements** (the exhaustive list of "cannot collect trustworthy evidence → unverifiable") and **Command execution & sandboxing** (`command-transcript` executes `target`; untrusted claims MUST be sandboxed; the reference impl is trusted-input-only and says so). ## Verification - `scripts/tests/wave4-dyadt-test.sh`: **14/14** (7 new hardening assertions — each proves a specific bypass is now closed). - `spec/conformance/`: **9/9** vectors (added `missing-field`, `unsafe-path`, `licence-in-statement` so the production verifier must handle them too). - Dogfood `CLAIMS.a2ml` still all-confirmed; Waves 0/1/3 tests unaffected; registry + dashboard in sync. Method: 3-lens red-team fan-out → independent re-verification of each finding → fix only the confirmed ones fail-safe. Licence handling stays `manual-only` end-to-end. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --- _Generated by [Claude Code](https://claude.ai/code/session_0114ps6mY5jAH4SzbGxeuYjc)_ Co-authored-by: Claude <noreply@anthropic.com>

…names guard, DYADT residual fix (#459) Two cohesive commits closing out the estate audit-and-optimization program (umbrella #460). Both fully tested; generated artifacts in sync. ## Commit 1 — Wave 5: per-language testing depth You flagged this directly: the estate's only per-language testing *depth* was a single Julia guide from **2024** (no MUST/SHOULD, Rust+Julia only) next to a **byte-identical duplicate**. - `language-testing-standards.md` → **v2.0.0**: RFC-2119 requirements **R1–R9** mapped to the CRG test taxonomy; an anti-theatre rule (no `continue-on-error` on a MUST check; coverage reported-with-artifact, not asserted). - `templates/language-testing-guide-TEMPLATE.md`: the skeleton every guide follows — requirement-mapping table (tool or **visible** `none`), tools, SHA-pinned CI, and a **mandatory honest "Known gaps"** section. - `affinescript-testing-guide.md`: your primary language, previously with **zero** testing standard — authored honestly (most SHOULD rows are tracked gaps; R3 notes `affinescript-verify.yml` is advisory). SSOT migrates to `hyperpolymath/affinescript` prospectively. - `scripts/check-language-guide.sh` (wired into `just validate`) + `wave5-language-guides-test.sh` (7/7). Deleted the duplicate snapshot. ## Commit 2 — Wave 6: guard, DYADT residual fix, licence record - **DYADT residual (#461):** an adversarial review confirmed 16 bypasses in the Wave-4 verifier; 15 were fixed in #458, and this closes the last — an always-matching `contains:` regex (`.*`, `^`, `$`, …) no longer confirms vacuously (`unverifiable trivial-pattern`). Spec pins the `contains:` dialect to POSIX ERE; conformance vector + assertion added (10 vectors, 15 assertions). - **Canonical-names guard** (`check-canonical-names.sh`): blocks *reintroduction* of the deprecated names (`6a2`→descriptiles, `agent_instructions`→bot_directives) in **added** diff lines only (chartered bulk migration untouched). Wired into `just validate` + the pre-commit hook; `wave6-canonical-names-test.sh` (4/4). - **`audits/licence-flags-2026-07.adoc`**: flag-only record — the whole program made no SPDX edits and no auto licence PRs; DYADT treats licence claims as `manual-only` end to end. ## Verification All six wave suites pass; DYADT conformance 10/10 + dogfood all-confirmed; registry + scorecard dashboard in sync. ## Program status (umbrella #460) Waves 0/1/3/4 + hardening **merged** (#453, #454, #457, #458). This lands Waves 5 + 6. Remaining estate-wide work is chartered: #461 (verifier residual — **fixed here**), #462 (DYADT production verifier), #463 (per-language guides completion). Licence rows `manual-only` throughout (flag-only policy). 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude <noreply@anthropic.com>

hyperpolymath marked this pull request as ready for review July 3, 2026 01:52

Merge branch 'main' into claude/estate-audit-optimization-h19z12

2765980

hyperpolymath merged commit 9083f25 into main Jul 3, 2026
17 checks passed

hyperpolymath deleted the claude/estate-audit-optimization-h19z12 branch July 3, 2026 01:53

hyperpolymath mentioned this pull request Jul 3, 2026

Harden the DYADT claim verifier against adversarial-review bypasses #458

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Estate audit — Wave 4: DYADT, post-action agent-claim verification (rein in LLMs)#457

Estate audit — Wave 4: DYADT, post-action agent-claim verification (rein in LLMs)#457
hyperpolymath merged 2 commits into
mainfrom
claude/estate-audit-optimization-h19z12

hyperpolymath commented Jul 3, 2026

Uh oh!

Uh oh!

sonarqubecloud Bot commented Jul 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

hyperpolymath commented Jul 3, 2026

Context

The four-tier accountability pipeline

What's here

Boundary

Coming next (same track)

Uh oh!

Uh oh!

sonarqubecloud Bot commented Jul 3, 2026

Quality Gate passed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants