Harden the DYADT claim verifier against adversarial-review bypasses by hyperpolymath · Pull Request #458 · hyperpolymath/standards

hyperpolymath · 2026-07-03T02:16:31Z

Context

Follow-up to #457 (Wave 4, merged), which introduced DYADT — the post-action verifier that checks an agent's claimed outcomes actually happened. Before considering it done, I ran an adversarial review of the reference verifier (3 red-team lenses — bypass, verdict-logic, spec-gaps — each finding independently re-verified). It found real ways to make a false claim pass, or to make the verifier drop or mis-judge a claim.

A claim-checker that can be fooled is worse than none. This PR closes every confirmed hole fail-safe (return unverifiable, never a confident wrong verdict) and locks each with a regression assertion and a conformance vector.

Holes closed (`scripts/verify-claims.sh`)

Bypass found	Now
Unresolvable/empty base ref → confident-wrong `created`/`modified`/`deleted`	`unverifiable no-base-ref`
`created` confirmed any existing file (untracked build output, etc.)	requires a git-tracked file new to the change
Missing required field → false-confirm / silent drop	`unverifiable missing-field`; a block with no `id` still appears (no silent drop)
Empty / `.*` `expect` → unconditional confirm	`unverifiable empty-pattern`; malformed regex → `bad-regex`
`target` = absolute / `..` / symlink → evidence redirection	`unverifiable unsafe-path`
`stdout-contains:` matched stderr too	matches stdout only (stderr captured separately)
`contains:`/`sha256:` on a directory/unreadable file → `refuted`	`unverifiable`
Licence claim phrased only in `statement` → auto-confirmed	licence detected in class/target/expect/statement → `manual-only`
`claims-compose` infinite recursion (cycle / fork bomb)	depth-capped at 8
`not_before` (stale-evidence) unimplemented	present → `unverifiable` (reference collects no timestamps)
Parser only accepted `key = "v"`	whitespace-tolerant (`key="v"` too)

Spec

VERIFICATION-PROTOCOL.adoc gains two normative sections: Fail-safe requirements (the exhaustive list of "cannot collect trustworthy evidence → unverifiable") and Command execution & sandboxing (command-transcript executes target; untrusted claims MUST be sandboxed; the reference impl is trusted-input-only and says so).

Verification

scripts/tests/wave4-dyadt-test.sh: 14/14 (7 new hardening assertions — each proves a specific bypass is now closed).
spec/conformance/: 9/9 vectors (added missing-field, unsafe-path, licence-in-statement so the production verifier must handle them too).
Dogfood CLAIMS.a2ml still all-confirmed; Waves 0/1/3 tests unaffected; registry + dashboard in sync.

Method: 3-lens red-team fan-out → independent re-verification of each finding → fix only the confirmed ones fail-safe. Licence handling stays manual-only end-to-end.

🤖 Generated with Claude Code

Generated by Claude Code

…sses An adversarial review of the Wave-4 verifier (3 red-team lenses + independent verification) found real ways to make a FALSE claim pass, or to make the verifier drop/mis-judge a claim. A claim-checker that can be fooled is worse than none, so every confirmed hole is closed fail-safe (unverifiable, never a confident wrong verdict) and locked with a regression test + conformance vector. Closed holes (scripts/verify-claims.sh): - Unresolvable/empty base ref no longer false-confirms `created`/`modified`/ `deleted` — returns `unverifiable no-base-ref`. - `created` now requires a git-TRACKED file new to the change; stray untracked build output no longer confirms "I created X". - Missing required field (claim_class/target/expect/verifier) -> unverifiable; a `[[claim]]` with no id is no longer silently dropped (appears as a block, never confirmed). - Empty / always-matching `expect` (`contains:` / `stdout-contains:` with empty arg) -> unverifiable; malformed regex -> `bad-regex` (not a false refute). - Unsafe `target` (absolute, `..` traversal, or symlink) -> `unsafe-path` (evidence can't be redirected to a known-good file). - `stdout-contains:` matches STDOUT only — a marker on stderr no longer false-confirms. - `contains:`/`sha256:` on a directory or unreadable file -> unverifiable. - Licence/SPDX detected in ANY of class/target/expect/STATEMENT -> manual-only (previously the statement field was not scanned). - `claims-compose` recursion depth-capped at 8 (cycle / fork-bomb guard). - not_before present -> unverifiable (reference impl collects no timestamps). - Parser is now whitespace-tolerant (`key="v"` and `key = "v"`). Spec: VERIFICATION-PROTOCOL.adoc gains normative "Fail-safe requirements" and "Command execution & sandboxing" sections (command-transcript executes target; untrusted claims MUST be sandboxed; the reference impl is trusted-input only). Tests: wave4-dyadt-test.sh +7 hardening assertions (14 total); 3 new conformance vectors (missing-field, unsafe-path, licence-in-statement; 9 total). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_0114ps6mY5jAH4Sz

sonarqubecloud · 2026-07-03T02:17:19Z

Quality Gate passed

Issues
43 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

…names guard, DYADT residual fix (#459) Two cohesive commits closing out the estate audit-and-optimization program (umbrella #460). Both fully tested; generated artifacts in sync. ## Commit 1 — Wave 5: per-language testing depth You flagged this directly: the estate's only per-language testing *depth* was a single Julia guide from **2024** (no MUST/SHOULD, Rust+Julia only) next to a **byte-identical duplicate**. - `language-testing-standards.md` → **v2.0.0**: RFC-2119 requirements **R1–R9** mapped to the CRG test taxonomy; an anti-theatre rule (no `continue-on-error` on a MUST check; coverage reported-with-artifact, not asserted). - `templates/language-testing-guide-TEMPLATE.md`: the skeleton every guide follows — requirement-mapping table (tool or **visible** `none`), tools, SHA-pinned CI, and a **mandatory honest "Known gaps"** section. - `affinescript-testing-guide.md`: your primary language, previously with **zero** testing standard — authored honestly (most SHOULD rows are tracked gaps; R3 notes `affinescript-verify.yml` is advisory). SSOT migrates to `hyperpolymath/affinescript` prospectively. - `scripts/check-language-guide.sh` (wired into `just validate`) + `wave5-language-guides-test.sh` (7/7). Deleted the duplicate snapshot. ## Commit 2 — Wave 6: guard, DYADT residual fix, licence record - **DYADT residual (#461):** an adversarial review confirmed 16 bypasses in the Wave-4 verifier; 15 were fixed in #458, and this closes the last — an always-matching `contains:` regex (`.*`, `^`, `$`, …) no longer confirms vacuously (`unverifiable trivial-pattern`). Spec pins the `contains:` dialect to POSIX ERE; conformance vector + assertion added (10 vectors, 15 assertions). - **Canonical-names guard** (`check-canonical-names.sh`): blocks *reintroduction* of the deprecated names (`6a2`→descriptiles, `agent_instructions`→bot_directives) in **added** diff lines only (chartered bulk migration untouched). Wired into `just validate` + the pre-commit hook; `wave6-canonical-names-test.sh` (4/4). - **`audits/licence-flags-2026-07.adoc`**: flag-only record — the whole program made no SPDX edits and no auto licence PRs; DYADT treats licence claims as `manual-only` end to end. ## Verification All six wave suites pass; DYADT conformance 10/10 + dogfood all-confirmed; registry + scorecard dashboard in sync. ## Program status (umbrella #460) Waves 0/1/3/4 + hardening **merged** (#453, #454, #457, #458). This lands Waves 5 + 6. Remaining estate-wide work is chartered: #461 (verifier residual — **fixed here**), #462 (DYADT production verifier), #463 (per-language guides completion). Licence rows `manual-only` throughout (flag-only policy). 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude <noreply@anthropic.com>

hyperpolymath marked this pull request as ready for review July 3, 2026 02:16

hyperpolymath enabled auto-merge (squash) July 3, 2026 02:16

hyperpolymath disabled auto-merge July 3, 2026 02:18

hyperpolymath enabled auto-merge (squash) July 3, 2026 02:18

hyperpolymath disabled auto-merge July 3, 2026 02:19

hyperpolymath merged commit 832b157 into main Jul 3, 2026
20 checks passed

hyperpolymath deleted the claude/estate-audit-optimization-h19z12 branch July 3, 2026 02:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Harden the DYADT claim verifier against adversarial-review bypasses#458

Harden the DYADT claim verifier against adversarial-review bypasses#458
hyperpolymath merged 1 commit into
mainfrom
claude/estate-audit-optimization-h19z12

hyperpolymath commented Jul 3, 2026

Uh oh!

sonarqubecloud Bot commented Jul 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

hyperpolymath commented Jul 3, 2026

Context

Holes closed (scripts/verify-claims.sh)

Spec

Verification

Uh oh!

sonarqubecloud Bot commented Jul 3, 2026

Quality Gate passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Holes closed (`scripts/verify-claims.sh`)