Skip to content

feat(training-agent): truth-of-claim verifier (PROVENANCE_CLAIM_CONTRADICTED)#3849

Merged
bokelley merged 4 commits intomainfrom
bokelley/training-truth-of-claim
May 2, 2026
Merged

feat(training-agent): truth-of-claim verifier (PROVENANCE_CLAIM_CONTRADICTED)#3849
bokelley merged 4 commits intomainfrom
bokelley/training-truth-of-claim

Conversation

@bokelley
Copy link
Copy Markdown
Contributor

@bokelley bokelley commented May 2, 2026

Closes #3802 at the spec/contract level. Adds the seller-side truth-of-claim verifier on top of the structural enforcement landed in #3792.

What this ships

`server/src/training-agent/task-handlers.ts`

  • `handleGetCreativeFeatures` — governance-agent-shaped handler returning deterministic AI-detection results. Detection encoded in the creative manifest's asset URL pattern: substring `ai-generated-true` / `ai_gen_true` → `ai_generated: true`; `ai-generated-false` / `ai_gen_false` → `false`. When neither URL pattern matches, derives from buyer-claimed `digital_source_type` via the canonical AI_TRUE_DST set (trained_algorithmic_media, composite_with_trained_algorithmic_media, composite_synthetic, algorithmic_media). Storyboards drive contradiction outcomes from the fixture without per-test stateful bookkeeping.

  • `runProvenanceVerifier` — in-process verifier-call helper invoked by `enforceProvenancePolicy` after the structural-rejection cascade. Selects the buyer-nominated verifier when on-list, falls back to the first on-list entry (with `substituted_for` audit trail). Threshold: `ai_generated=true` with confidence ≥ 0.9 against a non-AI `digital_source_type` claim → contradiction. The reverse (claims AI but verifier sees non-AI) is NOT a contradiction; buyers may conservatively over-disclose.

  • `enforceProvenancePolicy` is now async — calls the verifier after the five structural checks. Emits `PROVENANCE_CLAIM_CONTRADICTED` with audit-safe `error.details` per the `error-code.json` description's allowlist: `{ agent_url, feature_id, claimed_value, observed_value, confidence, substituted_for? }`. No `detail_url`, no verifier extension fields — that's the trust boundary preventing verifier responses from leaking cross-tenant data through the seller. Buyer-controlled strings are sanitized before interpolation.

`static/compliance/source/protocols/media-buy/scenarios/provenance_truth_of_claim.yaml`

Fleshed out from skeleton to a 3-phase scenario:

  1. discover_verifier — `get_products` surfaces `accepted_verifiers`
  2. reject_contradicted_claim — buyer claims `digital_capture` (non-AI) but attaches an asset URL with `ai-generated-true` → verifier returns `ai_generated: true` (confidence 0.95) → seller rejects with `PROVENANCE_CLAIM_CONTRADICTED` carrying audit-safe `error.details`
  3. accept_consistent_claim — buyer claims `digital_capture` and asset URL has `ai-generated-false` → verifier confirms, accept

Validations check the full audit-safe `error.details` shape: `agent_url`, `feature_id`, `claimed_value: "digital_capture"`, `observed_value: true`.

Known issue — gated on a #3713 regression

The storyboard remains on `KNOWN_FAILING_STORYBOARDS` because of a pre-existing regression introduced by #3713 (training-agent v6 platform split): the `v6 SalesPlatform.syncCreatives` shim at `server/src/training-agent/v6-sales-platform.ts:173-179` invokes `handleSyncCreatives` with just `{ creatives }` and loses the brand/account context that session-keying depends on. The seeded `creative_policy` lives on one session key; the v6 shim's `handleSyncCreatives` looks at a different session key with no seeded products. Result: `aggregateCreativePolicy` returns null, no enforcement runs, every `sync_creatives` is accepted regardless of policy.

This breaks BOTH the existing `media_buy_seller/provenance_enforcement` (was passing pre-#3713) and the new `media_buy_seller/provenance_truth_of_claim`.

Local run on `creative` tenant:
```
media_buy_seller/provenance_enforcement ✗ 1P / 1F / 5S / 0N/A
× sync_creatives_no_provenance: Expected "failed", got "created"
media_buy_seller/provenance_truth_of_claim ✗ 1P / 1F / 2S / 0N/A
× sync_creatives_contradicted: Expected "failed", got "created"
```

The truth-of-claim contract code in this PR is correct — same root cause, same fix needed in the v6 shim. The minimum fix is to thread `brand` / `account` through the v6 platform call so `sessionKeyFromArgs` produces the same key the test-controller seeded against.

Test plan

Sequencing

This PR ships the wire contract and the test scaffolding; merging it doesn't claim conformance until the v6 shim is fixed. Merge order:

  1. Land this PR (truth-of-claim handler + verifier + storyboard, KNOWN_FAILING entry kept)
  2. Fix the v6 SalesPlatform.syncCreatives shim to thread brand/account
  3. Remove the KNOWN_FAILING entry, bump floors, second PR with conformance proof

Filed/refs: #3802, #3713, #3468, #3777, #3792.

🤖 Generated with Claude Code

bokelley added 4 commits May 2, 2026 13:00
…NTRADICTED (refs #3802)

Adds the seller-side truth-of-claim verifier on top of the structural
provenance enforcement landed in #3792.

handleGetCreativeFeatures: governance-agent-shaped handler returning
deterministic AI-detection results. Detection encoded in the creative
manifest's asset URL pattern (substring `ai-generated-true` / `ai_gen_true`
→ ai_generated:true; `ai-generated-false` / `ai_gen_false` → false;
otherwise derived from buyer-claimed digital_source_type via the
canonical AI_TRUE_DST set). Storyboards drive contradiction outcomes
from the fixture without per-test stateful bookkeeping.

runProvenanceVerifier: in-process verifier-call helper invoked by
enforceProvenancePolicy after the structural-rejection cascade.
Selects the buyer-nominated verifier when on-list, falls back to the
first on-list entry (with substituted_for audit trail). Threshold:
ai_generated=true with confidence >= 0.9 against a non-AI claim →
contradiction. The reverse (claims AI but verifier sees non-AI) is NOT
a contradiction; buyers may conservatively over-disclose.

enforceProvenancePolicy is now async; calls the verifier after the
five structural checks and emits PROVENANCE_CLAIM_CONTRADICTED with
audit-safe error.details (agent_url, feature_id, claimed_value,
observed_value, confidence, optional substituted_for) per the
error-code.json description's allowlist. Buyer-controlled strings are
sanitized before interpolation.

handleSyncCreatives's call site updated to await
enforceProvenancePolicy.

Storyboard wiring + KNOWN_FAILING removal + floor bumps come in the
follow-on commit on this branch.

Refs: #3468, #3777, #3802.
…3802)

The companion to #3792's structural-rejection storyboard. Three phases:

  1. Discover — get_products surfaces accepted_verifiers
  2. Reject contradicted — buyer claims digital_capture but asset URL
     contains "ai-generated-true"; seller's verifier returns
     ai_generated:true (confidence 0.95), seller emits
     PROVENANCE_CLAIM_CONTRADICTED with audit-safe error.details
     (agent_url, feature_id, claimed_value, observed_value, confidence)
  3. Accept consistent — buyer claims digital_capture and asset URL
     contains "ai-generated-false"; verifier confirms, accept

The verifier's behavior is encoded in the asset URL pattern (handled
by handleGetCreativeFeatures in the previous commit), so storyboards
drive both outcomes from the fixture without per-test stateful
bookkeeping.

Storyboard remains in KNOWN_FAILING_STORYBOARDS for now — blocked on a
pre-existing #3713 regression where the v6 SalesPlatform.syncCreatives
shim invokes handleSyncCreatives with `{ creatives }` only, losing the
brand/account context that session-keying depends on. The seeded
creative_policy lives on a different session key than the one the v6
shim uses, so policy enforcement never fires. Same regression breaks
media_buy_seller/provenance_enforcement under the v6 path, which
worked pre-#3713. Removing the KNOWN_FAILING entry unblocks once the
v6 shim threads brand/account through to the v5 handler.

The truth-of-claim contract code itself (handleGetCreativeFeatures,
runProvenanceVerifier, the async enforceProvenancePolicy with the
verifier-call after the cascade, PROVENANCE_CLAIM_CONTRADICTED with
audit-safe error.details) is complete in task-handlers.ts — the
storyboard and its grading logic match the wire contract. Just gated
on the v6 shim fix.

Refs: #3468, #3777, #3802, #3713.
…im; wire truth-of-claim

Two changes that together unblock end-to-end conformance grading on
both media_buy_seller/provenance_enforcement and the new
media_buy_seller/provenance_truth_of_claim.

v6 shim regression fix:
- v6-sales-platform.ts and v6-creative-platform.ts both invoked
  handleSyncCreatives({ creatives }) without threading brand domain
  through. sessionKeyFromArgs in the v5 handler then routed to
  open:default while the test-controller seeded creative_policy on
  open:<brand> — aggregateCreativePolicy returned null and the entire
  enforcement cascade silently no-opped. Same root cause hit
  provenance_enforcement (was passing pre-#3713) and the new
  provenance_truth_of_claim. Fix: pull
  ctx.account.ctx_metadata.brand_domain from the v6 RequestContext and
  add { brand: { domain } } to the shim args before delegating.

Truth-of-claim manifest synthesis:
- The CreativeForEnforcement type now reflects that sync_creatives
  carries assets directly on the creative entry (not nested under
  creative_manifest). runProvenanceVerifier synthesizes the manifest
  the verifier expects from whichever shape the creative carries —
  sync_creatives top-level assets or build_creative / preview_creative
  nested creative_manifest.assets. Either path resolves to the same
  detection input.

Removes media_buy_seller/provenance_truth_of_claim from
KNOWN_FAILING_STORYBOARDS — it now passes 3/3 step validations:
contradicted submission emits PROVENANCE_CLAIM_CONTRADICTED with
audit-safe error.details (agent_url, feature_id, claimed_value,
observed_value); consistent submission accepts.

Local conformance on /creative tenant:
  media_buy_seller/provenance_enforcement   ✓ 6P / 1S / 0N/A
  media_buy_seller/provenance_truth_of_claim ✓ 3P / 1S / 0N/A

Closes #3802.
@bokelley bokelley merged commit ff4fd0c into main May 2, 2026
20 checks passed
@bokelley bokelley deleted the bokelley/training-truth-of-claim branch May 2, 2026 17:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Training agent: implement PROVENANCE_CLAIM_CONTRADICTED truth-of-claim verification

1 participant