Skip to content

feat(provenance): add embedded_provenance and watermarks to provenance schema#3468

Merged
bokelley merged 8 commits intoadcontextprotocol:mainfrom
erik-sv:feat/provenance-embedded-watermark-fields
May 1, 2026
Merged

feat(provenance): add embedded_provenance and watermarks to provenance schema#3468
bokelley merged 8 commits intoadcontextprotocol:mainfrom
erik-sv:feat/provenance-embedded-watermark-fields

Conversation

@erik-sv
Copy link
Copy Markdown
Contributor

@erik-sv erik-sv commented Apr 28, 2026

Summary

Adds two new optional arrays to provenance.json and a provenance_requirements object to creative-policy.json, addressing the must-carry baseline expansion proposed in #2854 (Option A) plus the field shape for embedded content provenance (Track 1).

Problem

The current provenance schema carries sidecar manifest references (c2pa.manifest_url) and third-party detection results (verification), but has no way to declare that provenance metadata is embedded within the content stream itself. Ad-server transcoding, CMS ingestion, and CDN re-encodings all can break file-level bindings. Content that passes through multiple intermediaries arrives at the publisher stripped of sidecar provenance.

Separately, there is no structured way for a seller to require specific provenance fields beyond the existing provenance_required boolean.

Changes

provenance.json - two new optional arrays

embedded_provenance declares that provenance metadata is carried within the content stream. Each entry identifies one embedding layer via:

  • method (new enum: manifest_wrapper or provenance_markers)
  • provider (organization that performed the embedding)
  • verify_url (verification endpoint for governance agents)
  • standard (optional, e.g., c2pa for Section A.7)
  • embedded_at (optional, ISO 8601 timestamp)

watermarks declares that content watermarks have been applied. Each entry identifies one watermarking layer via:

  • media_type (new enum: audio, image, video, text)
  • provider (organization that applied the watermark)
  • verify_url (optional, verification endpoint)
  • c2pa_action (optional: c2pa.watermarked.bound or c2pa.watermarked.unbound)
  • embedded_at (optional, ISO 8601 timestamp)

Why two fields, not one

The distinction follows C2PA's normative taxonomy:

  • Embedded provenance maps to C2PA binding assertions and manifest embedding (Section A.7). It carries or references a structured provenance record: the full chain of custody.
  • Watermarks map to the C2PA c2pa.watermarked.* action family. A watermark encodes an identifier (who generated it, who owns it) but does not carry a provenance record.

A single asset can carry both. An article might have provenance markers in its text (embedded provenance) and a spread-spectrum watermark in an accompanying audio clip (watermark). The fields serve different governance questions: "Can I verify the provenance chain?" vs. "Can I detect who generated this?"

creative-policy.json - provenance_requirements object

Expands the existing provenance_required boolean with structured, field-level requirements:

  • require_digital_source_type - seller requires digital_source_type declaration
  • require_disclosure_metadata - seller requires disclosure metadata
  • require_embedded_provenance - seller requires at least one embedded_provenance entry

All three are optional booleans. Existing seller agents that do not read the object are unaffected.

Enforcement gap disclosure: provenance_requirements in this version is a declaration framework, not an enforcement guarantee. A non-enforcing seller silently accepts non-compliant creative. The should-verify receiver-obligation RFC (deferred Track 2 from #2854) is intended to close this gap.

New enum files

  • enums/embedded-provenance-method.json (manifest_wrapper, provenance_markers)
  • enums/watermark-media-type.json (audio, image, video, text)

Existing c2pa field

The existing c2pa.manifest_url field is unchanged. Its description now notes that file-level bindings break during transcoding and points to embedded_provenance for pipelines with intermediaries.

How this addresses #2854

Issue component Status
Option A: must-carry baseline expansion Shipped in this PR via provenance_requirements
Track 1: Encypher field shape Shipped in this PR via embedded_provenance with verify_url
Track 2: Should-verify RFC Deferred (noted in enforcement gap disclosure)
Track 3: SI seam task ownership Deferred
Track 4: C2PA-required rescope Partially addressed: embedded_provenance provides the transcoding-resilient alternative

Channel coverage

Channel Relevant field Governance path
Native embedded_provenance (text provenance markers on editorial artifact) Governance agent sends native ad text to verify_url, checks for publisher's markers
SI/conversational embedded_provenance (text provenance markers on AI response) Buyer governance agent verifies
Video (CTV/OTT) watermarks (video/audio watermark on creative) Seller governance agent detects
Audio watermarks (audio watermark on creative) Seller governance agent detects
Display c2pa sidecar or embedded_provenance (manifest wrapper) Seller governance agent verifies

Standards alignment

  • C2PA 2.4: embedded_provenance aligns with binding assertions (c2pa.hash.data, c2pa.soft-binding) and format-specific manifest embedding (Section A.7). watermarks aligns with the c2pa.watermarked.bound/c2pa.watermarked.unbound action taxonomy.
  • IPTC: No change to digital_source_type alignment. IPTC's vocabulary describes how content was produced, not what provenance it carries. The new fields address a gap that IPTC does not cover.

Wire compatibility

All new fields are optional. No existing required fields are changed. Existing agents ignore unknown fields per additionalProperties: true. This is a purely additive, non-breaking change.

@erik-sv erik-sv requested a review from bokelley as a code owner April 28, 2026 18:29
@erik-sv
Copy link
Copy Markdown
Contributor Author

erik-sv commented Apr 28, 2026

I have read the IPR Policy

@erik-sv
Copy link
Copy Markdown
Contributor Author

erik-sv commented Apr 28, 2026

The IPR check CI is failing with a 422 from GitHub's API when the check-and-record.mjs script tries to post its request comment (links/0/schema validation error). The "I have read the IPR Policy" comment is already posted above. The script appears to crash before it gets to signature detection.

@bokelley
Copy link
Copy Markdown
Contributor

Thanks for tracking this down, @erik-sv.

Root cause: createIssueComment and updateIssueComment in scripts/ipr/github.mjs pass the comment body text directly as options.body to request(). Because request() does JSON.stringify(body), a string gets serialized as a bare JSON string literal ("\"text\"") rather than the object GitHub expects ({"body": "text"}). GitHub rejects the malformed payload with a 422 links/0/schema validation error.

Fix: Both methods now wrap the string in the expected object — { body: { body } } — matching how createStatus (the working method) already passes a nested object. Draft PR: #3478.

Important: both IPR check workflows (ipr-agreement.yml and ipr-check-callable.yml) check out main for the script, so the fix must merge to main before it takes effect for any PR. Merging #3478 first will unblock this PR's CI — a maintainer can re-trigger the IPR check once #3478 lands.


Generated by Claude Code

@bokelley
Copy link
Copy Markdown
Contributor

Thanks for the diagnosis, @bokelley. Understood — the IPR check failure is a bug in the shared scripts/ipr/github.mjs script (string passed where { body } object is required), not an issue with this PR's changes. No changes needed here; this PR's CI will clear once #3478 merges to main and a maintainer re-triggers the IPR check.


Generated by Claude Code

@pkras
Copy link
Copy Markdown
Collaborator

pkras commented Apr 29, 2026

oh great - I think split indeed makes sense - embedded_provenance vs watermarks as first answers chain verification and the later answers generation question, also in line with how you already do it at C2PA

Some notes from my end:

  • provenance_requirements is currently a declaration not enforcement so maybe we need to frame this per AdCP positioning this is visibility surface not a prevention surface - the track2 bellow will enforce it - PR description handles this but maybe the doc update should too.
  • on provenance_requirements placements. The booleans are on creative-policy.json, but digital_source_type and disclosure already exist on provenance.json so does requiring them mean "the field must be present" or "the field must be present and non-null with specific values"? maybe this should be specified so there are no deviations.

@erik-sv
Copy link
Copy Markdown
Contributor Author

erik-sv commented Apr 29, 2026

@pkras Good catches, both addressed in 988a0a0:

Visibility surface framing: The provenance_requirements schema description now reads "visibility surface, not a prevention surface" directly, and spells out that a non-enforcing seller silently accepts non-compliant creative. The should-verify RFC closes the gap by adding task-level error signals. This framing should carry into any doc update that ships alongside the schema.

Boolean semantics: Each boolean now specifies what "present" means:

  • require_digital_source_type: field must be present and set to a valid value from the digital-source-type enum (not null or absent)
  • require_disclosure_metadata: disclosure.required must be a boolean (true or false); when true, at least one disclosure.jurisdictions entry is expected
  • require_embedded_provenance: at least one embedded_provenance array entry (unchanged, was already unambiguous)

This prevents the "field present but null" loophole without overspecifying the validation contract before the should-verify RFC lands.

@github-actions
Copy link
Copy Markdown
Contributor

IPR Policy Agreement Required

@erik-sv — thanks for the contribution. Before this PR can be merged, the AgenticAdvertising.Org IPR Policy requires your agreement.

To agree, post a new comment on this PR with the exact phrase:

I have read the IPR Policy

Your signature is recorded once and covers all contributions to AAO repositories. See signatures/README.md for what gets recorded and why.

@erik-sv
Copy link
Copy Markdown
Contributor Author

erik-sv commented Apr 29, 2026

I have read the IPR Policy

@bokelley
Copy link
Copy Markdown
Contributor

Thanks @erik-sv — solid PR, and the C2PA taxonomy split (binding assertions vs. c2pa.watermarked.* actions) is the right anchor for separating embedded_provenance from watermarks. Pia's earlier notes look cleanly addressed in 988a0a0.

Two things I'd want pinned before merge, then a few smaller nits.

1. Relationship between provenance_required (existing boolean) and provenance_requirements (new object). Today the semantics are undefined. What does provenance_required: false + provenance_requirements.require_digital_source_type: true mean? Is the new object a no-op without the boolean, or does any flag inside imply the boolean? Pick one and write it into the provenance_requirements description — otherwise sellers will diverge in interpretation and the visibility surface gets noisier, not less.

2. verify_url is wire-undefined in v1. It's format: uri but the schema doesn't say what receivers POST, what response shape is expected, or how errors surface. That's intentional (should-verify RFC closes it) — but without an explicit forward-compat note in the field description, early adopters will bake in incompatible call patterns and we'll have to break them later. One line — "verification protocol is opaque in this version; the wire contract will be defined by the should-verify RFC" — is enough.

Smaller, optional:

  • embedded_provenance.verify_url is required. Self-verifiable embeddings (C2PA text manifest with a known public key, or receiver-local detection) wouldn't need a vendor-hosted verifier. Consider relaxing to oneOf: [verify_url, public_key_url, ...], or document that self-verifying embeddings should still expose a verify_url that returns the signing key. As-is, this soft-mandates that every embedder operate a verification service.
  • c2pa_action is an inline enum while embedded-provenance-method and watermark-media-type got factored out. Minor inconsistency; factoring it out keeps future c2pa.watermarked.* additions to one place.
  • The c2pa field description rewrite ("consider embedded_provenance as primary") is a real semantic repositioning, not just docs polish. Worth calling out in the changeset so anyone built around c2pa-as-primary sees it.

On Track 4: I read your provenance_markers line as "this gives Track 4 a path forward," not "Track 4 is now solved." A required tier still has to define what counts as satisfying it, and that's the design rethink we deferred. Worth being explicit in the issue thread that the rescope is unblocked, not closed.

My vote: approve once #1 is clarified in the schema. #2 is a one-line description add. Everything else is nice-to-have.


Generated by Claude Code

@erik-sv erik-sv force-pushed the feat/provenance-embedded-watermark-fields branch from 988a0a0 to 75e909c Compare April 30, 2026 19:11
@erik-sv
Copy link
Copy Markdown
Contributor Author

erik-sv commented Apr 30, 2026

Thanks @bokelley, thorough review. Addressed all five items in the latest commit:

Issue 1 (blocking): gate semantics. provenance_requirements now explicitly refines provenance_required. When the boolean is false or absent, the object SHOULD be absent; if present, receivers SHOULD ignore it. This gives existing agents correct behavior (they only read the boolean) while new agents get structured detail.

Issue 2 (blocking): verify_url forward-compat. Both verify_url fields now carry: "The verification protocol is opaque in this version; the wire contract will be defined by the should-verify RFC." We operate a verification endpoint and are happy to contribute to the should-verify RFC design when it opens, but defining the wire contract is out of scope for this PR.

Issue 3 (optional): verify_url relaxed to optional. embedded_provenance.items.required is now ["method", "provider"]. The verify_url description notes it SHOULD be present for methods where the receiver cannot self-verify (e.g., provenance_markers). Self-verifiable embeddings like C2PA text manifests with known public keys do not need a vendor endpoint.

Issue 4 (optional): c2pa_action factored out. New enum file c2pa-watermark-action.json replaces the inline enum, consistent with embedded-provenance-method.json and watermark-media-type.json.

Issue 5 (optional): changeset updated. Notes the c2pa field description repositioning and the new enum file.

bokelley added a commit that referenced this pull request May 1, 2026
…s, seller confirms

Reframe the verifier contract per expert review (security, protocol, product).
A buyer-controlled `verify_agent.agent_url` was an SSRF + exfil + phishing
surface, an inversion of the trust model (seller is verifier-of-record), and
a vendor-adoption ask (verifier vendors won't ship governance agents
unilaterally). The corrected contract:

- Seller publishes `creative_policy.accepted_verifiers[]` — the governance
  agents it operates or has allowlisted (`agent_url`, optional `feature_id`,
  optional `providers[]`). Returned on `get_products`.
- Buyer represents on `embedded_provenance[]`/`watermarks[]` by attaching
  `verify_agent: { agent_url, feature_id? }` whose `agent_url` matches a
  published `accepted_verifiers[]` entry (canonicalized).
- Seller confirms by cross-checking the URL against its own allowlist before
  any outbound call, then invoking `get_creative_features` against the
  matching on-list agent. Sellers MUST NOT call buyer-asserted endpoints
  outside the allowlist.

Schema changes:

- `creative-policy.json`: add `accepted_verifiers[]` sibling field.
- `provenance.json`: rewrite `verify_agent` description as buyer's
  representation; tighten `additionalProperties: false`; require `https://`
  pattern on `agent_url`; drop "MAY substitute" language.
- `error-code.json`: add `PROVENANCE_VERIFIER_NOT_ACCEPTED` for the
  cross-check rejection. Constrain `PROVENANCE_CLAIM_CONTRADICTED.details`
  to the audit-safe allowlist `{ agent_url, feature_id, claimed_value,
  observed_value, confidence, substituted_for }`.

Doc changes:

- `provenance-verification.mdx`: new "The verifier contract" section laying
  out seller-publishes / buyer-represents / seller-confirms, with worked
  example covering all three steps. Updated rejection-code table and
  buyer/seller checklists.
- `provenance.mdx`: rewrite `verify_agent` shape section as buyer's
  representation; expand creative-policy enforcement example with
  `accepted_verifiers`; fix the `PROVENANCE_CLAIM_CONTRADICTED` description
  (verifier source is the seller's allowlist, not the buyer's nominated
  endpoint); add `details` allowlist constraint.

Storyboard:

- `provenance_enforcement.yaml`: seed `accepted_verifiers` on the fixture;
  new `reject_off_list_verifier` phase exercises
  `PROVENANCE_VERIFIER_NOT_ACCEPTED`; positive-path phase now demonstrates
  on-list `verify_agent` representation; added top-level `brand` to the
  `get_products` step for parity with peer storyboards.

Validation: build:schemas + build:compliance pass; targeted tests
(test:schemas, test:examples, test:json-schema, test:storyboard-*,
test:composed, test:docs-nav, test:build-schemas-hoist-enums) all green.

Refs: #3468, #2854.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
erik-sv pushed a commit to erik-sv/adcp that referenced this pull request May 1, 2026
…-to-end storyboard

Stacked on top of adcontextprotocol#3468 to ship the embedded_provenance / watermarks /
provenance_requirements work as a complete, end-to-end-executable bundle.

verify_agent (replaces verify_url): each embedded_provenance and watermarks
entry now carries a `verify_agent: { agent_url, feature_id? }` pointer at
an AdCP governance agent that can verify the embedding via
`get_creative_features`. Verification routes through the existing AdCP
governance surface (governance.creative_features in get_adcp_capabilities)
rather than per-vendor opaque webhooks. Multiple verifiers can implement
checks for the same provider; receivers may substitute any agent that
declares an equivalent feature.

PROVENANCE_* error codes: new entries on error-code.json give sellers a
machine-readable rejection vocabulary for sync_creatives:
PROVENANCE_REQUIRED, PROVENANCE_DIGITAL_SOURCE_TYPE_MISSING,
PROVENANCE_DISCLOSURE_MISSING, PROVENANCE_EMBEDDED_MISSING, and
PROVENANCE_CLAIM_CONTRADICTED (active refutation by a verifier, distinct
from absence). Each per-creative result carries error.field at the
resolved provenance path; PROVENANCE_CLAIM_CONTRADICTED carries verifier
identity in error.details. Buyers' orchestrators self-correct without
negotiating with the seller.

creative-policy.json: drops the "visibility surface, not prevention
surface" framing. Sellers that publish a provenance_requirements entry
MUST enforce it on sync_creatives with the matching error code. The
should-verify-RFC deferral language is removed from descriptions and the
changeset.

Docs: docs/creative/provenance.mdx adds sections on embedded_provenance,
watermarks, and the verify_agent shape, plus the rejection-code table.
docs/governance/creative/provenance-verification.mdx adds a "What buyers
can declare" / "What sellers can require" pair, the rejection-code
reference, and updates the buyer/seller checklists to reflect the new
fields and codes.

Storyboard: protocols/media-buy/scenarios/provenance_enforcement.yaml
exercises the structural-rejection contract end to end — discover the
requirement on get_products, submit without disclosure metadata and
expect PROVENANCE_DISCLOSURE_MISSING, then resubmit with disclosure and
expect acceptance.

Refs: adcontextprotocol#3468, adcontextprotocol#2854.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@erik-sv
Copy link
Copy Markdown
Contributor Author

erik-sv commented May 1, 2026

Merged @bokelley's stack (erik-sv#1) and pushed a follow-up commit. The branch now carries:

From the stack (Brian's work):

  • verify_url replaced with verify_agent - structured object constrained to the seller's accepted_verifiers[] allowlist, closing the buyer-controlled-URL SSRF surface
  • Seller-publishes / buyer-represents / seller-confirms trust model via new creative_policy.accepted_verifiers[]
  • Six PROVENANCE_* error codes on error-code.json with machine-readable recovery guidance
  • provenance_requirements upgraded from visibility surface to mandatory enforcement
  • Docs on both provenance.mdx and provenance-verification.mdx
  • Four-phase compliance storyboard (provenance_enforcement.yaml)

Follow-up fixes (9ec5bec):

  • Canonicalization references in schema descriptions now inline the key normalization steps and link /docs/reference/url-canonicalization - implementers reading the schema get immediate guidance instead of an opaque forward reference
  • PROVENANCE_VERIFIER_NOT_ACCEPTED error.details tightened to reference the product's creative_policy rather than replaying the full accepted_verifiers[] snapshot (buyers already hold this from get_products)
  • Added provenance verifier allowlist to the canonicalization doc's "Where it applies" table

All targeted tests pass. Blocked on #3478 (IPR check script fix) landing on main so CI can re-run.

@bokelley
Copy link
Copy Markdown
Contributor

bokelley commented May 1, 2026

Quick status check, since the picture shifted: #3478 has already merged (it lands as state: MERGED in the API), and CI on this PR is green (IPR Agreement, GitGuardian, IPR Policy/Signature all SUCCESS). The actual gate now is a merge conflict against main.

The conflict is on a single file — static/schemas/source/enums/error-code.json — but it's not just adjacent-line edits to the enum array. PR #3738 (merged 2026-04-30) added a structured enumMetadata block alongside the existing enumDescriptions, and the $comment on it makes this load-bearing:

"SDKs MUST consume this block instead of parsing 'Recovery: X' from enumDescriptions prose."

So a clean rebase needs three things:

  1. Re-add the six PROVENANCE_* codes to the enum array (will largely auto-merge)
  2. Re-add their entries to enumDescriptions (same)
  3. Add six new entries to enumMetadata — this is the actually-new work. Each entry is { recovery, suggestion } and the recovery classification has to match what's already in the description prose

Without #3, the new codes ship without their structured recovery classification and SDKs won't surface them. Suggested mapping based on the existing prose:

"PROVENANCE_REQUIRED": { "recovery": "correctable", "suggestion": "attach a provenance object — at minimum digital_source_type — and resubmit" },
"PROVENANCE_DIGITAL_SOURCE_TYPE_MISSING": { "recovery": "correctable", "suggestion": "set provenance.digital_source_type to a value from the digital-source-type enum and resubmit" },
"PROVENANCE_DISCLOSURE_MISSING": { "recovery": "correctable", "suggestion": "set provenance.disclosure.required and, when true, populate disclosure.jurisdictions" },
"PROVENANCE_EMBEDDED_MISSING": { "recovery": "correctable", "suggestion": "attach at least one embedded_provenance entry from a supported provider and resubmit" },
"PROVENANCE_VERIFIER_NOT_ACCEPTED": { "recovery": "correctable", "suggestion": "replace verify_agent.agent_url with one from the seller's published accepted_verifiers, drop verify_agent if the embedding is self-verifiable, or re-embed with a verifier the seller accepts" },
"PROVENANCE_CLAIM_CONTRADICTED": { "recovery": "correctable", "suggestion": "revise the provenance claim to match the verifier's observation or replace the creative; auto-retry without correction will not pass" }

Happy to push the rebase + enumMetadata entries as a small follow-up to your branch if it's faster — otherwise all yours.

erik-sv and others added 6 commits May 1, 2026 18:04
…ovenance schema

Add two new optional arrays to provenance.json that distinguish between
provenance metadata carried within the content stream (embedded_provenance)
and content watermarks that encode an identifier or fingerprint (watermarks).

The separation aligns with C2PA's normative taxonomy: embedded provenance
maps to binding assertions and manifest embedding (Section A.7), while
watermarks map to the c2pa.watermarked.* action family.

Also expand creative-policy.json with a provenance_requirements object
that gives sellers structured, field-level provenance requirements beyond
the existing provenance_required boolean. Supports EU AI Act Art. 50 and
CA SB 942 compliance workflows.

New enums: embedded-provenance-method.json, watermark-media-type.json.

All fields are optional and additive. Existing agents are unaffected.

Refs: adcontextprotocol#2854
Address review feedback from @pkras:

- Reframe provenance_requirements as a visibility surface, not a
  prevention surface. The enforcement gap is explicitly called out
  in the schema description, not just the PR body.

- Specify that require_digital_source_type means the field must be
  present and set to a valid enum value (not null or absent).

- Specify that require_disclosure_metadata means disclosure.required
  must be a boolean, and when true, at least one jurisdictions entry
  is expected.
…compat, c2pa_action enum

Define provenance_required as the authoritative gate for
provenance_requirements: the object refines the boolean and receivers
ignore it when provenance_required is false or absent.

Add forward-compat note to verify_url on both embedded_provenance and
watermarks: verification protocol is opaque in v1, wire contract
deferred to should-verify RFC.

Relax verify_url from required to optional on embedded_provenance.
Self-verifiable embeddings (e.g., C2PA text manifest with known public
key) do not need a vendor endpoint. verify_url SHOULD be present for
methods where the receiver cannot self-verify.

Factor c2pa_action inline enum to c2pa-watermark-action.json for
consistency with other enum files.

Note c2pa field description repositioning in changeset.
…-to-end storyboard

Stacked on top of adcontextprotocol#3468 to ship the embedded_provenance / watermarks /
provenance_requirements work as a complete, end-to-end-executable bundle.

verify_agent (replaces verify_url): each embedded_provenance and watermarks
entry now carries a `verify_agent: { agent_url, feature_id? }` pointer at
an AdCP governance agent that can verify the embedding via
`get_creative_features`. Verification routes through the existing AdCP
governance surface (governance.creative_features in get_adcp_capabilities)
rather than per-vendor opaque webhooks. Multiple verifiers can implement
checks for the same provider; receivers may substitute any agent that
declares an equivalent feature.

PROVENANCE_* error codes: new entries on error-code.json give sellers a
machine-readable rejection vocabulary for sync_creatives:
PROVENANCE_REQUIRED, PROVENANCE_DIGITAL_SOURCE_TYPE_MISSING,
PROVENANCE_DISCLOSURE_MISSING, PROVENANCE_EMBEDDED_MISSING, and
PROVENANCE_CLAIM_CONTRADICTED (active refutation by a verifier, distinct
from absence). Each per-creative result carries error.field at the
resolved provenance path; PROVENANCE_CLAIM_CONTRADICTED carries verifier
identity in error.details. Buyers' orchestrators self-correct without
negotiating with the seller.

creative-policy.json: drops the "visibility surface, not prevention
surface" framing. Sellers that publish a provenance_requirements entry
MUST enforce it on sync_creatives with the matching error code. The
should-verify-RFC deferral language is removed from descriptions and the
changeset.

Docs: docs/creative/provenance.mdx adds sections on embedded_provenance,
watermarks, and the verify_agent shape, plus the rejection-code table.
docs/governance/creative/provenance-verification.mdx adds a "What buyers
can declare" / "What sellers can require" pair, the rejection-code
reference, and updates the buyer/seller checklists to reflect the new
fields and codes.

Storyboard: protocols/media-buy/scenarios/provenance_enforcement.yaml
exercises the structural-rejection contract end to end — discover the
requirement on get_products, submit without disclosure metadata and
expect PROVENANCE_DISCLOSURE_MISSING, then resubmit with disclosure and
expect acceptance.

Refs: adcontextprotocol#3468, adcontextprotocol#2854.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…s, seller confirms

Reframe the verifier contract per expert review (security, protocol, product).
A buyer-controlled `verify_agent.agent_url` was an SSRF + exfil + phishing
surface, an inversion of the trust model (seller is verifier-of-record), and
a vendor-adoption ask (verifier vendors won't ship governance agents
unilaterally). The corrected contract:

- Seller publishes `creative_policy.accepted_verifiers[]` — the governance
  agents it operates or has allowlisted (`agent_url`, optional `feature_id`,
  optional `providers[]`). Returned on `get_products`.
- Buyer represents on `embedded_provenance[]`/`watermarks[]` by attaching
  `verify_agent: { agent_url, feature_id? }` whose `agent_url` matches a
  published `accepted_verifiers[]` entry (canonicalized).
- Seller confirms by cross-checking the URL against its own allowlist before
  any outbound call, then invoking `get_creative_features` against the
  matching on-list agent. Sellers MUST NOT call buyer-asserted endpoints
  outside the allowlist.

Schema changes:

- `creative-policy.json`: add `accepted_verifiers[]` sibling field.
- `provenance.json`: rewrite `verify_agent` description as buyer's
  representation; tighten `additionalProperties: false`; require `https://`
  pattern on `agent_url`; drop "MAY substitute" language.
- `error-code.json`: add `PROVENANCE_VERIFIER_NOT_ACCEPTED` for the
  cross-check rejection. Constrain `PROVENANCE_CLAIM_CONTRADICTED.details`
  to the audit-safe allowlist `{ agent_url, feature_id, claimed_value,
  observed_value, confidence, substituted_for }`.

Doc changes:

- `provenance-verification.mdx`: new "The verifier contract" section laying
  out seller-publishes / buyer-represents / seller-confirms, with worked
  example covering all three steps. Updated rejection-code table and
  buyer/seller checklists.
- `provenance.mdx`: rewrite `verify_agent` shape section as buyer's
  representation; expand creative-policy enforcement example with
  `accepted_verifiers`; fix the `PROVENANCE_CLAIM_CONTRADICTED` description
  (verifier source is the seller's allowlist, not the buyer's nominated
  endpoint); add `details` allowlist constraint.

Storyboard:

- `provenance_enforcement.yaml`: seed `accepted_verifiers` on the fixture;
  new `reject_off_list_verifier` phase exercises
  `PROVENANCE_VERIFIER_NOT_ACCEPTED`; positive-path phase now demonstrates
  on-list `verify_agent` representation; added top-level `brand` to the
  `get_products` step for parity with peer storyboards.

Validation: build:schemas + build:compliance pass; targeted tests
(test:schemas, test:examples, test:json-schema, test:storyboard-*,
test:composed, test:docs-nav, test:build-schemas-hoist-enums) all green.

Refs: adcontextprotocol#3468, adcontextprotocol#2854.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ER_NOT_ACCEPTED details

Two follow-ups from review of the verify_agent trust model:

1. Replace opaque "canonicalized per the AdCP URL canonicalization rules"
   with an inline summary plus doc reference
   (/docs/reference/url-canonicalization) so implementers reading the
   schema get immediate guidance without hunting for the normalization
   spec. Add the provenance verifier allowlist surface to the
   canonicalization doc's "Where it applies" table.

2. Tighten PROVENANCE_VERIFIER_NOT_ACCEPTED error.details: instead of
   replaying the full accepted_verifiers[] snapshot on every rejection
   (the buyer already has this from get_products), reference the
   product's creative_policy. Reduces response payload size and avoids
   leaking the seller's full agent infrastructure on error paths.

All targeted tests pass (schemas, examples, json-schema, composed,
storyboard-*, docs-nav, build-schemas-hoist-enums).
@erik-sv erik-sv force-pushed the feat/provenance-embedded-watermark-fields branch from 9ec5bec to 233d692 Compare May 1, 2026 18:10
@erik-sv
Copy link
Copy Markdown
Contributor Author

erik-sv commented May 1, 2026

@bokelley Good catch on #3738's enumMetadata - rebased on current main and added the six PROVENANCE_* entries to the metadata block. Used exactly your suggested mapping (all correctable, suggestions extracted from the existing description prose).

Merge conflict resolved, enumMetadata populated, all targeted tests green. Branch should be clean against main now.

bokelley
bokelley previously approved these changes May 1, 2026
The new creative_sales_agent/provenance_enforcement storyboard grades the
PROVENANCE_*_MISSING / PROVENANCE_VERIFIER_NOT_ACCEPTED rejection contract
end to end. The training agent has no provenance enforcement yet — the
spec lands in this PR ahead of the reference implementation, so 3 of the
storyboard's 5 step validations fail against the training agent
(get_products doesn't surface the seeded creative_policy fields;
sync_creatives accepts every submission).

Add the storyboard to KNOWN_FAILING_STORYBOARDS in
server/tests/manual/run-storyboards.ts, referencing adcp#3777 which tracks
the training-agent implementation work. Once that lands, the entry can be
removed and min_clean_storyboards / min_passing_steps in the workflow file
should be bumped accordingly.

Pattern matches the existing v3_envelope_integrity entry: the storyboard
defines the spec contract, the runner has a tracked deferral until the
reference agent catches up.

Refs: adcontextprotocol#3468, adcontextprotocol#3777.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…iling

Stack: mark provenance_enforcement storyboard as known-failing
auto-merge was automatically disabled May 1, 2026 18:59

Head branch was pushed to by a user without write access

@bokelley bokelley enabled auto-merge (squash) May 1, 2026 19:09
@bokelley bokelley disabled auto-merge May 1, 2026 19:41
@bokelley bokelley merged commit 75793d5 into adcontextprotocol:main May 1, 2026
28 of 33 checks passed
@erik-sv erik-sv deleted the feat/provenance-embedded-watermark-fields branch May 1, 2026 19:50
bokelley added a commit that referenced this pull request May 2, 2026
…ion, storyboard rigor)

Addresses all expert-review items from #3792's three-pass read
(code-reviewer, security-reviewer, nodejs-testing-expert).

Catalog safety + symmetric backfill (code-review):
- Restrict backfillProductDefaults to seeded-product IDs only by folding
  it into overlaySeededProducts. The previous integration mutated nested
  objects (format_ids[], reporting_capabilities) on every product
  including catalog ones; getCatalog() returns a shallow copy whose
  nested fields alias the cached singleton, so mutations would leak
  across requests once a partial catalog product appeared.
- Both handleGetProducts and handleCreateMediaBuy now go through the
  same overlay path, so backfill is applied symmetrically. Closes the
  asymmetric-backfill divergence the reviewer flagged.
- Rename backfillProductDefaults -> backfillTrainingProductDefaults to
  make the training-only intent obvious at every call site.

Sanitization (security-review):
- Add sanitizeForError helper that strips C0/C1 controls and caps
  length; apply to buyer-controlled strings (verify_agent.agent_url,
  creative_id) before interpolating into TaskError.message and
  error.field. Defense against log/transcript poisoning by attacker-
  shaped values.
- Confirmed via review (and grep): no outbound HTTP touches the buyer-
  supplied verify_agent.agent_url anywhere in the enforcement path. The
  trust invariant from #3468 holds.

Helper documentation:
- enforceProvenancePolicy gains a cascade-order docstring (storyboard
  assertions on errors[0] rely on stable ordering).
- aggregateCreativePolicy gains a comment explaining the deliberate
  asymmetry: requirement booleans are intersected (most-restrictive
  wins), accepted_verifiers are unioned (allowlist semantics).

Storyboard rigor (testing-review):
- Add reject_no_provenance phase (PROVENANCE_REQUIRED) — the cheapest,
  most likely buyer mistake had zero coverage.
- Add reject_missing_digital_source_type phase
  (PROVENANCE_DIGITAL_SOURCE_TYPE_MISSING). Fixture now requires
  digital_source_type so the policy gate fires.
- Tighten accept-phase assertion from field_present to
  field_value allowed_values: [created, updated]. field_present would
  have silently passed on action: failed.
- Comment the errors[0] indexing as a stable contract tied to
  enforceProvenancePolicy's cascade order, with a TODO to switch to a
  field_contains predicate when the SDK ships one (adcp#3803).
- Comment the brief-mode coupling for products[0] selection.

Workflow comment correction:
- True origin/main baseline was 64 storyboards / 439 legacy steps / 457
  framework steps — the previous 53/388/401 floors had drifted. New
  floors: 65 / 446 / 464, lifting only the new
  creative_sales_agent/provenance_enforcement scenario (six phases).
  Comment corrected so the next reader doesn't reverse-engineer the
  baseline.

Truth-of-claim follow-up:
- Add static/compliance/source/protocols/media-buy/scenarios/
  provenance_truth_of_claim.yaml as a single-phase skeleton.
- Register it in KNOWN_FAILING_STORYBOARDS pointing at adcp#3802 (the
  follow-up issue tracking PROVENANCE_CLAIM_CONTRADICTED implementation
  in the training agent).

Conformance rigor follow-up:
- adcp#3803 tracks: (1) required-clean storyboard allowlist alongside
  the floors, (2) errors[*] field_contains predicate in the storyboard
  validator, (3) storyboard run in pre-push hook.

Local conformance both modes: 65/65 clean; 446 passing steps legacy /
464 framework — exactly matching the new floors.

Refs: #3468, #3777, #3792.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants