feat(provenance): add embedded_provenance and watermarks to provenance schema#3468
Conversation
|
I have read the IPR Policy |
|
The IPR check CI is failing with a 422 from GitHub's API when the |
|
Thanks for tracking this down, @erik-sv. Root cause: Fix: Both methods now wrap the string in the expected object — Important: both IPR check workflows ( Generated by Claude Code |
|
Thanks for the diagnosis, @bokelley. Understood — the IPR check failure is a bug in the shared Generated by Claude Code |
|
oh great - I think split indeed makes sense - embedded_provenance vs watermarks as first answers chain verification and the later answers generation question, also in line with how you already do it at C2PA Some notes from my end:
|
|
@pkras Good catches, both addressed in 988a0a0: Visibility surface framing: The Boolean semantics: Each boolean now specifies what "present" means:
This prevents the "field present but null" loophole without overspecifying the validation contract before the should-verify RFC lands. |
IPR Policy Agreement Required@erik-sv — thanks for the contribution. Before this PR can be merged, the AgenticAdvertising.Org IPR Policy requires your agreement. To agree, post a new comment on this PR with the exact phrase: Your signature is recorded once and covers all contributions to AAO repositories. See |
|
I have read the IPR Policy |
|
Thanks @erik-sv — solid PR, and the C2PA taxonomy split (binding assertions vs. Two things I'd want pinned before merge, then a few smaller nits. 1. Relationship between 2. Smaller, optional:
On Track 4: I read your My vote: approve once #1 is clarified in the schema. #2 is a one-line description add. Everything else is nice-to-have. Generated by Claude Code |
988a0a0 to
75e909c
Compare
|
Thanks @bokelley, thorough review. Addressed all five items in the latest commit: Issue 1 (blocking): gate semantics. Issue 2 (blocking): verify_url forward-compat. Both Issue 3 (optional): verify_url relaxed to optional. Issue 4 (optional): c2pa_action factored out. New enum file Issue 5 (optional): changeset updated. Notes the |
…s, seller confirms
Reframe the verifier contract per expert review (security, protocol, product).
A buyer-controlled `verify_agent.agent_url` was an SSRF + exfil + phishing
surface, an inversion of the trust model (seller is verifier-of-record), and
a vendor-adoption ask (verifier vendors won't ship governance agents
unilaterally). The corrected contract:
- Seller publishes `creative_policy.accepted_verifiers[]` — the governance
agents it operates or has allowlisted (`agent_url`, optional `feature_id`,
optional `providers[]`). Returned on `get_products`.
- Buyer represents on `embedded_provenance[]`/`watermarks[]` by attaching
`verify_agent: { agent_url, feature_id? }` whose `agent_url` matches a
published `accepted_verifiers[]` entry (canonicalized).
- Seller confirms by cross-checking the URL against its own allowlist before
any outbound call, then invoking `get_creative_features` against the
matching on-list agent. Sellers MUST NOT call buyer-asserted endpoints
outside the allowlist.
Schema changes:
- `creative-policy.json`: add `accepted_verifiers[]` sibling field.
- `provenance.json`: rewrite `verify_agent` description as buyer's
representation; tighten `additionalProperties: false`; require `https://`
pattern on `agent_url`; drop "MAY substitute" language.
- `error-code.json`: add `PROVENANCE_VERIFIER_NOT_ACCEPTED` for the
cross-check rejection. Constrain `PROVENANCE_CLAIM_CONTRADICTED.details`
to the audit-safe allowlist `{ agent_url, feature_id, claimed_value,
observed_value, confidence, substituted_for }`.
Doc changes:
- `provenance-verification.mdx`: new "The verifier contract" section laying
out seller-publishes / buyer-represents / seller-confirms, with worked
example covering all three steps. Updated rejection-code table and
buyer/seller checklists.
- `provenance.mdx`: rewrite `verify_agent` shape section as buyer's
representation; expand creative-policy enforcement example with
`accepted_verifiers`; fix the `PROVENANCE_CLAIM_CONTRADICTED` description
(verifier source is the seller's allowlist, not the buyer's nominated
endpoint); add `details` allowlist constraint.
Storyboard:
- `provenance_enforcement.yaml`: seed `accepted_verifiers` on the fixture;
new `reject_off_list_verifier` phase exercises
`PROVENANCE_VERIFIER_NOT_ACCEPTED`; positive-path phase now demonstrates
on-list `verify_agent` representation; added top-level `brand` to the
`get_products` step for parity with peer storyboards.
Validation: build:schemas + build:compliance pass; targeted tests
(test:schemas, test:examples, test:json-schema, test:storyboard-*,
test:composed, test:docs-nav, test:build-schemas-hoist-enums) all green.
Refs: #3468, #2854.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-to-end storyboard Stacked on top of adcontextprotocol#3468 to ship the embedded_provenance / watermarks / provenance_requirements work as a complete, end-to-end-executable bundle. verify_agent (replaces verify_url): each embedded_provenance and watermarks entry now carries a `verify_agent: { agent_url, feature_id? }` pointer at an AdCP governance agent that can verify the embedding via `get_creative_features`. Verification routes through the existing AdCP governance surface (governance.creative_features in get_adcp_capabilities) rather than per-vendor opaque webhooks. Multiple verifiers can implement checks for the same provider; receivers may substitute any agent that declares an equivalent feature. PROVENANCE_* error codes: new entries on error-code.json give sellers a machine-readable rejection vocabulary for sync_creatives: PROVENANCE_REQUIRED, PROVENANCE_DIGITAL_SOURCE_TYPE_MISSING, PROVENANCE_DISCLOSURE_MISSING, PROVENANCE_EMBEDDED_MISSING, and PROVENANCE_CLAIM_CONTRADICTED (active refutation by a verifier, distinct from absence). Each per-creative result carries error.field at the resolved provenance path; PROVENANCE_CLAIM_CONTRADICTED carries verifier identity in error.details. Buyers' orchestrators self-correct without negotiating with the seller. creative-policy.json: drops the "visibility surface, not prevention surface" framing. Sellers that publish a provenance_requirements entry MUST enforce it on sync_creatives with the matching error code. The should-verify-RFC deferral language is removed from descriptions and the changeset. Docs: docs/creative/provenance.mdx adds sections on embedded_provenance, watermarks, and the verify_agent shape, plus the rejection-code table. docs/governance/creative/provenance-verification.mdx adds a "What buyers can declare" / "What sellers can require" pair, the rejection-code reference, and updates the buyer/seller checklists to reflect the new fields and codes. Storyboard: protocols/media-buy/scenarios/provenance_enforcement.yaml exercises the structural-rejection contract end to end — discover the requirement on get_products, submit without disclosure metadata and expect PROVENANCE_DISCLOSURE_MISSING, then resubmit with disclosure and expect acceptance. Refs: adcontextprotocol#3468, adcontextprotocol#2854. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Merged @bokelley's stack (erik-sv#1) and pushed a follow-up commit. The branch now carries: From the stack (Brian's work):
Follow-up fixes (9ec5bec):
All targeted tests pass. Blocked on #3478 (IPR check script fix) landing on main so CI can re-run. |
|
Quick status check, since the picture shifted: #3478 has already merged (it lands as The conflict is on a single file —
So a clean rebase needs three things:
Without #3, the new codes ship without their structured recovery classification and SDKs won't surface them. Suggested mapping based on the existing prose: "PROVENANCE_REQUIRED": { "recovery": "correctable", "suggestion": "attach a provenance object — at minimum digital_source_type — and resubmit" },
"PROVENANCE_DIGITAL_SOURCE_TYPE_MISSING": { "recovery": "correctable", "suggestion": "set provenance.digital_source_type to a value from the digital-source-type enum and resubmit" },
"PROVENANCE_DISCLOSURE_MISSING": { "recovery": "correctable", "suggestion": "set provenance.disclosure.required and, when true, populate disclosure.jurisdictions" },
"PROVENANCE_EMBEDDED_MISSING": { "recovery": "correctable", "suggestion": "attach at least one embedded_provenance entry from a supported provider and resubmit" },
"PROVENANCE_VERIFIER_NOT_ACCEPTED": { "recovery": "correctable", "suggestion": "replace verify_agent.agent_url with one from the seller's published accepted_verifiers, drop verify_agent if the embedding is self-verifiable, or re-embed with a verifier the seller accepts" },
"PROVENANCE_CLAIM_CONTRADICTED": { "recovery": "correctable", "suggestion": "revise the provenance claim to match the verifier's observation or replace the creative; auto-retry without correction will not pass" }Happy to push the rebase + |
…ovenance schema Add two new optional arrays to provenance.json that distinguish between provenance metadata carried within the content stream (embedded_provenance) and content watermarks that encode an identifier or fingerprint (watermarks). The separation aligns with C2PA's normative taxonomy: embedded provenance maps to binding assertions and manifest embedding (Section A.7), while watermarks map to the c2pa.watermarked.* action family. Also expand creative-policy.json with a provenance_requirements object that gives sellers structured, field-level provenance requirements beyond the existing provenance_required boolean. Supports EU AI Act Art. 50 and CA SB 942 compliance workflows. New enums: embedded-provenance-method.json, watermark-media-type.json. All fields are optional and additive. Existing agents are unaffected. Refs: adcontextprotocol#2854
Address review feedback from @pkras: - Reframe provenance_requirements as a visibility surface, not a prevention surface. The enforcement gap is explicitly called out in the schema description, not just the PR body. - Specify that require_digital_source_type means the field must be present and set to a valid enum value (not null or absent). - Specify that require_disclosure_metadata means disclosure.required must be a boolean, and when true, at least one jurisdictions entry is expected.
…compat, c2pa_action enum Define provenance_required as the authoritative gate for provenance_requirements: the object refines the boolean and receivers ignore it when provenance_required is false or absent. Add forward-compat note to verify_url on both embedded_provenance and watermarks: verification protocol is opaque in v1, wire contract deferred to should-verify RFC. Relax verify_url from required to optional on embedded_provenance. Self-verifiable embeddings (e.g., C2PA text manifest with known public key) do not need a vendor endpoint. verify_url SHOULD be present for methods where the receiver cannot self-verify. Factor c2pa_action inline enum to c2pa-watermark-action.json for consistency with other enum files. Note c2pa field description repositioning in changeset.
…-to-end storyboard Stacked on top of adcontextprotocol#3468 to ship the embedded_provenance / watermarks / provenance_requirements work as a complete, end-to-end-executable bundle. verify_agent (replaces verify_url): each embedded_provenance and watermarks entry now carries a `verify_agent: { agent_url, feature_id? }` pointer at an AdCP governance agent that can verify the embedding via `get_creative_features`. Verification routes through the existing AdCP governance surface (governance.creative_features in get_adcp_capabilities) rather than per-vendor opaque webhooks. Multiple verifiers can implement checks for the same provider; receivers may substitute any agent that declares an equivalent feature. PROVENANCE_* error codes: new entries on error-code.json give sellers a machine-readable rejection vocabulary for sync_creatives: PROVENANCE_REQUIRED, PROVENANCE_DIGITAL_SOURCE_TYPE_MISSING, PROVENANCE_DISCLOSURE_MISSING, PROVENANCE_EMBEDDED_MISSING, and PROVENANCE_CLAIM_CONTRADICTED (active refutation by a verifier, distinct from absence). Each per-creative result carries error.field at the resolved provenance path; PROVENANCE_CLAIM_CONTRADICTED carries verifier identity in error.details. Buyers' orchestrators self-correct without negotiating with the seller. creative-policy.json: drops the "visibility surface, not prevention surface" framing. Sellers that publish a provenance_requirements entry MUST enforce it on sync_creatives with the matching error code. The should-verify-RFC deferral language is removed from descriptions and the changeset. Docs: docs/creative/provenance.mdx adds sections on embedded_provenance, watermarks, and the verify_agent shape, plus the rejection-code table. docs/governance/creative/provenance-verification.mdx adds a "What buyers can declare" / "What sellers can require" pair, the rejection-code reference, and updates the buyer/seller checklists to reflect the new fields and codes. Storyboard: protocols/media-buy/scenarios/provenance_enforcement.yaml exercises the structural-rejection contract end to end — discover the requirement on get_products, submit without disclosure metadata and expect PROVENANCE_DISCLOSURE_MISSING, then resubmit with disclosure and expect acceptance. Refs: adcontextprotocol#3468, adcontextprotocol#2854. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…s, seller confirms
Reframe the verifier contract per expert review (security, protocol, product).
A buyer-controlled `verify_agent.agent_url` was an SSRF + exfil + phishing
surface, an inversion of the trust model (seller is verifier-of-record), and
a vendor-adoption ask (verifier vendors won't ship governance agents
unilaterally). The corrected contract:
- Seller publishes `creative_policy.accepted_verifiers[]` — the governance
agents it operates or has allowlisted (`agent_url`, optional `feature_id`,
optional `providers[]`). Returned on `get_products`.
- Buyer represents on `embedded_provenance[]`/`watermarks[]` by attaching
`verify_agent: { agent_url, feature_id? }` whose `agent_url` matches a
published `accepted_verifiers[]` entry (canonicalized).
- Seller confirms by cross-checking the URL against its own allowlist before
any outbound call, then invoking `get_creative_features` against the
matching on-list agent. Sellers MUST NOT call buyer-asserted endpoints
outside the allowlist.
Schema changes:
- `creative-policy.json`: add `accepted_verifiers[]` sibling field.
- `provenance.json`: rewrite `verify_agent` description as buyer's
representation; tighten `additionalProperties: false`; require `https://`
pattern on `agent_url`; drop "MAY substitute" language.
- `error-code.json`: add `PROVENANCE_VERIFIER_NOT_ACCEPTED` for the
cross-check rejection. Constrain `PROVENANCE_CLAIM_CONTRADICTED.details`
to the audit-safe allowlist `{ agent_url, feature_id, claimed_value,
observed_value, confidence, substituted_for }`.
Doc changes:
- `provenance-verification.mdx`: new "The verifier contract" section laying
out seller-publishes / buyer-represents / seller-confirms, with worked
example covering all three steps. Updated rejection-code table and
buyer/seller checklists.
- `provenance.mdx`: rewrite `verify_agent` shape section as buyer's
representation; expand creative-policy enforcement example with
`accepted_verifiers`; fix the `PROVENANCE_CLAIM_CONTRADICTED` description
(verifier source is the seller's allowlist, not the buyer's nominated
endpoint); add `details` allowlist constraint.
Storyboard:
- `provenance_enforcement.yaml`: seed `accepted_verifiers` on the fixture;
new `reject_off_list_verifier` phase exercises
`PROVENANCE_VERIFIER_NOT_ACCEPTED`; positive-path phase now demonstrates
on-list `verify_agent` representation; added top-level `brand` to the
`get_products` step for parity with peer storyboards.
Validation: build:schemas + build:compliance pass; targeted tests
(test:schemas, test:examples, test:json-schema, test:storyboard-*,
test:composed, test:docs-nav, test:build-schemas-hoist-enums) all green.
Refs: adcontextprotocol#3468, adcontextprotocol#2854.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ER_NOT_ACCEPTED details Two follow-ups from review of the verify_agent trust model: 1. Replace opaque "canonicalized per the AdCP URL canonicalization rules" with an inline summary plus doc reference (/docs/reference/url-canonicalization) so implementers reading the schema get immediate guidance without hunting for the normalization spec. Add the provenance verifier allowlist surface to the canonicalization doc's "Where it applies" table. 2. Tighten PROVENANCE_VERIFIER_NOT_ACCEPTED error.details: instead of replaying the full accepted_verifiers[] snapshot on every rejection (the buyer already has this from get_products), reference the product's creative_policy. Reduces response payload size and avoids leaking the seller's full agent infrastructure on error paths. All targeted tests pass (schemas, examples, json-schema, composed, storyboard-*, docs-nav, build-schemas-hoist-enums).
9ec5bec to
233d692
Compare
|
@bokelley Good catch on #3738's Merge conflict resolved, |
The new creative_sales_agent/provenance_enforcement storyboard grades the PROVENANCE_*_MISSING / PROVENANCE_VERIFIER_NOT_ACCEPTED rejection contract end to end. The training agent has no provenance enforcement yet — the spec lands in this PR ahead of the reference implementation, so 3 of the storyboard's 5 step validations fail against the training agent (get_products doesn't surface the seeded creative_policy fields; sync_creatives accepts every submission). Add the storyboard to KNOWN_FAILING_STORYBOARDS in server/tests/manual/run-storyboards.ts, referencing adcp#3777 which tracks the training-agent implementation work. Once that lands, the entry can be removed and min_clean_storyboards / min_passing_steps in the workflow file should be bumped accordingly. Pattern matches the existing v3_envelope_integrity entry: the storyboard defines the spec contract, the runner has a tracked deferral until the reference agent catches up. Refs: adcontextprotocol#3468, adcontextprotocol#3777. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…iling Stack: mark provenance_enforcement storyboard as known-failing
Head branch was pushed to by a user without write access
…ion, storyboard rigor) Addresses all expert-review items from #3792's three-pass read (code-reviewer, security-reviewer, nodejs-testing-expert). Catalog safety + symmetric backfill (code-review): - Restrict backfillProductDefaults to seeded-product IDs only by folding it into overlaySeededProducts. The previous integration mutated nested objects (format_ids[], reporting_capabilities) on every product including catalog ones; getCatalog() returns a shallow copy whose nested fields alias the cached singleton, so mutations would leak across requests once a partial catalog product appeared. - Both handleGetProducts and handleCreateMediaBuy now go through the same overlay path, so backfill is applied symmetrically. Closes the asymmetric-backfill divergence the reviewer flagged. - Rename backfillProductDefaults -> backfillTrainingProductDefaults to make the training-only intent obvious at every call site. Sanitization (security-review): - Add sanitizeForError helper that strips C0/C1 controls and caps length; apply to buyer-controlled strings (verify_agent.agent_url, creative_id) before interpolating into TaskError.message and error.field. Defense against log/transcript poisoning by attacker- shaped values. - Confirmed via review (and grep): no outbound HTTP touches the buyer- supplied verify_agent.agent_url anywhere in the enforcement path. The trust invariant from #3468 holds. Helper documentation: - enforceProvenancePolicy gains a cascade-order docstring (storyboard assertions on errors[0] rely on stable ordering). - aggregateCreativePolicy gains a comment explaining the deliberate asymmetry: requirement booleans are intersected (most-restrictive wins), accepted_verifiers are unioned (allowlist semantics). Storyboard rigor (testing-review): - Add reject_no_provenance phase (PROVENANCE_REQUIRED) — the cheapest, most likely buyer mistake had zero coverage. - Add reject_missing_digital_source_type phase (PROVENANCE_DIGITAL_SOURCE_TYPE_MISSING). Fixture now requires digital_source_type so the policy gate fires. - Tighten accept-phase assertion from field_present to field_value allowed_values: [created, updated]. field_present would have silently passed on action: failed. - Comment the errors[0] indexing as a stable contract tied to enforceProvenancePolicy's cascade order, with a TODO to switch to a field_contains predicate when the SDK ships one (adcp#3803). - Comment the brief-mode coupling for products[0] selection. Workflow comment correction: - True origin/main baseline was 64 storyboards / 439 legacy steps / 457 framework steps — the previous 53/388/401 floors had drifted. New floors: 65 / 446 / 464, lifting only the new creative_sales_agent/provenance_enforcement scenario (six phases). Comment corrected so the next reader doesn't reverse-engineer the baseline. Truth-of-claim follow-up: - Add static/compliance/source/protocols/media-buy/scenarios/ provenance_truth_of_claim.yaml as a single-phase skeleton. - Register it in KNOWN_FAILING_STORYBOARDS pointing at adcp#3802 (the follow-up issue tracking PROVENANCE_CLAIM_CONTRADICTED implementation in the training agent). Conformance rigor follow-up: - adcp#3803 tracks: (1) required-clean storyboard allowlist alongside the floors, (2) errors[*] field_contains predicate in the storyboard validator, (3) storyboard run in pre-push hook. Local conformance both modes: 65/65 clean; 446 passing steps legacy / 464 framework — exactly matching the new floors. Refs: #3468, #3777, #3792. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Adds two new optional arrays to
provenance.jsonand aprovenance_requirementsobject tocreative-policy.json, addressing the must-carry baseline expansion proposed in #2854 (Option A) plus the field shape for embedded content provenance (Track 1).Problem
The current provenance schema carries sidecar manifest references (
c2pa.manifest_url) and third-party detection results (verification), but has no way to declare that provenance metadata is embedded within the content stream itself. Ad-server transcoding, CMS ingestion, and CDN re-encodings all can break file-level bindings. Content that passes through multiple intermediaries arrives at the publisher stripped of sidecar provenance.Separately, there is no structured way for a seller to require specific provenance fields beyond the existing
provenance_requiredboolean.Changes
provenance.json- two new optional arraysembedded_provenancedeclares that provenance metadata is carried within the content stream. Each entry identifies one embedding layer via:method(new enum:manifest_wrapperorprovenance_markers)provider(organization that performed the embedding)verify_url(verification endpoint for governance agents)standard(optional, e.g.,c2pafor Section A.7)embedded_at(optional, ISO 8601 timestamp)watermarksdeclares that content watermarks have been applied. Each entry identifies one watermarking layer via:media_type(new enum:audio,image,video,text)provider(organization that applied the watermark)verify_url(optional, verification endpoint)c2pa_action(optional:c2pa.watermarked.boundorc2pa.watermarked.unbound)embedded_at(optional, ISO 8601 timestamp)Why two fields, not one
The distinction follows C2PA's normative taxonomy:
c2pa.watermarked.*action family. A watermark encodes an identifier (who generated it, who owns it) but does not carry a provenance record.A single asset can carry both. An article might have provenance markers in its text (embedded provenance) and a spread-spectrum watermark in an accompanying audio clip (watermark). The fields serve different governance questions: "Can I verify the provenance chain?" vs. "Can I detect who generated this?"
creative-policy.json- provenance_requirements objectExpands the existing
provenance_requiredboolean with structured, field-level requirements:require_digital_source_type- seller requiresdigital_source_typedeclarationrequire_disclosure_metadata- seller requiresdisclosuremetadatarequire_embedded_provenance- seller requires at least oneembedded_provenanceentryAll three are optional booleans. Existing seller agents that do not read the object are unaffected.
Enforcement gap disclosure:
provenance_requirementsin this version is a declaration framework, not an enforcement guarantee. A non-enforcing seller silently accepts non-compliant creative. The should-verify receiver-obligation RFC (deferred Track 2 from #2854) is intended to close this gap.New enum files
enums/embedded-provenance-method.json(manifest_wrapper,provenance_markers)enums/watermark-media-type.json(audio,image,video,text)Existing
c2pafieldThe existing
c2pa.manifest_urlfield is unchanged. Its description now notes that file-level bindings break during transcoding and points toembedded_provenancefor pipelines with intermediaries.How this addresses #2854
provenance_requirementsembedded_provenancewithverify_urlembedded_provenanceprovides the transcoding-resilient alternativeChannel coverage
embedded_provenance(text provenance markers on editorial artifact)verify_url, checks for publisher's markersembedded_provenance(text provenance markers on AI response)watermarks(video/audio watermark on creative)watermarks(audio watermark on creative)c2pasidecar orembedded_provenance(manifest wrapper)Standards alignment
embedded_provenancealigns with binding assertions (c2pa.hash.data,c2pa.soft-binding) and format-specific manifest embedding (Section A.7).watermarksaligns with thec2pa.watermarked.bound/c2pa.watermarked.unboundaction taxonomy.digital_source_typealignment. IPTC's vocabulary describes how content was produced, not what provenance it carries. The new fields address a gap that IPTC does not cover.Wire compatibility
All new fields are optional. No existing required fields are changed. Existing agents ignore unknown fields per
additionalProperties: true. This is a purely additive, non-breaking change.