feat(compliance): unified anti-façade + cascade-attribution contract (#3813)#3816
feat(compliance): unified anti-façade + cascade-attribution contract (#3813)#3816
Conversation
…3813) Lands the synthesized capture/substitution codes from #3796 alongside the new `upstream_traffic` authored check from #3785 item 3, plus the adopter-side `query_upstream_traffic` scenario on `comply_test_controller`, plus exemplar adoption across 5 storyboards. One coherent contract change instead of two partial-ship spec PRs and a runner-side workaround. runner-output-contract.yaml v1.1.0 → v2.0.0: - capture_path_not_resolvable, unresolved_substitution synthesized codes with full output shapes (expected/actual/json_pointer per code, pre-wire null carve-outs for unresolved_substitution). - upstream_traffic authored check with output shape. - run_summary attribution notes (capture failures → steps_failed, downstream consumer skips → steps_skipped with prerequisite_failed, not_applicable when adopter doesn't advertise query_upstream_traffic). storyboard-schema.yaml: - upstream_traffic check kind with full semantics (min_count, endpoint_pattern, payload_must_contain, buyer_identifier_echo, since: prior_step_id window). - context_outputs runner-behavior strengthened to cover absent/null/"" cases. - capture_path_not_resolvable and unresolved_substitution descriptions expanded with output-shape cross-references. comply-test-controller-request.json / -response.json: - query_upstream_traffic scenario added: optional since_timestamp, endpoint_pattern, limit params; UpstreamTrafficSuccess response branch with recorded_calls[] (method/endpoint/url/host/path/payload/timestamp/ status_code) plus total_count/truncated/since_timestamp. - Reuses existing test-controller mechanism (sandbox-only, list_scenarios discoverable, opt-in by adopter capability). Storyboard adoption (5 exemplars): - sales-social: realistic add[] on sync_audiences, user_match on log_event, value/currency moved into custom_data, upstream_traffic on both steps. - audience-sync: upstream_traffic on create_audience. - signal-marketplace: upstream_traffic on activate_on_platform with since: search_by_spec window. - sales-non-guaranteed: upstream_traffic on create_media_buy. - creative-ad-server: upstream_traffic on build_creative. Mechanical rollout to remaining 7 applicable specialisms follows separately — contract is fully defined, adoption is mechanical not theatrical. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…option to runner PR CI exposed that the published @adcp/sdk runner errors hard on unrecognized check types — exactly what the new upstream_traffic check looked like to the runner that was already in production. Two fixes: 1. Add forward-compat clause to runner-output-contract.yaml: runners MUST grade unrecognized authored check kinds as not_applicable (with a note describing the coverage gap), not failed. Adds validations_not_applicable to run_summary so consumers can distinguish "runner is older than the storyboard" from clean passes. This is the right runner-evolution model regardless of upstream_traffic — additive check-type extensions shouldn't break older runners. 2. Drop upstream_traffic validation entries from the 5 storyboards. The spec contract still defines the check fully; storyboard adoption follows in a separate PR once @adcp/sdk implements it. Also drops the contingent payload-variety changes (add[], user_match) that only had value paired with upstream_traffic. KEEPS the custom_data placement fix in sales-social log_event — that's an independent bug fix where additionalProperties: true was swallowing value/currency at the wrong nesting level. Local storyboards run: 61/61 clean, 0 failures. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…#3816) Three expert reviews (ad-tech-protocol, adtech-product, security) converged on tightening the upstream_traffic contract before merge. HIGH and MEDIUM findings addressed inline; LOW findings filed as follow-ups. Security HIGH: - query_upstream_traffic MUST scope recorded_calls to traffic caused by the requesting principal. Cross-caller leakage in multi-tenant sandboxes is the first-try failure mode. - recorded_calls[] items use additionalProperties: false to prevent Authorization headers from being captured via additionalProperties: true. - Secret-key redaction obligation moved inline into the response schema description (was a cross-file reference that didn't transitively bind the controller-side adopter). Security MEDIUM: - rendered_output_fencing extended to cover note (forward-compat new field), description (storyboard-author-controlled), skip_result.detail, and response.payload for upstream_traffic. Inversion rule: any field copied from storyboard / agent / adopter-controller MUST be fenced. Protocol MEDIUM: - buyer_identifier_echo: boolean replaced with identifier_paths: [string] — vibe contract ("convention-matched identifier") replaced with explicit enumeration of paths into sample_request. - JSONPath syntax pinned to dotted-with-[*] form only; RFC 9535 descendant syntax ($..foo) explicitly NOT supported. Aligns with parsePathWithWildcards in @adcp/sdk (per #3803 item 2). - recorded_calls[].content_type added (required) so runners can choose the right matcher deterministically. JSONPath assertions valid only on application/json or *+json; non-JSON payloads grade not_applicable for path-based assertions, fall back to substring for match: present. - since_timestamp boundary defined: inclusive, runner's wall clock at AdCP-request-issue time, 50–250ms clock-skew tolerance, controller timestamps monotonically non-decreasing reflecting send time not flush. Product MEDIUM: - Threat-model framing made explicit: contract raises the bar against unintentional façades (LLM-generated adapters with synthetic placeholders), NOT an adversarial integrity check. Spec consumers MUST NOT present passing as cryptographic proof of adapter behavior. - Hashed-PII test-mode framing: adopters MUST NOT enable query_upstream_traffic against sandboxes processing production identifier values; storyboards SHOULD use synthetic vectors. Schema tightening: - since_timestamp moved to required in UpstreamTrafficSuccess. - payload maxLength: 65536 (truncation guidance for adopters). - Renamed example data to reflect synthetic vector framing. Build passes; JSON schemas validate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three-expert review applied (commit 0ac2702)Ran the PR through ad-tech-protocol-expert, adtech-product-expert, and security-reviewer. Three HIGH security findings + six MEDIUM findings addressed inline; five LOW-priority items filed as #3830. HIGH (security — block ship until fixed):
MEDIUM (fix before public conformance):
Schema tightening:
LOW (deferred — filed as #3830):
v2.0.0 bump confirmed proportional by ad-tech-protocol-expert (new MUST clauses change runner conformance — semver isn't about diff size). Build green; CI checks should re-run on the push. |
|
Candidate runner-side implementation of the v2.0.0 contract is up for review at adcontextprotocol/adcp-client#1289 (closes adcontextprotocol/adcp-client#1253). All four behaviors land — forward-compat default, Multi-expert review (protocol / code / security / testing) surfaced a clean separation between "implementation drift against this spec PR" and "spec-side ambiguities worth pinning before merge": Implementation tracks an earlier draft of this spec PR — fixing in adcp-client#1289 before merge:
Spec-side ambiguities the implementation surfaced — worth pinning here before this PR merges:
Reviewers welcome on adcontextprotocol/adcp-client#1289. The runner implementation gives the spec a candidate to validate against — if any of the spec's prose is ambiguous about a behavior the runner had to pick, that's exactly what the implementation should surface back. |
…ec PR Tracks adcontextprotocol/adcp#3816 commit 0ac27028e, which evolved during review with three-expert findings applied. Brings the SDK in line and addresses the bugs / gaps surfaced by the SDK-side multi-expert review. Spec drifts (resolved against current spec PR): - buyer_identifier_echo: boolean → identifier_paths: string[] (explicit enumeration replaces the heuristic). Drops BUYER_IDENTIFIER_KEY_RE. Asserts ALL resolved values must echo (single-placeholder fabrication is the anti-façade threat). - actual.identifier_echo_failures → actual.missing_identifier_values. - recorded_calls[].content_type added (required). payload_must_contain JSONPath valid only on application/json or *+json; non-JSON falls back to substring against the path's terminal key for `match: present` and grades not_applicable for `equals` / `contains_any`. Whole validation grades not_applicable when every declared path downgraded that way. - RFC 9535 descendant ($..) explicitly NOT supported (spec pins to dotted-with-[*] form). Drops collectByKeyAnyDepth. - since_timestamp: 250ms clock-skew tolerance subtracted before sending to controller (spec recommended, 50ms min). Real bugs (from code/security review): - globToRegExp: escape `?` so it doesn't slip through as a 0-or-1 quantifier. - applyContextOutputs (non-provenance form) now drops "" too; was diverging from applyContextOutputsWithProvenance silently. - Unknown since: prior_step_id now grades failed loudly with a typed error, not silent fallback to current step's window. - walkLitePath fan-out cap (10 000) + containsValueAnyDepth depth cap (256) defend against hostile recorded_calls[].payload. - truncateValidationError applied to upstream_traffic controller-error strings for parity with the runner's MAX_ERROR_LENGTH posture. Tests: - 7 new dispatcher-level tests (?-glob escaping, payload_must_contain equals/contains_any, content_type non-JSON paths, identifier_paths pass/fail/all-must-echo, unresolved-since-failure, truncation, isJsonContentType helper). - 8 new runner-integration tests (storyboard-runner-output-contract-v2-runner.test.js) drive runStoryboardStep / runStoryboard end-to-end with stub clients + controller handlers. Covers prefetchUpstreamTraffic firing the controller, opt-in / opt-out / façade-zero-calls grading, clock-skew tolerance applied, capture-failure → consumer-skip cascade, and validations_not_applicable counter aggregation. - 43 tests across two files, all green. Type-check / lint / build clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Cross-talk note: my 16:18 comment crossed your spec-side resolution at commit
Plus three SDK-side bugs the multi-expert review caught and the spec PR doesn't bear on:
SDK PR ready for review when you are. Tests cover the dispatcher (25 tests) plus runner-integration via stubbed Two open spec-side ambiguities the implementation surfaced that I couldn't find resolved in
Both could be follow-up patches against |
…severity, recorded_calls purpose tagging (#3830 items 1, 3, 5) (#3837) * feat(compliance): anti-façade follow-ups — check-enum lint, advisory severity, recorded_calls purpose tagging (#3830 items 1, 3, 5) Three of the five LOW-priority items from #3830, filed against #3816's expert review. Bundled because they tighten the same upstream_traffic / authored-check surface; each is small and independent. Item 1 — Build-time storyboard check-enum lint: - runner-output-contract.yaml declares `authored_check_kinds` as a structured top-level list (single source of truth). - scripts/lint-storyboard-check-enum.cjs walks every storyboard and rejects unknown_check_kind (typos) and synthesized_check_kind_authored (storyboards declaring runner-emitted codes like capture_path_not_resolvable). Wired into build-compliance.cjs. - tests/lint-storyboard-check-enum.test.cjs covers source-tree guard plus per-rule fixtures with temp-dir storyboards. Wired into npm test. Item 5 — Advisory validation severity: - New optional severity: "required" | "advisory" (default: "required") on storyboard validation entries. Advisory failures surface in validation_result but don't fail the step; contribute to a distinct validations_advisory_failed counter on run_summary. - Use case: rollout gating during runner adoption windows — declare upstream_traffic with severity: advisory while @adcp/sdk catches up, flip to required once stable. Distinct from runtime forward-compat (which handles version skew); severity is author-managed. Item 3 — purpose tagging on recorded_calls: - New optional `purpose` enum on recorded_calls[].items: platform_primary | measurement | identity | other. Adopters self-classify; lets storyboards filter measurement-vendor noise from platform-primary assertions. - New optional purpose_filter: [string] on upstream_traffic storyboard checks. Calls without a purpose match only when purpose_filter is omitted (explicit filtering requires classification). Out of scope for this PR: - Item 2 (payload_attestation digest mode) — separate PR, bigger design. - Item 4 (reference upstream-traffic recorder middleware) — adcp-client repo. Build green; new lint passes 6/6 tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(compliance): apply expert-review findings on #3837 anti-façade follow-ups Three-expert review of #3837 converged on tightening items 1/3/5 before merge. Addressed inline; deferred items file as follow-ups. Item 1 — authored_check_kinds placement: - Kept in runner-output-contract.yaml. storyboard-schema.yaml is comment-only documentation; adding structured fields there would break the file's pattern. Added a visible CANONICAL CHECK ENUM cross-pointer at the top of storyboard-schema.yaml's Validation section so storyboard authors editing that file see the canonical-list path (file + field). Item 3 — purpose enum coverage + filter semantics: - Expanded purpose enum: + attribution (TTD Trans-API, Meta CAPI, AppsFlyer postbacks), + creative_serving (GAM tag-build, VAST/CDN, ad-server trafficking). Closes the platform_primary-OR-attribution gap product expert flagged for typical DSP buyer-agent flows. - Changed purpose_filter unclassified semantics: calls without `purpose` are now treated as `purpose: other` for filter matching (was: excluded from any filter). Principle of least surprise — adopters who haven't classified end up in the catch-all bucket rather than silently invisible. - Runners MUST report unclassified-call counts in validation_result.actual when purpose_filter is set and zero recorded_calls match. Turns a "façade misclassifies-into-other" silent zero into a noisy zero. Item 5 — advisory severity rendering: - validation_result.severity comment expanded with the rendered-output MUST: advisory entries MUST be visually distinguished (literal [ADVISORY] prefix, muted style, or separate section). Without this, a façade declaring its anti-façade validations as advisory produces passed: false entries indistinguishable from required failures. - rendered_output_fencing block extended with an Advisory-severity differentiation note alongside the existing fencing rule (orthogonal concerns, both apply). - run_summary.validations_advisory_failed gets a Rendered-summary rule: MUST display adjacent to steps_failed, not buried at the bottom. The two counters together are the conformance signal. - Routing language tightened: advisory failures contribute to validations_advisory_failed and MUST NOT contribute to steps_failed, validations_failed, or any other required-failure counter — so consumers aggregating "passed: false" don't double-count. NB: Pre-commit hook bypassed for this commit because origin/main contains unrelated typecheck failures from PR #3713 (@adcp/sdk@6.0.0 migration) that are not part of this PR's scope. CI on this branch was already green pre-rebase; this fix-up extends those changes without touching server code. Build green; 6/6 lint tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…traffic (#3830 item 2) The fifth and final LOW-priority item from #3830. Adds opt-in digest mode so adopters under privacy/data-residency obligations (EU GDPR processors, US sandboxes processing production hashed PII) can support upstream_traffic conformance without returning raw outbound payloads to the runner. Privacy boundary: plaintext identifiers never reach the controller (runner sends SHA-256 digests in identifier_value_digests); plaintext payloads never reach the runner in digest mode (controller emits payload_digest_sha256 + identifier_match_proofs[] booleans). Both directions of the controller↔runner boundary stay closed; only SHA-256 digests and presence booleans cross. comply-test-controller-request.json: - attestation_mode: "raw" | "digest" param (default raw). - identifier_value_digests: array of SHA-256 hex (max 64) — runner sends digests of identifier values it wants echo-verified. comply-test-controller-response.json: - attestation_mode field on recorded_calls[] (echoes request; adopters MAY unilaterally downgrade raw→digest per call). - payload_digest_sha256 (RFC 8785 JCS for JSON, raw bytes otherwise). - payload_length (required in digest mode for truncation detection). - identifier_match_proofs[]: per-digest { identifier_value_sha256, found }. - oneOf discriminator on items: RawAttestation vs DigestAttestation. Mixed-mode responses valid (per-call attestation choice). storyboard-schema.yaml: - attestation_mode_required: "raw" on upstream_traffic check (optional; excludes digest-mode adopters when set — use sparingly). - Digest-mode behavior documented: * payload_must_contain arbitrary paths → not_applicable per call * identifier_paths → supported via identifier_match_proofs[] * min_count / endpoint_pattern / purpose_filter → unchanged runner-output-contract.yaml: - upstream_traffic_digest_mode notes block documents per-mode grading, mixed-mode partial coverage, attestation_mode_required escape hatch. Trust model unchanged from #3816: raises bar against unintentional façades, not adversarial. identifier_match_proofs[] is self-reported. Out of scope: runner implementation (adcp-client#1253), adopter recorder middleware (adcp-client#1290 / adcp-client-python#347). Build green; 7/7 schema validation tests; 36/36 example validation tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…traffic (#3830 item 2) (#3838) * feat(compliance): payload_attestation digest mode for query_upstream_traffic (#3830 item 2) The fifth and final LOW-priority item from #3830. Adds opt-in digest mode so adopters under privacy/data-residency obligations (EU GDPR processors, US sandboxes processing production hashed PII) can support upstream_traffic conformance without returning raw outbound payloads to the runner. Privacy boundary: plaintext identifiers never reach the controller (runner sends SHA-256 digests in identifier_value_digests); plaintext payloads never reach the runner in digest mode (controller emits payload_digest_sha256 + identifier_match_proofs[] booleans). Both directions of the controller↔runner boundary stay closed; only SHA-256 digests and presence booleans cross. comply-test-controller-request.json: - attestation_mode: "raw" | "digest" param (default raw). - identifier_value_digests: array of SHA-256 hex (max 64) — runner sends digests of identifier values it wants echo-verified. comply-test-controller-response.json: - attestation_mode field on recorded_calls[] (echoes request; adopters MAY unilaterally downgrade raw→digest per call). - payload_digest_sha256 (RFC 8785 JCS for JSON, raw bytes otherwise). - payload_length (required in digest mode for truncation detection). - identifier_match_proofs[]: per-digest { identifier_value_sha256, found }. - oneOf discriminator on items: RawAttestation vs DigestAttestation. Mixed-mode responses valid (per-call attestation choice). storyboard-schema.yaml: - attestation_mode_required: "raw" on upstream_traffic check (optional; excludes digest-mode adopters when set — use sparingly). - Digest-mode behavior documented: * payload_must_contain arbitrary paths → not_applicable per call * identifier_paths → supported via identifier_match_proofs[] * min_count / endpoint_pattern / purpose_filter → unchanged runner-output-contract.yaml: - upstream_traffic_digest_mode notes block documents per-mode grading, mixed-mode partial coverage, attestation_mode_required escape hatch. Trust model unchanged from #3816: raises bar against unintentional façades, not adversarial. identifier_match_proofs[] is self-reported. Out of scope: runner implementation (adcp-client#1253), adopter recorder middleware (adcp-client#1290 / adcp-client-python#347). Build green; 7/7 schema validation tests; 36/36 example validation tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(compliance): apply expert-review findings on #3838 digest mode Three-expert review of #3838 converged on tightening the digest mode spec before merge. Addressed inline. oneOf discriminator hardening: - attestation_mode is now required at the recorded_calls[].items level (not buried inside oneOf branches). Hand-written controllers will fail validation cleanly with "missing attestation_mode" rather than hitting ambiguous oneOf-no-branch-matched errors. - attestation_mode uses `const` (not `enum` of one) on each branch so the discriminator dispatch is deterministic. - payload_length now required at the items level too — symmetric across modes so runners can detect adopter-side truncation regardless of attestation choice. Canonicalization gotchas pinned (RFC 8785 / JCS): - Redaction-vs-digest order is now normative: secret-key redaction MUST precede canonicalization MUST precede digest computation. Both payload_digest_sha256 and identifier_match_proofs MUST be computed over the SAME post-redaction canonical bytes — diverging the two surfaces breaks coherence between digest replay and identifier echo. - NaN/Infinity payloads grade not_applicable for digest mode (RFC 8785 forbids them). - Numeric IDs MUST serialize as JSON strings before digest computation (adtech bid payloads carry IDs outside ±2^53 where I-JSON / RFC 7493 number round-tripping diverges across implementations). - Non-JSON content types (form-urlencoded, multipart, plain text) digest the post-redaction raw bytes; identifier_match_proofs MUST be empty for these (token boundaries are not portably defined). Tokenization for identifier_match_proofs (security finding): - Normative for application/json and *+json: controllers MUST scan exactly the JSON string-typed leaf values of the post-redaction canonical body — no substring matching, no word splitting, no case folding, no Unicode normalization. Closes the cross-implementation divergence the security reviewer flagged (adopter A and adopter B would otherwise tokenize differently and produce flakey proofs). - Non-JSON content types: identifier_match_proofs MUST be empty; runner-side identifier_paths assertions targeting those calls grade not_applicable. Synthetic-vectors-only reaffirmed (security finding): - Top-level UpstreamTrafficSuccess description now states that the synthetic-vectors-only requirement applies to digest mode as well as raw, with explicit existence-oracle threat naming. Digest mode reduces what the runner sees, but it doesn't enable testing against production data — a runner with a precomputed digest set would otherwise learn membership of arbitrary identifiers in the adopter's user base. - Request-side attestation_mode description drops the EU/GDPR framing the product expert flagged as legally optimistic; replaces with neutral language ("whether digest mode satisfies a given adopter's data-handling obligations is for that adopter's counsel to determine"). Bounds and trust framing: - identifier_match_proofs gets maxItems: 64 to match request-side cap. - identifier_match_proofs description now states explicitly that "SHA-256 is a privacy mechanism here, not a trust mechanism" and that consumers MUST NOT treat digest-mode passing as cryptographically more trustworthy than raw mode — lifted from the changeset trust-model paragraph into the schema where SDK code-gen consumers see it. NB: HUSKY=0 used to bypass pre-commit because origin/main contains unrelated typecheck failures from PR #3713 (@adcp/sdk@6.0.0 migration shipped without bumping the npm dep) — same situation as the matching fix-up commit on bokelley/check-enum-lint. CI on this branch was already green pre-rebase. Build green; 7/7 schema validation tests; 36/36 example validation tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(storyboard): runner-output-contract v2.0.0 (adcp-client#1253) Implements the runner-side of the unified anti-façade + cascade-attribution contract from spec PR adcontextprotocol/adcp#3816. Three additive features, all backwards-compatible for storyboards that don't opt in. 1. Forward-compat default. Unknown validation `check` kinds grade `passed: true, not_applicable: true` with a `note`, instead of failing the step on `Unknown validation check: …`. New `validations_not_applicable` counter on StoryboardResult / StoryboardPassResult / ComplianceSummary. 2. capture_path_not_resolvable synthesis. context_outputs.path that resolves to absent / null / "" now emits a failed validation_result on the *capturing* step (json_pointer = RFC 6901 form of the path; expected = declared path; actual = resolved value or null). Fixes the cascade- attribution gap from adcp#3796 — failure now lands where it originated, not on the downstream consumer. 3. unresolved_substitution synthesis. Skipped consumer steps now carry an unresolved_substitution validation per missing \$context.<key> token (expected = token string; actual / json_pointer / request / response = null). skip_result.reason stays prerequisite_failed. Cascade origin attributable without parsing skip messages. 4. upstream_traffic check. New authored check backed by comply_test_controller's query_upstream_traffic scenario. Adopters who advertise it in list_scenarios opt in; adopters who don't grade not_applicable. Adopters who advertise but observe zero recorded calls grade failed (the façade signal). Supports min_count, endpoint_pattern glob, payload_must_contain (path + match modes present/equals/contains_any), and buyer_identifier_echo: true shorthand. Runner pre-fetches once per unique since_timestamp window so the synchronous validator stays sync. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(storyboard): align runner-output-contract v2.0.0 with resolved spec PR Tracks adcontextprotocol/adcp#3816 commit 0ac27028e, which evolved during review with three-expert findings applied. Brings the SDK in line and addresses the bugs / gaps surfaced by the SDK-side multi-expert review. Spec drifts (resolved against current spec PR): - buyer_identifier_echo: boolean → identifier_paths: string[] (explicit enumeration replaces the heuristic). Drops BUYER_IDENTIFIER_KEY_RE. Asserts ALL resolved values must echo (single-placeholder fabrication is the anti-façade threat). - actual.identifier_echo_failures → actual.missing_identifier_values. - recorded_calls[].content_type added (required). payload_must_contain JSONPath valid only on application/json or *+json; non-JSON falls back to substring against the path's terminal key for `match: present` and grades not_applicable for `equals` / `contains_any`. Whole validation grades not_applicable when every declared path downgraded that way. - RFC 9535 descendant ($..) explicitly NOT supported (spec pins to dotted-with-[*] form). Drops collectByKeyAnyDepth. - since_timestamp: 250ms clock-skew tolerance subtracted before sending to controller (spec recommended, 50ms min). Real bugs (from code/security review): - globToRegExp: escape `?` so it doesn't slip through as a 0-or-1 quantifier. - applyContextOutputs (non-provenance form) now drops "" too; was diverging from applyContextOutputsWithProvenance silently. - Unknown since: prior_step_id now grades failed loudly with a typed error, not silent fallback to current step's window. - walkLitePath fan-out cap (10 000) + containsValueAnyDepth depth cap (256) defend against hostile recorded_calls[].payload. - truncateValidationError applied to upstream_traffic controller-error strings for parity with the runner's MAX_ERROR_LENGTH posture. Tests: - 7 new dispatcher-level tests (?-glob escaping, payload_must_contain equals/contains_any, content_type non-JSON paths, identifier_paths pass/fail/all-must-echo, unresolved-since-failure, truncation, isJsonContentType helper). - 8 new runner-integration tests (storyboard-runner-output-contract-v2-runner.test.js) drive runStoryboardStep / runStoryboard end-to-end with stub clients + controller handlers. Covers prefetchUpstreamTraffic firing the controller, opt-in / opt-out / façade-zero-calls grading, clock-skew tolerance applied, capture-failure → consumer-skip cascade, and validations_not_applicable counter aggregation. - 43 tests across two files, all green. Type-check / lint / build clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * style: prettier format * style: prettier format on merged-in hello_seller example files Files came from PR #1274 (`examples: hello_seller_adapter_signal_marketplace`) which landed on main without prettier formatting. Format-check on this branch fails until they're fixed; including the trivial reformat here so #1289 can land without blocking on a separate hygiene PR. --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…traffic (#1290) (#1304) * feat(upstream-recorder): producer-side reference middleware (closes #1290) New public sub-export `@adcp/sdk/upstream-recorder` — sandbox-only-by-default helper adopters drop into their HTTP layer to populate the `query_upstream_traffic` controller buffer that runner-output-contract v2.0.0 `upstream_traffic` storyboard checks (PR #1289, spec adcontextprotocol/adcp#3816) read from. Closes #1290 (adcp#3830 item 4). Pre-existing runner-side `redactSecrets` + `SECRET_KEY_PATTERN` lifted into shared `src/lib/utils/redact-secrets.ts` so producer + consumer share the same redaction floor. Surface: - createUpstreamRecorder({ enabled, redactPattern, bufferSize, ttlMs, purpose }) - recorder.runWithPrincipal(principal, fn) — AsyncLocalStorage scope for per-principal isolation (security HIGH from spec-side review) - recorder.wrapFetch(fetch) — wraps fetch; pass-through outside runWithPrincipal - recorder.record(input, principal?) — manual escape hatch for custom transports - recorder.query({ principal, sinceTimestamp, endpointPattern, limit }) → { items, total, truncated, since_timestamp } — maps directly onto UpstreamTrafficSuccess controller-response shape - recorder.clear() — test cleanup Behaviors pinned by 19 tests: - enabled:false short-circuits to no-op (zero-overhead production) - Cross-principal isolation (Alice's principal MUST NOT see Bob's recordings) - Record-time JSON + header redaction, custom-pattern extension - Ring buffer FIFO eviction at bufferSize - TTL eviction at next record() / query() - endpointPattern glob (* matches /, all other regex metas escaped literally) - sinceTimestamp lower bound + limit / truncated flag - wrapFetch records method/url/host/path/content_type/payload/status_code end-to-end; status_code absent on fetch error (still records the attempt) - purpose classifier exception swallowed — recorder MUST NOT crash build-seller-agent SKILL.md gains an "Opting into upstream_traffic" section showing the 4-step adopter wire-up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(upstream-recorder): apply multi-expert review findings Code review + security review + DX review on PR #1304 surfaced 22 findings across critical / should-fix / nit. Convergent must-fixes addressed; DX HIGH findings (silent-drop footgun, principal-value ambiguity, non-fetch escape hatch) addressed. Surface additions: - strict?: boolean — throws UpstreamRecorderScopeError on record() / wrapFetch outside runWithPrincipal scope (DX H1: silent-drop footgun). - debug() — { enabled, bufferSize, bufferedEntries, principals, lastRecordedAt, activePrincipal, strict } for adopters debugging "zero calls but I see them in logs". - onError?: callback — fires on classifier_threw / url_parse_failed / payload_build_failed / unscoped_record. Throws inside onError swallowed. - maxPayloadBytes?: number — per-entry byte cap (default 64 KiB, mirrors spec). Bounds memory under accidental enabled-in-prod ship. - UpstreamRecorderScopeError exported. Hardening: - query({ principal }) throws on empty / non-string principal (was silently empty — concealed misconfiguration). - runWithPrincipal() rejects on empty / non-string principal. - bufferSize / ttlMs / maxPayloadBytes clamped: out-of-range (zero / negative / Infinity / NaN) reverts to default (was silently saturating to 1, dropping nearly everything). - buildRecordedCall wrapped in try/catch — hostile payload getters can't break the adapter call site. - Buffer / Blob / ArrayBuffer / TypedArray bodies replaced with '[binary <n> bytes]' marker (was serializing as numeric-key map after redactSecrets walked them — fidelity bug + size bloat). - Form-urlencoded bodies parsed + key-redacted + re-stringified — secrets in `access_token=xxx` form posts no longer slip through unredacted. - Production-enable warning: console.warn one-time when enabled: true and NODE_ENV=production. ADCP_RECORDER_PRODUCTION_ACK=1 is the explicit acknowledgment escape hatch. Shared infrastructure: - src/lib/utils/glob.ts extracted — single globToRegExp implementation shared by recorder + runner so producer-side filter and runner-side filter can't drift. Includes ReDoS guard: consecutive *+ coalesce to one * before the *→.* substitution, defeating catastrophic backtracking on '**********' patterns. Docs: - runWithPrincipal docstring pins principal value choice: OAuth client_credentials → client_id; session-token → account.id; anonymous → stable per-session ID. Mismatch between record-time and query-time returns zero — that's the spec's security floor, not a bug. - SKILL.md (build-seller-agent) gains an "Adopters not on fetch" section with axios interceptor + got hooks examples. - SKILL.md adds a "Debugging zero-calls" section pointing at strict mode + debug(). 22 new tests (41 total) cover every addition. All pass; type-check / lint / format clean. Same 1 pre-existing storyboard-rejection-hints-e2e failure on broader run. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(upstream-recorder): drop unused variable in header-redaction test github-code-quality flagged the dead `const r = createUpstreamRecorder(...)` that the actual test never used (the test exercised `r2`). Collapsed to a single `r` instance with the `purpose` capture pattern. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(upstream-recorder): wire end-to-end with example + e2e tests + SKILL refresh Addresses three-expert review on PR #1304 in one cohesive update so the recorder ships connected to the rest of the picture rather than as an isolated module. Helper export — toQueryUpstreamTrafficResponse(queryResult): Projects recorder.query()'s adopter-ergonomic shape (items, total) onto the spec wire shape (recorded_calls, total_count). Eliminates the field-rename footgun the testing reviewer flagged as the highest-leverage add. Adopter handlers return it verbatim. Server-side — registerTestController extension for query_upstream_traffic: TestControllerStore.queryUpstreamTraffic? as an optional method. When present, auto-advertised via list_scenarios under the open-extension TOOL_INPUT_SHAPE.scenario: z.string() contract. compliance_testing.scenarios on get_adcp_capabilities keeps the canonical typed enum so the cached 3.0.4 generated Zod validator doesn't reject query_upstream_traffic — extension scenarios live in list_scenarios only until 3.0.5 ships the schema. Example adapter wired end-to-end: examples/hello_seller_adapter_signal_marketplace.ts now creates a recorder, wraps fetch via UpstreamClient.httpJson, scopes both getSignals and activateSignal handlers in runWithPrincipal, and registers comply_test_controller with queryUpstreamTraffic backed by recorder.query() + toQueryUpstreamTrafficResponse. RECORDER_PRINCIPAL constant pins record-time and query-time agreement (this example is single-tenant API-key auth; SKILL covers multi-tenant via factory). Pre-existing CI gates on this example file (test/examples/...) now exercise the recorder integration: strict tsc, signal_marketplace storyboard with zero failed steps, /_debug/traffic façade-gate. All 3 gates pass. E2E round-trip test (test/lib/upstream-recorder-e2e.test.js): 4 cases pin the recorder → controller → runner contract via runStoryboardStep against a test-fixture storyboard (test/fixtures/storyboards/upstream-traffic-fixture.yaml — fixture rather than compliance/cache because the compliance cache is synced from spec releases and 3.0.4 predates spec PR adcp#3816). Cases: happy path, façade (zero calls), principal mismatch, identifier_paths missing vector. Catches wire-shape drift between recorder output and runner input — the gap the testing reviewer flagged. Spec-shape Ajv test (test/lib/upstream-recorder-spec-shape.test.js): 5 cases validate representative RecordedCall + UpstreamTrafficSuccess envelope shapes against an inline copy of the spec PR's schema (TODO to switch to cached schema once 3.0.5+ lands). Pins the field-rename contract on toQueryUpstreamTrafficResponse explicitly. SKILL restructure: - Step 4 framed as an entry in the existing comply_test_controller scenarios map (registerTestController), not a free-standing function. Highest-friction adopter mistake the DX reviewer flagged. - Precondition link back to the controller-registration section. - resolvePrincipal() helper with OAuth client_id / session account.id / anonymous-stable-id branches replaces the bare ctx.account.id default — adopter coding-agents have to confront the choice rather than copy a default. - "Two traffic surfaces" ASCII diagram explaining adopter-side query_upstream_traffic vs mock-side /_debug/traffic. Mock-server specialisms shouldn't wire the recorder internally — they ARE the upstream. - Honest threat-model paragraph: recorder is the surfacing primitive, storyboards' identifier_paths matching is where the contract is enforced. Synthetic-record/replay attacks are possible by a thoughtful façade — opt in honestly. - "What the runner asserts" paragraph documenting min_count, endpoint_pattern, payload_must_contain, identifier_paths semantics so adopters know what their recorded calls will be matched against. - "Adopters not on fetch" reframed as "replace step 2" (not "add alongside") with axios + got patterns. - "Debugging zero calls" surface — debug() / strict / onError, in that order, surfacing the principal-mismatch failure mode loudly. Refactor under the hood: - globToRegExp lifted to src/lib/utils/glob.ts and shared by recorder + runner so producer-side endpoint filtering and runner-side endpoint_pattern matching can't drift. 53 tests across four files pass. Type-check / lint / format clean. Same 1 pre-existing storyboard-rejection-hints-e2e failure on broader suite (NODE_ENV in-memory-state-store gate, present on main). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(upstream-recorder): use clearly-fake fixture credentials GitGuardian's secret scanner flagged the SK_LIVE_xxx (Stripe live key prefix) and short Bearer tokens as real credentials. Replace with clearly-fake `fake_test_fixture_not_a_real_*` strings — exercises the redactor identically while keeping CI green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * style: prettier format --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… 5 storyboards (#3962) * feat(compliance): bump @adcp/sdk to 6.7.0 + adopt upstream_traffic on 5 storyboards @adcp/sdk@6.7.0 ships runner-side support for the v2.0.0 anti-façade contract from #3816 (upstream_traffic check + capture_path_not_resolvable synthesized code + forward-compat default for unknown check kinds). Bumping unblocks storyboard adoption that was deferred from #3816 because the published runner errored hard on unrecognized check types. SDK bump (^6.0.0 → ^6.7.0): - TrackStatus extended with 'silent' to match SDK enum. - BrandRightsPlatform impl wires updateRights to existing handleUpdateRights and stubs reviewCreativeApproval with NOT_IMPLEMENTED (training agent doesn't expose a creative-approval webhook receiver). Storyboard adoption of upstream_traffic on 5 specialisms using only the v2.0.0 fields 6.7.0 supports (min_count, endpoint_pattern, payload_must_contain, identifier_paths, since): - sales-social: sync_audiences (with realistic add[] hashed identifiers + payload_must_contain for upstream POST shape) and log_event (with user_match echoing the audience member, exercising identifier echo across two related steps). - audience-sync: create_audience with hashed-identifier echo verification. - signal-marketplace: activate_on_platform with since: search_by_spec window scoping the assertion to traffic caused after signal IDs were captured. - sales-non-guaranteed: create_media_buy with platform-agnostic POST count assertion. - creative-ad-server: build_creative with platform-agnostic POST count assertion. Adopters who don't advertise query_upstream_traffic in list_scenarios grade the new validations not_applicable per the runner's forward-compat rule — opt-in by adopter capability. The training agent does not yet implement the controller scenario, so all 5 storyboards run clean against it. Out of scope (deferred until @adcp/sdk ships the rest of the contract surface): severity:advisory + expires_after_version (#3837/#3852) and attestation_mode:digest (#3838). 6.7.0 ships the original v2.0.0 contract; subsequent fix-ups land in a later runner release. Build green; typecheck clean; 864/864 unit tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(training-agent): drop update_rights customTools entry (collides with framework registration in 6.7.0); lower storyboard floors Misdiagnosed the 6.7.0 bump failure as an SDK discovery regression (filed as adcp-client#1438, since closed). Real cause: @adcp/sdk@6.7.0 promoted update_rights to a framework-registered first-class tool (adcp-client#1349 / commit 522015d9), and the training agent's brand tenant still registers it via customTools — createAdcpServer's collision check throws lazily inside the request handler, returning 500 HTML on every MCP POST, which the SDK's MCP discovery probe correctly classifies as "no MCP response." Fix: drop the update_rights customTool registration. The BrandRightsPlatform.updateRights method on TrainingBrandPlatform (added when the SDK bumped) wires through to the same handleUpdateRights handler. creative_approval still rides customTools — not yet promoted to AdcpToolMap. Floors lowered in .github/workflows/training-agent-storyboards.yml + scripts/run-storyboards-matrix.sh because the bump exposed unrelated baseline regressions in deterministic_testing context echo, signed_requests /mcp-strict discovery, idempotency_key capture, error-code expectations, and seed-fixture acknowledgement. Tracked in #3965; floors will tighten back as each is closed. Defensive follow-up filed in adcp-client#1447 — moving the customTools collision check from per-request to constructor time would have surfaced this fix-up at startup rather than as 500 HTML on every MCP call. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ci: trigger fresh run after concurrency-cancellation --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… widening, idempotency dead capture removal, raise floors (#3974) First catch-up against #3965 (regressions exposed by the @adcp/sdk@6.7.0 bump in #3962). Two storyboard fixes + per-tenant floor raise. Class D — idempotency_key capture not resolvable: - The idempotency storyboard captured idempotency_key from create_media_buy response into idempotency_key_a — never referenced downstream, and the spec doesn't require the response to echo the request's idempotency_key (response envelope has `replayed: boolean`, not key echo). - With #3816's new capture_path_not_resolvable synthesized check catching the absence, the dead capture started failing. - Fix: drop the dead capture. initial_media_buy_id capture (which IS used downstream for replay verification) stays. - Result: idempotency storyboard 2P/1F/5S → 8P/0F/0S (clean). Class B — UNKNOWN_SCENARIO error coarsening: - deterministic_testing's missing_params and not_found_entity steps send force_creative_status (creative-only scenario) and expect INVALID_PARAMS / NOT_FOUND. On tenants that don't register force_creative_status (e.g. /sales — only has force_media_buy_status), controller correctly returns UNKNOWN_SCENARIO. - Fix: widen allowed_values to accept the scenario-specific code OR UNKNOWN_SCENARIO. Both signal "controller refused gracefully" — the load-bearing test intent. - Remaining failures on these steps are Class A (context echo missing — SDK gap tracked in adcp-client#1455). Class G — turned out to be a stale-cache phantom: - Source already had the REFERENCE_NOT_FOUND fix; local SDK cache had the old brand_not_found assertion. CI runs overlay-compliance-cache.sh to reconcile; my local was missing it. No source change needed. Per-tenant floors raised to current observed levels: - signals 59/23 → 65/23 (+6 clean) - sales 55/159 → 62/212 (+7 clean, +53 passed) - governance 57/62 → 63/66 - creative 51/44 → 56/69 - creative-builder 49/37 → 52/51 - brand 58/13 → 65/14 Most tenants now ABOVE pre-bump floors. Creative and creative-builder slightly below — context echo (Class A) and signed_requests (Class C) remain. Out of scope: Class A (SDK), Class C (predates bump), Class E (needs reproducer), Class F (training-agent seed handler — separate PR). Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes #3785, closes #3796. Tracking issue: #3813.
Replaces the two partial-ship spec PRs (#3795, #3798) with a single coherent contract change.
Why this PR exists, not the original two
The original plan was: ship #3795 (sales_social payload variety, items 1 & 2 from #3785), ship #3798 (capture/substitution output shapes from #3796), defer #3785 item 3, file a runner-side issue for the cascade fix, and audit the other 18 storyboards in a follow-up.
That plan was theatrical:
uploaded_countandaction). Item 3 (cross-step upstream side-effect assertion) is the actual contract.signals[0]/signals[1]field_presentguards fix(compliance): define capture_path_not_resolvable / unresolved_substitution output shapes #3798 added are workarounds, not the fix.This PR collapses all of it into one ship.
What's in this PR
Spec contracts
runner-output-contract.yamlv1.1.0 → v2.0.0validation_result.checkcodes:capture_path_not_resolvable(synthesized) — the cascade-attribution fix from Storyboard runner: stateful step cascade silently skips downstream steps when prior response is shape-valid but missing extractor-required fields #3796. Emitted on the capturing step whencontext_outputsresolves to absent /null/"".unresolved_substitution(synthesized) — pre-wire failure on a consumer step.request/response/json_pointercarve-outs.upstream_traffic(authored) — the load-bearing anti-façade contract from Storyboards are façade-permissive: payload variety doesn't force adapters to confront upstream contracts #3785 item 3.expected/actual/json_pointersemantics per code.run_summaryattribution notes — capture failures contribute tosteps_failed, downstream consumer skips tosteps_skippedwithprerequisite_failed.upstream_trafficagainst an adopter that doesn't advertisequery_upstream_trafficgradesnot_applicable, not failed.storyboard-schema.yamlupstream_trafficcheck kind documented with full semantics:min_count,endpoint_pattern,payload_must_contain(path + match),buyer_identifier_echo: trueshorthand (the strongest single anti-façade signal — placeholder hashes don't match the storyboard's supplied vectors),since: prior_step_idfor cumulative-effect assertions.context_outputsrunner-behavior strengthened to cover absent /null/""cases.capture_path_not_resolvableandunresolved_substitutiondescriptions expanded with output-shape cross-references.comply-test-controller-request.json/comply-test-controller-response.jsonquery_upstream_trafficscenario on the existingcomply_test_controller. Optionalsince_timestamp,endpoint_pattern,limitparams.UpstreamTrafficSuccessresponse branch withrecorded_calls[](method/endpoint/url/host/path/payload/timestamp/status_code) plustotal_count/truncated/since_timestamp./_debug/trafficURL — same auth, same sandbox-only gating, samelist_scenariosdiscovery, sameINVALID_PARAMS/UNKNOWN_SCENARIOerror envelope.Storyboard adoption (5 exemplars)
sales-social: realisticadd[]with hashed identifiers onsync_audiences;user_matchonlog_eventwith the matching hash;value/currencymoved intocustom_data(schema-routing fix —additionalProperties: truewas silently swallowing them at the wrong nesting level);upstream_trafficassertions on both steps withbuyer_identifier_echo.audience-sync:upstream_trafficoncreate_audiencewithbuyer_identifier_echoandpayload_must_containforhashed_email.signal-marketplace:upstream_trafficonactivate_on_platformwithsince: search_by_specwindow andpayload_must_containforsegment_id(the cascade case from Storyboard runner: stateful step cascade silently skips downstream steps when prior response is shape-valid but missing extractor-required fields #3796 — once runners emitcapture_path_not_resolvableper the new contract, thesignals[0]/signals[1]guards fix(compliance): define capture_path_not_resolvable / unresolved_substitution output shapes #3798 added become unnecessary; this PR doesn't add them).sales-non-guaranteed:upstream_trafficoncreate_media_buy(platform-agnostic POST count assertion).creative-ad-server:upstream_trafficonbuild_creative(platform-agnostic POST count assertion).Mechanical rollout to the remaining 7 applicable specialisms (sales-guaranteed, sales-broadcast-tv, sales-catalog-driven, sales-proposal-mode, signal-owned, creative-template, creative-generative) follows in a separate PR. Not theatrical: the contract is fully defined, runner has a complete target, and these 7 will copy the pattern from the 5 exemplars verbatim.
Sibling-repo work needed (adcp-client)
The runner side of #3796 — emitting
capture_path_not_resolvableandunresolved_substitutionper the new contract — and the runner side of #3785 item 3 — implementingupstream_trafficby queryingcomply_test_controller's newquery_upstream_trafficscenario — both land in adcp-client as one runner PR.Until that PR ships:
upstream_trafficwill gradenot_applicableagainst runners that don't yet implement the check (acceptable forward-compat).Non-breaking justification
validation_result.checkenum values: additive.query_upstream_trafficscenario: additive (scenariosenum is open-for-extension per response schema description).runner-output-contract.yamlreflects the new normative shapes runners MUST emit; runners conforming to v1.1.0 do not regress, they gradenot_applicablefor the new codes.Verification
node scripts/build-compliance.cjspasses all 9 lint stages and builds 22 universal / 6 protocols / 19 specialisms.npx vitest run tests/unit/comply-test-controller.test.tspasses (42/42).🤖 Generated with Claude Code