adcontextprotocol · bokelley · May 11, 2026 · May 11, 2026 · May 11, 2026 · May 11, 2026
diff --git a/.changeset/idempotency-rule-9-and-10.md b/.changeset/idempotency-rule-9-and-10.md
@@ -0,0 +1,19 @@
+---
+"adcontextprotocol": minor
+---
+
+spec(idempotency): add normative rules for concurrent retries and downstream reconciliation; introduce `IDEMPOTENCY_IN_FLIGHT`
+
+Two new normative rules in `L1/security.mdx#idempotency`:
+
+**Rule 9 — Concurrent retries / first-insert-wins.** A second request carrying the same `(authenticated_agent, account_id, idempotency_key)` MAY arrive while the first is still executing. Sellers MUST resolve the race deterministically (`INSERT … ON CONFLICT DO NOTHING` on the scope tuple) and MAY pick one of two policies, behaving consistently: **wait-and-replay** (block the second request until the first completes, return cached response with `replayed: true`), or **reject-and-redirect** (return new `IDEMPOTENCY_IN_FLIGHT` code with `error.details.retry_after`). Same key with a *different* canonical payload during the in-flight window still returns `IDEMPOTENCY_CONFLICT` (rule 5). Verified against the canonical Python sales-agent (Wonderstruck) — its wait-and-replay implementation passes the new rule out of the box.
+
+**Rule 10 — Crossing service boundaries / downstream reconciliation.** When a seller invokes a downstream system (SSP, ad server, payment provider) during request handling, "errors don't cache" (rule 3) is necessary but not sufficient — a crash between downstream-accepts and local-persist leaves the seller in a "downstream unknown" state. Sellers MUST adopt one of two patterns for every downstream call whose duplicate-invocation has business consequences: **write-claim-before-invoke** (persist a claim row with `downstream_request_id` before invoking; reconcile on retry by querying the downstream by that id) or **thread-buyer-key** (pass the buyer's `idempotency_key` or a deterministic seller-side derivative as the downstream's own idempotency key). The pattern "best-effort dedup on downstream response inspection" is explicitly forbidden.
+
+**New error code: `IDEMPOTENCY_IN_FLIGHT`** (held for 3.1 per the wire-stability policy). Recovery: transient. Buyers MUST retry with the **same** `idempotency_key` after `error.details.retry_after` — minting a fresh key on this code turns a safe retry into a double-execution race.
+
+**Transitional note on `SERVICE_UNAVAILABLE + retry_after`.** Both reference implementations today (the Python sales-agent at `wonderstruck.sales-agent.scope3.com` and the `@adcp/sdk` middleware) implement wait-and-replay (rule 9's other policy) and never need to emit `IDEMPOTENCY_IN_FLIGHT`. SDKs that previously emitted `SERVICE_UNAVAILABLE + retry_after: 1` on the in-flight branch are NOT out of compliance with rule 9 as long as they adopt wait-and-replay end-to-end — `IDEMPOTENCY_IN_FLIGHT` is only required when a seller picks reject-and-redirect. The `@adcp/sdk` middleware swap from `SERVICE_UNAVAILABLE` to `IDEMPOTENCY_IN_FLIGHT` is tracked separately (adcp-client follow-up); it's a wire-code tightening, not a behavioral change.
+
+**Storyboard coverage.** `static/compliance/source/universal/idempotency.yaml` gains a `concurrent_retry` phase using two new cross-response check kinds (`cross_response_count_distinct`, `cross_response_field_equal`) that operate on the resolved response set across N parallel dispatches. The runner contract is documented in the new `test-kits/parallel-dispatch-runner.yaml`; runners without parallel-dispatch support skip the phase with a stable not_applicable marker. SDK/runner implementation tracked separately (adcp-client follow-up).
+
+Author skill (`skills/call-adcp-agent/SKILL.md`) and the buyer-facing `docs/protocol/calling-an-agent.mdx` updated so buyers know to wait-and-retry on `IDEMPOTENCY_IN_FLIGHT` rather than mint a fresh key.
diff --git a/docs/building/by-layer/L1/security.mdx b/docs/building/by-layer/L1/security.mdx
@@ -299,6 +299,36 @@ This section applies only to AdCP task requests. OpenRTB bid streams have their
 
     The ceiling is per `(authenticated_agent, account)` — the same scope as the idempotency key itself (bullet 1) — so a multi-account agency does not have its per-account budgets collapsed into a single shared quota. `RATE_LIMITED` rejections MUST populate `retry_after` (seconds) per the [error handling taxonomy](/docs/building/by-layer/L3/error-handling#rate-limit-handling) and MUST NOT be cached as idempotency responses (rule 3: only successful responses are cached). Sellers SHOULD enforce `retry_after` as a cheap rejection floor — a buyer retrying before `retry_after` elapses SHOULD hit a pre-auth token bucket (e.g., at a reverse-proxy layer) rather than re-entering the full schema-validate-and-cache-check pipeline on every retry. Without this discipline, misbehaving buyers can amplify load on the rate-limiter itself.
 
+9. **Concurrent retries — first-insert-wins.** A second request carrying the same `(authenticated_agent, account_id, idempotency_key)` MAY arrive while the first request is still executing — most commonly when the buyer's transport timeout fires before the seller's downstream call returns, and the buyer retries. Sellers MUST resolve the race deterministically; they MUST NOT execute the side effect twice and MUST NOT silently drop the second request. Resolution is a `(unique constraint, INSERT … ON CONFLICT DO NOTHING)` pattern on the scope tuple: the first row to land owns execution and stores the canonical payload hash on the in-flight row (NOT a sentinel); subsequent requests observe an existing row whose response slot is not yet populated but whose payload hash IS populated.
+
+    Sellers MUST handle the second request by one of two policies and MUST behave consistently across calls — clients infer the policy from the first response within a session and apply it to subsequent retries:
+
+    - **Wait-and-replay** (preferred for fast operations, &lt;5s typical): the seller blocks the second request until the first completes, then returns the cached response with `replayed: true`. Total wall-time for the second call is bounded by the seller's request-timeout budget.
+    - **Reject-and-redirect** (preferred for slow operations involving long-running downstream calls): the seller returns `IDEMPOTENCY_IN_FLIGHT` immediately, with `error.details.retry_after` (seconds, integer) populated based on the first request's elapsed time and expected completion. Buyers MUST retry with the same `idempotency_key` after the hint elapses — a buyer that mints a fresh key on `IDEMPOTENCY_IN_FLIGHT` turns a safe retry into the exact double-execution race this rule prevents.
+
+    A second request with the same key AND a *different* canonical payload during the in-flight window MUST return `IDEMPOTENCY_CONFLICT` (rule 5), not `IDEMPOTENCY_IN_FLIGHT` — the canonical-form mismatch is computable at INSERT time against the row's stored hash, so the conflict is detectable without waiting for the first request's response. Sellers whose backing store cannot persist the real canonical hash until the handler completes (e.g., a placeholder-sentinel pattern) MUST upgrade the store to persist the hash at INSERT time before declaring rule 9 conformance — the alternative (returning `IDEMPOTENCY_IN_FLIGHT` on a same-key-different-payload race and only surfacing the conflict after the first request completes) silently delays detection of a real client bug.
+
+    Per rule 3, if the first request ultimately fails (validation error, downstream timeout, internal error), the `(in_flight)` row is released — the key returns to "never seen" state and a subsequent retry re-executes from scratch. Sellers MUST bound the lifetime of an in-flight row to their declared per-task handler timeout, and MUST release the row (treat as failed per rule 3) when that timeout fires — even if the downstream has not yet responded. Without this bound, a hung handler indefinitely returns `IDEMPOTENCY_IN_FLIGHT` for the same key, locking the buyer out of any safe retry path.
+
+    Sellers using reject-and-redirect MUST set `error.details.retry_after` to a value no greater than `replay_ttl_seconds` (declared in `capabilities.idempotency`). A buyer instructed to wait past the seller's own replay window is being told to wait until the response can no longer be replayed — the wait is vacuous and the buyer either ends up minting a fresh key (the failure mode this rule prevents) or hits `IDEMPOTENCY_EXPIRED` on retry. Recommended bound: an order of magnitude below the replay TTL, derived from the seller's typical handler latency rather than the TTL ceiling.
+
+    Sellers MUST NOT leak the in-flight state across the scope boundary: an attacker probing a candidate key MUST receive the same response shape and timing whether the row exists, is in flight, or has never existed.
+
+10. **Crossing service boundaries — downstream reconciliation.** Sellers commonly invoke downstream systems during request handling — SSP/ad-server calls on `create_media_buy`, payment-provider calls on billing operations, governance-agent calls on `check_governance`. These calls have their own failure modes that can leave the seller in a "downstream unknown" state: the network connection dropped after the downstream accepted the request but before its response arrived; the seller process crashed mid-call; a region failover swapped the worker before the response was persisted. Rule 3 (only successful responses cached) is necessary but not sufficient: a seller that simply doesn't cache and re-executes on retry will double-invoke the downstream and create duplicate side effects there.
+
+    **Conformance grading.** This rule is reviewer-graded, not programmatically graded by the compliance storyboard suite. Black-box observation cannot distinguish "the seller has a claim row" from "the seller got lucky on the test run." The `parallel_dispatch_runner` test-kit lists rule-10 conformance under `reviewer_checks` — sellers attesting to rule-10 conformance MUST surface their operational runbook describing which pattern applies to which downstream, and reviewers verify the implementation against that runbook. The other normative rules (1–9) are programmatically graded.
+
+    Sellers MUST adopt one of two reconciliation patterns for every downstream call whose duplicate-invocation has business consequences (resource creation, payment movement, irreversible state change). Read-only downstream calls (cache lookups, eligibility checks that don't write) are exempt — but borderline cases like fraud-scoring lookups that also write to a downstream audit log count as writes for this rule (the audit log entry is the side effect).
+
+    - **Write-claim-before-invoke (preferred default).** Before invoking the downstream, the seller persists a "claim" row in the same transaction as the idempotency cache row — typically `{idempotency_key, downstream_provider, downstream_request_id, status: 'invoked', invoked_at}` — using the seller-generated `downstream_request_id` it will pass to the downstream as the downstream's own correlation/idempotency identifier. On retry, before invoking the downstream again, the seller MUST look up the claim row by `(idempotency_key, downstream_provider)` and reconcile: query the downstream by `downstream_request_id` to determine the true outcome, then resume cache population from there. The seller MUST NOT treat a missing local record as "downstream call did not happen" — a crash between downstream-accepts and local-persist is exactly the case where it did happen and the local record is missing. If the downstream reports no record of `downstream_request_id` (the claim row was persisted but the seller crashed before invoking), the seller MUST treat the call as not-yet-invoked and proceed with the invocation; the claim row already reserves the `downstream_request_id`, so the downstream's own idempotency will dedup any subsequent retry. On an ambiguous response from the downstream lookup (transient 5xx, network error, malformed response), the seller MUST fail closed — return a transient error to the buyer (so the buyer retries against the same `idempotency_key` per rule 9) rather than proceed with invocation on an unauthenticated "no record" signal.
+    - **Thread-buyer-key (acceptable when the downstream protocol supports it).** The seller passes a per-downstream-provider derivative of the buyer's `idempotency_key` as the downstream's own idempotency key — typically `HMAC(K_provider, idempotency_key)` where `K_provider` is derived from the seller's KMS-managed root keyed by provider identity (one key per downstream, not one shared seller secret across all downstreams). Per-provider derivation prevents cross-provider replay if any single downstream is compromised; a shared seller secret across all downstreams collapses every provider into a single key-exposure blast radius. The downstream's at-most-once guarantee then covers the case the seller's local persistence missed. The seller MUST still write a claim row on the success path so the cached response can be populated correctly, but the downstream itself becomes the source of truth on retry. The seller MUST NOT pass the buyer's raw `idempotency_key` to any downstream operated by a different trust principal — the buyer's key is a capability token within its TTL (see "Keys are security-sensitive" below) and forwarding it across a trust boundary widens the capability surface. "Different trust principal" means any system the seller does not operate under the same security boundary; passing the raw key to a purely intra-tenant microservice the seller owns end-to-end (same KMS, same audit log, same operator) does NOT cross a trust boundary and is permitted, though per-provider derivation is still the better default.
+
+    Sellers MUST document which pattern applies to which downstream in their operational runbook. Sellers MUST NOT use a third pattern of "best-effort dedup on downstream response inspection" — comparing the downstream's response payload to a cached fingerprint to decide whether the call already happened — because the downstream's response shape changes across versions and the fingerprint is a synchronization bug waiting to happen. A claim row OR a threaded key. Not pattern-match-on-response.
+
+    Sellers MUST NOT include the buyer's `idempotency_key` (or any reversible derivative thereof) in error envelopes returned to the buyer when those errors originated from the downstream. Downstream errors that mention the seller's per-downstream-provider key (or the buyer's key, if the seller incorrectly threaded it raw) MUST be re-keyed or stripped before propagating to the buyer — otherwise a downstream error message becomes a cross-trust-boundary key-disclosure surface.
+
+    The buyer-visible consequence of this rule: when a seller invokes a slow downstream and the buyer retries during the window, the seller's response on the second request is determined by the seller's policy under rule 9 (`IDEMPOTENCY_IN_FLIGHT` or wait-and-replay), not by the downstream's behavior. Buyers do not need to know which downstream is in the path — the seller MUST present a uniform retry surface regardless.
+
 #### Payload equivalence
 
 "Equivalent" means **identical canonical JSON form**, not field-by-field semantic comparison. Sellers MUST determine equivalence by hashing the canonical form and comparing hashes. The canonical form is [RFC 8785 JSON Canonicalization Scheme (JCS)](https://www.rfc-editor.org/rfc/rfc8785) — number serialization, key ordering, and escaping all follow JCS §3 normatively.

diff --git a/docs/protocol/calling-an-agent.mdx b/docs/protocol/calling-an-agent.mdx
@@ -26,7 +26,8 @@ Every mutating tool requires an `idempotency_key` (UUID).
 
 - **Same key on retry** → server replays the **same response**, byte-for-byte. Use this for transport-level retries (timeout, 5xx, dropped connection).
 - **Fresh key** → **new operation**, regardless of body. Generating a new UUID because the previous attempt failed is the most common way naïve callers create duplicate media buys.
-- **Same key, different body** → server-defined; most agents return the original cached response and ignore the body change. Don't rely on it.
+- **Same key, different canonical body** → `IDEMPOTENCY_CONFLICT`. Sellers MUST reject (rule 5 in [security.mdx#idempotency](/docs/building/by-layer/L1/security#idempotency)) — do not silently apply the second body, do not silently replay the first response.
+- **Same key while first request still running** → `IDEMPOTENCY_IN_FLIGHT` (rule 9 in [security.mdx#idempotency](/docs/building/by-layer/L1/security#idempotency)). The seller MAY return this code with `error.details.retry_after` instead of blocking. Wait and retry with the **same key** — minting a fresh key on this code turns a safe retry into a double-execution race.
 
 For async flows, the replayed response carries the **same `task_id`** so polling continues against the same task instead of forking.
 

diff --git a/scripts/error-code-drift-dispositions.json b/scripts/error-code-drift-dispositions.json
@@ -46,6 +46,11 @@
       "target_version": "3.1",
       "note": "Part of field_scopes feature (#3887). Backport requires the whole feature surface, which is 3.1-shaped."
     },
+    "IDEMPOTENCY_IN_FLIGHT": {
+      "disposition": "held-for-next-minor",
+      "target_version": "3.1",
+      "note": "Pairs with security.mdx#idempotency rule 9 (first-insert-wins / concurrent-retry resolution). Lets sellers reject-and-redirect on concurrent retry instead of blocking; buyers retry with the same key after error.details.retry_after. Wire change — held for 3.1."
+    },
     "PAYMENT_TERMS_NOT_SUPPORTED": {
       "disposition": "held-for-next-minor",
       "target_version": "3.1",

diff --git a/scripts/lint-storyboard-check-enum.cjs b/scripts/lint-storyboard-check-enum.cjs
@@ -49,16 +49,17 @@ const SYNTHESIZED_CHECK_KINDS = new Set([
   'unresolved_substitution',
 ]);
 
-function loadAuthoredCheckKinds() {
+function loadKnownCheckKinds() {
   const doc = yaml.load(fs.readFileSync(CONTRACT_FILE, 'utf8'));
-  const kinds = doc && Array.isArray(doc.authored_check_kinds) ? doc.authored_check_kinds : null;
-  if (!kinds || kinds.length === 0) {
+  const authored = doc && Array.isArray(doc.authored_check_kinds) ? doc.authored_check_kinds : null;
+  if (!authored || authored.length === 0) {
     throw new Error(
       `runner-output-contract.yaml is missing the \`authored_check_kinds\` list. ` +
       `This lint reads that field as the canonical enum; restore it before running.`
     );
   }
-  return new Set(kinds);
+  const crossResponse = doc && Array.isArray(doc.cross_response_check_kinds) ? doc.cross_response_check_kinds : [];
+  return new Set([...authored, ...crossResponse]);
 }
 
 const RULE_MESSAGES = {
@@ -108,7 +109,7 @@ function* walkValidations(doc) {
 }
 
 function lint(sourceDir = SOURCE_DIR) {
-  const authoredKinds = loadAuthoredCheckKinds();
+  const authoredKinds = loadKnownCheckKinds();
   const violations = [];
 
   function lintFile(p) {
@@ -183,6 +184,6 @@ if (require.main === module) main();
 module.exports = {
   RULE_MESSAGES,
   SYNTHESIZED_CHECK_KINDS,
-  loadAuthoredCheckKinds,
+  loadKnownCheckKinds,
   lint,
 };