spec(idempotency): rules 9 (concurrent retries) + 10 (downstream reconciliation); IDEMPOTENCY_IN_FLIGHT#4402
Merged
Merged
Conversation
…m reconciliation); introduce IDEMPOTENCY_IN_FLIGHT Two new normative rules in L1/security.mdx#idempotency close the last gaps the existing 1-8 left underspecified — what happens when a second request arrives while the first is still running, and what a seller must do about downstream calls whose outcome it lost. Rule 9 (first-insert-wins). Sellers MUST resolve concurrent same-key retries deterministically via INSERT ... ON CONFLICT DO NOTHING on (authenticated_agent, account_id, idempotency_key), and MAY pick one of two policies: wait-and-replay (block, return cached) or reject-and-redirect (return new IDEMPOTENCY_IN_FLIGHT with error.details.retry_after). Verified against the Python sales-agent (Wonderstruck) — its wait-and-replay implementation passes out of the box; the new rule documents existing canonical behavior. Rule 10 (downstream reconciliation). When a seller invokes a downstream system during request handling, it MUST adopt either write-claim- before-invoke or thread-buyer-key. Best-effort dedup on downstream response inspection is explicitly forbidden. New error code IDEMPOTENCY_IN_FLIGHT (held for 3.1 per wire-stability policy). Recovery: transient. Buyers MUST retry with the SAME key after retry_after — minting a fresh key here turns a safe retry into a double-execution race. Storyboard concurrent_retry phase added to static/compliance/source/universal/idempotency.yaml, with two new cross-response check kinds (cross_response_count_distinct, cross_response_field_equal). Runner contract documented in static/compliance/source/test-kits/parallel-dispatch-runner.yaml. Runners without parallel-dispatch support skip with not_applicable. Author skill and calling-an-agent.mdx updated so buyers retry rather than mint fresh keys on the new error code. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…eck kinds, bound in-flight TTL, tighten rule 10 reconciliation Expert review (run-by-experts skill, 4 reviewers in parallel) surfaced 3 convergent blockers and 4 single-reviewer must-fix items on the rule-9/10 spec text and storyboard surface. All addressed: CONVERGENT (2+ reviewers): - Cross-response check kinds (cross_response_field_equal, cross_response_count_distinct) moved out of authored_check_kinds into a sibling cross_response_check_kinds enum in runner-output-contract.yaml. Prevents the per-response-interpretation footgun. Lint updated to read both arrays. Storyboard-schema.yaml gains a dedicated section documenting parallel_dispatch step block + the two check kinds. Runner-output-contract.yaml expected/actual blocks document both kinds' payload shapes. - In-flight TTL bound: rule 9 now requires sellers to bound in-flight row lifetime to their declared per-task handler timeout and release the row on timeout. Also caps retry_after at <= replay_ttl_seconds — recommends an order of magnitude below the TTL ceiling. - Rule 10 reviewer-graded note: explicit clause that rule 10 conformance is reviewer-graded, not programmatically graded by the storyboard suite. parallel_dispatch_runner already lists rule-10 as a reviewer_check; this surfaces it in the spec text too. SINGLE-REVIEWER MUST-FIX: - Rule 9 hash-availability: spec text now requires sellers to persist the canonical payload hash on INSERT (not a sentinel). Closes the gap where the SDK store's __adcp_in_flight__ placeholder prevented the rule-9-vs-rule-5 boundary from being detectable. - Rule 10 HMAC derivation rationale: tightened to per-downstream- provider derivation (HMAC(K_provider, idempotency_key)) rather than a single shared seller secret across all downstreams. Prevents cross-provider replay if any single downstream is compromised. - Rule 10 downstream-error key strip: sellers MUST NOT propagate the buyer's idempotency_key (or a reversible derivative) in error envelopes originating from a downstream. Closes a cross-trust- boundary key-disclosure surface. - Rule 10 claim-row-without-invoke: if the downstream reports no record of downstream_request_id (claim row persisted but seller crashed pre-invoke), the seller MUST proceed with the invocation rather than fail-closed. - Storyboard verify_media_buy_count narrative: corrected from "only TWO media buys" claim (false after the new concurrent_retry phase creates a third) to "captured media_buy_ids both exist and are distinct" — matches what the step's query actually asserts. Plus a transitional note in the changeset clarifying that SERVICE_UNAVAILABLE + retry_after on the in-flight branch remains conformant under wait-and-replay; IDEMPOTENCY_IN_FLIGHT is only required for reject-and-redirect. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…l-closed on ambiguous downstream lookup, rename lint helper Round-2 expert review surfaced 3 single-reviewer must-fix items (no convergent blockers). All three addressed: - Rule 10 raw-key clause was too absolute (security review #5). The prior MUST-NOT-pass-raw forbade legitimate intra-tenant patterns where a seller threads the key into a microservice it operates end-to-end (same KMS, same audit log, same operator). Reworded to forbid passing across trust boundaries specifically — same-trust- principal forwarding is permitted, though per-provider HMAC derivation remains the better default. - Rule 10 claim-row-without-invoke fail-closed on ambiguous lookup (security review #7-low). The prior "if downstream reports no record, proceed with invocation" wording assumed the downstream's lookup response is authentic. Added: on transient 5xx, network error, or malformed response, the seller MUST fail closed (return a transient error to the buyer per rule 9), not proceed on an unauthenticated "no record" signal. - Renamed lint helper loadAuthoredCheckKinds → loadKnownCheckKinds to match its actual behavior (returns the union of authored + cross- response check kinds after PR's round-2 split). Pure naming; lint output and behavior unchanged. Updated tests/lint-storyboard- check-enum.test.cjs to match. Follow-up filed: #4406 — declare capabilities.idempotency.in_flight_max_seconds so buyers can compute retry budgets bounded by the seller's handler-timeout rather than the much-wider replay_ttl_seconds ceiling. Not blocking this PR; additive 3.1 capability declaration. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4 tasks
bokelley
added a commit
that referenced
this pull request
May 11, 2026
…onds (#4409) Closes #4406. Follow-up to #4402 (rules 9 + 10 + IDEMPOTENCY_IN_FLIGHT). Rule 9 bounds in-flight row lifetime to the seller's per-task handler timeout but the bound is not buyer-observable today — the existing capabilities surface declares replay_ttl_seconds only (1h–7d), much wider than a realistic handler timeout. Buyers retrying on IDEMPOTENCY_IN_FLIGHT have no way to compute a tight retry budget. This adds an optional in_flight_max_seconds field to the IdempotencySupported branch of adcp.idempotency: - Optional in 3.1 (additive; SDKs that don't see it fall back to rule 9's order-of-magnitude SHOULD heuristic). - Required when supported: true in 4.0 (same migration path replay_ttl_seconds followed across 2.x → 3.x). - Bounded integer >= 1, <= 604800 at the schema layer; cross-field bound (<= replay_ttl_seconds) enforced by composed-schema test suite since JSON Schema cannot express field-relative bounds. - Forbidden on IdempotencyUnsupported branch (mirrors the existing replay_ttl_seconds treatment — no replay window means no in-flight bound). Rule 9 in security.mdx updated to point at the new capability field as the primary retry-budget bound when declared (Option A per the triage design pass); the order-of-magnitude heuristic remains the fallback when the field is absent. Composed-schema validation suite gains 5 new tests: - IdempotencySupported with in_flight_max_seconds accepts - Rejects in_flight_max_seconds: 0 (below minimum 1) - Rejects in_flight_max_seconds on unsupported branch - Schema accepts in_flight_max_seconds > replay_ttl_seconds at the schema layer (programmatic cross-field assertion catches it separately) - Cross-field assertion: in_flight_max_seconds > replay_ttl_seconds is detected 40/40 tests pass. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes the last two gaps the existing idempotency rules 1–8 left underspecified, both surfaced by a recent post on idempotency that walks through the failure modes a replay cache doesn't explain:
(authenticated_agent, account_id, idempotency_key)MAY arrive while the first is still executing. Sellers MUST resolve the race deterministically viaINSERT … ON CONFLICT DO NOTHINGon the scope tuple (with the canonical payload hash persisted at INSERT — not a sentinel — so same-key/different-payload mismatches are detectable before the first request completes), and MAY adopt one of two policies (behaving consistently): wait-and-replay (block the second request, return cached) or reject-and-redirect (return newIDEMPOTENCY_IN_FLIGHTwitherror.details.retry_after, bounded above byreplay_ttl_seconds). In-flight row lifetime is bounded by the seller's declared per-task handler timeout.parallel_dispatch_runner.reviewer_checks), not programmatically graded.IDEMPOTENCY_IN_FLIGHT(held for 3.1 per wire-stability policy). Recovery: transient. Buyers MUST retry with the same key — minting a fresh one turns a safe retry into a double-execution race.Storyboard coverage
static/compliance/source/universal/idempotency.yamlgains aconcurrent_retryphase using two new cross-response check kinds (cross_response_count_distinct,cross_response_field_equal) that operate on the resolved response set across N parallel dispatches. The check kinds live in a separatecross_response_check_kindsenum (notauthored_check_kinds) to prevent per-response-interpretation footguns; storyboard-schema.yaml documents theparallel_dispatchstep block and the kinds' semantics. The runner contract is instatic/compliance/source/test-kits/parallel-dispatch-runner.yaml; runners without parallel-dispatch support grade the phasenot_applicable.Wonderstruck (Python sales-agent) validation
Ran the new rules against the canonical Python sales-agent at
wonderstruck.sales-agent.scope3.com:create_media_buycalls with the same fresh key both returnedmedia_buy_id: "4070698693", both ~5.5–6s — exactly one resource created. The new rule documents existing canonical behavior rather than inventing it.adcp_error.code: "IDEMPOTENCY_CONFLICT"correctly.replayed: trueon the concurrent-replay response (rule 4 gap, not in scope here — tracked separately at Idempotency: missing replayed: true, lowercase validation_error, stripped replay payload, error.field in CONFLICT bokelley/salesagent#342).Expert review
Ran through
run-by-experts(4 reviewers in parallel) twice. Round 1 surfaced 1 convergent blocker + 4 single-reviewer must-fix items; all addressed in commit404fec7322. Round 2 surfaced 0 convergent blockers + 3 single-reviewer must-fix items; all addressed in commit4d98a0a198. Key fixes:cross_response_check_kindsout ofauthored_check_kinds(blocker: per-response-interpretation footgun)retry_afteratreplay_ttl_seconds(vacuous-wait footgun)Follow-up issues
parallel_dispatch_runnercontract implementation (gates the new storyboard phase from running anywhere)SERVICE_UNAVAILABLE→IDEMPOTENCY_IN_FLIGHTon the in-flight branch (one-line wire-code update)capabilities.idempotency.in_flight_max_secondsso buyers can compute retry budgets bounded by the seller's handler-timeout rather than the much-widerreplay_ttl_secondsceilingreplayed: trueenvelope, lowercasevalidation_error, package-field drop on replay,fieldpopulated in IDEMPOTENCY_CONFLICTTest plan
node scripts/lint-error-codes.cjs,lint-error-code-drift.cjs,lint-storyboard-check-enum.cjs— cleannode scripts/build-schemas.cjs,build-compliance.cjs— cleannode --test tests/lint-storyboard-check-enum.test.cjs— 6/6 pass (after function rename)npm test— 35/36 pass; the one failure (TMPserve_window_sec) is pre-existing on main, not introduced by this PRmedia_buy_id🤖 Generated with Claude Code