fix(training-agent): bump @adcp/sdk to 6.11.0, drop both KNOWN_FAILING_STEPS#4080
Merged
Conversation
…G_STEPS 6.11.0 ships adcp-client#1554 (handoffToTask accepts caller-supplied task_id) and #1552 (simulate_delivery spreads extension params verbatim). Wires the v6 sales platform's createMediaBuy to route the directive's submitted-arm envelope through ctx.handoffToTask with task_id, removes both KNOWN_FAILING_STEPS entries, ratchets step counts. Conformance suite fully green with zero skips. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SDK 6.11 ships a `<version>.previous` sibling alongside `<version>` for downgrade rollback. The runner reads from `<version>` per ADCP_VERSION, but the overlay's `find | head -1` picked siblings non-deterministically — locally it picked `<version>` (right) but Linux ext4 in CI picked `.previous` (wrong) and left the live cache with the SDK-bundled YAML. Storyboard checks then graded against the un-widened cache instead of the current-PR fixtures, regressing every tenant. Resolve via ADCP_VERSION when present; otherwise sort + filter out `.previous` defensively. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4 tasks
bokelley
added a commit
that referenced
this pull request
May 4, 2026
…4098) * fix(training-agent): tenant router lock + storyboard-smoke --brand Two follow-ups surfaced by the Socket Mode stack work (#4007/#4082) against the local training agent. 1. **Tenant router race (closes #4084).** The tenant router shares one MCP `Server` instance per tenant from the framework's `DecisioningAdcpServer` registry. Each request created a fresh `StreamableHTTPServerTransport`, called `server.connect(transport)`, handled the request, and `server.close()`'d. Two concurrent requests against the same tenant overlapped and the second `.connect()` threw "Already connected to a transport" — surfaced as intermittent 500s under back-to-back HTTP load. Fix: per-tenant async lock (`withTenantLock`) serializes the `connect/handle/close` window per tenant so the shared server only ever has one transport bound at a time. Throughput is gated by the in-flight request's wallclock; the storyboard runner's sequential dispatch makes this a non-issue, and the compliance heartbeat runs one tenant at a time. A future improvement could pool servers per tenant for true parallelism — this lock is the minimum-mass correctness change. Verification: 10 concurrent `tools/list` POSTs against `/api/training-agent/sales/mcp` return all `"result"` (was a mix of "result" and 500s before); server log shows zero "Already connected" errors (was 7+ per run before). 2. **storyboard-smoke `--brand` flag (closes #4083).** That issue reported `update_media_buy` on a cancelled buy returning `MEDIA_BUY_NOT_FOUND` instead of `INVALID_STATE`. After diagnosis the training agent is correct — the bug is in the upstream SDK runner's `update_media_buy` enricher, which fabricates an account from `resolveBrand(options)` (defaulting to `test.example`) when options.brand is unset. Positive-path steps get rewritten to `test.example`; `expect_error` steps skip the enricher and keep the YAML's literal `acmeoutdoor.example`. Result: split-brain session keying and stale-state reads on the negative-path probes. Fix: storyboard-smoke now accepts `--brand <domain>`. With `--brand acmeoutdoor.example`, the SDK runner's `applyBrandInvariant` normalizes both step kinds to the kit's domain and the storyboard passes 9/9. Long-form rationale in the JSDoc. Also un-blocks the pre-commit typecheck by adding a `// @ts-expect-error` on the v6-sales-platform `handoffToTask` call. PR #4080 bumped @adcp/sdk to 6.11 expecting the two-arg signature from adcp-client#1554, but the published 6.11 .d.ts still declares the single-arg shape only. Runtime accepts the second arg correctly — pure typings gap. Drop the directive when 6.12+ ships the matching typing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(training-agent): drop @ts-expect-error — local was stale on @adcp/sdk CI surfaced "Unused '@ts-expect-error' directive" because CI's npm install resolved @adcp/sdk 6.11.0 (which has the two-arg `handoffToTask` typing from adcp-client#1554), while my local node_modules was still on 6.9.0 (no typing for the second arg) — the version mismatch made the directive necessary locally but unused on CI. Refreshed node_modules to 6.11.0 and dropped the directive. Both surfaces typecheck clean now. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(training-agent): expand tenant-lock comments per code review Two doc-only fixes from the PR #4098 code review pass: 1. `withTenantLock` — explain why the lock includes flushDirtySessions and server.close() (not just the transport window): in-memory session state mutations from request N must persist before N+1 runs against the same shared `DecisioningAdcpServer`. Narrowing the lock would race on the v5 handlers' session-context state. 2. `withTenantLock` — explain the `.catch(() => {})` chain-keepalive pattern so a future reader doesn't "fix" it by removing the catch (which would poison every subsequent same-tenant request). 3. `storyboard-smoke.ts` — add `--brand` to the usage block so the flag is discoverable from the file header. No behavior change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
6.11.0 ships the two SDK gaps surfaced during the #3965 cluster work — both filed and closed within hours of being identified:
ctx.handoffToTask(fn, options)now acceptsTaskHandoffOptions.task_id. Lets the seller pass a caller-supplied task id through to the framework's submitted-arm projection (required by theforce_create_media_buy_armtest directive).simulate_deliverydispatcher now spreads params verbatim, so extension fields likevendor_metric_valuesreach the seller's adapter instead of being silently dropped.Wires both fixes:
v6-sales-platform.ts:createMediaBuydetects the v5 handler's submitted-arm envelope (returned whenforce_create_media_buy_armis set) and routes it throughctx.handoffToTask(fn, { task_id })with the directive's task_id. The handoff fn is a no-op-with-throw because the test directive only asserts on the immediate submitted envelope; production sellers register a real completion handler.KNOWN_FAILING_STEPSentries — both storyboards now pass cleanly with no skips.Floor lifts (step counts move when previously-skipped steps now pass):
All six tenants pass with zero failing steps and zero known-failing skips. The conformance suite is fully green on the framework path.
Test plan
🤖 Generated with Claude Code