feat: add SEP-2106 structuredContent wire-shape scenario (complements #295)#308
Draft
olaservo wants to merge 1 commit into
Draft
Conversation
…odelcontextprotocol#295) modelcontextprotocol#295 added SEP-2106 conformance checks for JSON Schema 2020-12 keyword preservation in inputSchema (composition/conditional/$anchor surviving tools/list), folded into the existing JsonSchema2020_12Scenario. This PR covers the other half of SEP-2106: the loosened wire format for outputSchema and structuredContent. Sep2106StructuredContentScenario emits six checks across two test tools: - sep_2106_array_output_tool: outputSchema.type === 'array' at the root advertised in tools/list; tools/call returns a JSON array directly in structuredContent. - sep_2106_primitive_output_tool: outputSchema.type === 'number' at the root; tools/call returns a raw number in structuredContent. Both shapes are what SEP-2106's motivation section uses as the worked examples (weather forecast / get-count) -- the wire-side behaviour the SEP exists to enable -- and neither is exercised by the keyword-preservation checks. The scenario uses raw HTTP because both the SDK Client and SDK Server validators currently reject non-object outputSchema/structuredContent: the SDK Client refuses to parse the list response, and the SDK Server returns JSON-RPC -32602 instead of emitting the call result. Raw HTTP bypasses both so the scenario can inspect what is actually on the wire. The scenario is registered in pendingClientScenariosList (and the all list, mirroring SEP-1613) so it does not run in the active suite against the in-repo everything-server, which cannot satisfy these checks until the SDK widens CallToolResultSchema.structuredContent to unknown. Once that lands, the pending entry can be removed. Positive verification target ships as examples/servers/typescript/sep-2106-compliant-server.ts: a bare-bones raw-Express server that speaks the SEP-2106 wire format end-to-end without an SDK in the path. A vitest in negative.test.ts spawns it and asserts every check is SUCCESS, which is what proves the scenario is not just emitting FAILURE everywhere. Assisted by Claude Code 🦉 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
commit: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Complements #295 by adding the second half of SEP-2106's wire-format changes that the keyword-preservation checks don't reach: the loosened
outputSchema(non-objecttypes at the root) and the widenedstructuredContent(any JSON value, not just records). These are exactly the shapes SEP-2106's motivation section is built around (the weather-forecast and get-count examples).cc @pcarleton — heads-up that this picks up where #295 left off; happy to fold differently if you'd prefer the checks live inside
JsonSchema2020_12Scenario.What #295 covered vs. what this adds
inputSchemakeepstype: \"object\"plus 2020-12 vocabulary survivestools/listoutputSchemamay betype: \"array\"at the root, advertised intools/listoutputSchemamay be a primitive (type: \"number\") at the roottools/callreturns an array directly instructuredContent(on the wire)tools/callreturns a primitive directly instructuredContent(on the wire)MUST NOTauto-deref network$ref(SEP security section)json-schema-ref-no-deref)New scenario:
sep-2106-structured-contentSix checks across two test tools:
sep-2106-array-output-tool-foundsep-2106-array-output-schema-preservedoutputSchema.type === 'array'survivestools/list(SDK didn't wrap it in{type:'object'})sep-2106-array-structured-contenttools/callreturns array directly instructuredContentsep-2106-primitive-output-tool-foundsep-2106-primitive-output-schema-preservedoutputSchema.type === 'number'survivessep-2106-primitive-structured-contenttools/callreturns raw number instructuredContentAll emit
FAILURE(capability-test framing, same as SEP-1613 / #295 — no new RFC 2119 sentences in the spec diff, so the keyword-mapping rule doesn't constrain severity). The scenario lives inpendingClientScenariosListonly — the in-repo everything-server can't satisfy these checks until the SDK widensCallToolResultSchema.structuredContenttounknown. Once that lands, the pending entry comes out.Why raw HTTP + a non-SDK reference server
Pre-SEP-2106, both sides of the SDK reject non-object
structuredContent:tools/listresponse (outputSchema.type !== 'object') and thetools/callresponse (structuredContentnot a record). The scenario uses rawhttp.requestto bypass that and inspect the actual wire bytes.tools/callresults against the sameCallToolResultSchemaand returns JSON-RPC-32602instead of letting array/primitivestructuredContentthrough. That means the only way to demonstrate a fully-compliant server today is to skip the SDK entirely, which is whatexamples/servers/typescript/sep-2106-compliant-server.tsdoes (bare Express, ~170 lines).The compliant server is the positive-test target. A vitest in
negative.test.ts(port 3009) spawns it and asserts all six checks SUCCESS — without it, there's no way to prove the scenario isn't just emitting FAILURE everywhere.Files
src/scenarios/server/sep-2106-structured-content.ts— new scenario (raw HTTP, ~440 lines incl. helper)examples/servers/typescript/sep-2106-compliant-server.ts— non-SDK reference server (raw Express, ~170 lines)src/scenarios/server/negative.test.ts— positive vitest case against the compliant serversrc/scenarios/index.ts— register inpendingClientScenariosList+allClientScenariosList, comment mirrors SEP-1613 framingTest plan
npm run typecheck— cleannpm run lint— cleannpm test— 215/215 passing (was 207 pre-change; +1 positive case = 208? — actually pre-push hook reports 215 because the scenario emits checks that the existing soft suite counters also see; the only new vitest block is the compliant-server positive case)JsonSchema2020_12Scenarioalongside SEP-2106: traceability YAML + JSON Schema 2020-12 conformance checks #295's checks — I went with separate because the soft version gate (SKIPPED/FAILURE) doesn't quite fit here (the SDK can't return the new shape at any negotiated version), but happy to fold.CallToolResultSchemawidening: remove thependingClientScenariosListentry; the in-repo everything-server can then be extended with these tools and the scenario goes green against it.Notes for review
docs/specification/draft/server/tools.mdxadds no new RFC 2119 sentences (descriptive-only loosening), so per AGENTS.md the keyword-mapping doesn't force a severity. I choseFAILUREto match SEP-2106: traceability YAML + JSON Schema 2020-12 conformance checks #295's choice for its capability checks; counter-argument: WARNING would be defensible if you consider these advisory until SDK support is widespread.sep-2106-broken-schema.tsnegative-test server: not included — SEP-2106: traceability YAML + JSON Schema 2020-12 conformance checks #295 already shipssep-2106-stripped-schema.tscovering the same negative-test role for the keyword-preservation checks. Adding another wire-shape negative server felt like noise; the positive case + SEP-2106: traceability YAML + JSON Schema 2020-12 conformance checks #295's stripped server cover the failure modes adequately. Happy to add one if reviewers disagree.Note: implementation assisted by Claude Code 🦉 (full transcript available on request); design choices and the comparison to #295 are mine.