Describe the bug
The suite already emits a tools-name-format check from ToolsListScenario (#240 / #238), but it does not correctly enforce the documentation-level SHOULD rules for tool names:
-
Wrong severity — violations emit FAILURE, but every normative sentence in the spec prose is SHOULD / SHOULD NOT, not MUST. Per AGENTS.md, SHOULD requirements must emit WARNING (Tier-1 CI still treats WARNING as a failure — #245).
-
Wrong rules — the check validates 1–64 chars and ^[A-Za-z0-9_./-]+$ (from the original SEP-986 draft). The published 2025-11-25 and draft prose says 1–128 chars and allows only A–Z, a–z, 0–9, _, -, . (no forward slash).
-
Wrong version gate — ToolsListScenario is tagged introducedIn: '2025-06-18', but the Tool Names normative prose exists only from 2025-11-25 onward. Proof from the spec source on GitHub (modelcontextprotocol/modelcontextprotocol, docs/specification/*/server/tools.mdx):
-
Misleading scenario prose — tools-list description and comments say tool names “MUST” match the format; the spec says SHOULD.
-
No negative vitest — unlike other wire/format checks (e.g. sep-2164-*, sep-2549-*), there is no deliberately non-conformant fixture proving the check catches violations.
-
Schema vs docs gap — in schema/draft/schema.json, Tool.name (via BaseMetadata) is an unconstrained string with no pattern, minLength, or maxLength. Conformance is the right place to test the documentation SHOULD until/unless the schema adds constraints.
Timeline note:
- SEP-986 was opened 2025-07-16 — nearly a month after the
2025-06-18 dated spec was published.
- The rule did not exist anywhere in the spec tree at the time that revision was cut; it landed first in draft (PR #1603, 2025-10-09) and only entered a dated release with
2025-11-25.
- Tagging the check
introducedIn: '2025-06-18' therefore backdates a requirement that postdates that version.
Format drift (64 + / → 128, no /)
-
The finalized SEP file was not updated when the rule entered the spec — seps/986-specify-format-for-tool-names.md still says 1–64 chars and allows / (example: user-profile/update).
-
When SEP-986 was integrated into the spec, PR #1603 wrote #### Tool Names as 1–128 chars, [A-Za-z0-9_.-] only (no /; examples admin.tools.list not user/profile/update) — see the PR #1603 tools.mdx diff.
-
No separate SEP introduced that change; the dated spec prose diverged from the SEP markdown at merge time. PR #1603’s additional context explicitly links typescript-sdk#900, whose validation matches the spec text (128 max, rejects / in tests) rather than the SEP file — plausibly reference TypeScript SDK treatment driving what landed in the spec.
-
Conformance #238 / #240 then encoded the stale SEP rules (64 + /), not the PR #1603 spec diff. Misalignment is still open on the SEP side: Solido on modelcontextprotocol#986 (Feb 2026) (2025-11-25 spec vs SEP on /); typescript-sdk#1502 closed in favor of tracking against current spec / #1512.
No equivalent hint exists elsewhere in the 2025-03-26 or 2025-06-18 trees (only “unique identifier” prose, unconstrained Tool.name: string in schema, and illustrative examples like get_weather — not charset/length SHOULDs).
The check should run only from 2025-11-25 onward (including draft), not on 2025-06-18 or 2025-03-26.
When was this introduced?
| Milestone |
Date / version |
Notes |
| SEP-986 opened |
2025-07-16 |
Original SEP text: 1–64 chars, includes /, all SHOULD |
| SEP merged into draft spec |
2025-10-09 (PR #1603) |
docs/specification/draft/server/tools.mdx |
| TypeScript SDK validation |
2025-11-11 (typescript-sdk#900) |
SDK warns on non-conformant names |
| Not present |
2025-06-18, 2025-03-26 |
No #### Tool Names in tools.mdx for those revisions — see 2025-06-18, 2025-03-26 |
| First dated spec with Tool Names prose |
2025-11-25 |
#### Tool Names at L217 — 1–128 chars, no / |
| Draft spec |
2026-07-28 |
#### Tool Names at L308 — same SHOULD rules (+ disambiguation note for aggregated clients) |
To Reproduce
Steps to reproduce the behavior:
- Run
tools-list against a reference server on --spec-version 2025-11-25:
node dist/index.js server --url '<server>' --scenario tools-list --spec-version 2025-11-25
- Observe check
tools-name-format — it runs even though the requirement is not in 2025-06-18 prose, and uses the 64-char / slash-allowed rules.
- Point the suite at a server advertising a tool named
bad name (space) or a.repeat(100) (valid under 2025-11-25 length, invalid under the harness’s 64-char rule) — behavior does not match the spec text.
- Inspect
buildToolsNameFormatCheck in src/scenarios/server/tools.ts — violations return status: 'FAILURE' instead of 'WARNING'.
- Search
src/scenarios/server/negative.test.ts and examples/servers/typescript/ — no fixture for invalid tool names.
Expected behavior
For --spec-version 2025-11-25 and draft only:
- After
tools/list, validate each advertised tool.name against the spec prose for that version:
- Length SHOULD be 1–128 (inclusive)
- Characters SHOULD be
[A-Za-z0-9_.-] only (no spaces, commas, /, etc.)
- (Optional follow-up checks: uniqueness within server, case-sensitivity — harder to test passively from a single list call)
- Emit
WARNING (not FAILURE) when any name violates the SHOULD rules; SUCCESS when all conform; INFO when tools is empty (current behavior is fine).
- Do not emit the check for
2025-03-26 or 2025-06-18 (gate via introducedIn: '2025-11-25' on the check or scenario, or specVersionAtLeast(ctx.specVersion, '2025-11-25')).
- Update scenario description/comments to say SHOULD, not MUST.
- Add a negative vitest: broken server fixture advertising
bad tool name → expect tools-name-format WARNING.
- Add
src/seps/sep-986.yaml traceability rows (one check ID per SHOULD sentence exercised, or one consolidated tools-name-format row if that matches manifest convention).
- Resolve SEP-986 (64 +
/) vs 2025-11-25 spec (128, no /) against the spec diff before coding — the check must track the dated spec text, not the stale SEP markdown alone.
For --spec-version 2025-06-18 / 2025-03-26: check should not run (no false signal).
Logs
Example from the current (incorrect) implementation when a name violates the harness rule:
[tools-name-format] FAILURE Tool names are 1-64 characters and match ^[A-Za-z0-9_./-]+$
Expected after fix:
[tools-name-format] WARNING Tool names SHOULD be 1-128 characters and match ^[A-Za-z0-9_.-]+$
Additional context
Precedent for enforcing documentation SHOULD semantics
This repo already treats SHOULD-level spec prose as WARNING checks that Tier-1 SDKs must still pass:
| Check |
Spec level |
Severity |
Location |
sep-2164-error-code |
SHOULD return -32602 |
WARNING |
src/scenarios/server/resources.ts |
sep-2164-data-uri |
SHOULD include URI in error data |
WARNING |
same |
sep-2575-server-list-changed-* |
SHOULD notify on list changes |
WARNING |
src/scenarios/server/stateless.ts |
sep-2243-server-reject-error-code (custom headers) |
mixed |
WARNING where SHOULD |
src/scenarios/server/http-standard-headers.ts |
| CIMD metadata |
SHOULD support |
WARNING |
src/scenarios/authorization-server/ |
Convention: AGENTS.md — Severity follows the spec keyword.
Negative-test precedent: src/scenarios/server/negative.test.ts + examples/servers/typescript/sep-*.ts broken fixtures.
JSON Schema gap
Tool.name in the machine-readable schema is only:
{ "type": "string", "description": "..." }
No pattern or length bounds. “Strict tool names” is therefore a prose SHOULD requirement until the schema catches up — same class of problem as other doc-only constraints conformance already tests.
Prior issues and PRs (attempted, but problem remains)
Several upstream efforts touched SEP-986 / tool names. None fully resolved the conformance gaps listed above (SHOULD severity, 2025-11-25 rules, version gate, negative fixture, traceability).
Conformance repo (modelcontextprotocol/conformance)
| # |
Type |
State |
What it did |
What it left unfixed |
| #238 |
Issue |
Closed |
Requested tools-name-format on tools-list |
Framed requirements as MUST (spec says SHOULD); cited original SEP-986 rules (1–64, / allowed); no version gate, no negative test |
| #240 |
PR |
Merged 2026-04-24 |
Added buildToolsNameFormatCheck + unit tests in tools.ts / tools.test.ts; closed #238 |
FAILURE not WARNING; still validates 64 chars + /; scenario stays introducedIn: '2025-06-18'; no broken-server vitest; no sep-986.yaml. Reviewer @pcarleton noted the check mostly validates harness-written tool names, not live servers — no follow-up for a negative fixture |
| #313 |
Issue |
Closed |
Flagged broken specReferences URL (SEP/SEP-986.md → 404) in tools.ts |
Docs link only; no validation logic change |
| #314 |
PR |
Closed without merge |
Proposed fixing the two 404 specReferences URLs (SEP-986 → issue #986; EMA oauth path) |
Never landed; current code uses the live spec-site URL instead |
No open issue or PR in conformance tracks correcting #240's gaps.
Spec repo (modelcontextprotocol/modelcontextprotocol)
| # |
Type |
State |
What it did |
Gap |
| #986 |
SEP |
Closed / Final |
Original SEP: 1–64 chars, / allowed, all SHOULD |
Superseded in dated spec by 1–128, no / |
| #1063 |
Issue |
Closed |
User report: spaces in tool names break Claude client; schema accepts any string |
Predates SEP-986 prose; no conformance artifact |
| #1603 |
PR |
Merged 2025-10-09 |
Added Tool Names section to draft spec docs |
Prose only; Tool.name in schema.json remains unconstrained string |
SDK repos (runtime validation — not conformance coverage)
These implement warn-at-registration (or log) in SDKs. They do not replace a server-side conformance check against a live tools/list response.
| Repo |
# |
State |
Notes |
| typescript-sdk |
#900 |
Merged 2025-11-11 |
Validates 1–128, no / — matches 2025-11-25 prose, not original SEP-986 markdown |
| typescript-sdk |
#1502 |
Closed |
Reported SDK “non-conformance” with original SEP-986 (64 + /); maintainer pointed at updated spec; redirected to #1512 |
| typescript-sdk |
#1512 |
Open |
Tracking client/provider non-compliance — not conformance harness fixes |
| python-sdk |
#1655 |
Merged 2025-11-24 |
Warn-on-register validation |
| python-sdk |
#1550 |
Closed without merge |
Duplicate/overlapping SEP-986 PR (128-char rules); superseded by #1655 |
| go-sdk |
#640 |
Merged 2025-11-19 |
Log-on-register validation; closes #621 |
| kotlin-sdk |
#695 |
Merged 2026-04-14 |
Server-side tool name validation; closes #417 |
Takeaway for this issue
#240 is the only merged conformance change to date. It closed #238 prematurely relative to the current 2025-11-25 / draft prose and AGENTS.md severity rules. This issue is a fix-and-complete follow-up, not greenfield work — consider referencing #238/#240 in the GitHub issue and optionally reopening #238 rather than filing from scratch.
Adjacent open policy (not tool-name-specific)
- conformance#245 — whether SHOULD-level
WARNING checks count toward Tier-1 (relevant if severity is corrected to WARNING)
Acceptance criteria
Describe the bug
The suite already emits a
tools-name-formatcheck fromToolsListScenario(#240 / #238), but it does not correctly enforce the documentation-level SHOULD rules for tool names:Wrong severity — violations emit
FAILURE, but every normative sentence in the spec prose is SHOULD / SHOULD NOT, not MUST. Per AGENTS.md, SHOULD requirements must emitWARNING(Tier-1 CI still treats WARNING as a failure — #245).Wrong rules — the check validates
1–64chars and^[A-Za-z0-9_./-]+$(from the original SEP-986 draft). The published 2025-11-25 and draft prose says1–128chars and allows onlyA–Z,a–z,0–9,_,-,.(no forward slash).Wrong version gate —
ToolsListScenariois taggedintroducedIn: '2025-06-18', but the Tool Names normative prose exists only from2025-11-25onward. Proof from the spec source on GitHub (modelcontextprotocol/modelcontextprotocol,docs/specification/*/server/tools.mdx):2025-03-26/server/tools.mdx:### Tool(L177) jumps to### Tool Result(L193); no#### Tool Namesheading.2025-06-18/server/tools.mdx: same structure —### Tool(L180) then### Tool Result(L198); no#### Tool Names.2025-11-25/server/tools.mdx:#### Tool Names(L217) under### Tool.draft/server/tools.mdx:#### Tool Names(L308).Misleading scenario prose —
tools-listdescription and comments say tool names “MUST” match the format; the spec says SHOULD.No negative vitest — unlike other wire/format checks (e.g.
sep-2164-*,sep-2549-*), there is no deliberately non-conformant fixture proving the check catches violations.Schema vs docs gap — in
schema/draft/schema.json,Tool.name(viaBaseMetadata) is an unconstrainedstringwith nopattern,minLength, ormaxLength. Conformance is the right place to test the documentation SHOULD until/unless the schema adds constraints.Timeline note:
2025-06-18dated spec was published.2025-11-25.introducedIn: '2025-06-18'therefore backdates a requirement that postdates that version.Format drift (64 +
/→ 128, no/)The finalized SEP file was not updated when the rule entered the spec —
seps/986-specify-format-for-tool-names.mdstill says 1–64 chars and allows/(example:user-profile/update).When SEP-986 was integrated into the spec, PR #1603 wrote
#### Tool Namesas 1–128 chars,[A-Za-z0-9_.-]only (no/; examplesadmin.tools.listnotuser/profile/update) — see the PR #1603tools.mdxdiff.No separate SEP introduced that change; the dated spec prose diverged from the SEP markdown at merge time. PR #1603’s additional context explicitly links typescript-sdk#900, whose validation matches the spec text (128 max, rejects
/in tests) rather than the SEP file — plausibly reference TypeScript SDK treatment driving what landed in the spec.Conformance #238 / #240 then encoded the stale SEP rules (64 +
/), not the PR #1603 spec diff. Misalignment is still open on the SEP side: Solido on modelcontextprotocol#986 (Feb 2026) (2025-11-25 spec vs SEP on/); typescript-sdk#1502 closed in favor of tracking against current spec / #1512.No equivalent hint exists elsewhere in the
2025-03-26or2025-06-18trees (only “unique identifier” prose, unconstrainedTool.name: stringin schema, and illustrative examples likeget_weather— not charset/length SHOULDs).The check should run only from
2025-11-25onward (including draft), not on2025-06-18or2025-03-26.When was this introduced?
/, all SHOULDdocs/specification/draft/server/tools.mdx2025-06-18,2025-03-26#### Tool Namesintools.mdxfor those revisions — see 2025-06-18, 2025-03-262025-11-25#### Tool Namesat L217 — 1–128 chars, no/2026-07-28#### Tool Namesat L308 — same SHOULD rules (+ disambiguation note for aggregated clients)To Reproduce
Steps to reproduce the behavior:
tools-listagainst a reference server on--spec-version 2025-11-25:node dist/index.js server --url '<server>' --scenario tools-list --spec-version 2025-11-25tools-name-format— it runs even though the requirement is not in 2025-06-18 prose, and uses the 64-char / slash-allowed rules.bad name(space) ora.repeat(100) (valid under 2025-11-25 length, invalid under the harness’s 64-char rule) — behavior does not match the spec text.buildToolsNameFormatCheckinsrc/scenarios/server/tools.ts— violations returnstatus: 'FAILURE'instead of'WARNING'.src/scenarios/server/negative.test.tsandexamples/servers/typescript/— no fixture for invalid tool names.Expected behavior
For
--spec-version2025-11-25and draft only:tools/list, validate each advertisedtool.nameagainst the spec prose for that version:[A-Za-z0-9_.-]only (no spaces, commas,/, etc.)WARNING(notFAILURE) when any name violates the SHOULD rules;SUCCESSwhen all conform;INFOwhentoolsis empty (current behavior is fine).2025-03-26or2025-06-18(gate viaintroducedIn: '2025-11-25'on the check or scenario, orspecVersionAtLeast(ctx.specVersion, '2025-11-25')).bad tool name→ expecttools-name-formatWARNING.src/seps/sep-986.yamltraceability rows (one check ID per SHOULD sentence exercised, or one consolidatedtools-name-formatrow if that matches manifest convention)./) vs 2025-11-25 spec (128, no/) against the spec diff before coding — the check must track the dated spec text, not the stale SEP markdown alone.For
--spec-version 2025-06-18/2025-03-26: check should not run (no false signal).Logs
Example from the current (incorrect) implementation when a name violates the harness rule:
Expected after fix:
Additional context
Precedent for enforcing documentation SHOULD semantics
This repo already treats SHOULD-level spec prose as
WARNINGchecks that Tier-1 SDKs must still pass:sep-2164-error-codeWARNINGsrc/scenarios/server/resources.tssep-2164-data-uriWARNINGsep-2575-server-list-changed-*WARNINGsrc/scenarios/server/stateless.tssep-2243-server-reject-error-code(custom headers)WARNINGwhere SHOULDsrc/scenarios/server/http-standard-headers.tsWARNINGsrc/scenarios/authorization-server/Convention: AGENTS.md — Severity follows the spec keyword.
Negative-test precedent:
src/scenarios/server/negative.test.ts+examples/servers/typescript/sep-*.tsbroken fixtures.JSON Schema gap
Tool.namein the machine-readable schema is only:{ "type": "string", "description": "..." }No
patternor length bounds. “Strict tool names” is therefore a prose SHOULD requirement until the schema catches up — same class of problem as other doc-only constraints conformance already tests.Prior issues and PRs (attempted, but problem remains)
Several upstream efforts touched SEP-986 / tool names. None fully resolved the conformance gaps listed above (SHOULD severity, 2025-11-25 rules, version gate, negative fixture, traceability).
Conformance repo (
modelcontextprotocol/conformance)tools-name-formatontools-list/allowed); no version gate, no negative testbuildToolsNameFormatCheck+ unit tests intools.ts/tools.test.ts; closed #238FAILUREnotWARNING; still validates 64 chars +/; scenario staysintroducedIn: '2025-06-18'; no broken-server vitest; nosep-986.yaml. Reviewer @pcarleton noted the check mostly validates harness-written tool names, not live servers — no follow-up for a negative fixturespecReferencesURL (SEP/SEP-986.md→ 404) intools.tsspecReferencesURLs (SEP-986 → issue #986; EMA oauth path)No open issue or PR in conformance tracks correcting #240's gaps.
Spec repo (
modelcontextprotocol/modelcontextprotocol)/allowed, all SHOULD/Tool.nameinschema.jsonremains unconstrainedstringSDK repos (runtime validation — not conformance coverage)
These implement warn-at-registration (or log) in SDKs. They do not replace a server-side conformance check against a live
tools/listresponse./— matches 2025-11-25 prose, not original SEP-986 markdown/); maintainer pointed at updated spec; redirected to #1512Takeaway for this issue
#240 is the only merged conformance change to date. It closed #238 prematurely relative to the current 2025-11-25 / draft prose and AGENTS.md severity rules. This issue is a fix-and-complete follow-up, not greenfield work — consider referencing #238/#240 in the GitHub issue and optionally reopening #238 rather than filing from scratch.
Adjacent open policy (not tool-name-specific)
WARNINGchecks count toward Tier-1 (relevant if severity is corrected toWARNING)Acceptance criteria
tools-name-formatusesWARNINGfor SHOULD violations on 2025-11-25 and draft/) — confirmed against spec diffsep-986.yamltraceability added if required by manifest workflow