Tolerate spec-violating list methods on backend init#5232
Conversation
When a backend advertises resources or prompts capability in its initialize response but returns JSON-RPC -32601 to resources/list or prompts/list, treat that as "the backend has no resources/prompts" rather than a fatal init error. This unblocks third-party MCP servers (e.g. Atlassian Rovo) whose initialize response contradicts their implemented method set, so users still get the backend's tools instead of silently losing them. Implements changes for issue #5231: - Recover from errors.Is(err, mcp.ErrMethodNotFound) on resources/list and prompts/list with a WARN log naming the backend and method. - Keep tools/list failures and non-(-32601) errors from list methods fatal so we are not silencing arbitrary failures. - Add table-driven unit tests with a fake JSON-RPC backend covering the recoverable, fatal, and regression-guard cases.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #5232 +/- ##
==========================================
+ Coverage 67.91% 67.97% +0.05%
==========================================
Files 610 612 +2
Lines 62522 62741 +219
==========================================
+ Hits 42464 42648 +184
- Misses 16879 16909 +30
- Partials 3179 3184 +5 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
lorr1
left a comment
There was a problem hiding this comment.
Multi-agent review summary
Recommendation: COMMENT (non-blocking).
The fix is correct, well-scoped, and end-to-end safe. errors.Is(listErr, mcp.ErrMethodNotFound) is the right discriminator against the mcp-go v0.49.0 sentinel (JSONRPCErrorDetails.AsError() wraps it via fmt.Errorf("%w: ..."), so errors.Is matches both bare and wrapped variants). Switch ordering is nil-safe. tools/list strictness is preserved. Empty caps.Resources/caps.Prompts flow correctly through pkg/vmcp/session/factory.go and the aggregator merge — a downstream client resources/list against vMCP simply returns an empty merged set.
Four specialist reviewers (Go correctness, MCP protocol, test coverage, general code quality) consulted. Codex cross-review was attempted but timed out; not blocking.
Summary of inline findings
| # | Severity | Theme |
|---|---|---|
| 1 | MEDIUM | Switch default arm (the success-path populate loop) is unverified |
| 2 | LOW | Comment phrasing ("spec violation" / "empty resource set") |
| 3 | LOW | wantToolsCalled bool redundant with int counters |
| 4 | LOW | assert.True(strings.Contains(...)) should be assert.Contains |
| 5 | LOW | WARN log lacks workload name and base URL |
| 6 | LOW | errors.Is won't match HTTP-level method-missing variants (404/501) |
Out-of-scope / informational
- Pre-existing: list calls do not loop on
nextCursor— file a follow-up issue if pagination matters. - PR description references
mcp-go v0.43.2;go.modactually pinsv0.49.0. Sentinel and wrapping semantics are identical, so the code is correct — doc nit only. - WARN may be noisy under reconnect churn; rate-limit per
(backendID, method)only if it shows up in practice. - vMCP does not currently forward
notifications/resources/list_changed, so the empty-recovery path does not create phantom subscriptions today.
Addresses #5232 review comments: - MEDIUM mcp_session_capabilities_test.go (3211850392): add success-path rows for resources and prompts so the switch's default arm (the populate loop) is exercised, including BackendID field-mapping assertions. - LOW mcp_session.go (3211850393): tighten the recovery comment in both the resources and prompts switch arms to a single rationale-focused sentence. - LOW mcp_session_capabilities_test.go (3211850399): replace wantToolsCalled bool with wantToolsCalls int so the assertion shape matches the resources/prompts counters and zero-call cases become expressible. - LOW mcp_session_capabilities_test.go (3211850401): use assert.ErrorContains in place of assert.True(strings.Contains(...)) and drop the now-unused strings import. - LOW mcp_session.go (3211850404): enrich the WARN log in both arms with workload name and base URL so the breadcrumb is grep-friendly without an ID-to-name lookup. - LOW mcp_session.go (3211850406): note in the recovery comment that HTTP-level method absence is intentionally fatal, scoping the tolerance to JSON-RPC -32601.
Summary
A vMCP backend that advertises
capabilities.resources(orcapabilities.prompts) in itsinitializeresponse but does not actually implementresources/list(orprompts/list) currently fails per-session backend init outright. Every tool from that backend silently disappears fromtools/list, leaving users with no signal that anything went wrong — a real problem against third-party servers like Atlassian's Rovo MCP that we don't control. This PR makes the per-session bootstrap tolerate that specific spec violation while keeping all other failure modes fatal.In
initAndQueryCapabilities, a JSON-RPC-32601 Method not foundfromresources/listorprompts/listis now treated as "the backend has no resources/prompts" rather than a fatal init error, with a WARN log recording the spec violation.tools/listerrors and any non--32601error from the resources/prompts list calls remain fatal.Closes #5231
Changes Made
pkg/vmcp/session/internal/backend/mcp_session.gomcp.ErrMethodNotFoundfromresources/listandprompts/listviaerrors.Isand recover with an empty result set instead of returning an error.backendIDand the offending list method, so operators can flag the upstream bug without ERROR-level noise.tools/liststrict (a backend with no tool surface is not useful to expose) and leave non--32601list errors fatal (we are not silencing arbitrary failures).pkg/vmcp/session/internal/backend/mcp_session_capabilities_test.go(new)Implementation Details
-32601detection relies onmcp-go'sJSONRPCErrorDetails.AsError()wrapping the sentinelmcp.ErrMethodNotFound, soerrors.Is(listErr, mcp.ErrMethodNotFound)is the correct discriminator across upstream message variations (e.g., the doubled"method not found: Method not found"string seen in production).Testing
pkg/vmcp/session/internal/backend/mcp_session_capabilities_test.gocover the acceptance criteria:resources/listreturns-32601after the server advertisedresources-> init succeeds with no resources, tools remain reachable.prompts/listreturns-32601after the server advertisedprompts-> init succeeds with no prompts, tools remain reachable.-32601simultaneously -> init still succeeds, tools remain reachable.tools/listreturns-32601-> init still fails (regression guard).resources/listreturns-32601INTERNAL_ERROR(non--32601) -> init still fails (regression guard).prompts/listreturnsINVALID_PARAMS(non--32601) -> init still fails (regression guard).task test(unit tests) andtask lint-fix(linting) run clean on the touched package.Additional Notes
https://mcp.atlassian.com/v1/mcp) called out in the issue's acceptance criteria has not been performed in this PR; the unit tests exercise the same code path with a fake backend that produces the equivalent-32601response. Reviewers who have the Rovo wiring available should confirmtools/listreturns the workload-prefixed Atlassian tools after OAuth completes.initializeresponse and then unconditionally call X. Out of scope for this PR.