Skip to content

Tolerate spec-violating list methods on backend init#5232

Merged
tgrunnagle merged 2 commits into
mainfrom
issue_5231
May 11, 2026
Merged

Tolerate spec-violating list methods on backend init#5232
tgrunnagle merged 2 commits into
mainfrom
issue_5231

Conversation

@tgrunnagle
Copy link
Copy Markdown
Contributor

Summary

A vMCP backend that advertises capabilities.resources (or capabilities.prompts) in its initialize response but does not actually implement resources/list (or prompts/list) currently fails per-session backend init outright. Every tool from that backend silently disappears from tools/list, leaving users with no signal that anything went wrong — a real problem against third-party servers like Atlassian's Rovo MCP that we don't control. This PR makes the per-session bootstrap tolerate that specific spec violation while keeping all other failure modes fatal.

In initAndQueryCapabilities, a JSON-RPC -32601 Method not found from resources/list or prompts/list is now treated as "the backend has no resources/prompts" rather than a fatal init error, with a WARN log recording the spec violation. tools/list errors and any non--32601 error from the resources/prompts list calls remain fatal.

Closes #5231

Changes Made

pkg/vmcp/session/internal/backend/mcp_session.go

  • Detect mcp.ErrMethodNotFound from resources/list and prompts/list via errors.Is and recover with an empty result set instead of returning an error.
  • Emit a WARN log on recovery, including backendID and the offending list method, so operators can flag the upstream bug without ERROR-level noise.
  • Leave tools/list strict (a backend with no tool surface is not useful to expose) and leave non--32601 list errors fatal (we are not silencing arbitrary failures).

pkg/vmcp/session/internal/backend/mcp_session_capabilities_test.go (new)

  • Table-driven tests covering each acceptance-criteria scenario via a fake streamable-http MCP backend.

Implementation Details

  • The -32601 detection relies on mcp-go's JSONRPCErrorDetails.AsError() wrapping the sentinel mcp.ErrMethodNotFound, so errors.Is(listErr, mcp.ErrMethodNotFound) is the correct discriminator across upstream message variations (e.g., the doubled "method not found: Method not found" string seen in production).
  • The recovery is scoped narrowly to the spec-violation case. Transport errors, timeouts, and other JSON-RPC error codes still abort init, preserving visibility into genuine problems.

Testing

  • New unit tests in pkg/vmcp/session/internal/backend/mcp_session_capabilities_test.go cover the acceptance criteria:
    • resources/list returns -32601 after the server advertised resources -> init succeeds with no resources, tools remain reachable.
    • prompts/list returns -32601 after the server advertised prompts -> init succeeds with no prompts, tools remain reachable.
    • Both list methods return -32601 simultaneously -> init still succeeds, tools remain reachable.
    • tools/list returns -32601 -> init still fails (regression guard).
    • resources/list returns -32601 INTERNAL_ERROR (non--32601) -> init still fails (regression guard).
    • prompts/list returns INVALID_PARAMS (non--32601) -> init still fails (regression guard).
  • task test (unit tests) and task lint-fix (linting) run clean on the touched package.

Additional Notes

  • Manual verification against the Atlassian Rovo MCP server (https://mcp.atlassian.com/v1/mcp) called out in the issue's acceptance criteria has not been performed in this PR; the unit tests exercise the same code path with a fake backend that produces the equivalent -32601 response. Reviewers who have the Rovo wiring available should confirm tools/list returns the workload-prefixed Atlassian tools after OAuth completes.
  • The same fragility class is worth a follow-up audit elsewhere in the session bootstrap (subscriptions, completions) — anywhere we derive "the server supports method X" from the initialize response and then unconditionally call X. Out of scope for this PR.

When a backend advertises resources or prompts capability in its
initialize response but returns JSON-RPC -32601 to resources/list or
prompts/list, treat that as "the backend has no resources/prompts"
rather than a fatal init error. This unblocks third-party MCP servers
(e.g. Atlassian Rovo) whose initialize response contradicts their
implemented method set, so users still get the backend's tools instead
of silently losing them.

Implements changes for issue #5231:
- Recover from errors.Is(err, mcp.ErrMethodNotFound) on resources/list
  and prompts/list with a WARN log naming the backend and method.
- Keep tools/list failures and non-(-32601) errors from list methods
  fatal so we are not silencing arbitrary failures.
- Add table-driven unit tests with a fake JSON-RPC backend covering
  the recoverable, fatal, and regression-guard cases.
@github-actions github-actions Bot added the size/M Medium PR: 300-599 lines changed label May 8, 2026
@tgrunnagle tgrunnagle marked this pull request as ready for review May 8, 2026 20:36
@codecov
Copy link
Copy Markdown

codecov Bot commented May 8, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 67.97%. Comparing base (9211a36) to head (a5648e9).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5232      +/-   ##
==========================================
+ Coverage   67.91%   67.97%   +0.05%     
==========================================
  Files         610      612       +2     
  Lines       62522    62741     +219     
==========================================
+ Hits        42464    42648     +184     
- Misses      16879    16909      +30     
- Partials     3179     3184       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

@lorr1 lorr1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multi-agent review summary

Recommendation: COMMENT (non-blocking).

The fix is correct, well-scoped, and end-to-end safe. errors.Is(listErr, mcp.ErrMethodNotFound) is the right discriminator against the mcp-go v0.49.0 sentinel (JSONRPCErrorDetails.AsError() wraps it via fmt.Errorf("%w: ..."), so errors.Is matches both bare and wrapped variants). Switch ordering is nil-safe. tools/list strictness is preserved. Empty caps.Resources/caps.Prompts flow correctly through pkg/vmcp/session/factory.go and the aggregator merge — a downstream client resources/list against vMCP simply returns an empty merged set.

Four specialist reviewers (Go correctness, MCP protocol, test coverage, general code quality) consulted. Codex cross-review was attempted but timed out; not blocking.

Summary of inline findings

# Severity Theme
1 MEDIUM Switch default arm (the success-path populate loop) is unverified
2 LOW Comment phrasing ("spec violation" / "empty resource set")
3 LOW wantToolsCalled bool redundant with int counters
4 LOW assert.True(strings.Contains(...)) should be assert.Contains
5 LOW WARN log lacks workload name and base URL
6 LOW errors.Is won't match HTTP-level method-missing variants (404/501)

Out-of-scope / informational

  • Pre-existing: list calls do not loop on nextCursor — file a follow-up issue if pagination matters.
  • PR description references mcp-go v0.43.2; go.mod actually pins v0.49.0. Sentinel and wrapping semantics are identical, so the code is correct — doc nit only.
  • WARN may be noisy under reconnect churn; rate-limit per (backendID, method) only if it shows up in practice.
  • vMCP does not currently forward notifications/resources/list_changed, so the empty-recovery path does not create phantom subscriptions today.

Comment thread pkg/vmcp/session/internal/backend/mcp_session_capabilities_test.go
Comment thread pkg/vmcp/session/internal/backend/mcp_session.go Outdated
Comment thread pkg/vmcp/session/internal/backend/mcp_session_capabilities_test.go Outdated
Comment thread pkg/vmcp/session/internal/backend/mcp_session_capabilities_test.go Outdated
Comment thread pkg/vmcp/session/internal/backend/mcp_session.go
Comment thread pkg/vmcp/session/internal/backend/mcp_session.go
Addresses #5232 review comments:
- MEDIUM mcp_session_capabilities_test.go (3211850392): add success-path
  rows for resources and prompts so the switch's default arm (the populate
  loop) is exercised, including BackendID field-mapping assertions.
- LOW mcp_session.go (3211850393): tighten the recovery comment in both
  the resources and prompts switch arms to a single rationale-focused
  sentence.
- LOW mcp_session_capabilities_test.go (3211850399): replace
  wantToolsCalled bool with wantToolsCalls int so the assertion shape
  matches the resources/prompts counters and zero-call cases become
  expressible.
- LOW mcp_session_capabilities_test.go (3211850401): use
  assert.ErrorContains in place of assert.True(strings.Contains(...))
  and drop the now-unused strings import.
- LOW mcp_session.go (3211850404): enrich the WARN log in both arms
  with workload name and base URL so the breadcrumb is grep-friendly
  without an ID-to-name lookup.
- LOW mcp_session.go (3211850406): note in the recovery comment that
  HTTP-level method absence is intentionally fatal, scoping the
  tolerance to JSON-RPC -32601.
@github-actions github-actions Bot added size/M Medium PR: 300-599 lines changed and removed size/M Medium PR: 300-599 lines changed labels May 10, 2026
@tgrunnagle tgrunnagle merged commit c68fdb2 into main May 11, 2026
86 of 88 checks passed
@tgrunnagle tgrunnagle deleted the issue_5231 branch May 11, 2026 14:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/M Medium PR: 300-599 lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

vMCP backend init fails fatally when server advertises resources capability without resources/list

3 participants