Skip to content

Ambiguity in capability tests for SEP-2575 #279

@anubhav756

Description

@anubhav756

Problem

In the stateless scenario of the conformance suite (#273), the mock server runs a simple lifecycle test (negotiating version, listing tools, and cancelling a tool call). It does not execute any roots or elicitation flows.

Because these optional capabilities are never triggered or used on the wire during this test run:

  • If the mock server sees a client omits elicitation, how does it know if the client is correct or if the client is buggy and forgot to declare it?
  • If the mock server sees a client declares elicitation, how does it know if the client is correct or if the client is lying and doesn't actually support it?

Background

Optional vs. Required Capabilities

  • Client capabilities like roots (filesystem access), sampling (AI model queries), and elicitation (user prompt inputs) are optional capabilities in the MCP spec.
  • These tests implement stateless MCP optional client capability verification.
  • Note: Client is conformant under the spec even if it only supports tools and omits the others entirely.

Implication Rule

  • The specification dictates: "Clients that support [optional capability, e.g., elicitation] MUST declare it."
  • If a client does not support a capability, it MUST NOT declare it (otherwise it misleads the server and crashes during execution).
  • Therefore:{Supports Capability} iff {Declares Capability}

Potential Solutions

Option 1 (Recommended): Permissive Check

  • If the capability is present → verify it is a valid object {}SUCCESS.

  • If absent → SUCCESS (since it is optional).

  • Pros:

    • Simple and client-blind
    • Requires no extra config.
  • Cons:

    • Weak verification.
      • A buggy client that forgot to declare it passes anyway.
      • A lying client that erroneously declared it passes anyway.

Option 2: Strict Assertion + Conformance Baseline

  • The mock server strictly asserts that the client MUST declare all capabilities.

  • If any are missing → FAILURE.

  • Capability supported in the client under test?

    • Yes

      • Declares all → passes with SUCCESS.
      • If it forgets → FAILURE (build breaks).
    • No

      • Omits them → fails in harness.
      • The developer lists these failures in their expected failures file (conformance-baseline.yml).
      • CI passes cleanly.
      • If the client erroneously declares them → check passes → baseline becomes stale → runner fails the build (catches the lie!).
    • Pros:

      • Catches both forgetful and lying clients.
      • Keeps the mock server code clean and client-blind.
      • Uses standard conformance baseline mechanisms.
    • Cons:

      • Even though the tools-only clients (like MCP Toolbox) are perfectly conformant with the stateless protocol, they would be forced to modify their conformance-baseline.yml baseline files.
      • The MCP Working Group could find this annoying.

Reason for Rejection

  • The baseline file (conformance-baseline.yml) only accepts scenario names (e.g., - tools_call, - sse-retry, - auth/metadata-default).
  • It does not support baseline tracking at the individual check slug level.
  • So this option is not applicable.

Option 3: Probe-and-Callback

  • The mock server actively probes the client during a tool call by returning an InputRequiredResult containing an ElicitRequest.
    • A client that supports it must resolve and retry (declaring support).

    • A client that doesn't must fail/abort.

    • Pros: Dynamic and client-blind.

    • Cons:

      • High runtime complexity.
      • Duplicates elicitation defaults scenario logic into the simple stateless scenario.
      • Risk race conditions and CI timeouts.

Recommendation

I recommend adopt Option 2 because it provides a verification guarantee in both directions of the implication, requires no runtime complexity, and is client-blind and spec-compliant.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions