Skip to content

Architecture

sarmakska edited this page Jun 7, 2026 · 3 revisions

Architecture

mcp-server-toolkit is a thin, opinionated layer over the Model Context Protocol. A tool author writes one async function and gets schema validation, auth, rate limiting, tracing, and two transports for free.

graph TD
  C[MCP client]
  C -->|stdio JSON-RPC| S[transports/stdio.py]
  C -->|streamable HTTP POST /mcp| H[transports/http.py]
  H --> A[auth: api_key / oauth + rate limit]
  S --> B[protocol.dispatch_batch]
  A --> B
  B -->|per message| D[protocol.dispatch]
  D --> R[registry.Registry.call]
  R --> V[jsonschema validation]
  R --> T[OpenTelemetry span]
  R --> P1[plugins/filesystem]
  R --> P2[plugins/sarmalink]
Loading

Components

Module Responsibility
server.py Lifecycle: select transport, set up telemetry, import plugins
protocol.py MCP 1.0 JSON-RPC dispatch shared by both transports
registry.py Decorator tool registry: schema generation, validation, span wrapping
transports/stdio.py JSON-RPC 2.0 loop over stdin/stdout, one message per line
transports/http.py FastAPI app: POST /mcp, REST /tools, /health, auth, rate limiting
auth/api_key.py Constant-time API key comparison
auth/oauth.py OAuth 2.1 resource server: JWT validation against issuer JWKS
auth/ratelimit.py Per-client token bucket
oauth_client.py OAuth 2.1 PKCE client flow for obtaining tokens
telemetry.py OpenTelemetry tracer provider and structlog configuration
config.py Settings from the environment, MCP_ prefix
cli.py run, doctor, init, login

Protocol layer

Both transports parse the request body and hand it to protocol.dispatch_batch. That entry point routes a single JSON object to protocol.dispatch, the single source of truth for MCP behaviour, and a top-level JSON array (a JSON-RPC 2.0 batch) to a concurrent fan-out over dispatch:

  • initialize: negotiates the protocol version (newest supported wins if the client asks for something unknown) and returns serverInfo and capabilities.
  • notifications/initialized: acknowledged with no response, as notifications must be.
  • ping: returns an empty result.
  • tools/list: returns advertised tools, including outputSchema where declared.
  • tools/call: validates arguments, runs the handler, returns content blocks. A string becomes a text block; a dict is JSON-encoded and also returned as structuredContent. Validation errors map to -32602, unknown tools to -32601, and handler exceptions to a tool result with isError: true.

Batch requests

dispatch_batch implements JSON-RPC 2.0 batching: members are dispatched concurrently with asyncio.gather, the responses are returned as an array in input order, and notification members (no id) are omitted. An empty array is rejected with -32600, a batch of only notifications produces no response body, and a non-object member yields a per-member -32600 error rather than failing the whole batch. MCP revision 2025-06-18 removed batching, so this stays a transport-level convenience for clients negotiating 2025-03-26 or 2024-11-05; the server still accepts a batch from any client rather than rejecting older ones.

Tool registration flow

@registry.tool("search_docs", description="Search internal docs")
async def search_docs(query: str, limit: int = 10) -> dict:
    return {"results": [...]}

The decorator inspects type hints with typing.get_type_hints, maps each parameter to a JSON Schema fragment (str to string, int to integer, list[str] to an array of strings, X | None to the inner type), and marks parameters without a default as required. The schema sets additionalProperties: false. Handlers must be async; the decorator raises at registration time otherwise.

Validation and tracing

Registry.call validates arguments against the input schema, opens a span named tool.<name>, runs the handler, records duration and any error, then validates the return value against output_schema when declared. The span exports through OTLP when MCP_OTEL_ENDPOINT is set; otherwise tracing is a no-op and structured logs still flow to stderr.

Transport selection

stdio: the client launches the server as a subprocess and exchanges JSON-RPC over stdin/stdout. Logs go to stderr so stdout stays a clean message channel, as the MCP specification requires.

Streamable HTTP: clients POST JSON-RPC to /mcp. Auth and rate limiting run as a FastAPI dependency on the protected routes; /health is always open for readiness probes. A small REST surface (GET /tools, POST /tools/{name}) is provided for quick inspection.

Same registry, same plugins, same handlers. The transport is a shell around protocol.dispatch.

Clone this wiki locally