Observability

Two layers: structured logs always, span export when you opt in.

Structured logs

The server logs through structlog as JSON, written to stderr so the stdio transport keeps stdout reserved for JSON-RPC. Every tool call emits a tool_call event with the tool name and duration; failures emit tool_call_failed with the error:

{"timestamp": "2026-05-31T12:34:56", "level": "info", "event": "tool_call", "tool": "list_files", "duration_ms": 1.42}

Pipe stderr to your log aggregator.

OpenTelemetry span export

Span export is opt-in. Set an OTLP collector endpoint and every tool call is exported as a span:

MCP_OTEL_ENDPOINT=https://otel.your-domain.com:4317
MCP_SERVICE_NAME=mcp-server-toolkit

Each span is named tool.<name> and carries mcp.tool.name, mcp.tool.argument_count, mcp.tool.duration_ms, and mcp.tool.error when the handler raises. The span is created in Registry.call, so it wraps validation and execution and records exceptions. When MCP_OTEL_ENDPOINT is unset, tracing is a no-op and only the structured logs flow.

Health endpoint

GET /health is always open and returns:

{"ok": true, "tools_registered": 6, "uptime_seconds": 12345.6}

Use it as a readiness or liveness probe; the container image already wires it into a Docker HEALTHCHECK.

What to alert on

Signal	Threshold	Why
5xx rate	> 1% over 5 min	Internal errors
Tool latency P99 (`mcp.tool.duration_ms`)	> 5s	Downstream slowness
401 rate	sustained spike	Possible credential attack
429 rate	sustained	Rate limit too tight or abuse
Memory growth	sustained over 1h	Likely leak

What to ignore

Occasional single tool timeouts; the client retries.
401s from idle fuzzing of a public URL.
Cold-start latency on the first request after a deploy.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Observability

Observability

Structured logs

OpenTelemetry span export

Health endpoint

What to alert on

What to ignore

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally