Testing toolkit for Go MCP server authors.
The Model Context Protocol (MCP) ecosystem in Go has two competing server frameworks (mark3labs/mcp-go and modelcontextprotocol/go-sdk) plus a growing set of domain-specific MCP servers (GitHub, Grafana, Kubernetes, Terraform, …). Every project ends up hand-rolling roughly the same test plumbing: an in-process client to drive the server, a way to record real sessions for regression tests, an assertion harness for tool behaviour.
mcpharness fills that gap with a small SDK-neutral surface.
mcpharness.Client— neutral interface every adapter implements. Two adapters ship today:mark3formark3labs/mcp-go(8.7k ⭐, the de-facto Go MCP framework), andsdkformodelcontextprotocol/go-sdk(4.6k ⭐, the official Anthropic SDK).Recorderwraps anyClientand writes every call (initialize,tools/list,tools/call,resources/list,resources/read) to a JSON Lines stream.Replayreads a recorded stream back and returns a deterministicClientthat asserts each call matches the recording. Catches three regression classes: wrong method, wrong params, extra/missing calls.FuzzCallToolplugs anyClient+ tool name into Go's native*testing.Ffuzz infrastructure. Per-iteration timeout, fails on panic / hang / transport error, acceptsIsError=trueas a handled-error signal.Snapshotgolden-file regression for any value with stable JSON canonicalisation. First run creates the baseline; subsequent runs diff.MCPHARNESS_UPDATE_SNAPSHOTS=1to bulk-regenerate.conformance.Run— bridge to Anthropic's official conformance test harness. Drivenpx @modelcontextprotocol/conformancefromgo test, fail loudly on any scenario regression. Skips automatically when Node.js is unavailable.
You can, and you should for simple smoke tests. mcpharness exists for the moments when one client isn't enough:
- Test against multiple framework versions without rewriting tests — the neutral
Clientinterface stays put when the underlying framework's request types churn. - Record once, replay forever — capture a real session against your production server, commit the file, then run deterministic regression tests in CI without standing up the real server.
- Catch divergence early —
Replayfails loudly on wrong method, wrong params, or missing/extra calls, so a behaviour drift surfaces as a precise test failure rather than a silent wrong assertion.
go get github.com/ultramcu/mcpharness@latestpackage mcpserver_test
import (
"context"
"testing"
"github.com/mark3labs/mcp-go/mcp"
"github.com/mark3labs/mcp-go/server"
"github.com/ultramcu/mcpharness"
"github.com/ultramcu/mcpharness/mark3"
)
func TestEcho(t *testing.T) {
srv := server.NewMCPServer("echo", "0.1.0")
srv.AddTool(
mcp.NewTool("echo", mcp.WithString("text", mcp.Required())),
func(ctx context.Context, req mcp.CallToolRequest) (*mcp.CallToolResult, error) {
text, _ := req.Params.Arguments.(map[string]any)["text"].(string)
return mcp.NewToolResultText(text), nil
},
)
client, err := mark3.New(srv)
if err != nil { t.Fatal(err) }
defer client.Close()
if _, err := client.Initialize(context.Background()); err != nil {
t.Fatal(err)
}
res, err := client.CallTool(context.Background(), "echo", map[string]any{"text": "ping"})
if err != nil { t.Fatal(err) }
if res.IsError { t.Fatal("tool returned IsError") }
// res.Content[0] == map[string]any{"type":"text", "text":"ping"}
}// Phase 1: record a real session into testdata/echo.jsonl
func TestRecord(t *testing.T) {
f, _ := os.Create("testdata/echo.jsonl")
defer f.Close()
real, _ := mark3.New(buildServer(t))
rec := mcpharness.NewRecorder(real, f)
defer rec.Close()
rec.Initialize(ctx)
rec.CallTool(ctx, "echo", map[string]any{"text": "ping"})
}
// Phase 2: in CI, replay deterministically without the real server
func TestReplay(t *testing.T) {
f, _ := os.Open("testdata/echo.jsonl")
defer f.Close()
replay := mcpharness.NewReplay(t, f)
defer replay.Close() // fails the test if any recorded entries were not consumed
replay.Initialize(ctx) // asserts seq=1 matches
replay.CallTool(ctx, "echo", map[string]any{"text": "ping"}) // asserts seq=2 matches
}If the second-phase test calls a method that doesn't match the recording, or passes different params, Replay calls t.Fatalf with a precise diff — no silent drift.
import "github.com/ultramcu/mcpharness/conformance"
func TestMCPConformance(t *testing.T) {
srv := startMyServerOnRandomPort(t) // your own HTTP transport setup
conformance.Run(t, srv.URL) // skips if npx not on PATH
}Narrow the run to a single suite for faster iteration:
conformance.Run(t, srv.URL, conformance.WithSuite("core"))func FuzzEchoTool(f *testing.F) {
srv := buildEchoServer()
client, _ := mark3.New(srv)
defer client.Close()
mcpharness.FuzzCallTool(f, client, "echo",
map[string]any{"text": "hello"},
map[string]any{"text": ""},
map[string]any{},
)
}
// go test -fuzz=FuzzEchoTool -fuzztime=30s ./...Inputs that don't decode as JSON objects are silently skipped. Inputs that make the tool panic, hang past the per-iteration timeout, or surface a transport error fail the fuzz iteration — but a tool returning IsError=true is treated as a valid handled-error path.
res, _ := client.CallTool(ctx, "echo", map[string]any{"text": "ping"})
mcpharness.Snapshot(t, "echo-ping", res)First run writes testdata/snapshots/echo-ping.json and logs that a baseline was created. Subsequent runs compare byte-for-byte after stable JSON canonicalisation. To intentionally regenerate after a behaviour change, set the env var:
MCPHARNESS_UPDATE_SNAPSHOTS=1 go test ./...- v0.1:
Client+Recorder+Replay+ mark3labs adapter. (shipped) - v0.2: adapter for
modelcontextprotocol/go-sdk;conformance.Runbridge to the officialnpx @modelcontextprotocol/conformanceharness. (shipped) - v0.3 (this release):
FuzzCallToolharness on top of Go's native*testing.F;Snapshotgolden-file helper withMCPHARNESS_UPDATE_SNAPSHOTSenv override. - v0.4+: HTTP-transport spawner helper to make the conformance bridge fully turnkey; resource-template support in the
Clientsurface; multi-contentReadResourceaccessor.
mcpharness follows SemVer. Until 1.0, any minor version bump may include breaking API changes (we'll keep them minimal and well-documented in the CHANGELOG).
Issues and PRs welcome. Please open an issue first for any non-trivial change so we can align on direction before you spend time on a PR.
MIT © 2026 ultramcu