Skip to content

feat(core): add core/mcp and core/session packages#13

Merged
cjimti merged 2 commits intomainfrom
feat/bootstrap-mcp-session
May 7, 2026
Merged

feat(core): add core/mcp and core/session packages#13
cjimti merged 2 commits intomainfrom
feat/bootstrap-mcp-session

Conversation

@cjimti
Copy link
Copy Markdown
Contributor

@cjimti cjimti commented May 7, 2026

Closes #5. Part of #1 (project bootstrap, v0.1).

This is phase 4 of 7 in the v0.1 bootstrap. It lands the multi-server MCP client wrapper and the JSON Lines session store. With this merged, every collaborator the agent loop will need (events, provider, mcp, session) is in place — phase 5 (loop / router / approval) is the next step.

Summary

  • core/mcp/ — thin wrapper over github.com/modelcontextprotocol/go-sdk v1.6.0. Manages parallel connections to one or more MCP servers, aggregates the tool catalog with server__tool namespacing, routes tool calls back to the originating server, and retries transient failures with bounded backoff.
  • core/session/ — append-only chat history with JSON Lines persistence. Token-aware truncation behind a Tokenizer interface; v1 ships a byte-length heuristic.
  • Build tooling: Makefile fuzz-quick now uses an isolated mktemp fuzz cache so the corpus does not grow between local runs and trip the fuzz runner's deadline. FUZZTIME raised from 5s to 15s.

Phase mapping

Phase Issue Branch Status
1. Repo skeleton & hygiene #2 feat/bootstrap-skeleton merged (#9)
2. CI & release pipeline #3 feat/bootstrap-ci merged (#10)
3. core/event + core/provider #4 feat/bootstrap-event-provider merged (#11)
4. core/mcp + core/session (this PR) #5 feat/bootstrap-mcp-session open
5. core/loop + core/router + core/approval #6 feat/bootstrap-loop-router-approval next
6. cmd/ask CLI #7 feat/bootstrap-ask
7. examples/acme-revenue + ADR + architecture doc #8 feat/bootstrap-acme-example

What lands

core/mcp/client.go (742 lines)

  • Client: parallel connections to N MCP servers via errgroup. One-shot lifecycle — Connect must be called exactly once. Failure or Close latches the gate; constructing a new Client is the only way to retry.
  • ServerConfig validation at construction: rejects empty/whitespace Name, leading/trailing whitespace in Name, duplicate names, namespace separator in Name, empty/whitespace Endpoint, malformed HTTP URLs (url.Parse with scheme + host required), and stdio Endpoint with no command tokens. Every error wraps ErrConfig.
  • Headers field is wired: HTTP-based transports get a custom http.RoundTripper that injects the configured headers on every outbound request. Defensive copy at construction so caller-side mutation does not propagate.
  • Tool namespacing: server__tool separator (__). SplitName / JoinName parser pair, exhaustively fuzz-tested.
  • Aggregated Catalog: sorted deterministically (server name, then bare tool name) so two Connect calls produce identical Tools ordering even though the underlying sessions map is randomized. Defensive copies through Catalog.copy.
  • Pagination: Resources, Prompts, and the catalog fetch use the SDK's iterator helpers (sess.Resources/Prompts/Tools returning iter.Seq2) so multi-page servers are fully enumerated. Lock-and-release pattern: the client lock is held only long enough to copy the session pointer, then released for iteration. A slow listing on one server cannot starve concurrent operations on other servers.
  • Call: routes by SplitName; retries on every non-context error with bounded backoff (exponential + full jitter, default 500ms base / 30s cap / 5 attempts). Returns ErrServerUnavailable on exhaustion. Tool-level errors flow through ToolResult.IsError, not Go errors.
  • Backoff.Delay: clamps negative attempts to 0; caps the shift at 31 to avoid wraparound; clamps math.MaxInt64 so int64(d)+1 doesn't overflow into MinInt64.
  • WithDialer option lets tests inject in-memory transports per-Client (no package globals) so parallel tests are race-free.
  • contentToToolContent: recognizes TextContent, ImageContent, AudioContent, EmbeddedResource, ResourceLink. Unknown SDK content types fall back to a JSON dump in Text with a diagnostic on marshal failure.
  • Stdio subprocess env inheritance: documented at package level with the recommended hardening (operator-side env scrub).
  • Sentinels: ErrConfig, ErrUnknownServer, ErrServerUnavailable, ErrInvalidName.

core/mcp/catalog.go (93 lines)

  • Catalog struct with flat Tools slice and ToolsByServer map.
  • Toolkit grouping with a pluggable ToolkitClassifier.
  • PrefixClassifier groups by the substring before the first underscore in the bare name (the Plexara datahub_* / trino_* / s3_* / memory_* pattern). Names with no underscore, leading underscore (_foo), or empty bare name fall into misc.

core/session/session.go (324 lines)

  • Session is an append-only chat history value with ID, Created, Updated, and []provider.Message. Intentionally not concurrency-safe (caller serializes).
  • Tokenizer interface; LengthHeuristic ships as the v1 implementation (bytes / 4 by default). Replaceable for v1.1.
  • Truncate drops oldest non-system messages until under maxTokens. The system prompt is never dropped, even when it alone exceeds the budget. A single oversize non-system message is dropped along with everything before it; the session may be left empty.
  • Save / Load use JSON Lines: a session.header envelope followed by N session.message envelopes. Forward-compat skipping for unknown envelope types; loud-fail (ErrFormat) on missing type field.
  • messageEnvelope and toolCallEnvelope carry explicit snake_case JSON tags so the persisted format stays interoperable. Upstream provider.ToolCall (no JSON tags) never reaches the wire directly.

Tests (1247 lines across 4 files)

  • core/mcp/client_test.go (730 lines, 17+ tests) — uses an in-memory sdkmcp.Server wired through WithDialer for hermetic testing. Covers: SplitName/JoinName round-trip, all New validation paths, parallel Connect aggregation, Call routing, unknown server, invalid name, parallel-connect-failure cleanup, backoff bounds and growth, post-Close ErrUnknownServer, twice-Connect rejection, failed-Connect latching, deterministic catalog ordering, multi-page pagination (server with PageSize: 1), defensive catalog copy.
  • core/mcp/catalog_test.goPrefixClassifier cases, custom classifiers, empty catalogs, leading-underscore names.
  • core/mcp/fuzz_test.goFuzzSplitName round-trip; uses errors.Is(err, ErrInvalidName) for sentinel checks.
  • core/session/session_test.go (325 lines, 13 tests) — New, Append updates timestamp, full Save / Load round-trip, error cases, unknown envelope skip, missing-type rejection, snake-case wire format pin, blank-line tolerance, LengthHeuristic math, all Truncate edge cases.
  • core/session/fuzz_test.goFuzzLoad decoder fuzz with reproducible (fixed-timestamp) seeds.

Local quality gate

Check Result
make verify (full gate) PASS end-to-end
go test -race -shuffle=on -count=1 ./... all packages green
core/event coverage 90.0%
core/mcp coverage ~91%
core/provider coverage 93.2%
core/session coverage ~96%
4 fuzz targets 15s each, 0 panics
golangci-lint run ./... 0 issues
Semgrep (Docker, OWASP / golang / secrets / security-audit) 0 findings
govulncheck no vulnerabilities
5-arch cross-compile build matrix all green

Process — Pre-commit Review Loop (PRL)

Followed the local review-gate hook on this commit. The hook ran an adversarial sub-agent review across two diff hashes (the hash flips when fixes change the diff, restarting the round counter). Across all rounds, ~50 findings were surfaced. Every critical bug was fixed in the working tree before commit. Documented design choices were left in place with explicit code comments. The commit body enumerates every addressed and every deferred finding for traceability.

Critical bugs caught and fixed

  • Headers field was documented but never wired into HTTP transports — silent auth-token drop. Now flows through a custom http.RoundTripper.
  • Backoff.Delay panicked on negative attempts (runtime error: negative shift amount).
  • Backoff.Delay could overflow int64(d) + 1 when Cap = math.MaxInt64, causing rand.Int64N to panic.
  • Close cleared sessions but left a stale catalog readable via Catalog().
  • A failed Connect did not latch the gate; a retry could partially re-initialize.
  • Resources/Prompts held the read-lock for full iteration → starved concurrent ops on unrelated servers.
  • SDK iterator-based pagination was missing — silent truncation past page 1.
  • provider.ToolCall persistence used Pascal-case JSON keys (no JSON tags upstream) — wire format would not interop with any other reader.
  • A session line with empty type field was silently dropped on Load.
  • Fuzz seeds called time.Now() — corpus was non-reproducible.
  • Missing case for ResourceLink in contentToToolContent.
  • convertToolResult panicked on nil result.
  • Whitespace-only Endpoint passed New; whitespace in Name also now rejected.
  • Catalog ordering was non-deterministic across Connect calls.
  • HTTP Endpoint URL syntax not validated at New (now url.Parse with scheme + host required).
  • fakeServer goroutines could call t.Logf after test end.
  • Backoff retry test would pass with a hardcoded zero return; added a growth assertion.

Acknowledged design choices, documented in code

  • Catalog is a one-time snapshot at Connect time; no live-refresh on tools/list_changed notifications in v1.
  • Connect holds the write-lock for full I/O duration; ctx cancellation is the abort path.
  • Save is non-atomic; callers do temp+rename.
  • Session is not concurrency-safe; caller serializes.
  • Call retries on every non-context error in v1; future versions may classify jsonrpc2 codes (-32601 Method not found, -32602 Invalid params) as non-retryable.

Out of scope (deferred to later phases)

  • core/loop agent loop and core/router toolkit classifier (phase 5)
  • core/approval (phase 5)
  • SQL safety middleware (v0.2 per spec)
  • Token-aware truncation via runtime /tokenize (v0.3)
  • Live MCP tools/list_changed notification handling (post-v1)

Test plan

  • CI lint passes: gofmt, go vet, golangci-lint, golangci-lint config verify, go mod verify, go mod tidy -diff
  • test (ubuntu-latest) and test (macos-latest) green; coverage upload with the core flag (best-effort while Codecov registration is pending — see chore(ci): re-enable codecov fail_ci_if_error after Codecov registration #12)
  • build matrix green across darwin/{amd64,arm64}, linux/{amd64,arm64}, windows/amd64
  • gosec clean; semgrep clean; govulncheck clean; trivy fs clean
  • analyze go (CodeQL) clean
  • dependency-review reports the new transitive deps (github.com/modelcontextprotocol/go-sdk, github.com/google/jsonschema-go, golang.org/x/oauth2, etc.); confirm all licenses are within the allowlist (Apache-2.0, BSD-2/3-Clause, MIT, etc.)
  • conventional commit PR title check passes (feat(core): ...)
  • ci pass aggregator green
  • After merge, phase 5 (Phase 5: core/loop + core/router + core/approval #6) can wire the agent loop on top of the now-stable Provider, Client, and Session surfaces

Implements phase 4 of the v0.1 bootstrap (#1). Adds the
multi-server MCP client wrapper and the JSON Lines session store,
both built against [github.com/modelcontextprotocol/go-sdk] v1.6.0.

core/mcp/
  - Client manages parallel connections to one or more MCP
    servers via errgroup; one-shot lifecycle (Connect must be
    called exactly once; failure or Close latches the gate).
  - ServerConfig validation at construction: rejects empty/
    whitespace Name, leading/trailing whitespace in Name,
    duplicate names, namespace-separator in Name, empty/
    whitespace Endpoint, malformed HTTP URLs (url.Parse with
    scheme + host required), and stdio Endpoint with no
    command tokens. All errors wrap ErrConfig.
  - Headers field is now wired into HTTP-based transports via
    a custom http.RoundTripper that injects them on every
    request. Defensive copy at construction so post-construction
    mutation does not propagate.
  - Tool names namespaced as "server__tool"; SplitName /
    JoinName parser pair, FuzzSplitName fuzz target.
  - Aggregated Catalog sorted deterministically (server name,
    then bare tool name). Defensive copies through Catalog.copy.
  - Resources / Prompts / Tools use SDK iterator helpers for
    transparent pagination. Lock-and-release pattern so a slow
    listing on one server does not starve concurrent ops on
    others.
  - Call routes by SplitName, retries with bounded backoff
    (exponential + full jitter, default 500ms base / 30s cap /
    5 attempts). Backoff.Delay clamps negatives to 0, caps shift
    at 31, guards math.MaxInt64 overflow.
  - WithDialer lets tests inject in-memory transports per-Client
    (no package globals) so parallel tests are race-free.
  - PrefixClassifier groups tools by prefix-before-first-
    underscore. Names with no underscore, leading underscore,
    or empty bare name fall into "misc".
  - contentToToolContent recognizes TextContent, ImageContent,
    AudioContent, EmbeddedResource, ResourceLink. Unknown SDK
    content types fall back to a JSON dump in Text with a
    diagnostic on marshal failure.
  - Stdio subprocess env inheritance documented at package
    level with the recommended hardening.

core/session/
  - Session is an append-only chat history value
    (intentionally not concurrency-safe; serialize at the
    caller).
  - Tokenizer interface with v1 LengthHeuristic
    (bytes-per-token, default 4); v1.1 may swap in a real
    tokenizer behind the same interface.
  - Truncate drops oldest non-system messages until under
    budget; system prompt is never dropped. Edge cases
    documented (single oversize message, oversize system
    prompt, may leave Messages empty).
  - Save / Load use JSON Lines: a "session.header" envelope
    followed by N "session.message" envelopes. Forward-compat
    skipping for unknown envelope types; loud-fail on missing
    type field.
  - messageEnvelope and toolCallEnvelope have explicit
    snake_case JSON tags so the persisted format stays
    interoperable; upstream provider.ToolCall (no JSON tags)
    never reaches the wire directly.
  - FuzzLoad guards the decoder against arbitrary input. Fuzz
    seeds use a fixed timestamp (not time.Now()) for corpus
    reproducibility.

Build tooling
  - Makefile fuzz-quick uses an isolated mktemp fuzz cache so
    the corpus does not grow between local runs and trip the
    fuzz runner's deadline. FUZZTIME default raised from 5s
    to 15s. Per-target -timeout=120s headroom.

Process
  Followed the Pre-commit Review Loop (PRL) under the local
  review-gate hook. Multiple adversarial-review rounds across
  two diff hashes surfaced 18 + 9 + 1 + 22 ≈ 50 findings total.
  Critical / real-bug findings addressed in the working tree:
    - Headers field documented but never wired into transports.
    - Backoff.Delay panicked on negative attempts.
    - Backoff.Delay int64+1 overflow with math.MaxInt64 cap.
    - Close left a stale catalog readable via Catalog().
    - Failed Connect did not latch the gate (could re-attempt).
    - Resources/Prompts held the RLock long enough to starve
      concurrent ops on unrelated servers.
    - SDK iterator-based pagination was missing (silent
      truncation past page 1).
    - provider.ToolCall persistence used Pascal-case JSON keys.
    - Empty type field in a session line was silently dropped.
    - Fuzz seed used time.Now() (corpus not reproducible).
    - Missing ResourceLink case in contentToToolContent.
    - convertToolResult panicked on nil result.
    - Whitespace-only Endpoint passed New (Now also rejects
      whitespace in Name).
    - Catalog ordering non-deterministic (Tools order varied).
    - URL syntax not validated for HTTP transports at New.
    - fakeServer goroutines could call t.Logf after test end.
    - Backoff retry test would pass with hardcoded 0 (added
      growth assertion).

  Acknowledged design choices documented in code:
    - Catalog is a one-time snapshot; no live-refresh in v1.
    - Connect holds the write-lock for full I/O duration;
      ctx cancellation is the abort path.
    - Save is non-atomic; caller uses temp+rename.
    - Session is not concurrency-safe.
    - Call retries on every non-context error; future versions
      may classify jsonrpc2 codes.

Verified locally
  - make verify             PASS end-to-end
  - go test -race           all packages green
  - 4 fuzz targets          15s each, 0 panics
  - 5-arch cross-compile    all green

Closes #5.

Reviewed-by: pre-commit-gate (artifact .claude/.last-review.md)
Comment thread core/mcp/client.go Fixed
Comment thread core/mcp/client.go Fixed
Comment thread core/mcp/client.go Fixed
…osec suppression for exec

Two gosec findings on PR #13 came through GitHub Advanced Security
even though `make verify` was clean locally. Root cause: the
suppressions used `//nolint:gosec` (golangci-lint dialect) but the
standalone gosec binary that writes the SARIF Advanced Security
ingests does not honor `nolint`. Two findings:

  1. G404 — math/rand/v2 used for backoff jitter.
  2. G204 — exec.Command with variable argv (the stdio MCP server
     command line from operator config).

For G404 (jitter): the right fix is not "suppress better," it's
"use the better function." Switched to `crypto/rand.Int(rand.Reader,
big.NewInt(int64(d)+1))`. The per-call cost is negligible against
the network I/O backoff is gating, and there is no longer anything
to suppress for any scanner. The math/rand and math/big imports are
also cleaned up; the elaborate dual-suppression comment block is
gone.

For G204 (exec): there is no "better function" — exec.Command IS
the API for stdio subprocesses, and argv genuinely comes from the
maintainer's `ServerConfig` (trusted by design). Replaced the
`//nolint:gosec` annotation with gosec's native `// #nosec G204 --
argv is operator-trusted config` syntax, which is honored by both
the standalone gosec used in security.yml and by gosec running
inside golangci-lint.

Backoff.Delay now handles the (effectively impossible) error from
rand.Int by returning the clamped delay as a safe fallback rather
than panicking — `crypto/rand.Reader` is documented as always
available on supported platforms, but defense in depth.

Verified locally
  - make verify                  PASS end-to-end
  - go test -race                 backoff tests still green
  - TestBackoff_GrowsAcrossAttempts (statistical) still passes
  - golangci-lint                 0 issues
  - gosec (standalone)            0 issues (verified G404 absent,
                                   G204 properly suppressed via
                                   #nosec)
  - semgrep                       0 findings

Reviewed-by: pre-commit-gate (artifact .claude/.last-review.md)
@cjimti cjimti merged commit c25e1c2 into main May 7, 2026
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Phase 4: core/mcp + core/session

2 participants