Skip to content

P2.3 Move middleware chain under Serve; remove authz + annotation mw #5441

@tgrunnagle

Description

@tgrunnagle

Description

Relocate the full HTTP middleware chain so the new Serve-built *Server uses it —
recovery, header-validation, write-timeout (SSE), auth, audit (+ backend enrichment),
telemetry, MCP-parsing — preserving the existing wrapping/execution order. On the
Serve path, authz and annotation-enrichment are omitted via the existing
cfg.AuthzMiddleware != nil nil-guard
: Serve leaves AuthzMiddleware nil, so the
guards at server.go:606/:614 skip both blocks, and authorization is enforced instead
by the core admission seam from #5438. The authz/annotation blocks are NOT deleted
in Phase 2
— they remain in the shared (*Server).Handler so the still-live
server.New path keeps enforcing authz with no regression; their physical removal is
deferred to Phase 3 (#5445), after server.New is routed through Serve.

Context

Part of the vMCP interface refactor (epic #5419), which splits the
server.New god-object into a core (New(cfg) -> VMCP) and a transport
(Serve(ctx, VMCP, serverCfg) -> *Server). Today the entire MCP-endpoint
middleware chain is assembled in (*Server).Handler (server.go:574-675),
wrapped outermost-last so execution runs recovery → header-val → write-timeout →
auth+parser → audit → discovery → annotation-enrichment → authz →
backend-enrichment → MCP-parsing → telemetry → handler. Two of those layers —
AuthzMiddleware (applied only when cfg.AuthzMiddleware != nil,
server.go:606) and AnnotationEnrichmentMiddleware (server.go:615, gated on
the same condition, block 611-617) — exist solely to make the Cedar authorization
decision on the HTTP path, with annotation-enrichment injecting Tool.Annotations
into context so Cedar when-clauses evaluate.

Sequencing constraint (why nothing is deleted here). (*Server).Handler is a
single shared method read from s.config; both server.New's *Server and a
Serve-built *Server use it. In Phase 2, server.New is not yet routed through
Serve (that is Phase 3 / #5445), and the core admission seam is not in its call
path. If this task deleted the authz/annotation blocks from the shared Handler,
server.New would lose authorization enforcement entirely until Phase 3 — a real
regression window. So this task makes the layers inert on the Serve path via the
existing nil-guard
(Serve leaves cfg.AuthzMiddleware nil) rather than deleting them.
On the server.New path AuthzMiddleware stays non-nil, so authz keeps running.

Per architecture.md "Admission seam (R1)" and "What stays as transport middleware (NOT
moved to core)" (lines 193-198), authorization is re-platformed into the core admission
seam (#5438), which sources Tool.Annotations directly from the aggregated
capabilities and injects them via authorizers.WithToolAnnotations — reusing the
existing converter (annotation_enrichment.go:92). Everything else stays as
transport middleware: incoming auth (OIDC/local/anonymous + parser), audit and its
backend enrichment, recovery, header-validation, write-timeout, telemetry, and the
MCP-parsing layer.

The discovery middleware itself is retired (guarded to the legacy path) separately by
#5442 (direct VMCP.ListTools/CallTool), so it is out of scope here.

Parent Story: #5431
Dependencies: #5439 (the Serve skeleton + ServerConfig the chain is wired into), #5438 (cross-story; the core admission seam must exist so the Serve path can omit authz without losing enforcement)
Blocks: #5442

Acceptance Criteria

  • The MCP-endpoint middleware chain (recovery, header-validation, write-timeout SSE, auth, audit + backend enrichment, telemetry, MCP-parsing) is wired so the Serve-built *Server produces it, with the existing wrapping order — and therefore execution order — preserved.
  • On the Serve path, no authz/annotation HTTP layer is applied — achieved by leaving cfg.AuthzMiddleware nil so the existing nil-guards (server.go:606, :614) skip both the authz block (606-609) and the AnnotationEnrichmentMiddleware block (611-617). The blocks are NOT deleted in this PR.
  • The authz and annotation-enrichment blocks remain intact in the shared (*Server).Handler so the still-live server.New path continues to enforce authz with no regression (its AuthzMiddleware stays non-nil). annotation_enrichment.go (incl. convertAnnotations) is left in place — the core seam reuses convertAnnotations (P1.5 Core admission seam (bounded rewrite) #5438). Physical removal of the inert blocks is deferred to Phase 3 (P3.2 Reduce server.New body to the wrapper #5445).
  • The cfg.AuditConfig != nil backend-enrichment layer (server.go:599-602) and the audit layer (server.go:651) are preserved with unchanged behavior on both paths.
  • Behavioral-parity suite is green: MCP responses, audit events, and telemetry metrics/labels match pre-refactor behavior on the server.New path, and the Serve path advertises/denies via the core admission seam with allow/deny outcomes identical to the middleware path (cross-checked against the R1 suite from P1.5 Core admission seam (bounded rewrite) #5438) — no authz regression on either path.
  • PR is ≤ 400 LOC and ≤ 10 files changed (excluding tests/docs/generated).
  • server.New signature and observable behavior unchanged.
  • All tests pass (task test); lint clean (task lint-fix).
  • Code reviewed and approved.

Technical Approach

Recommended Implementation

Wire the MCP-endpoint middleware chain (server.go:574-675) so the Serve-built
*Server produces it. Under the guard-based deferral, the chain construction stays in
the shared (*Server).Handler (both server.New's *Server and Serve's *Server
use it); this task ensures the Serve path yields the correct chain via configuration,
not by deleting layers. Keep the wrapping idiom intact: each layer wraps the previous
mcpHandler, and because Go middleware executes in reverse wrapping order, the execution
sequence is unchanged so long as the wrap sequence is preserved. Carry forward, in the
same relative positions:

  • telemetry.Provider.Middleware (server.go:583-586) and the second
    mcpparser.ParsingMiddleware (server.go:595) — innermost layers; the
    parsing layer must stay before telemetry so recordMetrics can label metrics
    with the real mcp_method.
  • backendEnrichmentMiddleware when cfg.AuditConfig != nil
    (server.go:599-602).
  • auditor.Middleware when cfg.AuditConfig != nil, including the
    AuditConfig.Validate() + NewAuditorWithTransport construction
    (server.go:640-653).
  • AuthMiddleware when configured (server.go:656-659).
  • headerValidatingMiddleware (server.go:662),
    transportmiddleware.WriteTimeout(cfg.EndpointPath) (server.go:669), and
    recovery.Middleware as the outermost layer (server.go:672).

Omit on the Serve path via the existing nil-guard — do NOT delete:

  • The authz block (server.go:606-609) and the annotation-enrichment block
    (server.go:611-617) stay in the shared Handler. Serve populates its *Server
    with cfg.AuthzMiddleware == nil, so the existing guards skip both blocks on the
    Serve path; the core admission seam (P1.5 Core admission seam (bounded rewrite) #5438) enforces the decision instead. On the
    server.New path AuthzMiddleware is still set, so the blocks still run — preserving
    authz with no regression.
  • Leave annotation_enrichment.go and convertAnnotations in place; the core seam
    reuses convertAnnotations. The inert blocks (and any code that becomes dead once
    server.New routes through Serve) are physically removed in Phase 3 (P3.2 Reduce server.New body to the wrapper #5445).

Do not rewrite the chain's leading comment (server.go:574-579) to claim authz/
annotation are gone — they remain in the shared Handler until Phase 3. If a brief note
helps, add one line stating the layers are guarded-off on the Serve path and removed in
Phase 3 (go-style "Keep Comments Synchronized With Code").

The discovery middleware (server.go:619-637) is left in place and unguarded by this
task
; #5442 adds the core != nil guard that skips it on the Serve path and wires
the direct VMCP.ListTools/CallTool calls. Keep this task surgical so the parity surface
stays flat and the PR stays within limits.

Patterns & Frameworks

  • Reserve middleware for cross-cutting concerns only — authorization is
    request-type-specific business logic and belongs on the domain object; routing it
    through the core admission seam (and omitting the HTTP layer on the Serve path)
    retires vmcp-anti-patterns.md Implement secret store #4 (middleware overuse) and fix(typo): corrects readme #1/Create proxy subcommand #7 (context
    stuffing/mutable context) on the domain path — completed once the inert layers are
    deleted in Phase 3.
  • Guard, don't delete (yet) — the shared (*Server).Handler must serve both the
    legacy server.New chain (authz on) and the new Serve chain (authz omitted via nil
    AuthzMiddleware) until Phase 3; deleting shared middleware while server.New depends
    on it would open an authz-regression window.
  • mcp-go SDK boundary stays in Serve — telemetry/audit/parsing layers are
    transport concerns and remain in the transport package (Core Principle fix(typo): corrects readme #1;
    vmcp-anti-patterns.md Implement secret injection #5).
  • stdlib testing.T + testify, no Ginkgo in pkg/vmcp/server; preserve
    SPDX headers on touched files (testing.md; go-style).

Code Pointers

  • pkg/vmcp/server/server.go:574-675 — the full middleware-wrapping block; the
    comment at 574-579 documents the wrap-vs-execution order.
  • pkg/vmcp/server/server.go:583-595 — telemetry Middleware + the second
    ParsingMiddleware (innermost; parsing-before-telemetry ordering is
    load-bearing for metric labels).
  • pkg/vmcp/server/server.go:599-602backendEnrichmentMiddleware, applied
    only when cfg.AuditConfig != nil; preserve.
  • pkg/vmcp/server/server.go:606-609 — authz layer applied only when
    cfg.AuthzMiddleware != nil; kept, inert on the Serve path (Serve leaves it
    nil). Decision enforced by the core admission seam (P1.5 Core admission seam (bounded rewrite) #5438). Physical removal: Phase 3.
  • pkg/vmcp/server/server.go:611-617AnnotationEnrichmentMiddleware, gated on the
    same cfg.AuthzMiddleware != nil condition (guard at :614, wrap inside 611-617);
    kept, inert on the Serve path. Physical removal: Phase 3.
  • pkg/vmcp/server/server.go:640-653 — audit middleware construction
    (AuditConfig.Validate, NewAuditorWithTransport, auditor.Middleware at
    651); preserve.
  • pkg/vmcp/server/server.go:656-673 — auth, header-validation, write-timeout,
    recovery (outermost) layers; preserve order.
  • pkg/vmcp/server/annotation_enrichment.goAnnotationEnrichmentMiddleware
    (the if-guard at server.go:614, wrapped in the 611-617 block) and its
    findToolAnnotations/convertAnnotations helpers; left in place. convertAnnotations
    is reused by the core seam (P1.5 Core admission seam (bounded rewrite) #5438) per architecture.md (181-183) — do not delete it.
  • pkg/vmcp/server/serve.go (from P2.1 Serve skeleton + ServerConfig #5439) — the transport entry point; Serve
    populates the *Server's config with AuthzMiddleware == nil so the shared Handler
    omits authz/annotation on the Serve path.
  • architecture.md "What stays as transport middleware (NOT moved to core)"
    (lines 193-198); "Admission seam (R1)" annotations source (176-183).

Component Interfaces

No new exported types or interfaces are introduced. The work wires the shared
middleware chain for the Serve path and relies on the existing nil-guard. Shapes
referenced (existing signatures; literal placeholders for package-qualified types):

// All standard http middleware: func(http.Handler) http.Handler, wrapped onto
// mcpHandler in the order below (executes in reverse). Produced by the shared
// (*Server).Handler for BOTH server.New and Serve:
//   recovery.Middleware
//   transportmiddleware.WriteTimeout(cfg.EndpointPath)
//   headerValidatingMiddleware
//   cfg.AuthMiddleware            // when non-nil
//   cfg.AuthzMiddleware           // when non-nil  <-- nil on the Serve path (omitted)
//   AnnotationEnrichmentMiddleware// when cfg.AuthzMiddleware != nil <-- omitted on Serve path
//   auditor.Middleware            // when cfg.AuditConfig != nil
//   (*Server).backendEnrichmentMiddleware // when cfg.AuditConfig != nil
//   mcpparser.ParsingMiddleware
//   cfg.TelemetryProvider.Middleware(cfg.Name, "streamable-http") // when non-nil
//
// On the Serve path: cfg.AuthzMiddleware == nil, so the existing :606/:614 guards
// skip authz + annotation-enrichment (decision moves to the core seam, #5438).
// The blocks are NOT deleted here — physical removal is Phase 3 (#5445).

The core admission seam (Admission + Cedar adapter) and the direct sourcing
of Tool.Annotations are defined upstream in #5438. This task only ensures the
Serve path omits the HTTP authz/annotation layers via the nil-guard; it does not
implement the seam or delete the layers.

Testing Strategy

Unit Tests (pkg/vmcp/server, stdlib testing.T + testify)

  • The chain wraps layers in the expected order on both paths: assert the
    observable execution order (recovery outermost → ... → telemetry innermost →
    handler), e.g. via a probe handler / ordered marker.
  • On the Serve path (cfg.AuthzMiddleware == nil), authz and annotation-enrichment
    layers do not appear; on the server.New path (AuthzMiddleware != nil), they
    do — proving the shared Handler serves both modes and the blocks are not deleted.
  • Per-layer guards still hold: with AuditConfig == nil, neither audit nor
    backend-enrichment is applied; with AuthMiddleware == nil, auth is skipped;
    with TelemetryProvider == nil, telemetry is skipped.

Integration / Behavioral Parity Tests

  • MCP responses (initialize, tools/list, tools/call) are byte-for-byte
    equivalent to pre-refactor behavior on the server.New path.
  • Audit events emitted for tools/call (and other audited methods) are
    unchanged in shape and content, including backend-name enrichment.
  • Telemetry metrics carry the correct mcp_method labels (parsing-before-
    telemetry ordering preserved).
  • No authz regression: the server.New path still enforces authz via its
    HTTP middleware (unchanged); the Serve path enforces the same allow/deny via the
    core admission seam (cross-checked against the R1 security/parity suite from P1.5 Core admission seam (bounded rewrite) #5438).

Edge Cases

  • Anonymous / nil-identity requests behave identically (core seam allow-all when
    cfg.Authz is absent; HTTP authz nil-guard skips on the Serve path).
  • SSE (GET + Accept: text/event-stream) requests still bypass the
    server WriteTimeout via the write-timeout layer.
  • Panics in inner layers are still caught by the outermost recovery layer.

Out of Scope

References

  • RFC THV-0076: /Users/trey/Documents/GitHub/stacklok/toolhive-rfcs/rfcs/THV-0076-vmcp-core-interface.md
  • architecture.md "What stays as transport middleware (NOT moved to core)" (193-198), "Admission seam (R1)" annotations source (176-183), "Serve (transport) wiring relocated" (285-290), PR-Sized Decomposition (P2.3, lines 414-416).
  • research.md R1 → core admission seam decision; carried-forward Server methods.
  • Parent story: Phase 2: Serve transport helper, re-home transport, replace discovery #5431
  • Epic: vMCP interface refactor #5419

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs-triageIssue needs initial triage by a maintainerrefactorvmcpVirtual MCP Server related issues

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions