Skip to content

Wire rate limit middleware into proxy runner chain#4652

Open
jerm-dro wants to merge 3 commits intomainfrom
jerm-dro/rate-limit-middleware
Open

Wire rate limit middleware into proxy runner chain#4652
jerm-dro wants to merge 3 commits intomainfrom
jerm-dro/rate-limit-middleware

Conversation

@jerm-dro
Copy link
Copy Markdown
Contributor

@jerm-dro jerm-dro commented Apr 7, 2026

Summary

Rate limiting CRD types and the token-bucket Limiter package were
merged in #4577. This PR wires the limiter into the proxy runner
middleware chain so rate limiting is enforced at runtime.

  • Adds HTTP middleware that reads the parsed MCP request from context,
    rate-limits only tools/call requests, returns HTTP 429 with
    JSON-RPC error code -32029 and Retry-After header on rejection,
    and fails open on Redis errors
  • Registers the middleware factory and positions it after auth and MCP
    parser in the middleware chain
  • Maps spec.rateLimiting from the CRD to the RunConfig in the
    operator reconciler, carrying the CRD type directly
  • Adds Ginkgo E2E acceptance tests that deploy Redis + MCPServer in a
    Kind cluster and verify rate limit enforcement

Closes #4551

Type of change

  • New feature

Test plan

  • Unit tests (task test)
  • E2E tests (task test-e2e)
  • Linting (task lint-fix)

Unit tests (12 new):

  • Middleware handler: allowed, rejected (429 + -32029), fail-open,
    missing MCP context (500), non-tools/call passthrough
  • Wiring: addRateLimitMiddleware nil/valid, PopulateMiddlewareConfigs
    presence/absence
  • Operator: spec.rateLimiting flows to RunConfig

E2E tests (2 new):

  • AC7: Burst of requests exceeds shared limit, 4th request gets HTTP
    429 with JSON-RPC -32029 and retryAfterSeconds
  • AC8: MCPServer with both shared and per-tool config is accepted and
    reaches Running phase

Changes

File Change
pkg/ratelimit/middleware.go New: factory, handler, JSON-RPC error helpers, MiddlewareParams
pkg/runner/middleware.go Register factory, add helper, call after MCP parser
pkg/runner/config.go Add RateLimitConfig and RateLimitNamespace fields
pkg/runner/config_builder.go Add WithRateLimitConfig builder option
cmd/thv-operator/controllers/mcpserver_runconfig.go Map spec.rateLimiting to RunConfig
test/e2e/thv-operator/acceptance_tests/ New: Ginkgo E2E suite with Redis deploy and rate limit tests

Special notes for reviewers

  • The middleware errors with HTTP 500 if ParsedMCPRequest is missing
    from context — this catches middleware ordering bugs at runtime
  • Only tools/call requests are rate-limited; other MCP methods pass
    through unconditionally
  • Redis client is created from SessionRedisConfig + the
    THV_SESSION_REDIS_PASSWORD env var, sharing connection info with
    session storage
  • E2E tests follow the test/e2e/thv-operator/virtualmcp/ patterns

Large PR Justification

1140 lines across 11 files, but 844 lines are tests (unit + E2E
infrastructure). The 296 lines of production code span 5 files that
form a single logical change: middleware + wiring + operator mapping.

Generated with Claude Code

@github-actions github-actions bot added the size/XL Extra large PR: 1000+ lines changed label Apr 7, 2026
@jerm-dro jerm-dro marked this pull request as draft April 7, 2026 21:20
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 7, 2026

Codecov Report

❌ Patch coverage is 61.16505% with 40 lines in your changes missing coverage. Please review.
✅ Project coverage is 68.84%. Comparing base (9f38bbe) to head (0f476a5).
⚠️ Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
pkg/ratelimit/middleware.go 54.05% 33 Missing and 1 partial ⚠️
pkg/runner/middleware.go 71.42% 3 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4652      +/-   ##
==========================================
- Coverage   68.88%   68.84%   -0.05%     
==========================================
  Files         506      509       +3     
  Lines       52449    52642     +193     
==========================================
+ Hits        36130    36240     +110     
- Misses      13525    13607      +82     
- Partials     2794     2795       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jerm-dro jerm-dro force-pushed the jerm-dro/rate-limit-middleware branch from 15f2cd5 to 90ebabc Compare April 7, 2026 21:33
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Apr 7, 2026
@jerm-dro jerm-dro force-pushed the jerm-dro/rate-limit-middleware branch from 90ebabc to 7063692 Compare April 7, 2026 21:47
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Apr 7, 2026
@jerm-dro jerm-dro force-pushed the jerm-dro/rate-limit-middleware branch from 7063692 to d28ef3e Compare April 7, 2026 22:05
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Apr 7, 2026
@jerm-dro jerm-dro force-pushed the jerm-dro/rate-limit-middleware branch from d28ef3e to 4d1d6aa Compare April 7, 2026 22:28
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Apr 7, 2026
jerm-dro and others added 3 commits April 7, 2026 16:06
Implement the rate limit middleware factory and handler that wires into
the proxy runner middleware chain.

The handler:
- Errors if MCP parser context is missing (middleware ordering bug)
- Only rate-limits tools/call requests; other methods pass through
- Fails open on Redis errors (logs warning, allows request)
- Returns HTTP 429 with JSON-RPC -32029 error and Retry-After header

The factory pings Redis at startup to fail fast on misconfiguration
rather than silently failing open on every request.

Tests use a dummy Limiter implementation to verify handler behavior
without Redis: allowed, rejected, fail-open, missing context, and
non-tools/call passthrough.

Part of #4551

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Register the rate limit middleware factory and add it to the proxy
runner middleware chain, positioned after MCP parser (needs tool name
from context) and before validating webhooks.

The operator reconciler maps spec.rateLimiting directly to the
RunConfig, carrying the CRD type without intermediate config types.
The middleware factory creates a Redis client from session storage
config and builds the Limiter at startup.

Move populateScalingConfig before PopulateMiddlewareConfigs in the
reconciler so SessionRedis config is available when the rate limit
middleware reads it. Validate that Redis session storage is configured
when rate limiting is enabled, with a clear error message.

Part of #4551

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Ginkgo-based E2E tests that deploy Redis and an MCPServer with rate
limiting in a Kind cluster and verify runtime enforcement:

- AC7: Send requests exceeding the shared limit, verify HTTP 429 with
  JSON-RPC error code -32029 and retryAfterSeconds in error data
- AC8: Deploy MCPServer with both shared and per-tool rate limit config,
  verify the CRD is accepted and the server reaches Running phase

Creates a NodePort service targeting the MCPServer proxy pods for
test access. Uses the correct Accept header for streamable-http.

Part of #4551

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jerm-dro jerm-dro force-pushed the jerm-dro/rate-limit-middleware branch from 4d1d6aa to 0f476a5 Compare April 7, 2026 23:06
@github-actions github-actions bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Apr 7, 2026
@jerm-dro jerm-dro marked this pull request as ready for review April 7, 2026 23:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/XL Extra large PR: 1000+ lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Configure global rate limits on MCPServer

1 participant