Skip to content

feat(mcp): W2.5 — per-tool rate limit via mcp-tool.rate-limit#34

Merged
jrosskopf merged 1 commit into
mainfrom
feature/gh-24-mcp-per-tool-rate-limit
May 16, 2026
Merged

feat(mcp): W2.5 — per-tool rate limit via mcp-tool.rate-limit#34
jrosskopf merged 1 commit into
mainfrom
feature/gh-24-mcp-per-tool-rate-limit

Conversation

@jrosskopf
Copy link
Copy Markdown
Contributor

Summary

  • Adds opt-in per-tool rate limit for MCP tool calls:
    mcp-tool:
      name: customer_lookup
      rate-limit:
        enabled: true
        max: 100
        interval: 60
  • Each tool gets its own bucket keyed on (tool_name, principal). Two tools have independent quotas; two callers on the same tool have independent quotas. The check runs in MCPToolHandler::executeTool BEFORE argument validation and BEFORE the SQL template is loaded — a flooded caller never consumes template I/O or DB resources.
  • On denial: error_message starts with "Rate limit exceeded"; metadata carries rate_limited: true and retry_after_seconds.
  • Closes W2.5 of the security roadmap (Security Wave 2: MCP hardening (per-tool RBAC, dry-run, response shaping, description hygiene) #24). The entire MCP wave is now covered.

Why a new limiter (not just RateLimitMiddleware)

All MCP tool calls land on the same HTTP path (/mcp/jsonrpc). Crow's middleware sees the URL path, not the tool name in the JSON-RPC body, so keying on req.url cannot separate tools. MCPToolRateLimiter keys on tool_name directly and lives inside the handler, which already has the parsed tool name in hand.

Test plan

  • 8 Catch2 unit cases in test/cpp/mcp_tool_rate_limiter_test.cpp covering every branch with an injectable clock — disabled config always allows, max=N allows exactly N then denies, bucket resets after the interval, two tools have independent buckets, two principals on the same tool have independent buckets, retry_after equals seconds-until-reset, remaining decrements correctly, concurrent acquires honour the cap exactly (16 threads × 25 attempts, max=50 → exactly 50 allowed).
  • 1 new parser case plus the existing MCP-tool case extended to assert the default is enabled: false.
  • ctest -R "MCPToolRateLimiter|Parse MCP Tool" — 10/10 pass.
  • 2 end-to-end Python cases in test/integration/test_mcp_per_tool_rate_limit.py boot a real flapi server with two tools at different limits, hammer them, and assert each tool blocks at its own threshold while leaving the other tool's bucket untouched. They skip cleanly in environments with the existing DuckDB v1.5.1/v1.5.2 extension-cache mismatch; CI runs against fresh extensions.

Design notes

  • MCPToolRateLimiter has a single responsibility: hold per-bucket counters, decide allow/deny, return retry_after_seconds on denial. Clock function is injectable for deterministic tests; default uses steady_clock::now(). Thread-safe via mutex around the buckets map.
  • Reuses the existing RateLimitConfig struct — no new config type. MCPToolInfo.rate_limit defaults to enabled: false so existing tools behave exactly as before.
  • The check runs in the handler before any of the existing W2.1 (RBAC) / W2.2 (dry-run) / W2.3/W2.4 logic so it short-circuits cleanly.

Closes part of #24
Refs #21

Adds an opt-in per-tool rate limit for MCP tool calls:

  mcp-tool:
    name: customer_lookup
    rate-limit:
      enabled: true
      max: 100
      interval: 60      # seconds

Each tool gets its own bucket keyed on `(tool_name, principal)`, where
principal is the authenticated username (from `auth.username` in the
tool-call context, populated by the auth layer) or the literal
"anonymous" sentinel. Two tools have completely independent quotas
even when invoked by the same caller; two callers of the same tool
have independent quotas too.

The check runs in `MCPToolHandler::executeTool` BEFORE argument
validation and BEFORE the SQL template is loaded — a flooded caller
never consumes template I/O or DB resources.

On denial the handler returns an error result whose `error_message`
starts with "Rate limit exceeded" and whose `metadata` carries
`rate_limited: true` plus `retry_after_seconds: <N>`.

Why a new limiter instead of extending RateLimitMiddleware:
- All MCP tool calls land on the same HTTP path (`/mcp/jsonrpc`).
- Crow's middleware sees the URL path, not the tool name in the
  JSON-RPC body, so keying on `req.url` cannot separate tools.
- `MCPToolRateLimiter` keys on tool_name directly and lives inside
  the handler, which already has the parsed tool name in hand.

Implementation:

- New `MCPToolRateLimiter` class with three responsibilities only:
  hold per-bucket counters, decide allow/deny, return retry_after on
  denial. Clock function is injectable for deterministic tests.
  Thread-safe via mutex around the buckets map.
- `MCPToolInfo` gains a `rate_limit: RateLimitConfig` field (reusing
  the existing struct). Default `enabled: false`, so unannotated
  tools behave exactly as before.
- `endpoint_config_parser` parses `mcp-tool.rate-limit.{enabled,max,
  interval}`; the block is optional and inert when absent.
- `MCPToolHandler` constructs one `MCPToolRateLimiter` member and
  calls `tryAcquire(tool_name, principal, cfg)` for every tool call
  whose endpoint has the limit enabled.

Tests:

- test/cpp/mcp_tool_rate_limiter_test.cpp: 8 Catch2 cases — disabled
  config always allows, max=N allows exactly N then denies, bucket
  resets after the interval, two tools have independent buckets,
  two principals on the same tool have independent buckets,
  retry_after equals seconds-until-reset, remaining decrements,
  concurrent acquires honour the cap exactly (16 threads × 25
  attempts, max=50 → exactly 50 allowed).
- test/cpp/endpoint_config_parser_test.cpp: 1 new case proving
  `mcp-tool.rate-limit.{enabled,max,interval}` round-trips through
  the parser; existing MCP-tool test extended to verify the default
  is `enabled: false`.
- test/integration/test_mcp_per_tool_rate_limit.py: 2 end-to-end
  cases that boot a real flapi server with two tools at different
  limits, hammer them, and assert each tool blocks at its own
  threshold while leaving the other tool's bucket untouched.
  Skips cleanly on environments with the v1.5.1/v1.5.2 DuckDB
  extension-cache mismatch; CI runs against fresh extensions.

Skipped pre-commit hook per the existing precedent in commit e1b465e —
the bd-shim calls 'bd hook pre-commit' (singular) which is missing
from the installed bd binary (only 'bd hooks' plural exists).
@jrosskopf jrosskopf force-pushed the feature/gh-24-mcp-per-tool-rate-limit branch from d3a4725 to b4b9c59 Compare May 16, 2026 17:44
@jrosskopf jrosskopf merged commit 0a7d69c into main May 16, 2026
jrosskopf added a commit that referenced this pull request May 16, 2026
The merged W2.5 (per-tool MCP rate limit, #34) calls
`config_manager_->safeGet<int>(rl, "max", ...)` from
`endpoint_config_parser.cpp`. The safeGet template definition lives
in `config_manager.cpp`, so cross-TU callers need the explicit
instantiation alongside the existing string/bool ones.

Debug builds happened to link because the GCC inliner crossed TUs
on the safeGet body. Release builds enforce `-Wl,--no-undefined`
via the existing Linux linker flags, surfacing the missing symbol:

  mold: error: undefined symbol: int flapi::ConfigManager::safeGet<int>(...)

Fix: add `template int ConfigManager::safeGet<int>(...)` to the
explicit-instantiation block at the bottom of config_manager.cpp.
The same pattern already exists for std::string and bool.

Skipped pre-commit hook per the existing precedent in commit
e1b465e — the bd-shim invokes 'bd hook' (singular) but the
installed bd binary only exposes 'bd hooks' (plural).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant