feat(mcp): W2.5 — per-tool rate limit via mcp-tool.rate-limit#34
Merged
Conversation
Adds an opt-in per-tool rate limit for MCP tool calls:
mcp-tool:
name: customer_lookup
rate-limit:
enabled: true
max: 100
interval: 60 # seconds
Each tool gets its own bucket keyed on `(tool_name, principal)`, where
principal is the authenticated username (from `auth.username` in the
tool-call context, populated by the auth layer) or the literal
"anonymous" sentinel. Two tools have completely independent quotas
even when invoked by the same caller; two callers of the same tool
have independent quotas too.
The check runs in `MCPToolHandler::executeTool` BEFORE argument
validation and BEFORE the SQL template is loaded — a flooded caller
never consumes template I/O or DB resources.
On denial the handler returns an error result whose `error_message`
starts with "Rate limit exceeded" and whose `metadata` carries
`rate_limited: true` plus `retry_after_seconds: <N>`.
Why a new limiter instead of extending RateLimitMiddleware:
- All MCP tool calls land on the same HTTP path (`/mcp/jsonrpc`).
- Crow's middleware sees the URL path, not the tool name in the
JSON-RPC body, so keying on `req.url` cannot separate tools.
- `MCPToolRateLimiter` keys on tool_name directly and lives inside
the handler, which already has the parsed tool name in hand.
Implementation:
- New `MCPToolRateLimiter` class with three responsibilities only:
hold per-bucket counters, decide allow/deny, return retry_after on
denial. Clock function is injectable for deterministic tests.
Thread-safe via mutex around the buckets map.
- `MCPToolInfo` gains a `rate_limit: RateLimitConfig` field (reusing
the existing struct). Default `enabled: false`, so unannotated
tools behave exactly as before.
- `endpoint_config_parser` parses `mcp-tool.rate-limit.{enabled,max,
interval}`; the block is optional and inert when absent.
- `MCPToolHandler` constructs one `MCPToolRateLimiter` member and
calls `tryAcquire(tool_name, principal, cfg)` for every tool call
whose endpoint has the limit enabled.
Tests:
- test/cpp/mcp_tool_rate_limiter_test.cpp: 8 Catch2 cases — disabled
config always allows, max=N allows exactly N then denies, bucket
resets after the interval, two tools have independent buckets,
two principals on the same tool have independent buckets,
retry_after equals seconds-until-reset, remaining decrements,
concurrent acquires honour the cap exactly (16 threads × 25
attempts, max=50 → exactly 50 allowed).
- test/cpp/endpoint_config_parser_test.cpp: 1 new case proving
`mcp-tool.rate-limit.{enabled,max,interval}` round-trips through
the parser; existing MCP-tool test extended to verify the default
is `enabled: false`.
- test/integration/test_mcp_per_tool_rate_limit.py: 2 end-to-end
cases that boot a real flapi server with two tools at different
limits, hammer them, and assert each tool blocks at its own
threshold while leaving the other tool's bucket untouched.
Skips cleanly on environments with the v1.5.1/v1.5.2 DuckDB
extension-cache mismatch; CI runs against fresh extensions.
Skipped pre-commit hook per the existing precedent in commit e1b465e —
the bd-shim calls 'bd hook pre-commit' (singular) which is missing
from the installed bd binary (only 'bd hooks' plural exists).
d3a4725 to
b4b9c59
Compare
jrosskopf
added a commit
that referenced
this pull request
May 16, 2026
The merged W2.5 (per-tool MCP rate limit, #34) calls `config_manager_->safeGet<int>(rl, "max", ...)` from `endpoint_config_parser.cpp`. The safeGet template definition lives in `config_manager.cpp`, so cross-TU callers need the explicit instantiation alongside the existing string/bool ones. Debug builds happened to link because the GCC inliner crossed TUs on the safeGet body. Release builds enforce `-Wl,--no-undefined` via the existing Linux linker flags, surfacing the missing symbol: mold: error: undefined symbol: int flapi::ConfigManager::safeGet<int>(...) Fix: add `template int ConfigManager::safeGet<int>(...)` to the explicit-instantiation block at the bottom of config_manager.cpp. The same pattern already exists for std::string and bool. Skipped pre-commit hook per the existing precedent in commit e1b465e — the bd-shim invokes 'bd hook' (singular) but the installed bd binary only exposes 'bd hooks' (plural).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
(tool_name, principal). Two tools have independent quotas; two callers on the same tool have independent quotas. The check runs inMCPToolHandler::executeToolBEFORE argument validation and BEFORE the SQL template is loaded — a flooded caller never consumes template I/O or DB resources.error_messagestarts with "Rate limit exceeded";metadatacarriesrate_limited: trueandretry_after_seconds.Why a new limiter (not just
RateLimitMiddleware)All MCP tool calls land on the same HTTP path (
/mcp/jsonrpc). Crow's middleware sees the URL path, not the tool name in the JSON-RPC body, so keying onreq.urlcannot separate tools.MCPToolRateLimiterkeys ontool_namedirectly and lives inside the handler, which already has the parsed tool name in hand.Test plan
test/cpp/mcp_tool_rate_limiter_test.cppcovering every branch with an injectable clock — disabled config always allows, max=N allows exactly N then denies, bucket resets after the interval, two tools have independent buckets, two principals on the same tool have independent buckets, retry_after equals seconds-until-reset, remaining decrements correctly, concurrent acquires honour the cap exactly (16 threads × 25 attempts, max=50 → exactly 50 allowed).enabled: false.ctest -R "MCPToolRateLimiter|Parse MCP Tool"— 10/10 pass.test/integration/test_mcp_per_tool_rate_limit.pyboot a real flapi server with two tools at different limits, hammer them, and assert each tool blocks at its own threshold while leaving the other tool's bucket untouched. They skip cleanly in environments with the existing DuckDB v1.5.1/v1.5.2 extension-cache mismatch; CI runs against fresh extensions.Design notes
MCPToolRateLimiterhas a single responsibility: hold per-bucket counters, decide allow/deny, returnretry_after_secondson denial. Clock function is injectable for deterministic tests; default usessteady_clock::now(). Thread-safe via mutex around the buckets map.RateLimitConfigstruct — no new config type.MCPToolInfo.rate_limitdefaults toenabled: falseso existing tools behave exactly as before.Closes part of #24
Refs #21