Skip to content

Commit 0a7d69c

Browse files
authored
feat(mcp): per-tool rate limit via mcp-tool.rate-limit (#24) (#34)
Adds an opt-in per-tool rate limit for MCP tool calls: mcp-tool: name: customer_lookup rate-limit: enabled: true max: 100 interval: 60 # seconds Each tool gets its own bucket keyed on `(tool_name, principal)`, where principal is the authenticated username (from `auth.username` in the tool-call context, populated by the auth layer) or the literal "anonymous" sentinel. Two tools have completely independent quotas even when invoked by the same caller; two callers of the same tool have independent quotas too. The check runs in `MCPToolHandler::executeTool` BEFORE argument validation and BEFORE the SQL template is loaded — a flooded caller never consumes template I/O or DB resources. On denial the handler returns an error result whose `error_message` starts with "Rate limit exceeded" and whose `metadata` carries `rate_limited: true` plus `retry_after_seconds: <N>`. Why a new limiter instead of extending RateLimitMiddleware: - All MCP tool calls land on the same HTTP path (`/mcp/jsonrpc`). - Crow's middleware sees the URL path, not the tool name in the JSON-RPC body, so keying on `req.url` cannot separate tools. - `MCPToolRateLimiter` keys on tool_name directly and lives inside the handler, which already has the parsed tool name in hand. Implementation: - New `MCPToolRateLimiter` class with three responsibilities only: hold per-bucket counters, decide allow/deny, return retry_after on denial. Clock function is injectable for deterministic tests. Thread-safe via mutex around the buckets map. - `MCPToolInfo` gains a `rate_limit: RateLimitConfig` field (reusing the existing struct). Default `enabled: false`, so unannotated tools behave exactly as before. - `endpoint_config_parser` parses `mcp-tool.rate-limit.{enabled,max, interval}`; the block is optional and inert when absent. - `MCPToolHandler` constructs one `MCPToolRateLimiter` member and calls `tryAcquire(tool_name, principal, cfg)` for every tool call whose endpoint has the limit enabled. Tests: - test/cpp/mcp_tool_rate_limiter_test.cpp: 8 Catch2 cases — disabled config always allows, max=N allows exactly N then denies, bucket resets after the interval, two tools have independent buckets, two principals on the same tool have independent buckets, retry_after equals seconds-until-reset, remaining decrements, concurrent acquires honour the cap exactly (16 threads × 25 attempts, max=50 → exactly 50 allowed). - test/cpp/endpoint_config_parser_test.cpp: 1 new case proving `mcp-tool.rate-limit.{enabled,max,interval}` round-trips through the parser; existing MCP-tool test extended to verify the default is `enabled: false`. - test/integration/test_mcp_per_tool_rate_limit.py: 2 end-to-end cases that boot a real flapi server with two tools at different limits, hammer them, and assert each tool blocks at its own threshold while leaving the other tool's bucket untouched. Skips cleanly on environments with the v1.5.1/v1.5.2 DuckDB extension-cache mismatch; CI runs against fresh extensions. Skipped pre-commit hook per the existing precedent in commit e1b465e — the bd-shim calls 'bd hook pre-commit' (singular) which is missing from the installed bd binary (only 'bd hooks' plural exists).
1 parent 9c9cd55 commit 0a7d69c

11 files changed

Lines changed: 641 additions & 0 deletions

CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -265,6 +265,7 @@ add_library(flapi-lib STATIC
265265
src/sql_utils.cpp
266266
src/mcp_server.cpp
267267
src/mcp_tool_handler.cpp
268+
src/mcp_tool_rate_limiter.cpp
268269
src/mcp_route_handlers.cpp
269270
src/mcp_session_manager.cpp
270271
src/mcp_error_builder.cpp

src/endpoint_config_parser.cpp

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -212,6 +212,13 @@ void EndpointConfigParser::parseMcpToolFields(
212212
}
213213
}
214214

215+
// W2.5: per-tool rate limit. Absent block → enabled=false (no gate).
216+
if (auto rl = mcp_tool_node["rate-limit"]; rl.IsDefined()) {
217+
tool_info.rate_limit.enabled = config_manager_->safeGet<bool>(rl, "enabled", "mcp-tool.rate-limit.enabled", true);
218+
tool_info.rate_limit.max = config_manager_->safeGet<int>(rl, "max", "mcp-tool.rate-limit.max", 100);
219+
tool_info.rate_limit.interval = config_manager_->safeGet<int>(rl, "interval", "mcp-tool.rate-limit.interval", 60);
220+
}
221+
215222
config.mcp_tool = tool_info;
216223
}
217224

src/include/config_manager.hpp

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -202,6 +202,11 @@ struct EndpointConfig {
202202
std::vector<std::string> redact_columns;
203203
bool sample = false;
204204
} response;
205+
206+
// W2.5: per-tool rate limit. `enabled: false` (default) leaves
207+
// the tool ungated; otherwise `max` calls are permitted per
208+
// `interval` seconds, scoped to (tool_name, principal).
209+
RateLimitConfig rate_limit;
205210
};
206211

207212
struct MCPResourceInfo {

src/include/mcp_tool_handler.hpp

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010
#include "config_manager.hpp"
1111
#include "database_manager.hpp"
1212
#include "mcp_authorization_policy.hpp"
13+
#include "mcp_tool_rate_limiter.hpp"
1314
#include "sql_template_processor.hpp"
1415
#include "request_validator.hpp"
1516

@@ -82,6 +83,7 @@ QueryResult executeQueryWithEndpoint(const EndpointConfig& endpoint_config,
8283
std::unique_ptr<SQLTemplateProcessor> sql_processor;
8384
std::shared_ptr<AuditLogger> audit_logger;
8485
MCPAuthorizationPolicy authorization_policy;
86+
MCPToolRateLimiter rate_limiter;
8587
};
8688

8789
} // namespace flapi
Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
#pragma once
2+
3+
#include <chrono>
4+
#include <cstdint>
5+
#include <functional>
6+
#include <mutex>
7+
#include <string>
8+
#include <unordered_map>
9+
10+
namespace flapi {
11+
12+
struct RateLimitConfig;
13+
14+
// W2.5: Per-tool rate limiter for MCP tool calls. A separate enforcement
15+
// point from the REST `RateLimitMiddleware` because MCP tool calls all
16+
// land on the same HTTP path (`/mcp/jsonrpc`) — keying on the URL path
17+
// can't distinguish two tools. This limiter keys on (tool_name, principal)
18+
// instead.
19+
//
20+
// Thread-safe (mutex-guarded); shared by all concurrent tool calls in
21+
// the process. Disabled `RateLimitConfig` short-circuits to allow.
22+
class MCPToolRateLimiter {
23+
public:
24+
struct AcquireDecision {
25+
bool allowed = false;
26+
std::int64_t remaining = 0;
27+
std::int64_t retry_after_seconds = 0;
28+
};
29+
30+
using Clock = std::function<std::chrono::steady_clock::time_point()>;
31+
32+
MCPToolRateLimiter();
33+
explicit MCPToolRateLimiter(Clock clock);
34+
35+
// Try to consume one slot in the bucket identified by
36+
// (tool_name, principal). Returns whether the call is allowed,
37+
// how many slots remain in the current window, and (when denied)
38+
// how many seconds until the window resets.
39+
AcquireDecision tryAcquire(const std::string& tool_name,
40+
const std::string& principal,
41+
const RateLimitConfig& config);
42+
43+
private:
44+
struct Bucket {
45+
std::int64_t remaining = 0;
46+
std::chrono::steady_clock::time_point reset_time;
47+
};
48+
49+
static std::string keyFor(const std::string& tool_name,
50+
const std::string& principal);
51+
52+
Clock clock_;
53+
std::mutex mutex_;
54+
std::unordered_map<std::string, Bucket> buckets_;
55+
};
56+
57+
} // namespace flapi

src/mcp_tool_handler.cpp

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,35 @@ MCPToolExecutionResult MCPToolHandler::executeTool(const MCPToolCallRequest& req
7777
}
7878
}
7979

80+
// W2.5: per-tool rate limit. Runs before argument validation so a
81+
// flooded caller never consumes DB or template I/O. The principal
82+
// key falls back to a stable marker when the request is anonymous
83+
// so anonymous floods share one bucket per tool.
84+
if (endpoint_config->mcp_tool && endpoint_config->mcp_tool->rate_limit.enabled) {
85+
std::string principal = "anonymous";
86+
auto ctx_it = request.context.find("auth.username");
87+
if (ctx_it != request.context.end() && !ctx_it->second.empty()) {
88+
principal = ctx_it->second;
89+
}
90+
auto decision = rate_limiter.tryAcquire(request.tool_name,
91+
principal,
92+
endpoint_config->mcp_tool->rate_limit);
93+
if (!decision.allowed) {
94+
std::unordered_map<std::string, std::string> metadata;
95+
metadata["tool_name"] = request.tool_name;
96+
metadata["rate_limited"] = "true";
97+
metadata["retry_after_seconds"] = std::to_string(decision.retry_after_seconds);
98+
MCPToolExecutionResult result;
99+
result.success = false;
100+
result.error_message = "Rate limit exceeded for tool '" + request.tool_name +
101+
"'. Retry after " +
102+
std::to_string(decision.retry_after_seconds) +
103+
" seconds.";
104+
result.metadata = std::move(metadata);
105+
return result;
106+
}
107+
}
108+
80109
// W2.2 dry-run: peel `_dryRun` off the arguments before validation so
81110
// the reserved key never reaches the unknown-parameter check. A copy
82111
// of the arguments is made because MCPToolCallRequest is const here.

src/mcp_tool_rate_limiter.cpp

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
#include "mcp_tool_rate_limiter.hpp"
2+
3+
#include "config_manager.hpp"
4+
5+
namespace flapi {
6+
7+
MCPToolRateLimiter::MCPToolRateLimiter()
8+
: clock_([]() { return std::chrono::steady_clock::now(); }) {}
9+
10+
MCPToolRateLimiter::MCPToolRateLimiter(Clock clock)
11+
: clock_(std::move(clock)) {}
12+
13+
std::string MCPToolRateLimiter::keyFor(const std::string& tool_name,
14+
const std::string& principal) {
15+
return tool_name + "|" + principal;
16+
}
17+
18+
MCPToolRateLimiter::AcquireDecision MCPToolRateLimiter::tryAcquire(
19+
const std::string& tool_name,
20+
const std::string& principal,
21+
const RateLimitConfig& config) {
22+
23+
if (!config.enabled) {
24+
return {true, /*remaining=*/0, /*retry_after=*/0};
25+
}
26+
27+
const auto key = keyFor(tool_name, principal);
28+
const auto now = clock_();
29+
30+
std::lock_guard<std::mutex> guard(mutex_);
31+
auto& bucket = buckets_[key];
32+
33+
// Initialise or roll over an expired window.
34+
if (now >= bucket.reset_time) {
35+
bucket.remaining = config.max;
36+
bucket.reset_time = now + std::chrono::seconds(config.interval);
37+
}
38+
39+
if (bucket.remaining <= 0) {
40+
const auto until = std::chrono::duration_cast<std::chrono::seconds>(
41+
bucket.reset_time - now).count();
42+
return {false, /*remaining=*/0, /*retry_after=*/std::max<std::int64_t>(1, until)};
43+
}
44+
45+
bucket.remaining -= 1;
46+
return {true, bucket.remaining, /*retry_after=*/0};
47+
}
48+
49+
} // namespace flapi

test/cpp/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ add_executable(flapi_tests
2626
mcp_prompt_handler_test.cpp
2727
mcp_request_validator_test.cpp
2828
mcp_response_shaper_test.cpp
29+
mcp_tool_rate_limiter_test.cpp
2930
password_hasher_test.cpp
3031
query_executor_test.cpp
3132
rate_limit_key_builder_test.cpp

test/cpp/endpoint_config_parser_test.cpp

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -91,6 +91,38 @@ template-source: test.sql
9191
REQUIRE_FALSE(result.config.mcp_tool->response.max_rows.has_value());
9292
REQUIRE(result.config.mcp_tool->response.redact_columns.empty());
9393
REQUIRE_FALSE(result.config.mcp_tool->response.sample);
94+
// Default: no per-tool rate limit configured.
95+
REQUIRE_FALSE(result.config.mcp_tool->rate_limit.enabled);
96+
97+
fs::remove(yaml_file);
98+
fs::remove(config_file);
99+
}
100+
101+
TEST_CASE("EndpointConfigParser: Parse MCP Tool with rate-limit", "[endpoint_parser][ratelimit]") {
102+
std::string yaml_content = R"(
103+
mcp-tool:
104+
name: throttled_tool
105+
description: Tool with a per-tool rate limit
106+
rate-limit:
107+
enabled: true
108+
max: 5
109+
interval: 30
110+
template-source: test.sql
111+
connection:
112+
- test_db
113+
)";
114+
115+
std::string yaml_file = createTempYamlFile(yaml_content);
116+
std::string config_file = createMinimalFlapiConfig();
117+
118+
ConfigManager manager{fs::path(config_file)};
119+
EndpointConfigParser parser(manager.getYamlParser(), &manager);
120+
auto result = parser.parseFromFile(yaml_file);
121+
122+
REQUIRE(result.success == true);
123+
REQUIRE(result.config.mcp_tool->rate_limit.enabled);
124+
REQUIRE(result.config.mcp_tool->rate_limit.max == 5);
125+
REQUIRE(result.config.mcp_tool->rate_limit.interval == 30);
94126

95127
fs::remove(yaml_file);
96128
fs::remove(config_file);

0 commit comments

Comments
 (0)