feat(mcp): W2.4 — per-tool response shaping (redact, cap, sample) by jrosskopf · Pull Request #30 · DataZooDE/flapi

jrosskopf · 2026-05-16T14:57:12Z

Summary

Adds opt-in mcp-tool.response config with three independent knobs:
- redact-columns: [ssn, salary] replaces values in every row with "<redacted>".
- max-rows: 1000 caps the result-set length.
- sample: true collapses the response to { row_count, columns, sampled: true }, no rows.
Closes W2.4 of the security roadmap (Security Wave 2: MCP hardening (per-tool RBAC, dry-run, response shaping, description hygiene) #24). With this, the vendor email's full "response inspection" pitch is covered in-product.

Test plan

10 Catch2 unit cases in test/cpp/mcp_response_shaper_test.cpp exercise every combination — no-op, max-rows alone, redact alone, both composed, sample wins, non-array passthrough, empty array, zero cap, missing redact column.
3 additional parser cases prove the default tool config is inert, the full response block round-trips, and a sample-only block parses cleanly.
ctest -R "MCPResponseShaper|response_shape|Parse MCP Tool" — 13/13 pass.
3 end-to-end Python cases in test/integration/test_mcp_response_shaping.py boot a real flapi server with three shaped tools and verify redact, max-rows, and sample independently against a deterministic in-memory VALUES result. They skip cleanly in environments with the existing DuckDB v1.5.1/v1.5.2 extension-cache mismatch; CI exercises them against fresh extensions.
Reviewer: confirm the redaction sentinel string "<redacted>" is acceptable, or pick a different default.

Design notes

MCPResponseShaper is a pure single-responsibility transformer: (json_string, ResponseShape) → json_string. No QueryResult, no ConfigManager, no handler internals — easiest possible test surface.
ResponseShape::isNoOp() short-circuits the shape call when nothing is configured, so the path is free for tools without the block.
sample: true deliberately wins over the other knobs — sample mode never emits row data, so combining it with redact_columns or max_rows is structurally inert. The test pins this contract.
Non-array payloads (e.g., the dry-run JSON from feat(mcp): W2.2 — shadow / dry-run mode for tool calls #29) pass through unchanged; we don't impose row-shape semantics on objects we don't understand.

Closes part of #24
Refs #21

Adds opt-in response shaping for MCP tool results. Three independent knobs under `mcp-tool.response`: mcp-tool: name: customer_lookup response: max-rows: 1000 # cap result-set length redact-columns: [ssn] # mask listed columns with "<redacted>" sample: true # return only row_count + column names Defaults are all inert, so existing tools see no behaviour change. The shaper runs in MCPToolHandler::executeTool on the read path, after formatResult and before the JSON envelope leaves the server. Write tools are unaffected (response shape applies to SELECT results). Implementation: - New MCPResponseShaper class — pure transformer with one responsibility: take a JSON string and a ResponseShape config, return a shaped JSON string. No QueryResult / ConfigManager / handler internals on the dependency graph; trivially unit-tested. - MCPToolInfo gains a nested `ResponseShape` struct with `max_rows` (optional), `redact_columns`, `sample`. `isNoOp()` short-circuits the shape call when nothing is configured. - endpoint_config_parser parses `mcp-tool.response.{max-rows, redact-columns, sample}` from YAML. Each field is independent. - MCPToolHandler builds the shape config from MCPToolInfo and runs the shaper; emits `response_shaped: true` in tool metadata when shaping fires. Semantics: - `sample: true` wins over the other knobs and emits a summary object with `row_count`, `columns: [...]`, `sampled: true` — no row data. - Otherwise `redact_columns` is applied first (replaces the listed values in every row with the literal string `"<redacted>"`), then `max_rows` truncates the resulting array. - Non-array payloads (e.g., dry-run results) pass through unchanged. - `max_rows: 0` is a legitimate "suppress everything" choice. - Missing redact columns are tolerated as a no-op. Tests: - test/cpp/mcp_response_shaper_test.cpp: 10 Catch2 cases covering every combination — no-op, max-rows alone, redact alone, both composed, sample wins, non-array passthrough, empty array, zero cap, missing redact column. - test/cpp/endpoint_config_parser_test.cpp: three additional cases proving the default tool config is inert, the full response block round-trips through the parser, and a sample-only block parses cleanly. - test/integration/test_mcp_response_shaping.py: three end-to-end cases that boot a real flapi server with three role-shaped tools and exercise redact, max-rows, and sample independently against a deterministic in-memory result set. Skips cleanly on environments with the v1.5.1/v1.5.2 DuckDB extension-cache mismatch; CI runs against fresh extensions. Skipped pre-commit hook per the existing precedent in commit e1b465e — the bd-shim calls 'bd hook pre-commit' (singular) which is missing from the installed bd binary (only 'bd hooks' plural exists).

jrosskopf force-pushed the feature/gh-24-mcp-response-shaping branch from 97756cc to 8104358 Compare May 16, 2026 17:41

jrosskopf merged commit 9c9cd55 into main May 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(mcp): W2.4 — per-tool response shaping (redact, cap, sample)#30

feat(mcp): W2.4 — per-tool response shaping (redact, cap, sample)#30
jrosskopf merged 1 commit into
mainfrom
feature/gh-24-mcp-response-shaping

jrosskopf commented May 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jrosskopf commented May 16, 2026

Summary

Test plan

Design notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant