Skip to content

v0.2.0

Latest

Choose a tag to compare

@ShawnChen-Sirius ShawnChen-Sirius released this 19 May 02:51
8b4b070

Highlights

Security hardening on CHDB_MCP_FILE_ALLOWLIST, plus a new list_functions tool, per-query structured logging, a wall-clock query timeout, and version bumps to keep pyproject.toml /
__init__ / server.json / PyPI in sync.

Breaking change

When CHDB_MCP_FILE_ALLOWLIST is set, the gate now blocks every chDB table function not on the safe-by-construction list — numbers, values, view, merge, dictionary,
generateRandom, mergeTree* introspection, timeSeries*. This is stricter than v0.1.0's "advisory" framing and the v0.1.1 hand-listed denylist. Callers that relied on s3() / url() /
remote() via query() under a set allowlist need to either unset the allowlist or route file access through query_file().

When CHDB_MCP_FILE_ALLOWLIST is unset (the default), no behavior change.

Closed allowlist bypasses

  • query_file() now scans the user SQL before {file} substitution, so a UNION smuggling a second file('/etc/passwd', …) past the path-only check is rejected.
  • Scanner normalizes backtick- and double-quote-wrapped function names before matching, so quoting the identifier no longer bypasses it.
  • Denylist replaced with an inverted design: a small SAFE_TABLE_FUNCTIONS set plus a runtime snapshot of system.table_functions. The gate now catches executable, python, cosn,
    oss, fileCluster, urlCluster, paimon*, iceberg*Azure/S3/HDFS, prometheusQuery*, ytsaurus, etc.

New tool

  • list_functions(pattern=None) — returns name, is_aggregate, case_insensitive, alias_to from system.functions, optionally substring-filtered.

Resource caps

  • CHDB_MCP_MAX_RESULT_BYTES (default 1 MiB) is now enforced engine-side via max_block_size + max_result_bytes + result_overflow_mode='break', not just a post-hoc Python slice.
  • New CHDB_MCP_QUERY_TIMEOUT_SEC (default 30, 0 disables) injects chDB's max_execution_time.

SQL escape hardening

  • quote_string escapes backslashes too — payloads using ClickHouse's \' escape form can no longer break out of the literal.
  • Scanner masks string literals, line comments, and block comments in a single left-to-right pass.

Quality of life

  • Handshake reports chdb-mcp v0.2.0 instead of the mcp SDK version.
  • Structured per-tool logging: tool=X fmt=Y elapsed_ms=Z bytes=N truncated=T sql=… on stderr.
  • InitializeResult.capabilities.prompts / resources return None rather than empty objects.
  • CI now enforces mypy --strict src/chdb_mcp.
  • server.json synced to 0.2.0 and lists CHDB_MCP_QUERY_TIMEOUT_SEC.

Compatibility

Python 3.11 – 3.14, Linux + macOS, tested on chDB 4.1.7 / ClickHouse 26.3.9.

PRs: #4 (initial hardening), #5 (post-review allowlist redesign + 0.2.0 bump).
NOTES