Skip to content

mcp-data-platform-v1.56.0

Choose a tag to compare

@github-actions github-actions released this 19 Apr 03:50
· 171 commits to main since this release
a70a3f3

Highlights

New: trino_export — Query-to-Asset Export (#331)

A new MCP tool that executes a validated SQL query and writes the full result set directly to S3 as an immutable portal asset, bypassing the LLM token budget. The agent shapes queries interactively with trino_query, then hands off bulk execution to trino_export when the shape is finalized.

Parameters: sql, format (csv/json/markdown/text), name, plus optional connection, description, tags, limit, idempotency_key, timeout_seconds, create_public_link.

Returns: Asset metadata only (ID, portal URL, share URL, row count, size) — data goes to S3, not through the LLM.

Key capabilities:

  • Four output formats — CSV (with formula injection escaping), JSON, Markdown, plain text
  • Hard size caps — configurable max_rows (default 100K) and max_bytes (default 100 MB) per deployment
  • Query timeout — configurable default_timeout / max_timeout, separate from size caps
  • Idempotencyidempotency_key prevents duplicate assets on retry, backed by a partial unique index with race condition handling
  • Sensitivity inheritance — if a query touches a PII-classified table (via DataHub tags), the resulting asset automatically inherits _sys-classification:pii tags
  • Public share linkscreate_public_link: true generates a shareable URL for automation pipelines, with default notice text
  • Read-only enforcement — always enforced regardless of deployment read_only config (exports are SELECT-only by definition)
  • Provenance — captures session tool call history, source tables, query SQL, and format in the asset's provenance metadata
  • No partial state — no asset record is created unless the S3 write fully succeeds
  • Persona/RBAC gate — requires explicit authorization via tools allow/deny patterns

Configuration:

portal:
  export:
    enabled: true          # auto-enabled when portal + trino configured
    max_rows: 100000
    max_bytes: 104857600   # 100 MB
    default_timeout: "5m"
    max_timeout: "10m"

New migration: 000033_export_idempotency adds idempotency_key column with partial unique index to portal_assets.

Large Asset Preview Guard

Assets exceeding 2 MB are no longer loaded inline in the portal viewer. A "too large to preview" message with a download button is shown instead. This prevents the browser from choking on multi-megabyte CSV or JSON exports.

Affected surfaces:

  • Authenticated portal (asset viewer, preview modal, admin viewer)
  • Public viewer (backend skips S3 fetch, renders download prompt)
  • New download endpoint: GET /portal/view/{token}/content for public share content downloads

Provenance Middleware: Multiple Harvest Tools

MCPProvenanceMiddleware now accepts variadic tool names for provenance harvesting. Both save_artifact and trino_export trigger session tool call capture.

Other Changes

  • docs: comprehensive portal documentation with screenshots (#329)
  • docs: complete OpenAPI spec with examples for all endpoints (#330)
  • deps: bump go.opentelemetry.io/otel/sdk from 1.41.0 to 1.43.0 (#328)
  • deps: bump dompurify from 3.3.2 to 3.4.0 (#327)

Installation

Homebrew (macOS)

brew install txn2/tap/mcp-data-platform

Claude Code CLI

claude mcp add mcp-data-platform -- mcp-data-platform

Docker

docker pull ghcr.io/txn2/mcp-data-platform:v1.56.0

Verification

All release artifacts are signed with Cosign. Verify with:

cosign verify-blob --bundle mcp-data-platform_1.56.0_linux_amd64.tar.gz.sigstore.json \
  mcp-data-platform_1.56.0_linux_amd64.tar.gz