Skip to content

mcp-data-platform-v0.21.1

Choose a tag to compare

@github-actions github-actions released this 17 Feb 08:32
· 348 commits to main since this release
d46a761

What's New in v0.21.1

Knowledge insights now track their origin, a unified connections tool replaces per-toolkit variants, and cleanup routines across the platform get proper structured logging.

Knowledge Insight Source Tracking

The capture_insight tool gains a source field that distinguishes where an insight came from:

Source When to use
user (default) User shares domain knowledge during conversation
agent_discovery Agent figures something out independently by sampling data, finding join relationships, or identifying quality patterns
enrichment_gap Agent flags a metadata gap it cannot resolve from the data alone — needs admin attention

Backward compatible: existing insights default to user. The field is filterable in both the MCP tool and admin API.

Migration 000010 adds source TEXT NOT NULL DEFAULT 'user' with an index to knowledge_insights.

Expanded agent guidance prompt — three new sections teach LLM agents when to self-capture discoveries vs. ask the user:

  • Agent-Discovered Insights: When and how to record findings with source: "agent_discovery" (e.g., discovering column semantics via SELECT DISTINCT, finding undocumented joins, identifying refresh cadence)
  • When to Ask the User Instead: Ambiguous interpretations, high-impact classifications (PII, deprecation), insufficient data to draw conclusions
  • When NOT to Capture: Trivially obvious gaps without added meaning, speculative interpretations without query evidence, repeated gaps within a session

Unified list_connections Tool

Per-toolkit connection listing tools (trino_list_connections, datahub_list_connections, s3_list_connections) are replaced by a single list_connections platform tool that reports all configured data connections across every toolkit in one call. This reduces tool clutter and gives agents a single entry point to discover what data sources are available.

{
  "connections": [
    {"kind": "trino", "name": "prod", "connection": "prod-trino"},
    {"kind": "datahub", "name": "primary", "connection": "primary-datahub"},
    {"kind": "s3", "name": "data-lake", "connection": "data-lake-s3"}
  ],
  "count": 3
}

The pkg/tools placeholder package is removed — it existed only to hold an example toolkit that was never used in production.

Structured Logging and Cleanup Improvements

  • LOG_LEVEL environment variable: Configure slog JSON logging level at startup (debug, info, warn, error). Defaults to info.
  • Cleanup goroutines: Audit log cleanup, OAuth token/code expiration, and session cleanup routines now use slog.Warn for error reporting instead of silently discarding errors.
  • Lifecycle rollback: Extracted a rollback() helper that logs individual stop-callback failures during startup rollback instead of ignoring them.
  • Config cleanup: Removed unused UserPersonas field from persona mapper and unused IdleTimeout field from platform config.
  • Server factory simplification: NewWithDefaults() returns (*mcp.Server, error) only — toolkit lifecycle is now fully managed by the platform, not the server factory.

Breaking Changes

  • Per-toolkit *_list_connections tools removed: Agents that called trino_list_connections, datahub_list_connections, or s3_list_connections should use list_connections instead.
  • internal/server.NewWithDefaults() signature change: Returns (*mcp.Server, error) instead of (*mcp.Server, Toolkit, error). Only affects consumers using the library API directly.

Installation

Homebrew (macOS)

brew install txn2/tap/mcp-data-platform

Claude Code CLI

claude mcp add mcp-data-platform -- mcp-data-platform

Docker

docker pull ghcr.io/txn2/mcp-data-platform:v0.21.1

Verification

All release artifacts are signed with Cosign. Verify with:

cosign verify-blob --bundle mcp-data-platform_0.21.1_linux_amd64.tar.gz.sigstore.json \
  mcp-data-platform_0.21.1_linux_amd64.tar.gz