Skip to content

mcp-data-platform-v0.27.0

Choose a tag to compare

@github-actions github-actions released this 23 Feb 08:38
· 329 commits to main since this release
2a39f91

Session-Aware Workflow Gating & Tool Description Overrides (#152)

LLM agents consistently skip DataHub discovery and jump straight to trino_query, bypassing the curated business context that makes this platform valuable. The static require_datahub_check hint fired on every query regardless of whether discovery had already happened, creating hint fatigue that agents learned to ignore.

v0.27.0 replaces the static hint with four complementary layers that adapt to actual agent behavior within a session.

Built-in Description Overrides (zero config)

The platform now ships with built-in description overrides for trino_query and trino_execute that instruct agents to call datahub_search first. These activate automatically in tools/list responses — no configuration required. Agents see the guidance at tool discovery time, before they even attempt a query.

A new MCPDescriptionOverrideMiddleware intercepts tools/list responses and replaces descriptions for matching tool names. Deployers can customize or extend overrides via configuration:

tools:
  description_overrides:
    trino_query: "My custom description..."    # Overrides the built-in default
    my_custom_tool: "Explain what this does"   # Adds new override

Config entries take precedence over built-in defaults.

Session-Aware Workflow Tracking

A new SessionWorkflowTracker maintains per-session state recording which tool categories have been called. It knows whether any discovery tool (all 11 datahub_* tools by default) has been called before a query tool (trino_query, trino_execute). The tracker follows the same concurrent-map + TTL-cleanup pattern as SessionEnrichmentCache.

Key behavior: warning counts reset to zero when any discovery tool is called, rewarding agents that comply with the discovery-first workflow.

Adaptive Rule Enforcement with Escalation

MCPRuleEnforcementMiddleware has been rewritten to accept a RuleEnforcementConfig struct with dual code paths:

  • Session-aware path (when WorkflowTracker is configured): checks actual session state, warns only when discovery hasn't happened, and escalates after a configurable number of repeated violations.
  • Static fallback path (no tracker): preserves existing require_datahub_check behavior for backward compatibility.

Opt-in via configuration:

workflow:
  require_discovery_before_query: true
  # discovery_tools: []           # Defaults to all datahub_* tools
  # query_tools: []               # Defaults to trino_query, trino_execute
  # warning_message: ""           # Custom warning text
  escalation:
    after_warnings: 3             # Escalate after 3 ignored warnings
    # escalation_message: ""      # Custom escalation ({count} placeholder)

Without the workflow section, the platform falls back to the existing static hint — no breaking changes.

Enrichment Discovery Note (Soft Signal Layer)

When MCPSemanticEnrichmentMiddleware enriches a query result but no discovery has occurred in the session, a soft note is appended inline with the enrichment context:

Note: No DataHub discovery has been performed in this session yet. Call datahub_search to understand the business context, ownership, and data quality of the tables you are querying.

This fires only when enrichment actually added content (not on errors or unenriched results) and disappears once discovery occurs.

How the Layers Work Together

Layer Fires at What agents see
Description overrides Tool discovery (tools/list) Modified tool descriptions with "call datahub_search first"
Rule enforcement Query execution (pre-handler) Warning prepended to results, escalating after threshold
Enrichment note Response enrichment (post-handler) Soft note inline with semantic context
Session tracking Auth middleware (every call) Nothing directly — feeds data to the other layers

Middleware Chain (Updated)

tools/list path:
  Icons → DescOverrides → Visibility → Apps Metadata → handler

tools/call path:
  Auth/Authz (+RecordToolCall) → Audit → Rules (+session check)
    → Client Logging → Enrichment (+discovery note) → handler

New Files

File Purpose
pkg/middleware/mcp_descriptions.go Description override middleware + built-in defaults + merge logic
pkg/middleware/session_workflow.go Per-session workflow tracker with concurrent map + TTL cleanup

Breaking Changes

  • MCPRuleEnforcementMiddleware signature changed from (engine, hints) to (cfg RuleEnforcementConfig). This only affects Go library consumers who call the function directly. Binary users and YAML configuration are unaffected.

Installation

Homebrew (macOS)

brew install txn2/tap/mcp-data-platform

Claude Code CLI

claude mcp add mcp-data-platform -- mcp-data-platform

Docker

docker pull ghcr.io/txn2/mcp-data-platform:v0.27.0

Verification

All release artifacts are signed with Cosign. Verify with:

cosign verify-blob --bundle mcp-data-platform_0.27.0_linux_amd64.tar.gz.sigstore.json \
  mcp-data-platform_0.27.0_linux_amd64.tar.gz