mcp-data-platform-v0.24.0
Highlights
v0.24.0 delivers significant token savings for semantic enrichment — reducing repeat-query overhead by up to 90% and eliminating bloat from undocumented columns. Enrichment token cost is now tracked through the audit pipeline for observability.
Features
Omit empty column descriptions from enrichment (#130)
Columns with no meaningful catalog metadata are now filtered from enrichment responses. Previously, every column was included even when the catalog had no description, tags, glossary terms, PII flags, or inherited metadata — common with auto-ingested schemas.
- New
ColumnContext.HasContent()method checks for meaningful metadata before inclusion - When all columns lack metadata, a
column_context_notereplaces the emptycolumn_contextmap - Impact: A 50-column table with 5 documented columns drops from ~3,000 to ~500 enrichment tokens
Compact dedup summary mode (#131)
The "summary" dedup mode previously re-sent the full semantic_context (description, owners, tags, glossary terms, custom properties) on repeat queries — just without column details. Now it sends a compact_context containing only critical safety-relevant fields:
| Field | Included when |
|---|---|
urn |
Always (if set) |
domain |
Always (if set) |
deprecation |
Only when actively deprecated |
quality_score |
Always (if set) |
critical_tags |
Tags matching: pii, sensitive, quality, restricted, confidential |
Description, owners, full tag list, glossary terms, and custom properties are omitted since they were already sent in the full enrichment earlier in the session.
Enrichment token tracking (#131)
Enrichment token cost is now estimated and recorded in every audit event:
- Token estimation uses a
len(chars) / 4heuristic applied before and after enrichment - Full enrichment token counts are stored per-table in the session cache
- Dedup enrichment measures actual tokens sent vs. what full enrichment would have cost
- Two new fields in audit events:
enrichment_tokens_fullandenrichment_tokens_dedup - Cumulative counters on
SessionEnrichmentCachefor runtime diagnostics
Database Migration
Migration 000011 adds two columns to audit_logs:
ALTER TABLE audit_logs ADD COLUMN enrichment_tokens_full INTEGER DEFAULT 0;
ALTER TABLE audit_logs ADD COLUMN enrichment_tokens_dedup INTEGER DEFAULT 0;This migration runs automatically on startup. Existing rows default to 0.
Backward Compatibility
- Session persistence is fully backward-compatible. Existing persisted sessions using the old
map[string]time.Timeformat are parsed correctly withTokenCount: 0. New sessions persist asSentTableEntrywith bothsent_atandtoken_count. - The
compact_contextkey replacessemantic_contextin dedup summary responses. Clients consuming the "summary" dedup mode should update to look forcompact_contextinstead. - The "reference" and "none" dedup modes are unchanged.
Installation
Homebrew (macOS)
brew install txn2/tap/mcp-data-platformClaude Code CLI
claude mcp add mcp-data-platform -- mcp-data-platformDocker
docker pull ghcr.io/txn2/mcp-data-platform:v0.24.0Verification
All release artifacts are signed with Cosign. Verify with:
cosign verify-blob --bundle mcp-data-platform_0.24.0_linux_amd64.tar.gz.sigstore.json \
mcp-data-platform_0.24.0_linux_amd64.tar.gz