Skip to content

mcp-data-platform-v0.16.0

Choose a tag to compare

@github-actions github-actions released this 10 Feb 19:41
· 372 commits to main since this release
956695e

This release delivers the knowledge management system — the platform's first write-back capability. Domain experts capture insights during sessions, admins review and approve them, and approved changes are written directly to the DataHub catalog with full rollback support.

Closes #81, #65, #69.

Knowledge Capture (capture_insight tool) — #65

A low-friction tool available to all personas for recording domain knowledge discovered during data exploration.

  • Six categories: correction, business_context, data_quality, usage_guidance, relationship, enhancement
  • Confidence levels: high, medium, low
  • Rich context: entity URNs (up to 10), related columns (up to 20), suggested catalog actions (up to 5)
  • Session-aware: automatically captures user identity, session ID, and active persona from platform context
  • New migration: 000006_knowledge_insightsknowledge_insights table with JSONB columns for entity URNs, related columns, and suggested actions

Knowledge Apply (apply_knowledge tool) — #81

An admin-only tool for the full insight lifecycle: review, approve, synthesize, apply, and rollback.

Action What it does
bulk_review Summary of all pending insights grouped by entity
review Insights for a specific entity alongside current DataHub metadata
approve / reject Status transitions with reviewer notes
synthesize Builds structured change proposals from approved insights
apply Writes changes to DataHub, records changeset for rollback

Supported catalog changes: update_description, add_tag, add_glossary_term, add_documentation, flag_quality_issue. All changes are atomic per entity — if any write fails, nothing is applied.

Insight lifecycle

pending → approved → applied → rolled_back
       ↘ rejected
       ↘ superseded (when newer insight is captured for same entity)

Admin REST API

Full HTTP endpoints for managing insights and changesets outside the MCP protocol:

  • GET/PUT /api/v1/admin/knowledge/insights — list, filter, edit, transition status
  • GET /api/v1/admin/knowledge/insights/stats — aggregated counts by status, category, entity, confidence
  • GET /api/v1/admin/knowledge/changesets — list and filter applied changesets
  • POST /api/v1/admin/knowledge/changesets/{id}/rollback — revert changes using stored previous-value snapshots

All admin endpoints require admin authentication with 401/403 enforcement.

New database migrations

  • 000007_knowledge_lifecycle — adds review columns (reviewed_by, reviewed_at, review_notes) to knowledge_insights
  • 000008_knowledge_changesets — creates knowledge_changesets table with before/after JSONB snapshots; adds apply-tracking columns to knowledge_insights

Configuration

Both tools are gated by the existing persona tool-filter. capture_insight is registered whenever knowledge.enabled: true. apply_knowledge is not registered at all unless knowledge.apply.enabled: true — it does not exist as a callable tool until explicitly opted in. Persona filters are a second layer on top of that.

knowledge:
  enabled: true
  apply:
    enabled: true                  # apply_knowledge tool only exists when true
    datahub_connection: primary
    require_confirmation: true     # apply action requires explicit confirm param

personas:
  definitions:
    admin:
      tools:
        allow: ["*"]               # full access including apply_knowledge
    analyst:
      tools:
        allow: ["trino_*", "datahub_*", "capture_insight"]
        # apply_knowledge not listed — even if registered, persona filter blocks it
    etl:
      tools:
        allow: ["trino_*", "s3_*"]
        deny: ["capture_insight"]  # non-interactive, no knowledge capture

DataHub write-back

Real implementation via mcp-datahub v0.5.0 client — updates descriptions, adds/removes tags, adds glossary terms, adds documentation links. Every write records previous_value for rollback. Noop fallback when write-back is not configured.

Config Schema Versioning — #69

Configuration files now support apiVersion for forward-compatible schema evolution:

  • apiVersion: v1 is the current (and default) version
  • Version lifecycle: current → deprecated (with warnings) → removed (with migration guide)
  • New --migrate-config CLI flag normalizes config files and prepends explicit version markers
  • Backward compatible: configs without apiVersion default to v1

New packages

Package Purpose
pkg/toolkits/knowledge/ Knowledge toolkit, insight/changeset stores, DataHub writer
pkg/admin/ Admin REST API handler, knowledge endpoints, auth middleware

Dependencies

  • github.com/txn2/mcp-datahub 0.4.4 → 0.5.0 (adds write-back support)

Stats

75 files changed, +14,856 / -269 lines