Skip to content

mcp-data-platform-v1.52.0

Choose a tag to compare

@github-actions github-actions released this 09 Apr 08:30
· 196 commits to main since this release
5c1ced1

Memory Layer

Agents now accumulate knowledge across sessions. Corrections, preferences, business context, and domain expertise persist instead of disappearing when a session ends.

Memory is the storage layer. Knowledge capture is the governance workflow on top: admins review observations and decide what belongs in the DataHub catalog.

New Tools

Tool Purpose
memory_manage Create, update, archive, and list memories. Agents call this proactively when users share context.
memory_recall Retrieve memories via entity lookup, vector similarity (pgvector), DataHub lineage traversal, or all combined.

Both tools are opt-in per persona (add memory_* to tools.allow).

Proactive Capture

Agents no longer wait to be told "capture this." When a user says something like "stores close at 9pm" or "that column excludes returns," the agent records it automatically. The capture_insight and memory_manage tool descriptions now instruct agents to capture knowledge during normal conversation.

Cross-Injection

Active memories are automatically attached to Trino, DataHub, and S3 toolkit responses. When a query touches a dataset that has related memories, the context shows up without anyone asking for it.

Staleness Detection

A background watcher checks active memories against DataHub entity state. When a referenced dataset is deprecated or its schema changes, the memory is flagged as stale and excluded from default recall. Admins review stale memories via memory_manage(command='review_stale').

Vector Search

Memory records are embedded using Ollama (nomic-embed-text, 768 dimensions) via pgvector. Semantic recall finds relevant memories even when the exact entity URN is not known. Configure with memory.embedding.provider: ollama in your platform config.

Unified Knowledge & Memory Portal

The admin portal merges the previous Knowledge and Memory pages into a single "Knowledge & Memory" page with five tabs: Overview, Knowledge Capture, All Memory, Changesets, and Help.

Users get their own "Knowledge & Memory" page showing their captured insights and memories (read-only).

Knowledge Toolkit Refactor

capture_insight and apply_knowledge now write to and read from memory_records instead of the legacy knowledge_insights table. Migration 000031 handles the data migration automatically. Existing insights are preserved with their original status.

Enabled by Default

Memory, Knowledge, Portal, and Audit toolkits now default to enabled when a database is available. Set enabled: false to explicitly disable. No config change needed for existing deployments with these features already enabled.

Admin and Portal API

New REST endpoints for memory management:

  • GET /api/v1/admin/memory/records -- list, filter, paginate
  • GET /api/v1/admin/memory/records/stats -- aggregated counts
  • GET/PUT/DELETE /api/v1/admin/memory/records/{id} -- get, update, archive
  • GET /api/v1/portal/memory/records -- user-scoped list
  • GET /api/v1/portal/memory/records/stats -- user-scoped stats

Configuration

memory:
  embedding:
    provider: ollama
    ollama:
      url: "http://localhost:11434"
      model: "nomic-embed-text"
  staleness:
    enabled: true
    interval: 15m

Dev Environment

The dev PostgreSQL image is now pgvector/pgvector:pg16 (was postgres:16-alpine) to support the pgvector extension. Run ollama pull nomic-embed-text locally for semantic recall.

Installation

Homebrew (macOS)

brew install txn2/tap/mcp-data-platform

Claude Code CLI

claude mcp add mcp-data-platform -- mcp-data-platform

Docker

docker pull ghcr.io/txn2/mcp-data-platform:v1.52.0

Verification

All release artifacts are signed with Cosign. Verify with:

cosign verify-blob --bundle mcp-data-platform_1.52.0_linux_amd64.tar.gz.sigstore.json \
  mcp-data-platform_1.52.0_linux_amd64.tar.gz