mcp-data-platform-v1.52.0
Memory Layer
Agents now accumulate knowledge across sessions. Corrections, preferences, business context, and domain expertise persist instead of disappearing when a session ends.
Memory is the storage layer. Knowledge capture is the governance workflow on top: admins review observations and decide what belongs in the DataHub catalog.
New Tools
| Tool | Purpose |
|---|---|
memory_manage |
Create, update, archive, and list memories. Agents call this proactively when users share context. |
memory_recall |
Retrieve memories via entity lookup, vector similarity (pgvector), DataHub lineage traversal, or all combined. |
Both tools are opt-in per persona (add memory_* to tools.allow).
Proactive Capture
Agents no longer wait to be told "capture this." When a user says something like "stores close at 9pm" or "that column excludes returns," the agent records it automatically. The capture_insight and memory_manage tool descriptions now instruct agents to capture knowledge during normal conversation.
Cross-Injection
Active memories are automatically attached to Trino, DataHub, and S3 toolkit responses. When a query touches a dataset that has related memories, the context shows up without anyone asking for it.
Staleness Detection
A background watcher checks active memories against DataHub entity state. When a referenced dataset is deprecated or its schema changes, the memory is flagged as stale and excluded from default recall. Admins review stale memories via memory_manage(command='review_stale').
Vector Search
Memory records are embedded using Ollama (nomic-embed-text, 768 dimensions) via pgvector. Semantic recall finds relevant memories even when the exact entity URN is not known. Configure with memory.embedding.provider: ollama in your platform config.
Unified Knowledge & Memory Portal
The admin portal merges the previous Knowledge and Memory pages into a single "Knowledge & Memory" page with five tabs: Overview, Knowledge Capture, All Memory, Changesets, and Help.
Users get their own "Knowledge & Memory" page showing their captured insights and memories (read-only).
Knowledge Toolkit Refactor
capture_insight and apply_knowledge now write to and read from memory_records instead of the legacy knowledge_insights table. Migration 000031 handles the data migration automatically. Existing insights are preserved with their original status.
Enabled by Default
Memory, Knowledge, Portal, and Audit toolkits now default to enabled when a database is available. Set enabled: false to explicitly disable. No config change needed for existing deployments with these features already enabled.
Admin and Portal API
New REST endpoints for memory management:
GET /api/v1/admin/memory/records-- list, filter, paginateGET /api/v1/admin/memory/records/stats-- aggregated countsGET/PUT/DELETE /api/v1/admin/memory/records/{id}-- get, update, archiveGET /api/v1/portal/memory/records-- user-scoped listGET /api/v1/portal/memory/records/stats-- user-scoped stats
Configuration
memory:
embedding:
provider: ollama
ollama:
url: "http://localhost:11434"
model: "nomic-embed-text"
staleness:
enabled: true
interval: 15mDev Environment
The dev PostgreSQL image is now pgvector/pgvector:pg16 (was postgres:16-alpine) to support the pgvector extension. Run ollama pull nomic-embed-text locally for semantic recall.
Installation
Homebrew (macOS)
brew install txn2/tap/mcp-data-platformClaude Code CLI
claude mcp add mcp-data-platform -- mcp-data-platformDocker
docker pull ghcr.io/txn2/mcp-data-platform:v1.52.0Verification
All release artifacts are signed with Cosign. Verify with:
cosign verify-blob --bundle mcp-data-platform_1.52.0_linux_amd64.tar.gz.sigstore.json \
mcp-data-platform_1.52.0_linux_amd64.tar.gz