Skip to content

mcp-data-platform-v1.72.0

Choose a tag to compare

@github-actions github-actions released this 31 May 01:09
· 54 commits to main since this release
8bb8dc8

Overview

This release adds platform_find_tools, a discovery tool that ranks the platform's own registered tools by semantic similarity to a natural-language task description, and makes the shared index-jobs framework (introduced in v1.71.0) multi-consumer. It is the second consumer of that framework, after api-catalog operation embeddings.

It carries one additive database migration (see Upgrade notes). There are no changes to existing tools, the admin API, or configuration.

What platform_find_tools does

platform_find_tools(query, limit) embeds a natural-language query and returns the most relevant registered tools by cosine similarity, each with a score, filtered to the tools the caller's persona is permitted to call. It is the tool-catalog analogue of api_list_endpoints semantic ranking. When no embedding provider is configured or the index is empty, it falls back to a lexical name/description match and says so in a note.

Positioning: this is the application-level form of semantic tool discovery and the embedding-index foundation for server-side tools/list filtering (MCP SEP-1821). The protocol-level filter, where the server returns only the relevant subset, is tracked separately; this release ships the index and ranking it will build on.

How it works

  • A "tools" consumer on the shared index-jobs queue embeds every globally-visible tool descriptor (name, description, and a parameter-schema summary) into a new tool_embeddings table (pgvector). Unlike the api-catalog consumer, whose corpus is a database table, the tool corpus is the in-process registry, so the source enumerates the live tools through an internal listing and the embed pass is persona-neutral.
  • The tools consumer re-syncs on the reconciler's schedule rather than via count-based gap detection: a description edit or a visibility flip changes the live descriptors without changing the stored count, so the worker re-enumerates each sweep and its text-hash dedup re-embeds only the descriptors that actually changed (an unchanged pass costs no embedding calls). Tool additions, removals, description-override edits, and visibility flips are picked up automatically.
  • Ranking uses pgvector cosine distance against tool_embeddings. Persona scoping is applied at read time with the same predicate the tools/list visibility middleware uses (one shared implementation, so the two cannot diverge). Embeddings are persona-neutral and indexed once; results are filtered per caller, not duplicated per persona.

Framework change

The index-jobs wiring no longer gates the entire queue on the api-catalog store being present. The shared store, worker, reaper, and reconciler now wire whenever a database and a configured embedding provider are available. api-catalog registers only when its catalog store exists; the tools consumer always registers. This makes the framework genuinely multi-consumer.

Upgrade notes

  • Migration 000053_tool_embeddings adds the pgvector(768) table keyed (source_id, tool_name). No foreign key (tools are not database rows). The 768 dimensionality matches the platform's pinned embedding model, consistent with api_catalog_operation_embeddings.
  • No operator action is required. On a deployment with a database and a configured embedding provider, the tools index builds on the next boot and re-syncs automatically. On deployments without an embedding provider, indexing stays dormant and platform_find_tools falls back to lexical matching.

Compatibility

  • Existing MCP tools, the admin REST API, and configuration are unchanged.
  • The api-catalog embedding consumer (its source, sink, and admin status endpoints) is behaviorally unchanged; only the queue's wiring gate moved.

Installation

Homebrew (macOS)

brew install txn2/tap/mcp-data-platform

Claude Code CLI

claude mcp add mcp-data-platform -- mcp-data-platform

Docker

docker pull ghcr.io/txn2/mcp-data-platform:v1.72.0

Verification

All release artifacts are signed with Cosign. Verify with:

cosign verify-blob --bundle mcp-data-platform_1.72.0_linux_amd64.tar.gz.sigstore.json \
  mcp-data-platform_1.72.0_linux_amd64.tar.gz