Skip to content

mcp-data-platform-v1.71.0

Choose a tag to compare

@github-actions github-actions released this 30 May 23:18
· 55 commits to main since this release
3ec5a1a

Overview

This release generalizes the api-catalog embedding queue into a reusable, source-kind-agnostic indexing-job framework (pkg/indexjobs) and migrates the api-catalog toolkit to consume it as the first client. It is an internal infrastructure change: there are no new tools, no admin-API changes, and no configuration changes for users or operators. It does carry a database migration (see Upgrade notes below).

The framework is the foundation for upcoming semantic-search consumers (tool discovery, prompt library, knowledge-insight recall, portal asset search), each of which will plug in as a small Source + Sink rather than forking the queue.

Upgrade notes

This release applies two migrations on startup:

  • 000051_index_jobs adds the shared index_jobs queue table, keyed on an opaque (source_kind, source_id) pair.
  • 000052_drop_api_catalog_embedding_jobs removes the old per-toolkit api_catalog_embedding_jobs table.

No operator action is required. The dropped table held only transient queue rows. On first boot after the upgrade, the reconciler compares each spec's operation_count against its persisted vector count and re-enqueues any gaps; the worker re-converges them through the new queue. During that brief reconverge window, api_list_endpoints semantic/hybrid ranking falls back to lexical for any not-yet-reindexed spec, exactly as it does for a freshly added spec.

The api-catalog vector table (api_catalog_operation_embeddings) is untouched and keeps its ON DELETE CASCADE to api_catalog_specs, so no embedding data is recomputed or moved and spec deletion still cascades to its vectors.

Rollback is supported: the 000052 down migration recreates api_catalog_embedding_jobs.

What changed

New: pkg/indexjobs framework

A Postgres-backed job queue generic over (source_kind, source_id). The queue mechanics are the proven pattern from the api-catalog queue: lease-based claim with FOR UPDATE SKIP LOCKED, exponential-backoff retry, a reaper that releases expired leases, LISTEN/NOTIFY low-latency wake-ups, and a periodic gap reconciler. Consumers implement two small contracts:

  • Source declares what text to embed for a source_id and an optional post-embed hook.
  • Sink declares where vectors live and how to detect gaps for that kind.

The framework owns everything in between: SHA-256 text-hash dedup, batched embedding-provider calls, chunk-boundary progress, incremental persistence, and the full claim/lease/retry/reaper/reconcile state machine. One worker pool, one reaper, and one reconciler serve every registered kind, routing by the source_kind on each job row.

api-catalog migrated to the framework

The api-catalog toolkit is the first consumer (pkg/toolkits/apigateway/catalogindex). Its Sink writes the existing api_catalog_operation_embeddings table; an AdminStore backs the admin handler from index_jobs joined to the api-catalog tables.

  • The admin endpoints (/api/v1/admin/api-catalogs/{id}/embedding-status, embedding-health, embedding-jobs) keep identical URLs and JSON shapes.
  • The apigateway.embed_jobs.* configuration keys (workers, batch_size, lease_duration, embed_timeout) are unchanged.

Removed

The pkg/toolkits/apigateway/embedjobs package is removed; its behavior is now provided by pkg/indexjobs plus the api-catalog Source/Sink.

Compatibility

  • MCP tools: unchanged.
  • Admin REST API: unchanged (URLs and response shapes).
  • Configuration: unchanged.
  • Database: two additive/cleanup migrations applied automatically; rollback supported.

Commits

Installation

Homebrew (macOS)

brew install txn2/tap/mcp-data-platform

Claude Code CLI

claude mcp add mcp-data-platform -- mcp-data-platform

Docker

docker pull ghcr.io/txn2/mcp-data-platform:v1.71.0

Verification

All release artifacts are signed with Cosign. Verify with:

cosign verify-blob --bundle mcp-data-platform_1.71.0_linux_amd64.tar.gz.sigstore.json \
  mcp-data-platform_1.71.0_linux_amd64.tar.gz