Skip to content

mcp-data-platform-v1.64.0

Choose a tag to compare

@github-actions github-actions released this 19 May 20:51
· 96 commits to main since this release
ffed470

Highlights

  • API gateway embed-job queue is observable and parallelizable. A long-running spec embed no longer sits silent at 0/N indexed for minutes. The worker publishes a chunk-progress counter on every batch boundary so the catalog badge ticks upward live, and a new config knob lets multiple worker goroutines drain the queue in parallel.
  • Ollama batch embedding is now actually batched. EmbedBatch calls Ollama's POST /api/embed (singular path, plural input field) instead of N sequential POST /api/embeddings. Modern servers return one HTTP round-trip per batch; older servers are detected on the first 404 and the path transparently falls back. A 164-operation spec on CPU-only Ollama drops from ~10 minutes to model-inference floor.
  • Embedding provider misconfiguration is visible and fail-safe. When memory.embedding.provider is unset the platform substitutes the noop placeholder and now: logs one actionable WARN at startup, refuses to wire the apigateway embed-job queue (no more [0,0,...,0] rows in api_catalog_operation_embeddings masquerading as indexed), persists memory writes with Embedding: nil, exposes the state via GET /api/v1/admin/embedding/status, and surfaces an amber banner in the API Catalogs and Memory panels.
  • SRE-grade observability at the chokepoints. Prometheus metrics on every tools/call (latency P50/P95/P99, error rate, in-flight counts) and on every outbound api_invoke_endpoint (HTTP status class, transport-error category, per-connection latency). Cardinality-bounded by design: a closed allow-list of attributes, user-identifying fields kept off labels.
  • Correct gateway semantics for api_invoke_endpoint. A 4xx/5xx response from the upstream is no longer reported as a gateway failure. Wire-level HTTP from the REST shim stays 200 with the upstream code embedded in the payload; only true transport failures and timeouts surface as 502 / 504.
  • Auth chain output is quieter and easier to correlate. Successful fallthroughs between OIDC, API-key, and admin-key auth no longer log routine fallthroughs as warnings. Every auth decision carries a request-correlation ID so multi-step flows can be reconstructed from a single grep.

Operator-visible changes

New configuration

apigateway:
  embed_jobs:
    workers: 1   # default; raise to 2 or 4 if you have many specs and a fast embedder

CPU-only embedders typically saturate at 1 because the bottleneck is the model. Increase only when you have evidence the worker is the bottleneck.

New admin endpoints

Method Path Purpose
GET /api/v1/admin/embedding/status Returns {kind, model, dimension, status}. status="unconfigured" indicates the noop placeholder is in use; the portal renders an amber banner in this state and the apigateway embed-job queue refuses to start.
GET /metrics Prometheus scrape endpoint. Histograms + counters on tool-call latency, outbound apigateway latency by HTTP status class, and audit-write rate.

Wire-format additions (backward compatible, omitempty)

GET /api/v1/admin/api-catalogs/{id}/embedding-status response gains:

{
  "embedded_so_far": 12
}

embedded_so_far is the worker's in-flight chunk-progress counter. While job_status == "running" the portal renders this against operation_count so a long embed pass shows incremental progress instead of staying at 0/N until the final atomic upsert commits embedding_count.

Database migration

000046_api_catalog_embedding_jobs_progress adds embedded_so_far INTEGER NOT NULL DEFAULT 0 to api_catalog_embedding_jobs. On PostgreSQL 16 this is metadata-only for INTEGER columns; no table rewrite, no extended lock. Existing deployments roll forward automatically on platform start.

UI

  • API Catalogs panel: per-spec badge ticks indexing N/M upward during a running embed, distinct from queued (amber, 0/M) and N/N indexed (green).
  • API Catalogs + Memory tab on My Knowledge: amber banner when memory.embedding.provider is unset, naming the config key and clarifying that semantic features are disabled while lexical / keyword / entity-URN lookup still works.

Detailed changes

Features

  • #428 / #431. Prometheus observability at platform chokepoints. Histograms + counters on the MCPToolCallMiddleware and on api_invoke_endpoint outbound HTTP. Two instrumentation sites cover every toolkit at once: every tool call goes through the middleware, every model-driven API call goes through the gateway. Cardinality allowlist: tool, connection, persona, status_category, http_status_class, toolkit_kind. Forbidden labels: anything user-identifying (request IDs go on trace spans, not Prometheus labels). Pure Prometheus exporter; no OTel agent required.
  • #430 / #437. Incremental embed progress + configurable worker concurrency. Worker publishes embedded_so_far at every chunk boundary via a new Store.UpdateProgress call; the column lives in api_catalog_embedding_jobs and is read alongside the existing job-status fields. New apigateway.embed_jobs.workers config (default 1) spawns N goroutines that share the queue; existing FOR UPDATE SKIP LOCKED + lease guarantee keeps them from racing on the same job. After each successful Claim the worker notifies a sibling so a backlog drains in parallel. Worker exposes Concurrency() for assertion in wiring tests.

Bug fixes

  • #429 / #436. Silent noop embedding provider stored [0,0,...,0] vectors. Five defenses now refuse to persist zero vectors: apigateway wiring skips the queue when the embedder is noop; memory handleRemember and handleUpdate persist Embedding: nil instead of zero vectors; knowledge capture_insight skips the memory-store update under noop; Provider interface gains Kind() string + a package-level IsConfigured(p) bool helper for one-line guards; startup logs one WARN naming the config key. Existing recall-side guard at memory/recall.go:127 already refused all-zero query embeddings; the write side now mirrors that.
  • #435. Ollama EmbedBatch was named "batch" but didn't batch. Sent N sequential POST /api/embeddings calls (one per text). Now sends one POST /api/embed per batch with the full input array. Detects servers that lack the batch endpoint (HTTP 404) on the first call, logs one WARN naming the URL and model, transparently falls back to the singular path. Probe is paid at most once per provider lifetime. Eliminates 31 of every 32 HTTP round-trips on modern Ollama for the default batch size of 32.
  • #432 / #433. api_invoke_endpoint conflated upstream and transport failures. A 4xx/5xx response from the upstream was wrapped as a tool error (IsError=true) and the REST shim translated it to a 502, hiding the real upstream status. Now the wrapping CallToolResult reports IsError=false for any upstream response with a status code; Status carries the upstream HTTP code; the REST shim returns 200 with the upstream code embedded in the body. Only true transport failures (DNS, dial, TLS, mid-stream EOF) and request-timeouts surface as IsError=true and map to wire 502 / 504 respectively. Outcome classification (OutcomeOK, OutcomeUpstream4xx, OutcomeUpstream5xx, OutcomeTransportErr, OutcomeUpstreamTimeout) flows through to audit logs via _meta.audit_outcome so operators can grep by category.
  • #434. Auth chain fallthrough noise and missing request correlation. OIDC followed by API-key followed by admin-key authentication logged every successful fallthrough as a WARN; an admin-key request produced two warnings for every call. Successful fallthroughs are now silent (Debug level); only auth failures or unexpected errors log at WARN. Every auth decision carries a request-correlation ID propagated from the inbound request through every chain step, so a multi-step auth flow can be reconstructed by grepping a single ID.

Upgrade notes

  • Migration is automatic. migrate.Run on startup applies 000046. No operator action required.
  • No breaking config changes. apigateway.embed_jobs.workers is new and defaults to 1, which preserves prior single-goroutine behavior.
  • No breaking wire-format changes. New JSON fields are omitempty; consumers that ignore unknown fields are unaffected.
  • The Kind() string addition to the public embedding.Provider interface is a source-incompatible change for external implementations. Both in-tree implementations (Ollama, noop) are updated. Anyone vendoring this module and implementing embedding.Provider externally needs to add Kind() string returning a stable identifier.
  • First scrape of /metrics exposes zero values. Prometheus alerting rules that fire on absent series should be reviewed; on a freshly-deployed v1.64.0 some histograms only appear after the first relevant call.

Installation

Homebrew (macOS)

brew install txn2/tap/mcp-data-platform

Claude Code CLI

claude mcp add mcp-data-platform -- mcp-data-platform

Docker

docker pull ghcr.io/txn2/mcp-data-platform:v1.64.0

Verification

All release artifacts are signed with Cosign. Verify with:

cosign verify-blob --bundle mcp-data-platform_1.64.0_linux_amd64.tar.gz.sigstore.json \
  mcp-data-platform_1.64.0_linux_amd64.tar.gz

Full changelog

v1.63.0...v1.64.0