mcp-data-platform-v1.64.1
Highlights
Hotfix for the api-gateway embed-jobs worker timing out on every spec write against CPU-only Ollama. Spec saves on v1.64.0 with a 26+ operation spec ended in the badge cycling between indexing and failed indefinitely; this release stops that loop and adds the regression test that would have caught it before v1.64.0.
What changed
v1.64.0 introduced the batched /api/embed endpoint for embedding requests but kept the 30-second HTTP timeout that was tuned for the older singular /api/embeddings path. A batched POST of 26+ texts on CPU-only Ollama routinely exceeds 30 seconds, so the worker timed out on every attempt, retried five times, failed terminally, then the reconciler re-enqueued the job because operation count and embedding count still disagreed. The cycle never converged.
The fix is scoped to the asynchronous worker. Request-path embedding callers (memory_recall, memory_manage, capture_insight, apigateway queryVectorFor) continue to use the 30s default because their singular calls return in 1-3 seconds on CPU Ollama and operators do not want a wedged Ollama to hold an MCP tools/call open for minutes. Only the embed-jobs worker gets the longer timeout, scoped via a new config knob.
Operator-visible changes
New configuration
apigateway:
embed_jobs:
workers: 1
embed_timeout: 5m| Field | Default | Affects |
|---|---|---|
embed_jobs.embed_timeout |
5m |
The api-gateway embed-jobs worker's batched /api/embed POSTs against Ollama. Request-path embedding calls remain governed by memory.embedding.ollama.timeout (default 30s). |
GPU-backed Ollama deployments can tighten this value to keep the worker's failure floor short; CPU-only deployments should leave it at the default.
Behavior preserved
- No wire-format changes.
- No database migrations.
- No public API changes.
- No config breakage: deployments that did not set the new key get the documented 5m default automatically.
Upgrade notes
- No operator action required beyond rolling the pod. The new default behaves correctly for both CPU and GPU Ollama deployments. Operators who already worked around the bug on v1.64.0 by setting
memory.embedding.ollama.timeout: 5mcan leave that override in place or remove it; the worker no longer depends on it. - Tightening the worker timeout on GPU embedders is now possible. Operators on fast embedders who want a short failure floor on the worker (e.g. 30 seconds) can set
apigateway.embed_jobs.embed_timeout: 30swithout affecting other consumers.
Detailed changes
-
#445 / #446. Scope the long batch timeout to the embed-jobs worker.
embedding.DefaultTimeoutstays at 30s. Newapigateway.embed_jobs.embed_timeoutconfig (default 5m). NewPlatform.workerEmbedder()helper constructs a dedicated Ollama provider with that timeout; the sharedp.embeddingProvused by request-path callers is unchanged. Three new unit tests cover the scoping including a regression-prevention assertion that the shared instance keeps its 30s default when the worker timeout is configured.New
internal/testollamahelper andpkg/platform/integration_embedjobs_realollama_test.go(build tagintegration) drive the production embedding path end-to-end against a real Ollama container runningnomic-embed-text. Pre-v1.64.1 behavior would fail this test on the same batch shape that caused the incident; post-fix it passes with margin. This pattern is reusable by upcoming embedding consumers (DataHub semantic search, prompt discovery, knowledge insights recall, portal asset search) so synthetic-delay stubs are no longer the only coverage available.
Workaround on v1.64.0 (no upgrade required)
Add timeout: 5m under memory.embedding.ollama in the platform configmap and roll the deployment.
Installation
Homebrew (macOS)
brew install txn2/tap/mcp-data-platformClaude Code CLI
claude mcp add mcp-data-platform -- mcp-data-platformDocker
docker pull ghcr.io/txn2/mcp-data-platform:v1.64.1Verification
All release artifacts are signed with Cosign. Verify with:
cosign verify-blob --bundle mcp-data-platform_1.64.1_linux_amd64.tar.gz.sigstore.json \
mcp-data-platform_1.64.1_linux_amd64.tar.gz