Skip to content

mcp-data-platform-v1.61.9

Choose a tag to compare

@github-actions github-actions released this 16 May 07:09
· 112 commits to main since this release
1bb5c4f

Highlights

This release closes a class of related issues against the api toolkit and OAuth-backed connections. Three PRs ship together because each one was reachable from the same operator workflow: catalog a multi-spec vendor API, attach a connection, and use it under sustained load.

OAuth refresh-token rotation no longer takes connections out of service (#414)

Two concurrent refresh attempts for the same connection-OAuth row each loaded the persisted refresh_token, each POSTed it to the IdP, and against any IdP that enforces one-time-use rotation (RFC 6749 §6, the Microsoft Entra rotation-enforced configuration, Keycloak with rotation enabled, Salesforce) the loser received invalid_grant for a token the winner had just consumed. The loser was misclassified as a revoked credential and the persisted row was deleted, taking the connection out of service after the first concurrent refresh window. Symptom in production: a connection works for several invocations then sticks at "refresh failed" until an operator re-authorizes.

The Store interface now exposes a Lock(ctx, key) method that returns an exclusive lock release function. PostgresStore implements it via pg_advisory_lock acquired on a dedicated *sql.Conn, keyed by a 64-bit FNV-1a hash of "connoauth:" + kind + "/" + name. Postgres advisory locks are session-scoped, so a crashed lock holder is auto-released by the database on connection close. The lock is held across processes: two replicas calling Lock for the same key serialize at the database level. MemoryStore implements it via a sync.Map of per-key *sync.Mutex. Both backends return an idempotent release function safe to defer.

Source.Token and Source.Reacquire acquire the lock before the refresh path and re-read the persisted row after acquisition. A caller that finds the access token still valid after the lock returns it without an IdP round-trip, so a fleet of concurrent callers across replicas produces exactly one refresh per rotation interval per key.

A companion fix preserves the IdP's RFC 6749 error and error_description fields in the sanitized token-fetch error. The previous output ("status=400") forced every diagnostic round trip through operator log access. The new output ("status=400 error=invalid_grant error_description=Token is not active") disambiguates the common refresh failure modes from a single client-visible string. The description is bounded to 200 characters and scrubbed of CR/LF and URL-shaped substrings.

api_list_endpoints diagnostics and resilience (#415)

The semantic and hybrid ranking modes shipped with three independent weaknesses that compounded into Error occurred during tool execution on multi-spec catalogs:

  • rankWithMode, queryVectorFor, and ensureEmbeddings all dropped the embedder's error on the floor. A misconfigured Ollama URL, an upstream timeout, or a model with a context-window cap all surfaced as the same generic fallback Note.
  • EmbedBatch was called once with every operation's text. A 350-operation catalog produced one large POST that risked timing out or hitting an upstream context-window cap.
  • A panic anywhere in the ranking pipeline propagated up to the MCP SDK, which presented it as a generic tool-execution error with no traceability.
  • indexOf keyed only on (Method, Path). Multi-spec catalogs where the same tuple legitimately exists in two specs received the first-seen spec's embedding for every match, producing silently wrong rankings.

Fixes:

  • rankWithMode now returns a fallbackReason string. handleListEndpoints interpolates the reason into the Note: "ranking ""semantic"" fell back to lexical: embedding provider not configured on this connection", "... embed query: <provider error>", "... query embedding is the zero vector (misconfigured embedding model)".
  • ensureEmbeddings and queryVectorFor return errors. Errors are also logged with the connection name, mode, and the underlying error so the cause is traceable from the pod log.
  • embedInBatches chunks at embedBatchSize = 32 and short-circuits the loop on the first batch error.
  • handleListEndpoints defers a recover that returns "internal error in list_endpoints (connection=<name>): <panic message>" and marks the connection's embed cache as failed so a misbehaving provider cannot be re-hit on every subsequent call. The follow-up call falls back cleanly to lexical with a Note.
  • indexOf now keys on (Method, Path, Spec).

api_list_endpoints discovery filters (#415)

The lexical filter was a single substring match against the whole query string. For a 350-operation catalog across 9 specs:

  • "gift list" matched zero operations because no single field contained the phrase, even though plenty of operations contained both words.
  • The spec name appeared in the output but was not a search target.
  • There was no way to filter to one section. The model had to scan all 350 operations.

Fixes:

  • rankOperations tokenizes the query on whitespace and requires every token to substring-match at least one searchable field.
  • operationFieldsContain adds the spec name to the searched fields.
  • ListEndpointsInput.Spec (new) restricts results to one component spec, applied before ranking so the rank limit acts within the chosen spec.
  • lexicalScore extracts the per-token AND check so the hybrid scorer's lexical signal uses the same multi-token semantics.

Spec base paths reach the operation surface (#415)

Operation paths in api_list_endpoints were taken verbatim from each OpenAPI document's paths: map. Vendor specs that declare servers[] with a versioned base (https://host/foo/v1) need that segment in front of every operation path or the invoke-time URL join produces a 404.

  • specBasePath extracts the path component of servers[0].url and normalizes relative URLs to start with /.
  • computeEffectiveBasePath drops the prefix when the connection's base_url already contains it as a suffix, so the invoke-time URL join never doubles the segment.
  • specState.effectiveBasePath stores the resolved value per spec. buildOperationIndex (at registration time) and collectOperationMatches (at runtime) use the same value. api_list_endpoints, api_get_endpoint_schema, and the synthesized operationID for endpoints without an explicit one all now agree on a single full path.

Non-null operations (#415)

A zero-match query produced "operations": null while the catalog-less branch produced "operations": []. Clients had to handle two empty shapes. rankOperations now initializes the result slice so zero-match returns "operations": [] consistently.

OAuth status card no longer renders an empty error banner (#413)

Clicking "Refresh now" against a connection whose IdP refresh failed produced a red banner with no text. The catch block rendered err.message directly; some failure paths surfaced an ApiError whose detail was the empty string (HTTP/2 fetches expose an empty statusText, and any server path that wrote a problem+json body with an empty detail field reached the client as ApiError(status, "")). The banner now uses formatActionError which always returns a non-empty operator-readable string: detail if non-empty, else "<label> (HTTP <status>)", else the caller's label. Render guards on actionMsg.text and status.last_error refuse to render the destructive banner when the text would be empty.

Behavior changes worth noting

  • Store.Lock is a new method on the connoauth.Store interface. Both built-in implementations (MemoryStore, PostgresStore) provide it; no external implementers exist.
  • OperationSummary.Path from api_list_endpoints and EndpointSchemaOutput.Path from api_get_endpoint_schema now include the spec's base path when servers[].url declares one. For connections whose base_url already contains the segment, the dedupe keeps the behavior unchanged. The model passes whatever path the catalog tool returned to api_invoke_endpoint directly.
  • api_list_endpoints accepts a new optional spec filter input.
  • ranking=hybrid's lexical signal is now per-token AND rather than phrase-match. Multi-token intent queries are no longer systematically underweighted under hybrid mode.

Changelog

  • 1bb5c4f: fix(apigateway): ranking diagnostics, discovery filters, spec base paths, non-null operations (#415)
  • 902ac2b: fix(connoauth): distributed refresh lock; preserve IdP error code and description (#414)
  • 6ec0b1c: fix(ui): OAuth status card no longer renders an empty error banner (#413)

Installation

Homebrew (macOS)

brew install txn2/tap/mcp-data-platform

Claude Code CLI

claude mcp add mcp-data-platform -- mcp-data-platform

Docker

docker pull ghcr.io/txn2/mcp-data-platform:v1.61.9

Verification

All release artifacts are signed with Cosign. Verify with:

cosign verify-blob --bundle mcp-data-platform_1.61.9_linux_amd64.tar.gz.sigstore.json \
  mcp-data-platform_1.61.9_linux_amd64.tar.gz