v1.67.0

Eight PRs spanning a new outbound-auth mode, a security fix on stateless HTTP, audit improvements, smarter outbound content negotiation, an off-by-default cross-toolkit enrichment fallback, two admin-portal redesigns, and the connections-editor cleanup that fell out of building mTLS.

Highlights

mTLS for the API gateway, plus a per-connection CA bundle for upstreams behind a private root.
Security fix: a cross-user provenance leak on stateless HTTP transport is closed.
Audit logs gain caller-class tagging (mcp / rest / admin) and automatic monthly partition rotation.
api_invoke_endpoint content-type is now driven by the connection's OpenAPI catalog when present.
Cross-toolkit enrichment gets an opt-in semantic-similarity fallback when URN equality misses.
Personas page is rebuilt around a single always-on editor with tabbed Permissions / AI behavior.
Connections page: cleaner sidebar, validated identifier input, help-modal-driven auth and TLS reference, no more confusing connection_name override slot.

Security

Cross-user provenance leak on stateless HTTP transport (#451)

extractSessionID returned the literal "stdio" sentinel whenever the SDK could not surface a session ID. On HTTP transport every stateless caller (no Mcp-Session-Id header, no initialize handshake) inherited that sentinel and pooled into one shared "stdio" bucket used by ProvenanceTracker, the session gate, the dedup cache, and the workflow tracker.

ProvenanceTracker.Record already guards against empty session IDs to prevent cross-user mixing, but the "stdio" sentinel bypassed that guard. The result: a harvest tool (save_artifact, trino_export, api_export) called by any stateless HTTP caller could return provenance the pool had accumulated regardless of who invoked it.

This release leaves SessionID empty for stateless HTTP callers, which makes the existing empty-skip guard fire and prevents any bucket from accumulating. Stdio still uses the sentinel (one logical client per process). The fix is pinned by TestProvenanceNoCrossUserMixing_StatelessHTTP, which wires two distinct principals through the assembled middleware chain and asserts that Harvest("stdio") and Harvest("") both return zero entries.

Operator impact: for HTTP-transport deployments, pc.SessionID for stateless callers changes from "stdio" to "". Audit handles empty SessionID (column default ''). Memory and knowledge stores are scoped by persona / email and never bucket on session ID alone. The session gate, dedup cache, and workflow tracker still write to an empty-key bucket for stateless HTTP, which is a known cosmetic regression but carries no data leakage.

The PR also reduces resources/list log noise and adds user_id / persona attribution to the surviving DEBUG line.

New: mTLS auth and private-CA trust (#468, closes #466)

Adds standard RFC 5246 / 8446 client certificate authentication to the HTTP API gateway toolkit alongside the existing header-based auth modes, plus a per-connection CA bundle for verifying upstreams behind a private root.

Why this matters

mTLS is the IETF standard for HTTPS client authentication. Common upstreams that need it and that the toolkit could not reach until this release:

Service mesh peering (Istio, Linkerd, Consul Connect) where workload identity is a mesh-issued client cert.
PKI-fronted enterprise APIs that pre-date OAuth.
Healthcare integration engines (Mirth Connect, Rhapsody, InterSystems IRIS HealthShare).
Financial messaging: SWIFT REST surfaces, Open Banking / FAPI profiles, bank-direct payment APIs.
FedRAMP / DoD-boundary endpoints.
HashiCorp Vault when the configured auth method is cert/.
Kubernetes API server and etcd direct REST access.
Apache Kafka REST Proxy, Schema Registry, NiFi, and similar Apache projects when deployed with the standard security profile.
Any HTTPS service signed by a private CA (the CA-bundle half is useful on its own for bearer-over-private-CA cases).

Two orthogonal TLS concerns on every `kind: api` connection

Field	Type	Encrypted	Purpose
`mtls_client_cert_pem`	PEM	no	Client certificate chain (leaf first).
`mtls_client_key_pem`	PEM	yes	Private key matching the cert.
`tls_ca_bundle_pem`	PEM bundle	no	Extra CAs trusted when verifying the upstream's TLS cert. Appended to the system root pool, never substitutes.

Both groups are optional. An internal HTTPS service behind a private CA may only need the bundle; an upstream that wants client auth but has a public TLS cert needs only cert + key; an upstream that wants both sets all three. With auth_mode: mtls, the cert IS the credential; with any other mode the cert layers on top (bearer + mTLS, oauth2 + mTLS, etc.).

There is no insecure_skip_verify toggle. Self-signed endpoints require pasting their CA into tls_ca_bundle_pem.

Write-time validation

Cert and key are mutually required: set both, or set neither.
The key must match the cert (signature check via tls.X509KeyPair).
Key strength: RSA at least 2048 bits, ECDSA P-256 / P-384 / P-521, or Ed25519. Weaker keys rejected.
The CA bundle must contain at least one parseable CERTIFICATE block when set.
auth_mode: mtls requires both cert and key.

OAuth modes inherit the CA bundle

connoauth.Config gains a CABundlePEM field threaded through the token-exchange and refresh paths. IdPs behind a private CA work end-to-end for both oauth2_client_credentials and oauth2_authorization_code, on both the initial Connect (code exchange) and the silent refresh path. Client mTLS material is intentionally NOT presented to the IdP (separate security domain).

Cert expiry surfaced (not enforced)

GET responses on /api/v1/admin/connection-instances/api/{name} include mtls_cert_not_after (RFC3339 UTC) computed from the leaf cert. The portal renders a color-coded badge (green at 30 or more days remaining, amber under 30, red on expired). The field is server-derived: PUT bodies are stripped of it and GET strips any stale persisted value before recomputing, so removing a cert cannot leave a phantom expiry behind. The toolkit does not refuse to make calls with an expired cert (the upstream's TLS layer will).

Connections editor cleanup that landed with mTLS

connection_name operator-facing override removed. It was always 1:1 with the instance name and surfaced as a confusing second name slot. Existing DB rows that carry the legacy key are tolerated (value silently dropped). TestParseConfig_IgnoresLegacyConnectionNameOverride pins the contract change.
ENCRYPTION_KEY caveats dropped from help text. The platform's at-rest encryption is operator-controlled; admins filling out connection forms on a SaaS deployment cannot see or set that variable and don't need to be educated about it. Help text now reads simply "Encrypted at rest. Use [REDACTED] when re-saving."
Auth-mode help modal replaces the wall-of-text paragraph under the dropdown with a per-mode table: human-readable label as primary text, machine identifier as small mono subtitle, what the gateway sends, when to use it.
TLS / mTLS help modal replaces the inline blurb with the full setup guidance (cert + key, CA bundle, validation rules, expiry-badge semantics). The three PEM textareas and the expiry badge stay in the form itself.

Audit improvements (#452)

Caller-class tagging via `source`

Tools on the platform are reachable through three entry points, all of which fire the same MCP audit middleware. Until this release the audit row's source was hardcoded to mcp for all three. Now:

`source`	Caller	Where set
`mcp`	Real MCP transport (stdio or HTTP/SSE), used by agents	Default in `pkg/middleware/mcp.go`
`rest`	Gateway REST shim at `POST /api/v1/gateway/{connection}/invoke`, used by NiFi, cronjobs, integrations	`pkg/gatewayhttp/handler.go::connectInternalSession`
`admin`	Admin REST API at `POST /api/v1/admin/tools/call`, used by portal-driven tool runs	`pkg/admin/tools.go::connectInternalSession`

Operators can now separate agent traffic from external automation in audit_logs without inferring it from user IDs.

Filter + UI surface

New query params on /admin/audit/events and /admin/audit/stats: source, toolkit_kind.
New dropdown values on /admin/audit/events/filters.
New Source column and dropdown on the portal Audit Log page; tooltips on hover; CSV export includes source.

Automatic monthly partition rotation

audit_logs is declared PARTITION BY RANGE (created_date) but until now only had a _default partition, so partitioning provided no benefit and retention DELETEs would get progressively slower at high call rates.

The audit cleanup goroutine now performs three-step maintenance on each tick:

Ensure upcoming audit_logs_YYYY_MM partitions exist (CREATE TABLE IF NOT EXISTS ... PARTITION OF audit_logs FOR VALUES FROM (...) TO (...), starting at next month to avoid partition-key conflicts on brownfield deployments).
Cleanup DELETE for the retention window.
Drop fully-expired named partitions (constant-time storage reclamation as high-volume callers scale).

Step failures are logged and isolated; a transient partition error never blocks the retention DELETE.

Catalog-driven Content-Type negotiation (#454, fixes #453)

api_invoke_endpoint previously chose the outbound Content-Type purely from the runtime type of the body argument: objects and arrays became application/json, strings became text/plain. When a tool-call layer pre-serialized a structured argument before delivery, the body arrived at the gateway as a string and went out as text/plain; charset=UTF-8. JSON-strict upstreams responded 400 even though the catalog clearly declared requestBody.content.application/json on the operation.

The fix consults the connection's OpenAPI catalog at invoke time. Selection order:

An explicit Content-Type in the model's headers argument always wins.
When the resolved (method, path) declares application/json, a string body that parses as JSON passes through verbatim with Content-Type: application/json. Object / array / scalar bodies keep today's application/json behavior. Strings that don't parse as JSON fall back to text/plain.
When the resolved operation declares a single non-JSON media type (application/xml, text/csv, etc.), string bodies are sent verbatim with that media type.
No catalog match falls through to today's type-driven behavior unchanged.

Path matching prefers literal templates over {name} placeholders for the same concrete path, so /users/me wins over /users/{id} deterministically (a 50-iteration test guards against Go's randomized map iteration). api_export runs through the same encoder, so the fix applies equally to the export path.

Backward compatibility: connections without a catalog, paths that don't resolve in the catalog, and any call that supplies an explicit Content-Type header all preserve the prior wire output character-for-character. The only behavior change is for catalog-resolved calls without a caller header and a body that is a JSON-parseable string: previously text/plain, now application/json.

Semantic-similarity enrichment fallback (#455, closes #444)

When a Trino-table URN-equality lookup misses on the semantic provider and the operator opts in via the new injection.semantic_fallback flag, the enrichment middleware now calls SearchTables with Mode=semantic and surfaces the top-K hits as suggested matches. The appended payload is wrapped under semantic_fallback, tagged with match_kind: semantic, and carries a human-readable note flagging the result as similarity-inferred rather than URN-resolved.

Off by default. Only fires on the single-table enrichment path (trino_describe_table and similar); multi-table SQL-query enrichment is unchanged. New audit column enrichment_match_kind records urn, semantic, or empty, so operators can measure the false-positive rate of similarity-based suggestions with a SQL aggregate.

injection:
  semantic_fallback: true        # default: false
  semantic_fallback_top_k: 1     # default: 1, clamped to [1, 10]

Deferred from the issue: per-result similarity scoring is upstream-blocked. DataHub's GraphQL SearchResult type does not currently surface a per-result relevance score on any version; the threshold gate (issue acceptance criteria #3 and #4) will land once that's available. Top-K rank-based cutoff is the realistic substitute today.

Persona editor redesign (#456)

The Personas admin page is rebuilt around a single always-on editor.

UX

No view / edit toggle. Selecting a persona opens the editor directly; clicking elsewhere prompts an in-app modal for unsaved changes.
Sidebar auto-collapses on /admin/personas, mirroring the asset-viewer behavior, then restores on navigation away.
Right area is tabbed: "Permissions" (explorer + summary + trace + templates) and "AI Assistant Behavior" (the four large markdown editors, which previously lived cramped in the left aside).
Explorer hover state is sticky, so the resolution-trace panel no longer flickers between rows during mouse movement.
Read-only deployments (config_mode: file) correctly disable all inputs, gate Save, surface a "Read only" badge and banner, and no-op draft mutations.
Delete and source-mode notice moved into the editor header.

Backend-parity fixes in the live-preview engine

The previous editor's permission preview disagreed with the backend in four ways. All fixed:

Connection identity keys off toolkit.Connection() (what pkg/persona/filter.go::IsConnectionAllowed actually checks), not the toolkit instance name.
Empty allow on connections permits all connections (matching pkg/persona/filter.go:107-110). The UI now reflects that.
Pattern matching ports Go's filepath.Match semantics (*, ?, [abc], [^abc], [a-z], \c escapes). The previous regex-based matcher treated ? and [ as literals.
Stale save-error when switching personas: editor remounts via key={selectedName} so per-persona state doesn't leak across selections.

Connections page UX (#457)

Cleanup of the Connections admin page plus proper handling of the connection name as the machine identifier it is.

Sidebar list

Source pill removed (database vs file): not operationally useful at glance height.
Connection name on its own row in monospace, full row width.
OAuth health badge moved to its own row when present; nothing renders when health is fine, so no empty gap for the common case.

Identifier field (create form)

The Name input previously accepted any text. But the value is used as a path segment in /api/admin/connections/{kind}/{name}, as a literal in persona allow/deny patterns, and as a primary key. There was no validation anywhere.

Now:

Label: "Name" -> "Identifier" with helper text explaining what it's for.
Live filter: keystrokes coerced to lowercase, anything outside [a-z0-9_-] stripped on the fly.
Leading-letter enforcement: HTML pattern="^[a-z][a-z0-9_-]*$" plus a JS regex check that gates Save with an inline "Identifier must start with a lowercase letter" message.
Native form niceties: autocomplete=off, autocapitalize=off, autocorrect=off, spellcheck=false, maxLength=64.
Read-only in edit mode preserved.

API Catalogs page fixes (#458)

Two paper-cuts on the API Catalogs admin page.

Land on the catalog you just touched

CatalogsPanel's auto-correct effect (which resets selection to catalogs[0] when selectedID isn't found) was racing the post-mutation refetch: until React Query finished refreshing the catalog list, the newly-saved catalog didn't appear in the cache and the effect snapped selection back to the first row. Fix: also gate the reset on !isFetching.

No more 1:1 ALL-CAPS slug headers

The list grouped catalogs by machine slug, useful when multiple versions of the same API are present. But the group header was rendered even for groups with exactly one entry, producing an all-caps slug shout above every single-version catalog. Fix: only render the slug header when a group has 2+ entries; single-version catalogs render flat. Multi-version groups get a small vX chip on each row inline.

While there, long display names now properly ellipsize inside the 280px aside (min-w-0 flex-1 engages flex truncation on the inner span).

Upgrade notes

Stateless HTTP SessionID is now "" instead of "stdio" (#451). Affects HTTP-transport deployments only. No data-leak risk (the fix is the leak closure). Cosmetic: session-gate, dedup cache, and workflow tracker still write to an empty-key bucket for stateless HTTP, which is a no-op for one-shot calls.
Audit table partitions are auto-managed (#452). Brownfield deployments: the cleanup goroutine creates audit_logs_YYYY_MM partitions for future months and drops fully-expired ones on the existing retention cadence. No manual partition steps needed.
connection_name UI field removed for API and MCP gateway connections (#468). Existing DB rows that carry the legacy key are tolerated (value dropped). No re-save required.
api_invoke_endpoint may send application/json where it previously sent text/plain (#454), but only for catalog-resolved string bodies that parse as JSON. Connections without a catalog, calls with an explicit Content-Type header, and non-JSON string bodies are unchanged.

Installation

Homebrew (macOS)

brew install txn2/tap/mcp-data-platform

Claude Code CLI

claude mcp add mcp-data-platform -- mcp-data-platform

Docker

docker pull ghcr.io/txn2/mcp-data-platform:v1.67.0

Verification

All release artifacts are signed with Cosign. Verify with:

cosign verify-blob --bundle mcp-data-platform_1.67.0_linux_amd64.tar.gz.sigstore.json \
  mcp-data-platform_1.67.0_linux_amd64.tar.gz

Full changelog

811fd40: feat(apigateway): mTLS, private-CA trust, and connection editor UX (#468)
0210aed: fix(ui): land on edited catalog and skip 1:1 slug headers (#458)
c50683b: feat(ui): tighten connections list and identifier input (#457)
70ef519: feat(ui): unified persona editor with tabbed permissions/AI behavior (#456)
66958b9: feat(enrichment): semantic-similarity fallback for cross-toolkit injection (#444) (#455)
be3db49: fix(apigateway): drive api_invoke_endpoint Content-Type from OpenAPI catalog (#453) (#454)
6aac341: feat(audit): tag caller class via source and rotate monthly partitions (#452)
1770d77: fix(middleware): close cross-user provenance leak on stateless HTTP transport (#451)

Uh oh!

mcp-data-platform-v1.67.0