Release mcp-data-platform-v1.75.0 · txn2/mcp-data-platform

Indexing dashboard: communicate state, not raw job rows

The admin Indexing dashboard now answers "is indexing healthy?" at a glance, and every red thing on it is either actionable or self-resolving. Previously it rendered three structurally different metrics (vector coverage, per-unit job state, recent job activity) side by side with no labels reconciling them, so a healthy, fully-indexed instance could read as broken: 34 / 34 indexed - 100% next to 0 succeeded and last activity never. Failures from a past deploy lingered in triage forever, and clicking Retry did nothing visible.

Highlights

Summary-first health verdict per kind. Each kind leads with a single plain-language verdict computed server-side: Healthy, Indexing, Degraded, or Idle (complete). The verdict is derived from the same counts and coverage the cards show, so the lead word and the detail metrics cannot disagree.
Fully indexed and idle no longer reads as broken. A kind at full coverage with no recent jobs (vectors seeded before the queue existed) now reads "fully indexed, idle" instead of "last activity never".
The two metric families are labelled. Vector coverage (how much is indexed) is kept visually and textually distinct from per-unit job state, so "1 succeeded" reads as "1 unit, last run succeeded", not a one-job history.
Self-resolving failure triage. A failure clears automatically once a later job for the same unit succeeds. Retry re-indexes the unit and the card clears on the next success; an explicit Dismiss is the fallback for a failure that no future success will supersede (for example a removed consumer's leftover rows).
Triage cards carry time and a drill-in. First-seen and last-seen timestamps, last-succeeded context, occurrence and attempt counts, and an expandable drill-in to the un-redacted error and the underlying job id, so a one-hour-old tombstone is distinguishable from an active incident.
Routine heartbeats are collapsed. Timer-driven reconciler successes for a unit (which each replica re-runs on its own schedule) fold into a single "synced xN" row instead of an unbounded firehose of duplicate rows.

New and changed API

GET /api/v1/admin/index-jobs - the per-kind summary gains verdict and unresolved_failures. succeeded/failed remain per-unit latest-status counts (units by last run), and unresolved_failures counts distinct units with an open failed job (the verdict's degraded trigger).
GET /api/v1/admin/index-jobs/failures - new. Returns the units with open (unresolved) failures, one per unit, most-recently-failed first, with latest_job_id, un-redacted last_error, attempts, occurrences, first_failed_at/last_failed_at, and optional last_succeeded_at.
POST /api/v1/admin/index-jobs/dismiss - new. Resolves every open failure for one unit. Idempotent.

Database migration

This release adds migration 000055_index_jobs_resolved_at, which adds a nullable resolved_at column to index_jobs plus a partial index on unresolved failures. It is additive and backward-compatible: every existing failed row is treated as unresolved until a later success supersedes it or an operator dismisses it. The migration runs automatically on startup through the platform's migration runner; no manual step is required.

Upgrade notes

No configuration changes are required.
On first start after upgrade the new migration applies automatically. A failed row now becomes "resolved" (and leaves the triage surface) when a later job for the same (source_kind, source_id) succeeds, so historical tombstones clear as their units are re-indexed.

Out of scope (planned follow-up)

The no consumer registered for source_kind failures (an old replica claiming a job for a kind it has not registered during a rolling deploy) are best fixed by the worker requeuing or skipping the job rather than terminating it. This release covers only the dashboard's presentation of those tombstones (they auto-resolve once the kind is re-registered and a job succeeds, and Dismiss clears any that never will). The worker-side fix is tracked as a separate follow-up against pkg/indexjobs/worker.go.

Changelog

feat(indexing): dashboard communicates state, not raw job rows (#509) (#510) (@cjimti)

Full changelog: v1.74.0...v1.75.0

Installation

Homebrew (macOS)

brew install txn2/tap/mcp-data-platform

Claude Code CLI

claude mcp add mcp-data-platform -- mcp-data-platform

Docker

docker pull ghcr.io/txn2/mcp-data-platform:v1.75.0

Verification

All release artifacts are signed with Cosign. Verify with:

cosign verify-blob --bundle mcp-data-platform_1.75.0_linux_amd64.tar.gz.sigstore.json \
  mcp-data-platform_1.75.0_linux_amd64.tar.gz

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

mcp-data-platform-v1.75.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Indexing dashboard: communicate state, not raw job rows

Highlights

New and changed API

Database migration

Upgrade notes

Out of scope (planned follow-up)

Changelog

Installation

Homebrew (macOS)

Claude Code CLI

Docker

Verification

Contributors

Uh oh!