Skip to content

mcp-datahub-v0.7.2

Choose a tag to compare

@github-actions github-actions released this 19 Feb 23:13
· 71 commits to main since this release
Immutable release. Only release title and notes can be modified.
063b27d

What's New

Fix: Null Aspect Crash in REST Client (P0)

getAspect() now correctly handles DataHub returning HTTP 200 with a null or empty aspect value. Previously, this caused json.Unmarshal panics when the entity existed but the requested aspect had never been explicitly written.

Root cause: DataHub's REST API returns {"value": null} (not 404) when an entity exists but a specific editable aspect (e.g., editableDatasetProperties, editableSchemaMetadata, globalTags) has never been written. The client was passing this null value directly to json.Unmarshal, which crashed.

Fix: A new isNullOrEmptyJSON() check in getAspect() detects nil, empty, whitespace-only, and null literal responses and returns ErrNotFound. All existing read-modify-write operations (UpdateDescription, AddTag, RemoveTag, AddGlossaryTerm, AddLink) benefit from this fix since they already handle ErrNotFound by initializing a default struct.

Affected scenarios:

  • Entities created via ingestion that have never had editable properties set
  • Entities where tags, glossary terms, or documentation links have never been explicitly written
  • Any apply_knowledge or write operation targeting a "fresh" entity

New: Column-Level Description Writes

Added Client.UpdateColumnDescription(ctx, urn, fieldPath, description) for setting editable descriptions on individual columns/fields via the editableSchemaMetadata REST API aspect.

How it works:

  1. Reads the current editableSchemaMetadata via getAspect() (initializes empty struct if not found)
  2. Finds the matching fieldPath entry or appends a new one
  3. Updates the description while preserving existing globalTags and glossaryTerms as raw JSON
  4. Writes back via postIngestProposal()

Design choice: The editableFieldInfo struct uses json.RawMessage for GlobalTags and GlossaryTerms to avoid deserializing/reserializing metadata the caller didn't intend to modify. This prevents accidental data loss during read-modify-write cycles.

// Update a column description
err := client.UpdateColumnDescription(ctx,
    "urn:li:dataset:(urn:li:dataPlatform:trino,catalog.schema.table,PROD)",
    "email",
    "Customer email address used for account verification",
)

Housekeeping

  • Added #nosec G704 annotations on httpClient.Do() calls where URLs are constructed from configured server endpoints, suppressing false-positive SSRF warnings from gosec

Changelog

Bug Fixes

  • 063b27d fix: handle null getAspect responses, add UpdateColumnDescription (#58)

CI

  • 000985d ci: bump github/codeql-action from 4.32.2 to 4.32.3 (#56)

Upgrading

This is a backwards-compatible patch release. No configuration changes required.

Go module:

go get github.com/txn2/mcp-datahub@v0.7.2

If you use the client library directly: The new UpdateColumnDescription method is additive. No existing APIs changed signatures or behavior, with one exception: getAspect() now returns ErrNotFound for null values where it previously crashed. If you have custom callers that somehow relied on receiving a nil json.RawMessage from getAspect(), they should handle ErrNotFound instead.

Installation

Claude Desktop (macOS/Windows)

Download the .mcpb bundle for your platform and double-click to install:

  • macOS Apple Silicon (M1/M2/M3/M4): mcp-datahub_0.7.2_darwin_arm64.mcpb
  • macOS Intel: mcp-datahub_0.7.2_darwin_amd64.mcpb
  • Windows: mcp-datahub_0.7.2_windows_amd64.mcpb

Homebrew (macOS)

brew install txn2/tap/mcp-datahub

Claude Code CLI

claude mcp add datahub \
  -e DATAHUB_URL=https://your-datahub.example.com/api/graphql \
  -e DATAHUB_TOKEN=your-token \
  -- mcp-datahub

Docker

docker pull ghcr.io/txn2/mcp-datahub:v0.7.2

Verification

All release artifacts are signed with Cosign. Verify with:

cosign verify-blob --bundle mcp-datahub_0.7.2_linux_amd64.tar.gz.sigstore.json \
  mcp-datahub_0.7.2_linux_amd64.tar.gz