mcp-datahub-v0.7.2
What's New
Fix: Null Aspect Crash in REST Client (P0)
getAspect() now correctly handles DataHub returning HTTP 200 with a null or empty aspect value. Previously, this caused json.Unmarshal panics when the entity existed but the requested aspect had never been explicitly written.
Root cause: DataHub's REST API returns {"value": null} (not 404) when an entity exists but a specific editable aspect (e.g., editableDatasetProperties, editableSchemaMetadata, globalTags) has never been written. The client was passing this null value directly to json.Unmarshal, which crashed.
Fix: A new isNullOrEmptyJSON() check in getAspect() detects nil, empty, whitespace-only, and null literal responses and returns ErrNotFound. All existing read-modify-write operations (UpdateDescription, AddTag, RemoveTag, AddGlossaryTerm, AddLink) benefit from this fix since they already handle ErrNotFound by initializing a default struct.
Affected scenarios:
- Entities created via ingestion that have never had editable properties set
- Entities where tags, glossary terms, or documentation links have never been explicitly written
- Any
apply_knowledgeor write operation targeting a "fresh" entity
New: Column-Level Description Writes
Added Client.UpdateColumnDescription(ctx, urn, fieldPath, description) for setting editable descriptions on individual columns/fields via the editableSchemaMetadata REST API aspect.
How it works:
- Reads the current
editableSchemaMetadataviagetAspect()(initializes empty struct if not found) - Finds the matching
fieldPathentry or appends a new one - Updates the description while preserving existing
globalTagsandglossaryTermsas raw JSON - Writes back via
postIngestProposal()
Design choice: The editableFieldInfo struct uses json.RawMessage for GlobalTags and GlossaryTerms to avoid deserializing/reserializing metadata the caller didn't intend to modify. This prevents accidental data loss during read-modify-write cycles.
// Update a column description
err := client.UpdateColumnDescription(ctx,
"urn:li:dataset:(urn:li:dataPlatform:trino,catalog.schema.table,PROD)",
"email",
"Customer email address used for account verification",
)Housekeeping
- Added
#nosec G704annotations onhttpClient.Do()calls where URLs are constructed from configured server endpoints, suppressing false-positive SSRF warnings fromgosec
Changelog
Bug Fixes
063b27dfix: handle null getAspect responses, add UpdateColumnDescription (#58)
CI
000985dci: bump github/codeql-action from 4.32.2 to 4.32.3 (#56)
Upgrading
This is a backwards-compatible patch release. No configuration changes required.
Go module:
go get github.com/txn2/mcp-datahub@v0.7.2
If you use the client library directly: The new UpdateColumnDescription method is additive. No existing APIs changed signatures or behavior, with one exception: getAspect() now returns ErrNotFound for null values where it previously crashed. If you have custom callers that somehow relied on receiving a nil json.RawMessage from getAspect(), they should handle ErrNotFound instead.
Installation
Claude Desktop (macOS/Windows)
Download the .mcpb bundle for your platform and double-click to install:
- macOS Apple Silicon (M1/M2/M3/M4):
mcp-datahub_0.7.2_darwin_arm64.mcpb - macOS Intel:
mcp-datahub_0.7.2_darwin_amd64.mcpb - Windows:
mcp-datahub_0.7.2_windows_amd64.mcpb
Homebrew (macOS)
brew install txn2/tap/mcp-datahubClaude Code CLI
claude mcp add datahub \
-e DATAHUB_URL=https://your-datahub.example.com/api/graphql \
-e DATAHUB_TOKEN=your-token \
-- mcp-datahubDocker
docker pull ghcr.io/txn2/mcp-datahub:v0.7.2Verification
All release artifacts are signed with Cosign. Verify with:
cosign verify-blob --bundle mcp-datahub_0.7.2_linux_amd64.tar.gz.sigstore.json \
mcp-datahub_0.7.2_linux_amd64.tar.gz