mcp-datahub-v0.7.4
Bug Fixes
Fix: Non-ASCII characters in DataHub write operations (#61)
Fixed ingestProposal validation failures when writing metadata containing non-ASCII Unicode characters such as em dash (—), en dash (–), bullets (•), or CJK characters.
Root cause: DataHub's GenericAspect.value is typed as bytes in the RestLi PDL schema, which uses Avro-style encoding permitting only characters U+0000–U+00FF. Go's json.Marshal produces raw UTF-8 multi-byte sequences for characters above this range, causing RestLi to reject the request with HTTP 400 before the handler executes.
Fix: Added escapeNonASCII() to convert non-ASCII runes to \uXXXX JSON escape sequences in the inner aspect JSON before embedding in the GenericAspect.value string. Includes a fast path that skips allocation entirely for pure-ASCII content (zero overhead for the common case). Supplementary characters (U+10000+) are encoded as surrogate pairs.
Affected operations: UpdateDescription, UpdateColumnDescription, AddTag, AddGlossaryTerm, AddLink — any write where the aspect content contains characters above U+00FF.
Installation
Claude Desktop (macOS/Windows)
Download the .mcpb bundle for your platform and double-click to install:
- macOS Apple Silicon (M1/M2/M3/M4):
mcp-datahub_0.7.4_darwin_arm64.mcpb - macOS Intel:
mcp-datahub_0.7.4_darwin_amd64.mcpb - Windows:
mcp-datahub_0.7.4_windows_amd64.mcpb
Homebrew (macOS)
brew install txn2/tap/mcp-datahubClaude Code CLI
claude mcp add datahub \
-e DATAHUB_URL=https://your-datahub.example.com/api/graphql \
-e DATAHUB_TOKEN=your-token \
-- mcp-datahubDocker
docker pull ghcr.io/txn2/mcp-datahub:v0.7.4Verification
All release artifacts are signed with Cosign. Verify with:
cosign verify-blob --bundle mcp-datahub_0.7.4_linux_amd64.tar.gz.sigstore.json \
mcp-datahub_0.7.4_linux_amd64.tar.gz