mcp-data-platform-v1.44.0
DataHub 1.4.x Integration
Integrates DataHub 1.4.x features into the platform's semantic enrichment layer and knowledge management tool. Bumps txn2/mcp-datahub from v1.2.0 to v1.4.0.
Dependency Update
| Dependency | Previous | New |
|---|---|---|
txn2/mcp-datahub |
v1.2.0 | v1.4.0 |
Semantic Enrichment
Three new categories of DataHub metadata now surface automatically in Trino cross-injection responses. No configuration changes are needed — the new fields appear when the connected DataHub instance is running 1.4.x and the entity has the relevant metadata. On DataHub 1.3.x, responses are unchanged (graceful degradation via `omitempty` JSON tags on zero-value fields).
Structured Properties
Typed custom metadata — retention policies, data classifications, SLAs, and other organization-defined properties — now appears in the `semantic_context` of Trino tool responses.
{
"semantic_context": {
"structured_properties": [
{
"qualified_name": "io.acryl.privacy.retentionTime",
"display_name": "Retention Time",
"values": [90]
}
]
}
}- Display names and string values are sanitized through the injection prevention pipeline
- Numeric and boolean values pass through as-is
Incident Warnings
Active incidents surface as warnings across all enrichment contexts, similar to existing deprecation warnings:
- Full context: `active_incidents` count + full `incidents` array (URN, type, title, description, state, created)
- Compact context: `active_incidents` count only (safety signal for deduped responses)
- Multi-table context: `active_incidents` count only
Incident titles and descriptions are sanitized.
Data Contract Status
Freshness, schema, and data quality assertion results appear as a bundled pass/fail quality signal:
{
"semantic_context": {
"data_contract": {
"status": "FAILING",
"assertion_results": [
{"type": "FRESHNESS", "result_type": "FAILURE"}
]
}
}
}- Full context: Always included when present
- Compact context: Only included when status is `FAILING` (surfaces problems, suppresses noise)
- Multi-table context: Always included when present
- Contract status is system-generated metadata and passes through without sanitization
Knowledge Management (`apply_knowledge`)
Four new change types extend the `apply_knowledge` tool for managing structured properties and incidents through the knowledge workflow. All require DataHub 1.4.x.
Structured Property Management
| Change Type | Target | Detail |
|---|---|---|
| `set_structured_property` | Property qualified name or URN (e.g., `io.acryl.privacy.retentionTime`) | Value or JSON array of values (e.g., `90`, `"PII"`, `[90, "PII"]`) |
| `remove_structured_property` | Property qualified name or URN | Removal reason |
- Qualified names are automatically normalized to `urn:li:structuredProperty:` URN format
- Values are parsed with type preservation: integers stay `int64`, floats stay `float64`, strings stay strings — both for scalar and JSON array inputs
Incident Management
| Change Type | Target | Detail |
|---|---|---|
| `raise_incident` | Incident title | Optional description |
| `resolve_incident` | Incident URN (e.g., `urn:li:incident:abc123`) | Resolution message |
- `raise_incident` creates `OPERATIONAL` type incidents by default
- `resolve_incident` operates on the incident URN directly (the `entity_urn` field identifies which entity the changeset applies to, but the resolution targets the incident itself)
Interface Changes
DataHubWriter Interface
4 new methods added to `pkg/toolkits/knowledge/datahub_writer.go`:
UpsertStructuredProperties(ctx context.Context, urn, propertyURN string, values []any) error
RemoveStructuredProperty(ctx context.Context, urn, propertyURN string) error
RaiseIncident(ctx context.Context, entityURN, title, description string) (string, error)
ResolveIncident(ctx context.Context, incidentURN, message string) errorNoop implementations provided for deployments without DataHub write-back configured.
Internal Improvements
- Complexity management: `dispatchChange()` refactored into `dispatchChange()` + `dispatchCuratedQuery()` + `dispatchV14Change()` to stay within cyclomatic complexity limit (≤10)
- Error handling: Unknown change types now return an error instead of silently succeeding
- Sanitization: Structured property display names/values and incident titles/descriptions go through the injection prevention pipeline with detection logging
- Extracted constant: `errFmtExecuting` constant satisfies `goconst` lint rule
Backward Compatibility
This release is fully backward compatible. No configuration changes are required.
- DataHub 1.3.x: The upstream entity response omits the new fields (nil/zero), and `omitempty` JSON tags ensure they are absent from enrichment output. Existing behavior is unchanged.
- DataHub 1.4.x: New fields appear automatically when the entity has structured properties, active incidents, or data contracts.
- apply_knowledge: The 4 new change types are additive. Existing change types (`update_description`, `add_tag`, etc.) are unaffected.
Quality
- 90.9% total test coverage
- All new functions at 80–100% coverage
- 0 lint issues
- Backward compatibility verified by `TestGetTableContext_V13Compat`
Closes
- #181 — Structured properties in semantic enrichment
- #182 — Incident status in semantic enrichment
- #183 — Data contract status in semantic enrichment
- #184 — Structured properties in apply_knowledge
- #185 — Incidents in apply_knowledge
- #186 — DataHub 1.4.x Upgrade (tracking issue)
Installation
Homebrew (macOS)
brew install txn2/tap/mcp-data-platformClaude Code CLI
claude mcp add mcp-data-platform -- mcp-data-platformDocker
docker pull ghcr.io/txn2/mcp-data-platform:v1.44.0Verification
All release artifacts are signed with Cosign. Verify with:
cosign verify-blob --bundle mcp-data-platform_1.44.0_linux_amd64.tar.gz.sigstore.json \
mcp-data-platform_1.44.0_linux_amd64.tar.gz