Skip to content

mcp-data-platform-v1.44.0

Choose a tag to compare

@github-actions github-actions released this 15 Mar 22:41
· 243 commits to main since this release
dcec056

DataHub 1.4.x Integration

Integrates DataHub 1.4.x features into the platform's semantic enrichment layer and knowledge management tool. Bumps txn2/mcp-datahub from v1.2.0 to v1.4.0.

Dependency Update

Dependency Previous New
txn2/mcp-datahub v1.2.0 v1.4.0

Semantic Enrichment

Three new categories of DataHub metadata now surface automatically in Trino cross-injection responses. No configuration changes are needed — the new fields appear when the connected DataHub instance is running 1.4.x and the entity has the relevant metadata. On DataHub 1.3.x, responses are unchanged (graceful degradation via `omitempty` JSON tags on zero-value fields).

Structured Properties

Typed custom metadata — retention policies, data classifications, SLAs, and other organization-defined properties — now appears in the `semantic_context` of Trino tool responses.

{
  "semantic_context": {
    "structured_properties": [
      {
        "qualified_name": "io.acryl.privacy.retentionTime",
        "display_name": "Retention Time",
        "values": [90]
      }
    ]
  }
}
  • Display names and string values are sanitized through the injection prevention pipeline
  • Numeric and boolean values pass through as-is

Incident Warnings

Active incidents surface as warnings across all enrichment contexts, similar to existing deprecation warnings:

  • Full context: `active_incidents` count + full `incidents` array (URN, type, title, description, state, created)
  • Compact context: `active_incidents` count only (safety signal for deduped responses)
  • Multi-table context: `active_incidents` count only

Incident titles and descriptions are sanitized.

Data Contract Status

Freshness, schema, and data quality assertion results appear as a bundled pass/fail quality signal:

{
  "semantic_context": {
    "data_contract": {
      "status": "FAILING",
      "assertion_results": [
        {"type": "FRESHNESS", "result_type": "FAILURE"}
      ]
    }
  }
}
  • Full context: Always included when present
  • Compact context: Only included when status is `FAILING` (surfaces problems, suppresses noise)
  • Multi-table context: Always included when present
  • Contract status is system-generated metadata and passes through without sanitization

Knowledge Management (`apply_knowledge`)

Four new change types extend the `apply_knowledge` tool for managing structured properties and incidents through the knowledge workflow. All require DataHub 1.4.x.

Structured Property Management

Change Type Target Detail
`set_structured_property` Property qualified name or URN (e.g., `io.acryl.privacy.retentionTime`) Value or JSON array of values (e.g., `90`, `"PII"`, `[90, "PII"]`)
`remove_structured_property` Property qualified name or URN Removal reason
  • Qualified names are automatically normalized to `urn:li:structuredProperty:` URN format
  • Values are parsed with type preservation: integers stay `int64`, floats stay `float64`, strings stay strings — both for scalar and JSON array inputs

Incident Management

Change Type Target Detail
`raise_incident` Incident title Optional description
`resolve_incident` Incident URN (e.g., `urn:li:incident:abc123`) Resolution message
  • `raise_incident` creates `OPERATIONAL` type incidents by default
  • `resolve_incident` operates on the incident URN directly (the `entity_urn` field identifies which entity the changeset applies to, but the resolution targets the incident itself)

Interface Changes

DataHubWriter Interface

4 new methods added to `pkg/toolkits/knowledge/datahub_writer.go`:

UpsertStructuredProperties(ctx context.Context, urn, propertyURN string, values []any) error
RemoveStructuredProperty(ctx context.Context, urn, propertyURN string) error
RaiseIncident(ctx context.Context, entityURN, title, description string) (string, error)
ResolveIncident(ctx context.Context, incidentURN, message string) error

Noop implementations provided for deployments without DataHub write-back configured.


Internal Improvements

  • Complexity management: `dispatchChange()` refactored into `dispatchChange()` + `dispatchCuratedQuery()` + `dispatchV14Change()` to stay within cyclomatic complexity limit (≤10)
  • Error handling: Unknown change types now return an error instead of silently succeeding
  • Sanitization: Structured property display names/values and incident titles/descriptions go through the injection prevention pipeline with detection logging
  • Extracted constant: `errFmtExecuting` constant satisfies `goconst` lint rule

Backward Compatibility

This release is fully backward compatible. No configuration changes are required.

  • DataHub 1.3.x: The upstream entity response omits the new fields (nil/zero), and `omitempty` JSON tags ensure they are absent from enrichment output. Existing behavior is unchanged.
  • DataHub 1.4.x: New fields appear automatically when the entity has structured properties, active incidents, or data contracts.
  • apply_knowledge: The 4 new change types are additive. Existing change types (`update_description`, `add_tag`, etc.) are unaffected.

Quality

  • 90.9% total test coverage
  • All new functions at 80–100% coverage
  • 0 lint issues
  • Backward compatibility verified by `TestGetTableContext_V13Compat`

Closes

  • #181 — Structured properties in semantic enrichment
  • #182 — Incident status in semantic enrichment
  • #183 — Data contract status in semantic enrichment
  • #184 — Structured properties in apply_knowledge
  • #185 — Incidents in apply_knowledge
  • #186 — DataHub 1.4.x Upgrade (tracking issue)

Installation

Homebrew (macOS)

brew install txn2/tap/mcp-data-platform

Claude Code CLI

claude mcp add mcp-data-platform -- mcp-data-platform

Docker

docker pull ghcr.io/txn2/mcp-data-platform:v1.44.0

Verification

All release artifacts are signed with Cosign. Verify with:

cosign verify-blob --bundle mcp-data-platform_1.44.0_linux_amd64.tar.gz.sigstore.json \
  mcp-data-platform_1.44.0_linux_amd64.tar.gz