Skip to content

mcp-data-platform-v0.22.3

Choose a tag to compare

@github-actions github-actions released this 20 Feb 01:57
· 344 commits to main since this release
8b674d3

Knowledge Pipeline Improvements

This release addresses remaining issues from the v0.22.2 regression testing of the knowledge pipeline (capture_insightapply_knowledge). Three fixes improve usability, reduce tag namespace pollution, and restore descriptive validation errors.

New: remove_tag Change Type

The apply_knowledge tool now supports remove_tag as a change type, enabling tag removal from DataHub entities. Tag names are automatically normalized to full URNs — deprecated becomes urn:li:tag:deprecated.

{
  "action": "apply",
  "entity_urn": "urn:li:dataset:(...)",
  "changes": [
    {"change_type": "remove_tag", "target": "", "detail": "urn:li:tag:QualityIssue"}
  ],
  "confirm": true
}

This enables cleanup workflows — for example, removing a QualityIssue tag after the underlying data quality problem has been resolved.

Redesigned: flag_quality_issue

Before (v0.22.2): Created dynamic slugified tags like quality_issue_missing_column_descriptions, quality_issue_nulls_in_30_of_rows, etc. This polluted the DataHub tag namespace with one-off tags that were hard to search and manage.

After (v0.22.3): Adds a single fixed urn:li:tag:QualityIssue tag. The detail text (e.g., "Missing column descriptions for 5 fields") is stored as context in the knowledge store for admin review, not encoded in the tag name.

Benefits:

  • Clean tag namespace — one tag for all quality issues, searchable in DataHub
  • Rich detail preserved in knowledge store alongside the insight that flagged it
  • Easy cleanup — remove_tag with urn:li:tag:QualityIssue clears the flag

Fixed: Enum Validation UX

Before (v0.22.2): JSON schema enum constraints caused the MCP transport layer to reject invalid values with generic error messages before they reached the server. LLM clients received unhelpful errors like "invalid value" with no guidance on valid options.

After (v0.22.3): Enum constraints removed from schemas. Valid values are listed in field descriptions so LLM clients discover them during tool discovery. Server-side validation provides descriptive errors:

invalid category "revenue_data": must be one of: correction, business_context,
data_quality, usage_guidance, relationship, enhancement

All Supported Change Types (v0.22.3)

Change Type Description
update_description Update entity or column description (use target: "column:<fieldPath>" for columns)
add_tag Add a tag (auto-normalizes short names to urn:li:tag:<name>)
remove_tag Remove a tag (same URN normalization)
add_glossary_term Associate a glossary term (auto-normalizes to urn:li:glossaryTerm:<name>)
flag_quality_issue Add fixed QualityIssue tag; detail stored in knowledge store
add_documentation Add documentation link (target = URL, detail = description)

Documentation

Updated governance workflow, tools reference, llms.txt, llms-full.txt, and changelog.


Upgrade Notes

  • No breaking changes. All existing apply_knowledge calls continue to work.
  • flag_quality_issue now creates urn:li:tag:QualityIssue instead of urn:li:tag:quality_issue_<slug>. Previously created dynamic tags remain in DataHub — use remove_tag to clean them up if desired.
  • LLM clients that relied on schema-level enum constraints for validation should now read valid values from field descriptions instead. Server-side validation is unchanged.

Installation

Homebrew (macOS)

brew install txn2/tap/mcp-data-platform

Claude Code CLI

claude mcp add mcp-data-platform -- mcp-data-platform

Docker

docker pull ghcr.io/txn2/mcp-data-platform:v0.22.3

Verification

All release artifacts are signed with Cosign. Verify with:

cosign verify-blob --bundle mcp-data-platform_0.22.3_linux_amd64.tar.gz.sigstore.json \
  mcp-data-platform_0.22.3_linux_amd64.tar.gz