Skip to content

fix(mcp): strip json_metadata and position_json from get_dashboard_info response#39101

Merged
aminghadersohi merged 6 commits intoapache:masterfrom
aminghadersohi:amin/strip-dashboard-json-metadata-position-json
Apr 9, 2026
Merged

fix(mcp): strip json_metadata and position_json from get_dashboard_info response#39101
aminghadersohi merged 6 commits intoapache:masterfrom
aminghadersohi:amin/strip-dashboard-json-metadata-position-json

Conversation

@aminghadersohi
Copy link
Copy Markdown
Contributor

Summary

The get_dashboard_info MCP tool returned excessively large responses (up to 2.4MB for dashboards with many charts and filters). The main culprits were:

  • json_metadata (~0.5MB on large dashboards): Raw JSON blob containing color schemes, cross-filter scopes, shared_label_colors, and other internal configuration not useful for LLM consumption
  • position_json (grows with chart count): Full internal Superset dashboard layout tree with every node's children, parents, height, width, etc.
  • Charts array: Included verbose per-chart fields (form_data, tags, owners, timestamps) not needed in dashboard context

Changes

  1. Removed json_metadata and position_json raw fields from DashboardInfo response schema
  2. Added structured native_filters field — extracts only filter name, type, and targets from json_metadata (the useful part for LLMs)
  3. Added cross_filters_enabled boolean field extracted from json_metadata
  4. Replaced ChartInfo with DashboardChartSummary in dashboard context — lightweight model with only: id, slice_name, viz_type, datasource_name, url, description
  5. Updated dashboard_serializer, serialize_dashboard_object, generate_dashboard, and add_chart_to_existing_dashboard to use new models
  6. Added comprehensive tests for filter extraction helpers and response slimming

Impact

This significantly reduces get_dashboard_info response size (from potentially 2.4MB to a small fraction) while retaining all information useful for LLM workflows: dashboard metadata, filter configuration, and chart summaries.

Test plan

  • All 86 existing dashboard MCP tests pass
  • New tests for _extract_native_filters (7 cases: None, empty, invalid JSON, no config, non-list, valid, skip non-dict)
  • New tests for _extract_cross_filters_enabled (5 cases: None, empty, true, false, non-bool)
  • New tests verifying DashboardInfo no longer has json_metadata/position_json
  • New tests verifying charts are lightweight DashboardChartSummary (no form_data, tags, owners)
  • ruff check + format pass
  • pre-commit hooks pass (mypy, ruff, etc.)

…fo response

The get_dashboard_info MCP tool returned excessively large responses
(up to 2.4MB) due to raw json_metadata (~0.5MB of color schemes,
cross-filter scopes, etc.) and position_json (internal layout tree).

Changes:
- Remove json_metadata and position_json raw fields from DashboardInfo
- Extract only useful filter info into structured native_filters field
- Add cross_filters_enabled boolean field
- Replace full ChartInfo with lightweight DashboardChartSummary in
  dashboard context (id, name, viz_type, datasource_name, url, description)
- Update dashboard_serializer, serialize_dashboard_object, and
  generate_dashboard/add_chart_to_existing_dashboard to use new models
- Add comprehensive tests for filter extraction and response slimming
@bito-code-review
Copy link
Copy Markdown
Contributor

bito-code-review Bot commented Apr 3, 2026

Code Review Agent Run #7a2e42

Actionable Suggestions - 0
Additional Suggestions - 1
  • superset/mcp_service/utils/permissions_utils.py - 1
    • Security: Potential data exposure from removed field restrictions · Line 47-48
      Removing 'json_metadata' and 'position_json' from SENSITIVE_FIELDS['dashboard'] eliminates permission checks for these fields, potentially exposing sensitive configuration or internal data. The original comments indicate 'json_metadata' may contain sensitive configuration and 'position_json' is internal layout data. If these fields can indeed hold sensitive information, this change could create a security vulnerability by allowing unauthorized access.
Filtered by Review Rules

Bito filtered these suggestions based on rules created automatically for your feedback. Manage rules.

  • superset/mcp_service/dashboard/schemas.py - 1
Review Details
  • Files reviewed - 7 · Commit Range: 2d34ede..2d34ede
    • superset/mcp_service/common/schema_discovery.py
    • superset/mcp_service/dashboard/schemas.py
    • superset/mcp_service/dashboard/tool/add_chart_to_existing_dashboard.py
    • superset/mcp_service/dashboard/tool/generate_dashboard.py
    • superset/mcp_service/utils/permissions_utils.py
    • tests/unit_tests/mcp_service/dashboard/test_dashboard_schemas.py
    • tests/unit_tests/mcp_service/dashboard/tool/test_dashboard_tools.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

@dosubot dosubot Bot added api Related to the REST API dashboard Namespace | Anything related to the Dashboard labels Apr 3, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 3, 2026

Codecov Report

❌ Patch coverage is 25.00000% with 72 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.42%. Comparing base (b3a402d) to head (2b51055).
⚠️ Report is 58 commits behind head on master.

Files with missing lines Patch % Lines
superset/mcp_service/dashboard/schemas.py 35.29% 44 Missing ⚠️
superset/mcp_service/utils/response_utils.py 0.00% 28 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #39101      +/-   ##
==========================================
- Coverage   64.52%   64.42%   -0.10%     
==========================================
  Files        2536     2543       +7     
  Lines      131208   131987     +779     
  Branches    30457    30572     +115     
==========================================
+ Hits        84661    85039     +378     
- Misses      45084    45463     +379     
- Partials     1463     1485      +22     
Flag Coverage Δ
hive 40.02% <25.00%> (-0.03%) ⬇️
mysql 60.69% <25.00%> (-0.19%) ⬇️
postgres 60.77% <25.00%> (-0.19%) ⬇️
presto 41.82% <25.00%> (+1.76%) ⬆️
python 62.35% <25.00%> (-0.19%) ⬇️
sqlite 60.40% <25.00%> (-0.18%) ⬇️
unit 100.00% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment thread superset/mcp_service/dashboard/schemas.py
Comment thread superset/mcp_service/dashboard/schemas.py Outdated
Address PR review comments:
- Add isinstance(metadata, dict) guard in _extract_native_filters
  and _extract_cross_filters_enabled to handle non-object JSON
  (e.g. "[]", "123") gracefully instead of raising AttributeError
- Add isinstance(raw_targets, list) guard for corrupted filter
  targets to prevent Pydantic ValidationError
- Fix stale ChartInfo reference in module docstring
- Add tests for non-dict top-level JSON edge cases
@github-actions github-actions Bot removed the api Related to the REST API label Apr 4, 2026
Refactor both _extract_native_filters and _extract_cross_filters_enabled
to use a shared _parse_json_metadata helper that safely handles None,
invalid JSON, and non-dict JSON values (e.g. "[]", "123").

This addresses review feedback about isinstance(dict) guards by
centralizing the validation in one place.
Add parentheses to @pytest.mark.asyncio and @pytest.fixture
decorators to match CI pre-commit ruff version.
@bito-code-review
Copy link
Copy Markdown
Contributor

bito-code-review Bot commented Apr 4, 2026

Code Review Agent Run #3628d0

Actionable Suggestions - 0
Review Details
  • Files reviewed - 3 · Commit Range: 2d34ede..41ef88d
    • superset/mcp_service/dashboard/schemas.py
    • tests/unit_tests/mcp_service/dashboard/test_dashboard_schemas.py
    • tests/unit_tests/mcp_service/dashboard/tool/test_dashboard_tools.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

@mistercrunch
Copy link
Copy Markdown
Member

Claude commenting from a session with @mistercrunch 👋

Nice cleanup — dropping 2.4MB responses is clearly the right call. One concern worth considering:

Silent omission vs. explicit omission

Right now position_json and most of json_metadata are silently dropped. The agent receives native_filters and cross_filters_enabled as structured extractions (great!), but has no way to know that layout data or other metadata fields exist at all.

Here's an idea: instead of silently omitting, surface a stub that tells the agent what it's missing and how to get it:

{
  "native_filters": [...],
  "cross_filters_enabled": true,
  "_omitted": {
    "position_json": "omitted (~42kb layout tree)",
    "json_metadata": "omitted (~18kb raw blob) — native_filters and cross_filters_enabled extracted above"
  }
}

Or even simpler — keep the field in the schema but return a sentinel string:

position_json: str | None = "[omitted: layout tree too large for context — use get_dashboard_raw if needed]"

The distinction matters because:

  • Silent drop → agent doesn't know what it doesn't know; may confidently give wrong answers about layout or filter config
  • Explicit stub → agent can tell the user "I don't have layout data" or decide to call a follow-up tool to fetch it

The native_filters extraction is a good pattern. position_json disappearing with no trace is the gap. Either a _omitted_fields metadata key, a stub sentinel, or a companion tool (get_dashboard_layout) that lets the agent fetch it on demand would close it.

Address reviewer feedback: instead of silently dropping json_metadata
and position_json, include an omitted_fields dict that tells the LLM
agent what was stripped, approximate sizes, and why.

Introduces a reusable OmittedFieldsBuilder utility in response_utils.py
that any MCP tool serializer can use for consistent omission metadata.

Pattern follows industry best practices (mcp-git-polite, Axiom, Blockscout)
for explicit omission signaling over silent drops.
Copilot AI review requested due to automatic review settings April 9, 2026 15:56
@pull-request-size pull-request-size Bot added size/XL and removed size/L labels Apr 9, 2026
Comment thread superset/mcp_service/dashboard/schemas.py
@netlify
Copy link
Copy Markdown

netlify Bot commented Apr 9, 2026

Deploy Preview for superset-docs-preview ready!

Name Link
🔨 Latest commit f62af42
🔍 Latest deploy log https://app.netlify.com/projects/superset-docs-preview/deploys/69d7cc38afd7b70008a629b7
😎 Deploy Preview https://deploy-preview-39101--superset-docs-preview.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR reduces the payload size of the get_dashboard_info MCP tool by removing large raw dashboard fields and replacing them with lightweight, structured summaries plus explicit omission metadata.

Changes:

  • Removed json_metadata and position_json from DashboardInfo, adding native_filters, cross_filters_enabled, and omitted_fields instead.
  • Replaced full chart serialization with lightweight DashboardChartSummary in dashboard context.
  • Added a reusable OmittedFieldsBuilder utility and expanded unit tests around extraction/omission behavior.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/unit_tests/mcp_service/dashboard/tool/test_dashboard_tools.py Updates tool tests to reflect new default columns and removed heavy fields.
tests/unit_tests/mcp_service/dashboard/test_dashboard_schemas.py Adds tests for omission behavior and json_metadata extraction helpers.
superset/mcp_service/utils/response_utils.py Introduces OmittedFieldsBuilder for consistent omission metadata.
superset/mcp_service/utils/permissions_utils.py Removes permissions handling for fields no longer exposed in MCP dashboard responses.
superset/mcp_service/dashboard/tool/generate_dashboard.py Switches dashboard chart serialization to lightweight summaries.
superset/mcp_service/dashboard/tool/add_chart_to_existing_dashboard.py Switches updated dashboard response chart serialization to lightweight summaries.
superset/mcp_service/dashboard/schemas.py Implements new schema fields + extraction helpers + chart summary serializer.
superset/mcp_service/common/schema_discovery.py Updates dashboard field descriptions to match new response shape.

Comment thread superset/mcp_service/dashboard/tool/generate_dashboard.py
Comment thread superset/mcp_service/dashboard/tool/generate_dashboard.py
Comment thread superset/mcp_service/dashboard/schemas.py
Comment thread superset/mcp_service/dashboard/schemas.py Outdated
@aminghadersohi
Copy link
Copy Markdown
Contributor Author

Great feedback. Implemented explicit omission metadata — the response now includes an omitted_fields dict:

"omitted_fields": {
    "position_json": "Omitted (~42 KB) — Internal layout tree with component positions/hierarchy. Not useful for analysis or LLM context.",
    "json_metadata": "Omitted (~18 KB), useful parts extracted — native_filters and cross_filters_enabled extracted into dedicated fields above."
}

This follows the pattern from mcp-git-polite (truncated + reason), Axiom (result provenance), and Blockscout (explicit indicators). The OmittedFieldsBuilder utility in response_utils.py is reusable by any MCP tool serializer.

For the companion tool idea (get_dashboard_layout), created a follow-up story: sc-103485. That will let the agent explicitly fetch layout data on demand when it sees the omission hint.

See commit f62af42.

@bito-code-review
Copy link
Copy Markdown
Contributor

bito-code-review Bot commented Apr 9, 2026

Code Review Agent Run #0b84e8

Actionable Suggestions - 0
Additional Suggestions - 1
  • superset/mcp_service/utils/response_utils.py - 1
    • Time-specific docstring comment · Line 28-28
      The module docstring includes time-specific language 'as of 2026' in the industry context section. Per AGENTS.md guidelines, code comments should avoid time-specific words like 'now', 'currently', or dates to prevent them from becoming outdated. Consider rephrasing to 'Industry context:' without the date.
      Code suggestion
       @@ -28,1 +28,1 @@
      -Industry context (as of 2026):
      +Industry context:
Review Details
  • Files reviewed - 3 · Commit Range: 41ef88d..f62af42
    • superset/mcp_service/dashboard/schemas.py
    • superset/mcp_service/utils/response_utils.py
    • tests/unit_tests/mcp_service/dashboard/test_dashboard_schemas.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

…load guard

- Filter non-dict entries from native filter targets to prevent
  Pydantic ValidationError on corrupted json_metadata
- Rename _serialize_chart_summary to serialize_chart_summary (public
  API since it's imported cross-module)
- Use `if chart_id is not None:` instead of `if chart_id:` to handle
  falsy-but-valid IDs
- Use getattr() for json_metadata/position_json in dashboard_serializer
  to avoid triggering SQLAlchemy lazy-loads on deferred columns
@bito-code-review
Copy link
Copy Markdown
Contributor

bito-code-review Bot commented Apr 9, 2026

Code Review Agent Run #542f75

Actionable Suggestions - 0
Review Details
  • Files reviewed - 3 · Commit Range: f62af42..2b51055
    • superset/mcp_service/dashboard/schemas.py
    • superset/mcp_service/dashboard/tool/add_chart_to_existing_dashboard.py
    • superset/mcp_service/dashboard/tool/generate_dashboard.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

Copy link
Copy Markdown
Member

@mistercrunch mistercrunch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@aminghadersohi aminghadersohi merged commit 680cef0 into apache:master Apr 9, 2026
65 checks passed
michael-s-molina pushed a commit that referenced this pull request Apr 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dashboard Namespace | Anything related to the Dashboard size/XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants