fix(mcp): strip json_metadata and position_json from get_dashboard_info response by aminghadersohi · Pull Request #39101 · apache/superset

aminghadersohi · 2026-04-03T21:55:49Z

Summary

The get_dashboard_info MCP tool returned excessively large responses (up to 2.4MB for dashboards with many charts and filters). The main culprits were:

json_metadata (~0.5MB on large dashboards): Raw JSON blob containing color schemes, cross-filter scopes, shared_label_colors, and other internal configuration not useful for LLM consumption
position_json (grows with chart count): Full internal Superset dashboard layout tree with every node's children, parents, height, width, etc.
Charts array: Included verbose per-chart fields (form_data, tags, owners, timestamps) not needed in dashboard context

Changes

Removed json_metadata and position_json raw fields from DashboardInfo response schema
Added structured native_filters field — extracts only filter name, type, and targets from json_metadata (the useful part for LLMs)
Added cross_filters_enabled boolean field extracted from json_metadata
Replaced ChartInfo with DashboardChartSummary in dashboard context — lightweight model with only: id, slice_name, viz_type, datasource_name, url, description
Updated dashboard_serializer, serialize_dashboard_object, generate_dashboard, and add_chart_to_existing_dashboard to use new models
Added comprehensive tests for filter extraction helpers and response slimming

Impact

This significantly reduces get_dashboard_info response size (from potentially 2.4MB to a small fraction) while retaining all information useful for LLM workflows: dashboard metadata, filter configuration, and chart summaries.

Test plan

All 86 existing dashboard MCP tests pass
New tests for _extract_native_filters (7 cases: None, empty, invalid JSON, no config, non-list, valid, skip non-dict)
New tests for _extract_cross_filters_enabled (5 cases: None, empty, true, false, non-bool)
New tests verifying DashboardInfo no longer has json_metadata/position_json
New tests verifying charts are lightweight DashboardChartSummary (no form_data, tags, owners)
ruff check + format pass
pre-commit hooks pass (mypy, ruff, etc.)

…fo response The get_dashboard_info MCP tool returned excessively large responses (up to 2.4MB) due to raw json_metadata (~0.5MB of color schemes, cross-filter scopes, etc.) and position_json (internal layout tree). Changes: - Remove json_metadata and position_json raw fields from DashboardInfo - Extract only useful filter info into structured native_filters field - Add cross_filters_enabled boolean field - Replace full ChartInfo with lightweight DashboardChartSummary in dashboard context (id, name, viz_type, datasource_name, url, description) - Update dashboard_serializer, serialize_dashboard_object, and generate_dashboard/add_chart_to_existing_dashboard to use new models - Add comprehensive tests for filter extraction and response slimming

bito-code-review · 2026-04-03T21:55:59Z

Code Review Agent Run #7a2e42

Actionable Suggestions - 0

Additional Suggestions - 1

superset/mcp_service/utils/permissions_utils.py - 1
- Security: Potential data exposure from removed field restrictions · Line 47-48
  
  Removing 'json_metadata' and 'position_json' from SENSITIVE_FIELDS['dashboard'] eliminates permission checks for these fields, potentially exposing sensitive configuration or internal data. The original comments indicate 'json_metadata' may contain sensitive configuration and 'position_json' is internal layout data. If these fields can indeed hold sensitive information, this change could create a security vulnerability by allowing unauthorized access.

Filtered by Review Rules

Bito filtered these suggestions based on rules created automatically for your feedback. Manage rules.

superset/mcp_service/dashboard/schemas.py - 1
- API Response Structure Change · Line 362-362

Review Details

Files reviewed - 7 · Commit Range: 2d34ede..2d34ede
- superset/mcp_service/common/schema_discovery.py
- superset/mcp_service/dashboard/schemas.py
- superset/mcp_service/dashboard/tool/add_chart_to_existing_dashboard.py
- superset/mcp_service/dashboard/tool/generate_dashboard.py
- superset/mcp_service/utils/permissions_utils.py
- tests/unit_tests/mcp_service/dashboard/test_dashboard_schemas.py
- tests/unit_tests/mcp_service/dashboard/tool/test_dashboard_tools.py
Files skipped - 0
Tools
- Whispers (Secret Scanner) - ✔︎ Successful
- Detect-secrets (Secret Scanner) - ✔︎ Successful
- MyPy (Static Code Analysis) - ✔︎ Successful
- Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

/review - Manually triggers a full AI review.
/pause - Pauses automatic reviews on this pull request.
/resume - Resumes automatic reviews.
/resolve - Marks all Bito-posted review comments as resolved.
/abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by

codecov · 2026-04-03T22:00:01Z

Codecov Report

❌ Patch coverage is 25.00000% with 72 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.42%. Comparing base (b3a402d) to head (2b51055).
⚠️ Report is 58 commits behind head on master.

Files with missing lines	Patch %	Lines
superset/mcp_service/dashboard/schemas.py	35.29%	44 Missing ⚠️
superset/mcp_service/utils/response_utils.py	0.00%	28 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #39101      +/-   ##
==========================================
- Coverage   64.52%   64.42%   -0.10%     
==========================================
  Files        2536     2543       +7     
  Lines      131208   131987     +779     
  Branches    30457    30572     +115     
==========================================
+ Hits        84661    85039     +378     
- Misses      45084    45463     +379     
- Partials     1463     1485      +22

Flag	Coverage Δ
hive	`40.02% <25.00%> (-0.03%)`	⬇️
mysql	`60.69% <25.00%> (-0.19%)`	⬇️
postgres	`60.77% <25.00%> (-0.19%)`	⬇️
presto	`41.82% <25.00%> (+1.76%)`	⬆️
python	`62.35% <25.00%> (-0.19%)`	⬇️
sqlite	`60.40% <25.00%> (-0.18%)`	⬇️
unit	`100.00% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Address PR review comments: - Add isinstance(metadata, dict) guard in _extract_native_filters and _extract_cross_filters_enabled to handle non-object JSON (e.g. "[]", "123") gracefully instead of raising AttributeError - Add isinstance(raw_targets, list) guard for corrupted filter targets to prevent Pydantic ValidationError - Fix stale ChartInfo reference in module docstring - Add tests for non-dict top-level JSON edge cases

Refactor both _extract_native_filters and _extract_cross_filters_enabled to use a shared _parse_json_metadata helper that safely handles None, invalid JSON, and non-dict JSON values (e.g. "[]", "123"). This addresses review feedback about isinstance(dict) guards by centralizing the validation in one place.

@pytest

Add parentheses to @pytest.mark.asyncio and @pytest.fixture decorators to match CI pre-commit ruff version.

bito-code-review · 2026-04-04T02:02:46Z

Code Review Agent Run #3628d0

Actionable Suggestions - 0

Review Details

Files reviewed - 3 · Commit Range: 2d34ede..41ef88d
- superset/mcp_service/dashboard/schemas.py
- tests/unit_tests/mcp_service/dashboard/test_dashboard_schemas.py
- tests/unit_tests/mcp_service/dashboard/tool/test_dashboard_tools.py
Files skipped - 0
Tools
- Whispers (Secret Scanner) - ✔︎ Successful
- Detect-secrets (Secret Scanner) - ✔︎ Successful
- MyPy (Static Code Analysis) - ✔︎ Successful
- Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

/review - Manually triggers a full AI review.
/pause - Pauses automatic reviews on this pull request.
/resume - Resumes automatic reviews.
/resolve - Marks all Bito-posted review comments as resolved.
/abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by

mistercrunch · 2026-04-06T17:17:37Z

Claude commenting from a session with @mistercrunch 👋

Nice cleanup — dropping 2.4MB responses is clearly the right call. One concern worth considering:

Silent omission vs. explicit omission

Right now position_json and most of json_metadata are silently dropped. The agent receives native_filters and cross_filters_enabled as structured extractions (great!), but has no way to know that layout data or other metadata fields exist at all.

Here's an idea: instead of silently omitting, surface a stub that tells the agent what it's missing and how to get it:

{
  "native_filters": [...],
  "cross_filters_enabled": true,
  "_omitted": {
    "position_json": "omitted (~42kb layout tree)",
    "json_metadata": "omitted (~18kb raw blob) — native_filters and cross_filters_enabled extracted above"
  }
}

Or even simpler — keep the field in the schema but return a sentinel string:

position_json: str | None = "[omitted: layout tree too large for context — use get_dashboard_raw if needed]"

The distinction matters because:

Silent drop → agent doesn't know what it doesn't know; may confidently give wrong answers about layout or filter config
Explicit stub → agent can tell the user "I don't have layout data" or decide to call a follow-up tool to fetch it

The native_filters extraction is a good pattern. position_json disappearing with no trace is the gap. Either a _omitted_fields metadata key, a stub sentinel, or a companion tool (get_dashboard_layout) that lets the agent fetch it on demand would close it.

Address reviewer feedback: instead of silently dropping json_metadata and position_json, include an omitted_fields dict that tells the LLM agent what was stripped, approximate sizes, and why. Introduces a reusable OmittedFieldsBuilder utility in response_utils.py that any MCP tool serializer can use for consistent omission metadata. Pattern follows industry best practices (mcp-git-polite, Axiom, Blockscout) for explicit omission signaling over silent drops.

netlify · 2026-04-09T16:02:09Z

✅ Deploy Preview for superset-docs-preview ready!

Name	Link
🔨 Latest commit	`f62af42`
🔍 Latest deploy log	https://app.netlify.com/projects/superset-docs-preview/deploys/69d7cc38afd7b70008a629b7
😎 Deploy Preview	https://deploy-preview-39101--superset-docs-preview.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR reduces the payload size of the get_dashboard_info MCP tool by removing large raw dashboard fields and replacing them with lightweight, structured summaries plus explicit omission metadata.

Changes:

Removed json_metadata and position_json from DashboardInfo, adding native_filters, cross_filters_enabled, and omitted_fields instead.
Replaced full chart serialization with lightweight DashboardChartSummary in dashboard context.
Added a reusable OmittedFieldsBuilder utility and expanded unit tests around extraction/omission behavior.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
`tests/unit_tests/mcp_service/dashboard/tool/test_dashboard_tools.py`	Updates tool tests to reflect new default columns and removed heavy fields.
`tests/unit_tests/mcp_service/dashboard/test_dashboard_schemas.py`	Adds tests for omission behavior and json_metadata extraction helpers.
`superset/mcp_service/utils/response_utils.py`	Introduces `OmittedFieldsBuilder` for consistent omission metadata.
`superset/mcp_service/utils/permissions_utils.py`	Removes permissions handling for fields no longer exposed in MCP dashboard responses.
`superset/mcp_service/dashboard/tool/generate_dashboard.py`	Switches dashboard chart serialization to lightweight summaries.
`superset/mcp_service/dashboard/tool/add_chart_to_existing_dashboard.py`	Switches updated dashboard response chart serialization to lightweight summaries.
`superset/mcp_service/dashboard/schemas.py`	Implements new schema fields + extraction helpers + chart summary serializer.
`superset/mcp_service/common/schema_discovery.py`	Updates dashboard field descriptions to match new response shape.

aminghadersohi · 2026-04-09T16:07:16Z

Great feedback. Implemented explicit omission metadata — the response now includes an omitted_fields dict:

"omitted_fields": {
    "position_json": "Omitted (~42 KB) — Internal layout tree with component positions/hierarchy. Not useful for analysis or LLM context.",
    "json_metadata": "Omitted (~18 KB), useful parts extracted — native_filters and cross_filters_enabled extracted into dedicated fields above."
}

This follows the pattern from mcp-git-polite (truncated + reason), Axiom (result provenance), and Blockscout (explicit indicators). The OmittedFieldsBuilder utility in response_utils.py is reusable by any MCP tool serializer.

For the companion tool idea (get_dashboard_layout), created a follow-up story: sc-103485. That will let the agent explicitly fetch layout data on demand when it sees the omission hint.

See commit f62af42.

bito-code-review · 2026-04-09T16:58:46Z

Code Review Agent Run #0b84e8

Actionable Suggestions - 0

Additional Suggestions - 1

superset/mcp_service/utils/response_utils.py - 1
- Time-specific docstring comment · Line 28-28
  
  The module docstring includes time-specific language 'as of 2026' in the industry context section. Per AGENTS.md guidelines, code comments should avoid time-specific words like 'now', 'currently', or dates to prevent them from becoming outdated. Consider rephrasing to 'Industry context:' without the date.
  Code suggestion
  @@ -28,1 +28,1 @@ -Industry context (as of 2026): +Industry context:

Review Details

Files reviewed - 3 · Commit Range: 41ef88d..f62af42
- superset/mcp_service/dashboard/schemas.py
- superset/mcp_service/utils/response_utils.py
- tests/unit_tests/mcp_service/dashboard/test_dashboard_schemas.py
Files skipped - 0
Tools
- Whispers (Secret Scanner) - ✔︎ Successful
- Detect-secrets (Secret Scanner) - ✔︎ Successful
- MyPy (Static Code Analysis) - ✔︎ Successful
- Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

/review - Manually triggers a full AI review.
/pause - Pauses automatic reviews on this pull request.
/resume - Resumes automatic reviews.
/resolve - Marks all Bito-posted review comments as resolved.
/abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by

…load guard - Filter non-dict entries from native filter targets to prevent Pydantic ValidationError on corrupted json_metadata - Rename _serialize_chart_summary to serialize_chart_summary (public API since it's imported cross-module) - Use `if chart_id is not None:` instead of `if chart_id:` to handle falsy-but-valid IDs - Use getattr() for json_metadata/position_json in dashboard_serializer to avoid triggering SQLAlchemy lazy-loads on deferred columns

bito-code-review · 2026-04-09T19:24:26Z

Code Review Agent Run #542f75

Actionable Suggestions - 0

Review Details

Files reviewed - 3 · Commit Range: f62af42..2b51055
- superset/mcp_service/dashboard/schemas.py
- superset/mcp_service/dashboard/tool/add_chart_to_existing_dashboard.py
- superset/mcp_service/dashboard/tool/generate_dashboard.py
Files skipped - 0
Tools
- Whispers (Secret Scanner) - ✔︎ Successful
- Detect-secrets (Secret Scanner) - ✔︎ Successful
- MyPy (Static Code Analysis) - ✔︎ Successful
- Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

/review - Manually triggers a full AI review.
/pause - Pauses automatic reviews on this pull request.
/resume - Resumes automatic reviews.
/resolve - Marks all Bito-posted review comments as resolved.
/abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by

mistercrunch

LGTM

…fo response (#39101) (cherry picked from commit 680cef0)

pull-request-size Bot added the size/L label Apr 3, 2026

dosubot Bot added api Related to the REST API dashboard Namespace | Anything related to the Dashboard labels Apr 3, 2026

codeant-ai-for-open-source Bot reviewed Apr 3, 2026

View reviewed changes

Comment thread superset/mcp_service/dashboard/schemas.py

Comment thread superset/mcp_service/dashboard/schemas.py Outdated

github-actions Bot removed the api Related to the REST API label Apr 4, 2026

aminghadersohi added 2 commits April 3, 2026 21:02

fix(mcp): apply ruff 0.9.7 PT023/PT001 auto-fixes to test file

41ef88d

Add parentheses to @pytest.mark.asyncio and @pytest.fixture decorators to match CI pre-commit ruff version.

aminghadersohi mentioned this pull request Apr 4, 2026

fix(mcp): add dynamic response truncation for oversized info tool responses #39107

Merged

Copilot AI review requested due to automatic review settings April 9, 2026 15:56

pull-request-size Bot added size/XL and removed size/L labels Apr 9, 2026

codeant-ai-for-open-source Bot reviewed Apr 9, 2026

View reviewed changes

Comment thread superset/mcp_service/dashboard/schemas.py

Copilot AI reviewed Apr 9, 2026

View reviewed changes

Comment thread superset/mcp_service/dashboard/tool/generate_dashboard.py

Comment thread superset/mcp_service/dashboard/tool/generate_dashboard.py

Comment thread superset/mcp_service/dashboard/schemas.py

Comment thread superset/mcp_service/dashboard/schemas.py Outdated

Copilot started reviewing on behalf of aminghadersohi April 9, 2026 16:20 View session

mistercrunch approved these changes Apr 9, 2026

View reviewed changes

aminghadersohi merged commit 680cef0 into apache:master Apr 9, 2026
65 checks passed

michael-s-molina pushed a commit that referenced this pull request Apr 13, 2026

fix(mcp): strip json_metadata and position_json from get_dashboard_in…

e2d6afa

…fo response (#39101) (cherry picked from commit 680cef0)

Conversation

aminghadersohi commented Apr 3, 2026

Summary

Changes

Impact

Test plan

Uh oh!

bito-code-review Bot commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review Agent Run #7a2e42

Uh oh!

codecov Bot commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

bito-code-review Bot commented Apr 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review Agent Run #3628d0

Uh oh!

mistercrunch commented Apr 6, 2026

Uh oh!

Uh oh!

netlify Bot commented Apr 9, 2026

✅ Deploy Preview for superset-docs-preview ready!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aminghadersohi commented Apr 9, 2026

Uh oh!

bito-code-review Bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review Agent Run #0b84e8

Uh oh!

bito-code-review Bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review Agent Run #542f75

Uh oh!

mistercrunch left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

bito-code-review Bot commented Apr 3, 2026 •

edited

Loading

codecov Bot commented Apr 3, 2026 •

edited

Loading

bito-code-review Bot commented Apr 4, 2026 •

edited

Loading

bito-code-review Bot commented Apr 9, 2026 •

edited

Loading

bito-code-review Bot commented Apr 9, 2026 •

edited

Loading