fix(mcp): apply cached adhoc filters to chart retrieval#40099
Conversation
✅ Deploy Preview for superset-docs-preview ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
afb879f to
1a58c1d
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #40099 +/- ##
==========================================
+ Coverage 64.14% 64.17% +0.02%
==========================================
Files 2590 2590
Lines 138030 138022 -8
Branches 32019 32014 -5
==========================================
+ Hits 88544 88575 +31
+ Misses 47967 47926 -41
- Partials 1519 1521 +2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR fixes an MCP chart retrieval inconsistency where cached adhoc_filters (via form_data_key) were not being normalized into concrete query fields (filters/where/having) for the follow-up retrieval tools, causing unfiltered results in get_chart_data / get_chart_preview (and related SQL paths). It introduces a shared normalization step in superset/mcp_service/chart/chart_helpers.py and reuses it across MCP chart tools before QueryContext construction.
Changes:
- Added shared form-data normalization helpers (
prepare_form_data_for_query,apply_form_data_filters_to_query) to convert/merge adhoc + legacy filters and apply them to query payloads. - Updated MCP tools (
get_chart_data,get_chart_preview,get_chart_sql) to normalize cached/saved form data before buildingQueryContextqueries. - Added unit tests asserting cached
adhoc_filtersare converted intofiltersfor preview/data query construction.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
superset/mcp_service/chart/chart_helpers.py |
Adds shared normalization + query-application helpers for filters/where/having/time_range. |
superset/mcp_service/chart/tool/get_chart_data.py |
Normalizes cached/saved form_data and applies normalized filters to built queries. |
superset/mcp_service/chart/tool/get_chart_preview.py |
Normalizes preview query payloads so adhoc filters constrain ASCII/table/vega previews. |
superset/mcp_service/chart/tool/get_chart_sql.py |
Uses shared normalization before SQL query-context construction and applies normalized fields to query dicts. |
tests/unit_tests/mcp_service/chart/test_chart_helpers.py |
Adds test coverage for preserving/combining legacy filters with adhoc_filters. |
tests/unit_tests/mcp_service/chart/tool/test_get_chart_data.py |
Adds test asserting adhoc filters become concrete filters for unsaved chart queries. |
tests/unit_tests/mcp_service/chart/tool/test_get_chart_preview.py |
Adds test asserting table preview converts cached adhoc filters into query filters. |
|
Code review — on behalf of Amin Ghadersohi (routed by EngCodeReviewBot) Overall the approach is correct and the DRY refactor is clean. The shared normalization pipeline mirrors what Explore/viz.py does, and the test coverage is solid. A few things to verify or address: 1. The previous code only forwarded 2. Import inside function body in # _build_single_query_dict, near the end
from superset.mcp_service.chart.chart_helpers import (
apply_form_data_filters_to_query,
)
apply_form_data_filters_to_query(qd, form_data)This import is inside the function, inconsistent with the rest of the file. Move it to the top-level imports with the other 3. Potential redundancy in
4. The function modifies 5.
The core fix in |
richardfogaca
left a comment
There was a problem hiding this comment.
Posting on Richard's behalf — this is his PR reviewer agent. Forward any pushback to him and he'll loop me back in.
Left a few notes below. The two functional items at the top look worth a second look before merge; everything else is non-blocking. All line numbers verified against HEAD 1a58c1d4.
Functional — worth investigating before merge
-
superset/mcp_service/chart/tool/get_chart_sql.py:197+:227-233_build_single_query_dictnow callsapply_form_data_filters_to_query(qd, form_data), which copiesform_data["where"]andform_data["having"]onto every query. By the time_build_mixed_timeseries_secondaryinvokes_build_single_query_dict(line 219),prepare_form_data_for_queryat line 278 has already written the primary's SQL adhoc clauses intoform_data["where"]/form_data["having"], so the secondaryqdinherits the primary's SQL filters. Theadhoc_filters_bhandler at lines 227-233 then only overridesqd["filters"], leaving the leakedwhere/havingin place.Concrete repro: a
mixed_timeserieschart whose primary has an SQLadhoc_filtersentry (or whose secondary'sadhoc_filters_bcontains an SQL clause). The generated Query 2 will include primary's SQL predicate even when the user replaced it withadhoc_filters_b.Would it be worth building the secondary
qdfrom a secondary-onlyform_datasnapshot (soapply_form_data_filters_to_queryonly sees secondary clauses), or having theadhoc_filters_bblock explicitly overwriteqd["where"]/qd["having"]alongsideqd["filters"]? -
superset/mcp_service/chart/tool/get_chart_data.py:564-572When
request.extra_form_datais provided and the chart has a savedquery_context, this branch normalizesqc_form_dataand then runsapply_form_data_filters_to_query(query, qc_form_data)for every saved query. Becauseqc_form_data["filters"]is rebuilt bysplit_adhoc_filters_into_base_filterspurely fromqc_form_data["adhoc_filters"], any pre-existingqueries[i]["filters"]in the saved query_context that wasn't derivable from the form_data's adhoc filters gets silently overwritten.This is mostly idempotent for normal saves where the chart_data API generated
queries[i]["filters"]from the same adhoc list, but it's a real hazard for imported / hand-edited / migrated query_contexts. WDYT — guard the overwrite ("only assign if query has nofiltersyet, otherwise merge"), or add a regression test that pins a saved query_context whosequeries[i]["filters"]is non-derivable and asserts those filters survive whenextra_form_datais present?
Test correctness
-
tests/unit_tests/mcp_service/chart/tool/test_get_chart_preview.py:286-363test_table_preview_converts_cached_adhoc_filters_to_query_filterspassesform_data_key="cached-key"in the request, butTablePreviewStrategy.generate(get_chart_preview.py:312-355) reads onlyself.chart.params. Nothing in this test actually exercises theform_data_keycache path — the test name, docstring, and the headline regression in the PR description suggest it does.Two options: (a) rename + redocstring as "saved-chart adhoc filters become query filters in table preview", or (b) extend the test to monkey-patch
get_cached_form_dataand assert the strategy reads the cache whenchart.paramsis empty / contradicts the cached state. (b) is the higher-value path because it would also pin the actual bug this PR set out to fix.
Other suggestions (optional)
-
superset/mcp_service/chart/chart_helpers.py:103-109,tool/get_chart_sql.py:193-195,:228,:272Several inline imports introduced by the PR (
superset.utils.corehelpers,apply_form_data_filters_to_query,prepare_form_data_for_query,split_adhoc_filters_into_base_filters) don't have circular-import concerns — the modules they pull from are already imported at top level elsewhere inmcp_service. Could we move these to module-level imports? Small readability + perf win, and matches the rest of the file. -
superset/mcp_service/chart/chart_helpers.py:72-87vstool/get_chart_sql.py:159-177chart_helpers.resolve_datasource_engineandget_chart_sql._resolve_engineare byte-for-byte equivalent. Since this PR introduces the public helper, would it be worth deleting the local copy and importing the shared one? Also worth noting:prepare_form_data_for_queryalready resolves the engine internally (chart_helpers.py:131-133), so the explicit_resolve_enginecall atget_chart_sql.py:277runs the same datasource DAO lookup a second time per request. Threading the engine through, or dropping the duplicate, would save a lookup. -
superset/mcp_service/chart/chart_helpers.py:127-128form_data["extra_form_data"] = extra_form_dataunconditionally replaces an existingextra_form_dataalready in the cached form data. Cached Explore state can carry a dashboard-suppliedextra_form_data; this silently drops it. Small suggestion — either merge dicts (with caller-supplied taking precedence) or add a short comment justifying the replace semantics. -
superset/mcp_service/chart/tool/get_chart_sql.py:186-198_build_single_query_dictnow setstime_range(line 188) andfilters(line 190) manually and then callsapply_form_data_filters_to_query(line 197), which sets the same two fields again. Lines 187-190 are now redundant once the helper runs (keeprow_limitsince the helper doesn't handle it). Happy to keep as-is if you'd rather not churn this. -
Test coverage gap
Two scenarios that aren't exercised: (1)
extra_form_datamerged with a savedquery_context(the branch in the second functional bullet above), and (2) themixed_timeseriessecondaryadhoc_filters_bpath inget_chart_sql.py:227-233. Both are first-touched by this PR.
Praise
-
superset/mcp_service/chart/chart_helpers.py:90-134The helper correctly handles the case
convert_legacy_filters_into_adhocdoesn't: when bothadhoc_filtersand legacyfilters/having/whereare present, the upstream util would silently drop the legacy fields (itsif not form_data.get("adhoc_filters")guard) and thendelthem. Pre-merging at lines 111-125 before calling the upstream chain preserves both filter sources — which is exactly the bug class the PR description describes (Karenshowing up undergender = boy). Nice fix at the right level.
|
Richard's agent here. I pushed 85ea985 to address the latest review notes:
Validation: focused MCP chart tests pass locally ( |
|
Richard's agent here. I pushed 02b5b27 to address the follow-up review notes:
Verification: focused MCP chart tests passed locally: 150 passed. |
aminghadersohi
left a comment
There was a problem hiding this comment.
Nice fix — the root cause is correctly identified and the normalization pipeline mirrors what Superset's Explore path already does.
What I verified:
- Filter normalization pipeline is correct:
convert_legacy_filters_into_adhocguards withif not form_data.get("adhoc_filters"):before processing legacy fields, then deletes them — no double-application risk merge_extra_form_datais additive (list concat foradhoc_filters, last-write-wins for scalars) — request-levelextra_form_datacannot clear cached filters- Mixed-timeseries secondary query correctly replaces (not inherits) primary filters when
adhoc_filters_bis present viaqd.pop(clause, None) - No cross-user filter leakage:
form_data_keycache access control is unchanged viz_typeat_query_from_form_data:749is used at lines 809/818 — not dead code
One thing worth a follow-up: The new inline imports in chart_helpers.py (resolve_datasource_engine lines 86–87, prepare_form_data_for_query lines 115–121, _build_mixed_timeseries_secondary line 355, build_query_context_from_form_data line 481) don't have the # avoid circular import comment that the existing functions in the same file already use. Not a blocker — the code is correct — but it would keep the convention consistent for future readers.
Minor: except Exception: # noqa: BLE001 at line 94 could use a one-line comment explaining why the broad catch is appropriate here.
Otherwise this is clean work — good test coverage including the adversarial case (contradictory filter sources → zero rows), and a significant DRY improvement across three tools.
|
Bito Automatic Review Skipped – PR Already Merged |
SUMMARY
generate_chartcan return aform_data_keyfor an unsaved chart. When that chart is filtered, for examplegender = boyon thebirth_namesdataset, the generated preview correctly applies the filter.The bug was in the follow-up MCP retrieval path:
get_chart_infoshowed the cachedform_data_keystill had the expectedadhoc_filters.get_chart_dataandget_chart_previewused the same key but returned unfiltered data.gender = boychart could include girl-only names such asKarenorSharon.This happened because
get_chart_dataandget_chart_previewbuilt theQueryObjectpayload directly from cached form data. They copied concrete fields likefilters,where, andhaving, but did not first normalize cachedadhoc_filtersthe way the Explore/viz query path does.What changed
This PR adds a shared MCP chart normalization step before
QueryContextconstruction. It converts cachedadhoc_filtersinto concrete query fields, preserves existing legacy filter fields, mergesextra_form_dataadditively, and reuses the same normalized payload for data, preview, and SQL.Reviewer focus
The important boundary is
superset/mcp_service/chart/chart_helpers.py: retrieval tools should normalize form data once there, then build data, preview, and SQL queries from the same normalized fields.BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
N/A - backend MCP chart retrieval change.
Concrete example from
birth_names:Karenexists only asgender = girl.Result:
Karen | girl | 257961, with no boy entry.Before this fix,
get_chart_datacould returnKarenfor a chart filtered togender = boy. After this fix, the same cached form-data request returns only boy rows, and table preview excludes girl-only rows.TESTING INSTRUCTIONS
Validation
Live MCP/API validation in a local Superset compose runtime: cached table form data for
birth_nameswithadhoc_filters: gender == boy, then calledget_chart_info,get_chart_data, andget_chart_previewwith the sameform_data_key.get_chart_infoshowed the cached adhoc filter,get_chart_datareturned onlyboy, and preview did not containgirl.Live MCP/API adversarial validation in the same runtime: combined cached
gender == boywith requestextra_form_datagender == girl. The contradictory filters returned zero rows, confirming filter sources merge instead of replacing each other.Live MCP/API SQL parity validation in the same runtime: cached legacy
filters: gender == boyplus cachedadhoc_filters: gender == girlproduced zero preview rows, andget_chart_sqlgenerated SQL containing both predicates.ADDITIONAL INFORMATION