feat(mcp): chart type plugin registry for extensible generate_chart#39922
feat(mcp): chart type plugin registry for extensible generate_chart#39922aminghadersohi wants to merge 16 commits into
Conversation
✅ Deploy Preview for superset-docs-preview ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
The inline context is insufficient to determine what 'this' refers to in the question. No diff hunk or suggestion snippet is provided for the thread in superset/mcp_service/chart/chart_utils.py. |
There was a problem hiding this comment.
Pull request overview
This PR introduces a chart-type plugin architecture for the MCP chart pipeline, centralizing per-chart-type behavior (pre-validation, column extraction/normalization, form_data mapping, runtime warnings, naming, viz_type resolution) into one plugin per supported chart_type. This reduces the previous need to update multiple dispatch sites when adding or modifying chart types.
Changes:
- Added a
ChartTypePluginprotocol +BaseChartPlugin, a global plugin registry, and 7 built-in chart plugins (xy,table,pie,pivot_table,mixed_timeseries,handlebars,big_number). - Refactored schema/dataset/runtime validation and form_data mapping to dispatch via the registry instead of
isinstance/if/elifchains. - Enhanced tool/docs output by expanding
generate_chartdocstring chart types and addingchart_type_display_nametoChartInfo.
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| superset/mcp_service/chart/validation/schema_validator.py | Switch pre-validation dispatch to registry/plugins; refactor Pydantic error enhancement. |
| superset/mcp_service/chart/validation/runtime/init.py | Replace XY-only runtime checks with per-plugin runtime warning dispatch. |
| superset/mcp_service/chart/validation/dataset_validator.py | Delegate column extraction/normalization to plugins; apply aggregation validation for all chart types. |
| superset/mcp_service/chart/tool/generate_chart.py | Update tool docstring to list all supported chart_type values. |
| superset/mcp_service/chart/schemas.py | Add chart_type_display_name to ChartInfo; resolve display name via registry mapping. |
| superset/mcp_service/chart/registry.py | Introduce global registry for chart plugins + viz_type display-name lookup. |
| superset/mcp_service/chart/plugin.py | Add plugin protocol and base class defining the plugin responsibilities. |
| superset/mcp_service/chart/plugins/init.py | Register the built-in plugins on import. |
| superset/mcp_service/chart/plugins/xy.py | Implement XY plugin (pre-validate, refs, mapping, naming, viz_type, runtime warnings). |
| superset/mcp_service/chart/plugins/table.py | Implement Table plugin. |
| superset/mcp_service/chart/plugins/pie.py | Implement Pie plugin. |
| superset/mcp_service/chart/plugins/pivot_table.py | Implement Pivot Table plugin. |
| superset/mcp_service/chart/plugins/mixed_timeseries.py | Implement Mixed Timeseries plugin. |
| superset/mcp_service/chart/plugins/handlebars.py | Implement Handlebars plugin. |
| superset/mcp_service/chart/plugins/big_number.py | Implement Big Number plugin including post-map trendline temporal validation. |
| superset/mcp_service/chart/chart_utils.py | Replace mapping/name/viz_type dispatch with registry/plugin calls. |
| superset/mcp_service/app.py | Import plugins at MCP app startup; update default instructions about chart type naming. |
891676f to
1e2b541
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #39922 +/- ##
==========================================
- Coverage 64.18% 64.12% -0.07%
==========================================
Files 2591 2602 +11
Lines 138471 139388 +917
Branches 32120 32254 +134
==========================================
+ Hits 88883 89381 +498
- Misses 48056 48469 +413
- Partials 1532 1538 +6
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Internal architecture review — findings & resolutions Ran an independent codex review of the plugin registry design. Summary of findings and their status:
The thread-safety fix ( |
f5ba09f to
7dc3563
Compare
…generation Replaces four scattered dispatch locations (schema_validator, dataset_validator, chart_utils, runtime validator) with a central ChartTypePlugin registry. Each of the 7 supported chart types (xy, table, pie, pivot_table, mixed_timeseries, handlebars, big_number) now owns its pre-validation, column extraction, form_data mapping, post-map validation, column normalization, and runtime warnings in a single plugin class. Key changes: - Add ChartTypePlugin protocol and BaseChartPlugin base class (plugin.py) - Add ChartTypeRegistry with register/get/all_types helpers (registry.py) - Add 7 chart type plugins under chart/plugins/ with full coverage - Fix 5-type column validation gap: pie, pivot_table, mixed_timeseries, handlebars, and big_number now participate in dataset column validation (previously silently skipped) - Move BigNumber trendline temporal check to BigNumberChartPlugin.post_map_validate() - Add get_runtime_warnings() to plugin protocol; XYChartPlugin implements format/cardinality checks, removing isinstance(config, XYChartConfig) from RuntimeValidator - Fix stale generate_chart.py docstring listing only 'xy' and 'table' chart types - Add missing pie, pivot_table, mixed_timeseries handlers to _enhance_validation_error; refactor into a data-driven lookup table to stay within complexity limits - Fix empty details fallback in Pydantic error handler
Each ChartTypePlugin now declares:
- display_name: human-readable label for the chart_type discriminator
(e.g. "Line / Bar / Area / Scatter Chart", "Pivot Table")
- native_viz_types: dict mapping every Superset-internal viz_type the
plugin produces to a user-friendly name
(e.g. {"echarts_timeseries_line": "Line Chart", "echarts_area": "Area Chart"})
The registry gains display_name_for_viz_type(viz_type) which searches
all plugins' native_viz_types maps, replacing the need for a separate
viz_type_display_names.json or viz_type_names.py module.
ChartInfo gains a chart_type_display_name field populated via the registry,
so list_charts / get_chart_info return human-readable chart type names.
The MCP system instructions now reference display names rather than
internal viz_type identifiers.
H1: Delete 7 dead _pre_validate_* static methods from SchemaValidator
— exact duplicates of plugin pre_validate() methods, never called
after _pre_validate_chart_type() was updated to delegate to plugin.
H2: Inline DatasetValidator._normalize_xy_config/_normalize_table_config
into XYChartPlugin/TableChartPlugin.normalize_column_refs() and delete
both DatasetValidator helper methods. The 5 other plugins already
called _get_canonical_column_name directly; XY and Table now match.
H3: Add generate_name()/resolve_viz_type() to ChartTypePlugin protocol
and BaseChartPlugin, implement in all 7 plugins. Replace the 7-arm
isinstance chain in generate_chart_name() and the 7-arm elif chain
in _resolve_viz_type() with single-line registry dispatch.
H4: Add a sync comment above _CHART_TYPE_ERROR_HINTS to document that
it must stay in sync with the plugin registry.
M4: Move logger=logging.getLogger(__name__) from inside
XYChartPlugin.get_runtime_warnings() to module scope.
…xes, test repairs On top of the dead-code elimination in the previous commit: - Add lazy _ensure_plugins_loaded() bootstrap to ChartTypeRegistry so the registry is populated even without importing app.py (fixes isolated test runs) - Delegate _RegistryProxy methods to module-level functions so bootstrap runs - Guard register() against empty chart_type strings - Add generate_name + resolve_viz_type to ChartTypePlugin Protocol and BaseChartPlugin; delegate generate_chart_name/_resolve_viz_type in chart_utils to the plugin registry - Add _with_context static helper to BaseChartPlugin (shared by all plugins) - Fix stale 'five methods' → 'eight methods' docstring in plugin.py - Add TypeVar _C to normalize_column_names so mypy infers correct return type - Fix broken tests: update _pre_validate_big_number_config → _pre_validate_chart_type, remove deleted TestNormalizeXYConfig/TestNormalizeTableConfig classes, update runtime validator tests for removed _validate_format_compatibility / _validate_cardinality methods, add x is not None narrowing guards Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…nal corrections, cardinality suggestions - Remove redundant local imports from BigNumberChartPlugin.post_map_validate() now that BigNumberChartConfig and is_column_truly_temporal are at top level - Add explanatory comments on the two remaining local get_registry imports in chart_utils.py and dataset_validator.py (circular import prevention) - Fix schema_validator.py and generate_chart.py docstring: XY 'x' field is optional (defaults to dataset primary datetime column), not required - Propagate cardinality suggestions alongside warnings in XYChartPlugin - Clarify app.py instructions: chart_type_display_name is null for viz_types outside the 7 generate_chart-supported types Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
All per-method local imports in the 7 chart plugins were moved to module-level. None of them create circular imports: schemas.py, chart_utils.py, and dataset_validator.py are safe to import at plugin load time because those modules guard their own registry lookups with local imports. - big_number: add map_big_number_config, _big_number_chart_what, _summarize_filters, DatasetValidator to top-level imports - pie: add map_pie_config, _pie_chart_what, _summarize_filters, PieChartConfig, DatasetValidator to top-level imports - xy: add map_xy_config, _xy_chart_what/context, XYChartConfig, DatasetValidator, FormatTypeValidator, CardinalityValidator to top-level imports - table: add map_table_config, _table_chart_what, _summarize_filters, TableChartConfig, DatasetValidator to top-level imports - pivot_table: add map_pivot_table_config, _pivot_table_what, _summarize_filters, PivotTableChartConfig, DatasetValidator to top-level imports - mixed_timeseries: add map_mixed_timeseries_config, _mixed_timeseries_what, _summarize_filters, MixedTimeseriesChartConfig, DatasetValidator to top-level - handlebars: add map_handlebars_config, _handlebars_chart_what, _summarize_filters, HandlebarsChartConfig, DatasetValidator to top-level imports Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Split long string literal in schema_validator.py line 202 (E501, 94 > 88 chars) - Apply ruff format auto-fixes to big_number.py, handlebars.py, and test_get_chart_data.py
- Move error_schemas import above _C TypeVar definition (E402) - Split two over-length comment lines to ≤88 chars (E501, lines 268 and 380)
_ensure_plugins_loaded() used an unprotected boolean flag, making it unsafe under concurrent first-call scenarios (e.g. gunicorn multi-thread workers). Double-checked locking with threading.Lock eliminates the race. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
update_chart was only running SchemaValidator + Tier-2 compile check,
silently skipping DatasetValidator's column-existence + fuzzy-match
and column-name normalisation layers that generate_chart runs.
A typo like {name: "reveneu"} would save the broken chart and only
surface as a render-time failure in the browser.
Now matches generate_chart pipeline:
- Layer 2: DatasetValidator.validate_against_dataset() — column
existence check with fuzzy-match "did you mean?" suggestions returned
to the LLM before any DB write occurs
- Layer 4: DatasetValidator.normalize_column_names() — case
normalisation so "order_date" resolves to "OrderDate" if that is the
canonical dataset name
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…nts, remove column regex P1.1 registry.py: move _plugins_loaded=True to after successful import so a failed load doesn't permanently poison the registry. P1.3 schemas.py: remove overly restrictive ColumnRef.name / FilterClause.column / BigNumberChartConfig.temporal_column regex that blocked valid column names containing parentheses, slashes, and other SQL-common characters. P2.3 (DRY): eliminate _CHART_TYPE_ERROR_HINTS second-registry in schema_validator.py by adding schema_error_hint() to ChartTypePlugin protocol, BaseChartPlugin default, and all 7 plugin classes. SchemaValidator now delegates to the plugin registry instead of maintaining a parallel dict. P3.3 test_registry.py: add full registry unit-test coverage (register, get, all_types, is_registered, display_name_for_viz_type, proxy methods, duplicate warning, empty chart_type validation, insertion-order guarantee). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… column validation Split an 89-char comment line and an over-limit condition in update_chart.py to satisfy the ruff E501 rule. Also applied ruff format. Two TestUpdateChartValidationGate tests expected CHART_VALIDATION_FAILED but received CHART_DATASET_NOT_FOUND because _validate_update_against_dataset calls DatasetValidator.validate_against_dataset before validate_and_compile, and the existing mocks provided a Mock() object for chart.datasource whose .id attribute is an auto-generated MagicMock (not a real int). Added a patch for DatasetValidator.validate_against_dataset returning (True, None) so the column-validation tier is bypassed and the test reaches the mocked validate_and_compile response as intended. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…onfig Introduces a dynamic filter layer in the chart type registry so operators can disable individual plugins (e.g. `handlebars`) without a code deploy: - `MCP_DISABLED_CHART_PLUGINS: frozenset[str]` — static deny-list in mcp_config.py - `MCP_CHART_PLUGIN_ENABLED_FUNC: Callable[[str], bool] | None` — dynamic hook for Harness/Split/per-user targeting; takes precedence over the deny-list - Both keys are propagated through `get_mcp_config()` defaults registry.py changes: - `_PluginFilterConfig` frozen dataclass replaces two bare globals so configure() replaces them atomically (no torn reads under concurrency) - `configure(disabled, enabled_func)` — called at app init; accepts any iterable for `disabled`; validates `enabled_func` is callable - `_is_plugin_enabled()` — reads config once, fails closed on callable exception - `get()` / `all_types()` / `is_enabled()` apply the filter at lookup time; `is_registered()` and `display_name_for_viz_type()` intentionally bypass it so callers can distinguish "unknown" vs "disabled" and existing charts still resolve display names for disabled viz types schema_validator.py: two-step pre-check — `is_registered()` for unknown types, `is_enabled()` for disabled ones, with distinct `DISABLED_CHART_TYPE` error code. Wiring: - `SupersetAppInitializer.configure_mcp_chart_registry()` called after `configure_feature_flags()` in `init_app()` - `flask_singleton.py` re-calls `registry.configure()` after the MCP config overlay so MCP-specific overrides in `superset_config.py` take effect in standalone MCP mode Tests: 28 cases in test_registry_filters.py covering deny-list, callable hook, fail-closed on exception, all_types() filtering, display_name bypass, atomic reconfigure, and configure() with list/tuple/frozenset inputs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
7dc3563 to
2e7a86f
Compare
| if config_dict.get("metric") and not config_dict["metric"].get("saved_metric"): | ||
| config_dict["metric"]["name"] = DatasetValidator._get_canonical_column_name( | ||
| config_dict["metric"]["name"], dataset_context | ||
| ) |
There was a problem hiding this comment.
Suggestion: Saved metrics are skipped during name normalization, so case-insensitive validation can pass but the emitted metric string can still use the wrong casing and fail later in Superset's exact-name metric lookup. Normalize saved metric names against dataset metrics (not just raw columns) so the final form_data always uses canonical metric names. [api mismatch]
Severity Level: Critical 🚨
- ❌ Big Number charts with mis-cased saved metrics fail.
- ⚠️ Validation passes but Superset query errors at runtime.
- ⚠️ LLM clients see confusing success-then-fail behavior.Steps of Reproduction ✅
1. Define a dataset in Superset with a saved metric whose name differs in case from the
desired client input, e.g. metric_name="Revenue" (saved metric) on a dataset that
ValidationPipeline will load via DatasetValidator._get_dataset_context at
`superset/mcp_service/chart/validation/dataset_validator.py:197-252`.
2. Call the MCP tool `generate_chart` at
`superset/mcp_service/chart/tool/generate_chart.py:95` with a Big Number config using the
saved metric but with different casing, e.g. `"metric": {"name": "revenue",
"saved_metric": true}` in `GenerateChartRequest.config` (chart_type="big_number").
3. ValidationPipeline.validate_request_with_warnings at
`superset/mcp_service/chart/validation/pipeline.py:90-55` invokes
DatasetValidator.validate_against_dataset at `dataset_validator.py:55-72`, which calls
`_validate_saved_metrics` at `dataset_validator.py:485-46`; this checks
`col_ref.name.lower()` against `available_metrics` lowercased, so `"revenue"` is accepted
as a valid saved metric even though its canonical name is `"Revenue"`.
4. The same pipeline then normalizes column names via
DatasetValidator.normalize_column_names at `dataset_validator.py:94-141`, which delegates
to BigNumberChartPlugin.normalize_column_refs at
`superset/mcp_service/chart/plugins/big_number.py:189-203`; because the metric has
`"saved_metric": true`, the guard at lines 192-195 skips `_get_canonical_column_name`, so
the metric name remains `"revenue"` and is never canonicalized to `"Revenue"`.
5. After validation, `generate_chart` calls `map_config_to_form_data` at
`superset/mcp_service/chart/chart_utils.py:93-131`, which routes to
BigNumberChartPlugin.to_form_data at `big_number.py:132-135`; `map_big_number_config` at
`chart_utils.py:763-799` builds form_data using `create_metric_object(config.metric)` at
`chart_utils.py:11-51`, which returns the raw `col.name` string `"revenue"` for saved
metrics.
6. When Superset executes the query, its SQLA connector resolves metrics via a
`metrics_by_name` dict (see docstring and usage at `chart_utils.py:491-499` and
`superset/connectors/sqla/models.py:1758-38`); because the dict keys are the canonical
metric names (e.g. `"Revenue"`), the mis-cased `"revenue"` key is missing and Superset
raises `QueryObjectValidationError("Metric '%(metric)s' does not exist")`, causing the Big
Number chart generation to fail at runtime even though validation passed.Fix in Cursor | Fix in VSCode Claude
(Use Cmd/Ctrl + Click for best experience)
Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** superset/mcp_service/chart/plugins/big_number.py
**Line:** 192:195
**Comment:**
*Api Mismatch: Saved metrics are skipped during name normalization, so case-insensitive validation can pass but the emitted metric string can still use the wrong casing and fail later in Superset's exact-name metric lookup. Normalize saved metric names against dataset metrics (not just raw columns) so the final form_data always uses canonical metric names.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fixThere was a problem hiding this comment.
The normalization is already in place. Both big_number.py and pie.py branch on saved_metric in normalize_column_refs: when saved_metric=True, the name is resolved through DatasetValidator._get_canonical_metric_name(), which performs a case-insensitive lookup exclusively against available_metrics and returns the canonical dataset name. Non-saved metrics go through _get_canonical_column_name instead. The described bug doesn't exist in the current code.
| def _norm_col_list(key: str) -> None: | ||
| if config_dict.get(key): | ||
| for col in config_dict[key]: | ||
| col["name"] = DatasetValidator._get_canonical_column_name( | ||
| col["name"], dataset_context | ||
| ) | ||
|
|
||
| _norm_col_list("rows") | ||
| _norm_col_list("metrics") | ||
| _norm_col_list("columns") |
There was a problem hiding this comment.
Suggestion: Metrics are normalized with a helper that prefers matching dataset columns before metrics, so a saved metric that case-insensitively collides with a column name can be rewritten to the column identifier and stop resolving as a saved metric. Handle metric normalization with metric-aware precedence (or branch on saved_metric) to preserve metric identity. [logic error]
Severity Level: Critical 🚨
- ❌ Pivot table charts can fail on valid saved metrics.
- ⚠️ Column-first canonicalization corrupts saved metric identity.
- ⚠️ Users get runtime metric-not-found errors after validation.Steps of Reproduction ✅
1. Consider a dataset where a physical column and a saved metric share the same
case-insensitive name but differ in canonical casing, e.g. available_columns includes
{"name": "Revenue"} and available_metrics includes {"name": "REVENUE"} as built in
DatasetValidator._get_dataset_context at
`superset/mcp_service/chart/validation/dataset_validator.py:197-252`.
2. A client calls the MCP `generate_chart` tool at
`superset/mcp_service/chart/tool/generate_chart.py:95` with a pivot_table config
referencing the saved metric using a lowercased name, e.g. `"metrics": [{"name":
"revenue", "saved_metric": true}]` in `PivotTableChartConfig`
(`superset/mcp_service/chart/schemas.py:243-261`).
3. ValidationPipeline.validate_request_with_warnings at
`superset/mcp_service/chart/validation/pipeline.py:90-55` runs
DatasetValidator.validate_against_dataset at `dataset_validator.py:55-72`, which calls
`_validate_saved_metrics` at `dataset_validator.py:485-46`; this checks
`col_ref.name.lower()` against `available_metrics` lowercased, so `"revenue"` is accepted
as a valid saved metric in spite of the ambiguous column name.
4. In the column-normalization layer, ValidationPipeline._normalize_column_names at
`validation/pipeline.py:137-181` calls DatasetValidator.normalize_column_names at
`dataset_validator.py:94-141`, which looks up the PivotTableChartPlugin and invokes its
normalize_column_refs at `superset/mcp_service/chart/plugins/pivot_table.py:120-134`;
`_norm_col_list("metrics")` at lines 123-131 runs `_get_canonical_column_name` at
`dataset_validator.py:46-77` for each metric, and because that helper checks
`available_columns` before `available_metrics`, the shared lowercase name `"revenue"` is
canonicalized to the column's name `"Revenue"` instead of the saved metric's canonical
name `"REVENUE"`.
5. The now-normalized config is used in `generate_chart` to build Superset form_data via
map_config_to_form_data at `chart_utils.py:93-131`, which routes to
PivotTableChartPlugin.to_form_data at `pivot_table.py:107-110`; `map_pivot_table_config`
at `superset/mcp_service/chart/chart_utils.py:841-871` builds the `"metrics"` list by
calling `create_metric_object(col)` at `chart_utils.py:11-51`, which returns the string
`"Revenue"` for the saved metric because `col.saved_metric` is still True.
6. At query execution time, Superset resolves metric strings via `metrics_by_name` (see
docstring at `chart_utils.py:491-499` and access pattern at
`superset/connectors/sqla/models.py:1758-38`); this dict is keyed by the canonical saved
metric name `"REVENUE"`, so the rewritten `"Revenue"` key is missing, causing
`QueryObjectValidationError("Metric '%(metric)s' does not exist")` and breaking pivot
table chart execution for datasets where saved metric names collide case-insensitively
with column names.Fix in Cursor | Fix in VSCode Claude
(Use Cmd/Ctrl + Click for best experience)
Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** superset/mcp_service/chart/plugins/pivot_table.py
**Line:** 123:132
**Comment:**
*Logic Error: Metrics are normalized with a helper that prefers matching dataset columns before metrics, so a saved metric that case-insensitively collides with a column name can be rewritten to the column identifier and stop resolving as a saved metric. Handle metric normalization with metric-aware precedence (or branch on `saved_metric`) to preserve metric identity.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix| if "x" not in config: | ||
| missing_fields.append("'x' (X-axis temporal column)") | ||
| if "y" not in config: | ||
| missing_fields.append("'y' (primary Y-axis metrics)") | ||
| if "y_secondary" not in config: | ||
| missing_fields.append("'y_secondary' (secondary Y-axis metrics)") |
There was a problem hiding this comment.
Suggestion: The pre-validation only checks for x, y, and y_secondary keys, but the schema explicitly accepts aliases (x_axis, metrics, metrics_b). Valid payloads that use those aliases will be rejected before Pydantic parsing, causing false validation failures. Make pre-validation honor the same aliases as the schema. [api mismatch]
Severity Level: Major ⚠️
- ❌ SchemaValidator rejects mixed_timeseries configs using alias keys.
- ⚠️ Direct callers cannot rely on x_axis/metrics/metrics_b aliases.Steps of Reproduction ✅
1. Use the schema-level entrypoint `SchemaValidator.validate_request` at
`superset/mcp_service/chart/validation/schema_validator.py:40-57` with a raw request dict
shaped like:
`{"dataset_id": 1, "config": {"chart_type": "mixed_timeseries", "x_axis": {"name":
"ds"}, "metrics": [{"name": "revenue", "aggregate": "SUM"}], "metrics_b": [{"name":
"orders", "aggregate": "COUNT"}]}}`.
2. Inside `_pre_validate` at `schema_validator.py:64-143`, the function extracts `config =
data["config"]` and dispatches to `_pre_validate_chart_type(chart_type, config)` at
`schema_validator.py:145-173`, passing this raw config dict (still using the alias keys
`x_axis`, `metrics`, and `metrics_b`).
3. `_pre_validate_chart_type` looks up the plugin from the registry and calls
`plugin.pre_validate(config)` at `schema_validator.py:190-201`; for
`chart_type="mixed_timeseries"` this resolves to `MixedTimeseriesChartPlugin.pre_validate`
defined in `superset/mcp_service/chart/plugins/mixed_timeseries.py:44-55`.
4. `MixedTimeseriesChartPlugin.pre_validate` only checks for `"x"`, `"y"`, and
`"y_secondary"` keys (lines 50-55) and does not recognize the schema-supported aliases
`x_axis`, `metrics`, and `metrics_b` defined on `MixedTimeseriesChartConfig` in
`superset/mcp_service/chart/schemas.py:18-51`, so `missing_fields` becomes non-empty, a
`ChartGenerationError` is returned, and `SchemaValidator.validate_request` exits early at
line 51, rejecting a payload that would successfully parse via Pydantic if alias keys were
honored.Fix in Cursor | Fix in VSCode Claude
(Use Cmd/Ctrl + Click for best experience)
Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** superset/mcp_service/chart/plugins/mixed_timeseries.py
**Line:** 50:55
**Comment:**
*Api Mismatch: The pre-validation only checks for `x`, `y`, and `y_secondary` keys, but the schema explicitly accepts aliases (`x_axis`, `metrics`, `metrics_b`). Valid payloads that use those aliases will be rejected before Pydantic parsing, causing false validation failures. Make pre-validation honor the same aliases as the schema.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix| return None | ||
|
|
||
| def extract_column_refs(self, config: Any) -> list[ColumnRef]: | ||
| if not isinstance(config, MixedTimeseriesChartConfig): | ||
| return [] | ||
| refs: list[ColumnRef] = [config.x] |
There was a problem hiding this comment.
Suggestion: Metric name normalization rewrites every entry in y/y_secondary without checking saved_metric, so saved metrics can be incorrectly remapped to similarly named dataset columns (column lookup is prioritized), breaking saved-metric resolution later. Skip canonical-column normalization for items marked as saved_metric. [logic error]
Severity Level: Critical 🚨
- ❌ Mixed_timeseries charts using saved metrics may fail at runtime.
- ⚠️ Saved metrics shadowed by columns resolve to wrong identifiers.Steps of Reproduction ✅
1. Call the MCP tool `generate_chart` defined in
`superset/mcp_service/chart/tool/generate_chart.py:95-132` with a `GenerateChartRequest`
whose `config.chart_type` is `"mixed_timeseries"` and whose `y`/`y_secondary` lists
contain saved metrics, e.g. `{"name": "count", "saved_metric": true}` as described by
`ColumnRef.saved_metric` in `superset/mcp_service/chart/schemas.py:27-58`.
2. Inside `generate_chart`, the code invokes
`ValidationPipeline.validate_request_with_warnings(request.model_dump())` at
`generate_chart.py:205-210`; in `ValidationPipeline.validate_request_with_warnings`
(`superset/mcp_service/chart/validation/pipeline.py:90-145`), after schema and dataset
validation, it calls `_normalize_column_names` at `pipeline.py:227-260`.
3. `_normalize_column_names` delegates to `DatasetValidator.normalize_column_names(config,
dataset_id, dataset_context)` at
`superset/mcp_service/chart/validation/dataset_validator.py:351-398`; this looks up the
`mixed_timeseries` plugin from the registry and calls
`plugin.normalize_column_refs(config, dataset_context)`, which is
`MixedTimeseriesChartPlugin.normalize_column_refs` at
`superset/mcp_service/chart/plugins/mixed_timeseries.py:84-106`.
4. In `normalize_column_refs`, helper `_norm_list` (lines 93-98) iterates over every entry
in `y` and `y_secondary` and unconditionally rewrites `col["name"]` using
`DatasetValidator._get_canonical_column_name(col["name"], dataset_context)`, ignoring
`col.get("saved_metric")`; `_get_canonical_column_name` at `dataset_validator.py:303-334`
searches `dataset_context.available_columns` first and only then `available_metrics`, so
in a dataset that defines both a column `"COUNT"` and a saved metric `"count"`, a
saved-metric reference `{"name": "count", "saved_metric": true}` in `y` is normalized to
`"COUNT"`. Later, `map_mixed_timeseries_config` in
`superset/mcp_service/chart/chart_utils.py:21-46` calls `create_metric_object` on these
`ColumnRef` objects; for saved metrics it returns `col.name` (now `"COUNT"`) as the metric
identifier. Because Superset's metric lookup expects the original metric name (e.g.
`"count"`), the chart's metrics no longer resolve correctly, causing mixed_timeseries
charts using such saved metrics to fail or behave incorrectly at query time.Fix in Cursor | Fix in VSCode Claude
(Use Cmd/Ctrl + Click for best experience)
Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** superset/mcp_service/chart/plugins/mixed_timeseries.py
**Line:** 93:98
**Comment:**
*Logic Error: Metric name normalization rewrites every entry in `y`/`y_secondary` without checking `saved_metric`, so saved metrics can be incorrectly remapped to similarly named dataset columns (column lookup is prioritized), breaking saved-metric resolution later. Skip canonical-column normalization for items marked as `saved_metric`.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix| is_col_valid, col_error = DatasetValidator.validate_against_dataset( | ||
| parsed_config, dataset.id | ||
| ) | ||
| if not is_col_valid and col_error is not None: | ||
| logger.warning( | ||
| "update_chart column validation failed for chart %s: %s", | ||
| getattr(chart, "id", None), | ||
| col_error, | ||
| ) | ||
| return GenerateChartResponse.model_validate( | ||
| { | ||
| "chart": None, | ||
| "error": col_error.model_dump(), | ||
| "success": False, | ||
| "schema_version": "2.0", | ||
| "api_version": "v1", | ||
| } | ||
| ) |
There was a problem hiding this comment.
Suggestion: This adds a full dataset validation call before validate_and_compile, but validate_and_compile already performs the same dataset validation internally using the already-loaded dataset object. The extra call triggers redundant dataset context fetching and duplicate validation work on every update request, increasing latency and database load. [performance]
Severity Level: Major ⚠️
- ⚠️ update_chart validation calls DatasetValidator twice per request.
- ⚠️ Extra dataset queries increase chart-update latency and load.Steps of Reproduction ✅
1. Call the MCP `update_chart` tool defined at
`superset/mcp_service/chart/tool/update_chart.py:304-595` with `generate_preview=False`
and a non-null `config` so that `parsed_config` is set (`update_chart.py:413-415`) and a
full visualization update runs.
2. After building the update payload via `_build_update_payload`
(`update_chart.py:87-121`), `update_chart` extracts `new_form_data` and, before
persisting, invokes `_validate_update_against_dataset(parsed_config, new_form_data,
chart)` inside the `mcp.update_chart.validation` log context at `update_chart.py:450-453`.
3. Inside `_validate_update_against_dataset` (`update_chart.py:167-251`), once the dataset
ORM object is resolved (`dataset` from `DatasetDAO.find_by_id` at 178-180), the new code
block at lines 199-205 calls `DatasetValidator.validate_against_dataset(parsed_config,
dataset.id)` without passing a `dataset_context`, causing `DatasetValidator` to fetch
dataset context from the database via `_get_dataset_context`
(`superset/mcp_service/chart/validation/dataset_validator.py:197-253`) and run full column
and aggregation validation once.
4. Immediately afterwards, `_validate_update_against_dataset` calls
`validate_and_compile(parsed_config, form_data, dataset, run_compile_check=True)` at
`update_chart.py:222-224`, and `validate_and_compile` in
`superset/mcp_service/chart/compile.py:19-84` builds a new `dataset_context` from the same
ORM dataset and calls `DatasetValidator.validate_against_dataset(config, dataset.id,
dataset_context=dataset_context)` again (compile.py:48-52), resulting in two complete
Tier-1 dataset validation passes and two dataset-context constructions for every
update_chart request.Fix in Cursor | Fix in VSCode Claude
(Use Cmd/Ctrl + Click for best experience)
Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** superset/mcp_service/chart/tool/update_chart.py
**Line:** 203:220
**Comment:**
*Performance: This adds a full dataset validation call before `validate_and_compile`, but `validate_and_compile` already performs the same dataset validation internally using the already-loaded dataset object. The extra call triggers redundant dataset context fetching and duplicate validation work on every update request, increasing latency and database load.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fixAdd _get_canonical_metric_name() to DatasetValidator that searches only available_metrics, preventing a column with matching case-insensitive name from shadowing a saved metric's canonical casing. Update all 7 chart plugins (xy, table, pie, big_number, handlebars, mixed_timeseries, pivot_table) to branch on saved_metric flag: saved metrics now go through _get_canonical_metric_name while regular column refs continue to use _get_canonical_column_name. Fix pre_validate alias handling in xy and mixed_timeseries plugins to accept Pydantic AliasChoices keys (metrics/x_axis/metrics_b) so payloads using canonical Superset field names are not incorrectly rejected. Add TestGetCanonicalMetricName, TestSavedMetricNormalizationCorrectness, and TestPreValidateAliasHandling test classes covering the collision case and alias acceptance. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
A saved metric and a regular column with the same input name resolve to different display labels after normalization (saved metrics use the dataset's actual casing). Using a plain string key incorrectly flags them as duplicates; keying on (saved_metric, label) avoids the false collision. Fixes test_xy_saved_metric_uses_metric_casing.
There was a problem hiding this comment.
Code Review Agent Run #8680f9
Actionable Suggestions - 7
-
superset/mcp_service/chart/schemas.py - 1
- CWE-390: Bare Exception Handler · Line 491-496
-
superset/mcp_service/chart/plugin.py - 1
- Invalid type annotation string · Line 175-175
-
superset/mcp_service/chart/tool/update_chart.py - 2
- Redundant DB query in validation · Line 203-204
- Missing test coverage for normalization · Line 416-432
-
superset/mcp_service/chart/plugins/handlebars.py - 1
- Duplicate validation with schema · Line 44-124
-
tests/unit_tests/mcp_service/chart/validation/test_runtime_validator.py - 1
- Test coverage regression · Line 213-234
-
superset/mcp_service/chart/validation/runtime/__init__.py - 1
- Avoid catching blind Exception · Line 107-107
Additional Suggestions - 6
-
superset/initialization/__init__.py - 1
-
Missing error handling for MCP init · Line 750-750Wrap the call to `configure_mcp_chart_registry()` in a try-except block to catch any exceptions from `registry.configure()` and log them so that `init_app()` can continue even if MCP chart registry configuration fails.
-
-
superset/mcp_service/chart/validation/schema_validator.py - 2
-
Import inside function body · Line 151-152The `from superset.mcp_service.chart.registry import get_registry` import is placed inside the function body at lines 151-152. This causes the import to execute on every call to `_pre_validate_chart_type`. Move it to the module-level imports for better performance.
Code suggestion
--- superset/mcp_service/chart/validation/schema_validator.py +++ superset/mcp_service/chart/validation/schema_validator.py @@ -29,6 +29,7 @@ from superset.mcp_service.chart.schemas import ( ) from superset.mcp_service.common.error_schemas import ChartGenerationError +from superset.mcp_service.chart.registry import get_registry logger = logging.getLogger(__name__) @@ -148,9 +149,6 @@ class SchemaValidator: config: Dict[str, Any], ) -> Tuple[bool, ChartGenerationError | None]: """Validate chart type and dispatch to plugin pre-validation.""" - from superset.mcp_service.chart.registry import get_registry - registry = get_registry()
-
Misleading variable name · Line 176-176Variable `valid_types` on line 176 is misleading — `registry.all_types()` returns only enabled chart types (per registry.py:168), so the name suggests all registered types when it actually contains only enabled ones. Rename to `enabled_types` for clarity.
Code suggestion
--- superset/mcp_service/chart/validation/schema_validator.py (lines 175-182) +++ superset/mcp_service/chart/validation/schema_validator.py @@ -173,12 +173,12 @@ class SchemaValidator: ) if not registry.is_enabled(chart_type): - valid_types = ", ".join(registry.all_types()) + enabled_types = ", ".join(registry.all_types()) return False, ChartGenerationError( error_type="disabled_chart_type", message=f"Chart type '{chart_type}' is not enabled on this instance", details=f"Chart type '{chart_type}' is registered but has been " f"disabled by the operator. " - f"Enabled chart types: {valid_types}", + f"Enabled chart types: {enabled_types}", suggestions=[ - f"Use one of the enabled chart types: {valid_types}", + f"Use one of the enabled chart types: {enabled_types}", "Contact your administrator if you believe this is an error", ], error_code="DISABLED_CHART_TYPE", )
-
-
superset/mcp_service/chart/plugins/xy.py - 3
-
Private member access in DatasetValidator · Line 114-114Accessing private method `_get_canonical_column_name`. Consider using a public API or adding a public wrapper method.
-
Private member access in DatasetValidator · Line 115-115Accessing private method `_get_canonical_metric_name`. Consider using a public API or adding a public wrapper method.
-
Private member access in DatasetValidator · Line 129-129Accessing private method `_normalize_filters`. Consider using a public API or adding a public wrapper method.
-
Filtered by Review Rules
Bito filtered these suggestions based on rules created automatically for your feedback. Manage rules.
-
superset/mcp_service/chart/registry.py - 1
- Broad exception catch · Line 87-87
Review Details
-
Files reviewed - 27 · Commit Range:
d3af275..f90f1cd- superset/initialization/__init__.py
- superset/mcp_service/app.py
- superset/mcp_service/chart/chart_utils.py
- superset/mcp_service/chart/plugin.py
- superset/mcp_service/chart/plugins/__init__.py
- superset/mcp_service/chart/plugins/big_number.py
- superset/mcp_service/chart/plugins/handlebars.py
- superset/mcp_service/chart/plugins/mixed_timeseries.py
- superset/mcp_service/chart/plugins/pie.py
- superset/mcp_service/chart/plugins/pivot_table.py
- superset/mcp_service/chart/plugins/table.py
- superset/mcp_service/chart/plugins/xy.py
- superset/mcp_service/chart/registry.py
- superset/mcp_service/chart/schemas.py
- superset/mcp_service/chart/tool/generate_chart.py
- superset/mcp_service/chart/tool/update_chart.py
- superset/mcp_service/chart/validation/dataset_validator.py
- superset/mcp_service/chart/validation/runtime/__init__.py
- superset/mcp_service/chart/validation/schema_validator.py
- superset/mcp_service/flask_singleton.py
- superset/mcp_service/mcp_config.py
- tests/unit_tests/mcp_service/chart/test_big_number_chart.py
- tests/unit_tests/mcp_service/chart/test_registry.py
- tests/unit_tests/mcp_service/chart/test_registry_filters.py
- tests/unit_tests/mcp_service/chart/tool/test_update_chart.py
- tests/unit_tests/mcp_service/chart/validation/test_column_name_normalization.py
- tests/unit_tests/mcp_service/chart/validation/test_runtime_validator.py
-
Files skipped - 0
-
Tools
- Whispers (Secret Scanner) - ✔︎ Successful
- Detect-secrets (Secret Scanner) - ✔︎ Successful
- MyPy (Static Code Analysis) - ✔︎ Successful
- Astral Ruff (Static Code Analysis) - ✔︎ Successful
Bito Usage Guide
Commands
Type the following command in the pull request comment and save the comment.
-
/review- Manually triggers a full AI review. -
/pause- Pauses automatic reviews on this pull request. -
/resume- Resumes automatic reviews. -
/resolve- Marks all Bito-posted review comments as resolved. -
/abort- Cancels all in-progress reviews.
Refer to the documentation for additional commands.
Configuration
This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.
Documentation & Help
| try: | ||
| from superset.mcp_service.chart.registry import display_name_for_viz_type | ||
|
|
||
| _display_name = display_name_for_viz_type(_viz_type) if _viz_type else None | ||
| except Exception: | ||
| _display_name = None |
There was a problem hiding this comment.
Replace the broad except Exception: with except (ImportError, AttributeError): to avoid masking unexpected errors while still handling expected import or attribute issues. (See also: CWE-390)
Code Review Run #8680f9
Should Bito avoid suggestions like this for future reviews? (Manage Rules)
- Yes, avoid them
| """ | ||
| ... | ||
|
|
||
| def schema_error_hint(self) -> "ChartGenerationError | None": |
There was a problem hiding this comment.
Remove the quotes around the return type annotation in the ChartTypePlugin protocol for schema_error_hint so it reads: def schema_error_hint(self) -> ChartGenerationError | None:.
Code Review Run #8680f9
Should Bito avoid suggestions like this for future reviews? (Manage Rules)
- Yes, avoid them
| is_col_valid, col_error = DatasetValidator.validate_against_dataset( | ||
| parsed_config, dataset.id |
There was a problem hiding this comment.
Lines 203-204 call DatasetValidator.validate_against_dataset(parsed_config, dataset.id) which internally invokes _get_dataset_context(dataset_id) to fetch the dataset from the database. However, _validate_update_against_dataset already has the dataset available from lines 178-180. This causes a redundant database query per update operation. Pass the pre-fetched dataset via the dataset_context parameter to eliminate the duplicate lookup.
Code suggestion
Check the AI-generated fix before applying
--- superset/mcp_service/chart/tool/update_chart.py
+++ superset/mcp_service/chart/tool/update_chart.py
@@ -200,8 +200,14 @@ def _validate_update_against_dataset(
# Column existence + fuzzy-match validation
# (mirrors generate_chart pipeline layer 2)
from superset.mcp_service.chart.validation.dataset_validator import DatasetValidator
+ from superset.mcp_service.chart.compile import build_dataset_context_from_orm
+
+ # Reuse the already-fetched dataset to avoid a redundant DB query
+ dataset_context = build_dataset_context_from_orm(dataset)
- is_col_valid, col_error = DatasetValidator.validate_against_dataset(
- parsed_config, dataset.id
+ is_col_valid, col_error = DatasetValidator.validate_against_dataset(
+ parsed_config, dataset.id, dataset_context=dataset_context
)
Code Review Run #8680f9
Should Bito avoid suggestions like this for future reviews? (Manage Rules)
- Yes, avoid them
| # Normalize column case to match dataset canonical names | ||
| # (mirrors generate_chart pipeline layer 4) | ||
| chart_datasource_id = getattr(chart, "datasource_id", None) | ||
| if parsed_config is not None and chart_datasource_id is not None: | ||
| from superset.mcp_service.chart.validation.dataset_validator import ( | ||
| DatasetValidator, | ||
| NORMALIZATION_EXCEPTIONS, | ||
| ) | ||
|
|
||
| try: | ||
| parsed_config = DatasetValidator.normalize_column_names( | ||
| parsed_config, chart.datasource_id | ||
| ) | ||
| except NORMALIZATION_EXCEPTIONS as e: | ||
| logger.warning( | ||
| "Column normalization failed for chart %s: %s", chart.id, e | ||
| ) |
There was a problem hiding this comment.
The new column normalization feature (lines 416-432) has no test coverage. Tests should verify that column names are properly normalized to match dataset canonical names and that exceptions are handled gracefully without crashing the update operation.
Code Review Run #8680f9
Should Bito avoid suggestions like this for future reviews? (Manage Rules)
- Yes, avoid them
| def pre_validate( | ||
| self, | ||
| config: dict[str, Any], | ||
| ) -> ChartGenerationError | None: | ||
| if "handlebars_template" not in config: | ||
| return ChartGenerationError( | ||
| error_type="missing_handlebars_template", | ||
| message="Handlebars chart missing required field: handlebars_template", | ||
| details=( | ||
| "Handlebars charts require a 'handlebars_template' string " | ||
| "containing Handlebars HTML template markup" | ||
| ), | ||
| suggestions=[ | ||
| "Add 'handlebars_template' with a Handlebars HTML template", | ||
| "Data is available as {{data}} array in the template", | ||
| "Example: '<ul>{{#each data}}<li>{{this.name}}: " | ||
| "{{this.value}}</li>{{/each}}</ul>'", | ||
| ], | ||
| error_code="MISSING_HANDLEBARS_TEMPLATE", | ||
| ) | ||
|
|
||
| template = config.get("handlebars_template") | ||
| if not isinstance(template, str) or not template.strip(): | ||
| return ChartGenerationError( | ||
| error_type="invalid_handlebars_template", | ||
| message="Handlebars template must be a non-empty string", | ||
| details=( | ||
| "The 'handlebars_template' field must be a non-empty string " | ||
| "containing valid Handlebars HTML template markup" | ||
| ), | ||
| suggestions=[ | ||
| "Ensure handlebars_template is a non-empty string", | ||
| "Example: '<ul>{{#each data}}<li>{{this.name}}</li>{{/each}}</ul>'", | ||
| ], | ||
| error_code="INVALID_HANDLEBARS_TEMPLATE", | ||
| ) | ||
|
|
||
| query_mode = config.get("query_mode", "aggregate") | ||
| if query_mode not in ("aggregate", "raw"): | ||
| return ChartGenerationError( | ||
| error_type="invalid_query_mode", | ||
| message="Invalid query_mode for handlebars chart", | ||
| details="query_mode must be either 'aggregate' or 'raw'", | ||
| suggestions=[ | ||
| "Use 'aggregate' for aggregated data (default)", | ||
| "Use 'raw' for individual rows", | ||
| ], | ||
| error_code="INVALID_QUERY_MODE", | ||
| ) | ||
|
|
||
| if query_mode == "raw" and not config.get("columns"): | ||
| return ChartGenerationError( | ||
| error_type="missing_raw_columns", | ||
| message="Handlebars chart in 'raw' mode requires 'columns'", | ||
| details=( | ||
| "When query_mode is 'raw', you must specify which columns " | ||
| "to include in the query results" | ||
| ), | ||
| suggestions=[ | ||
| "Add 'columns': [{'name': 'column_name'}] for raw mode", | ||
| "Or use query_mode='aggregate' with 'metrics' and optional 'groupby'", # noqa: E501 | ||
| ], | ||
| error_code="MISSING_RAW_COLUMNS", | ||
| ) | ||
|
|
||
| if query_mode == "aggregate" and not config.get("metrics"): | ||
| return ChartGenerationError( | ||
| error_type="missing_aggregate_metrics", | ||
| message="Handlebars chart in 'aggregate' mode requires 'metrics'", | ||
| details=( | ||
| "When query_mode is 'aggregate' (default), you must specify " | ||
| "at least one metric with an aggregate function" | ||
| ), | ||
| suggestions=[ | ||
| "Add 'metrics': [{'name': 'column', 'aggregate': 'SUM'}]", | ||
| "Or use query_mode='raw' with 'columns' for individual rows", | ||
| ], | ||
| error_code="MISSING_AGGREGATE_METRICS", | ||
| ) | ||
|
|
||
| return None |
There was a problem hiding this comment.
The pre_validate method duplicates validation logic already implemented in HandlebarsChartConfig.validate_query_fields() (schemas.py:1055-1087). This includes checking for missing metrics in aggregate mode (line 109) and missing columns in raw mode (line 94). The schema validator also performs additional checks (mutual exclusivity, aggregate function presence) that pre_validate lacks. Having both creates maintenance divergence risk.
Code Review Run #8680f9
Should Bito avoid suggestions like this for future reviews? (Manage Rules)
- Yes, avoid them
| @@ -226,28 +220,15 @@ def test_validate_table_chart_skips_xy_validations(self): | |||
| ], | |||
| ) | |||
|
|
|||
| # These should not be called for table charts | |||
| with ( | |||
| patch( | |||
| "superset.mcp_service.chart.validation.runtime.RuntimeValidator." | |||
| "_validate_format_compatibility" | |||
| ) as mock_format, | |||
| patch( | |||
| "superset.mcp_service.chart.validation.runtime.RuntimeValidator." | |||
| "_validate_cardinality" | |||
| ) as mock_cardinality, | |||
| patch( | |||
| "superset.mcp_service.chart.validation.runtime.RuntimeValidator." | |||
| "_validate_chart_type" | |||
| ) as mock_chart_type, | |||
| ): | |||
| # Mock chart type validator to return no warnings | |||
| # Plugin runtime dispatches to TableChartPlugin which returns no warnings. | |||
| # Chart type suggester is also stubbed to return no warnings. | |||
| with patch( | |||
| "superset.mcp_service.chart.validation.runtime.RuntimeValidator." | |||
| "_validate_chart_type" | |||
| ) as mock_chart_type: | |||
| mock_chart_type.return_value = ([], []) | |||
|
|
|||
| is_valid, error = RuntimeValidator.validate_runtime_issues(config, 1) | |||
|
|
|||
| # Format and cardinality validation should not be called for table charts | |||
| mock_format.assert_not_called() | |||
| mock_cardinality.assert_not_called() | |||
| assert is_valid is True | |||
| assert error is None | |||
There was a problem hiding this comment.
Test test_validate_table_chart_skips_xy_validations no longer verifies that XY-specific validations are skipped. Original assertions mock_format.assert_not_called() and mock_cardinality.assert_not_called() were removed, weakening test coverage for the skip behavior.
Code Review Run #8680f9
Should Bito avoid suggestions like this for future reviews? (Manage Rules)
- Yes, avoid them
| if plugin is None: | ||
| return [] | ||
| return plugin.get_runtime_warnings(config, dataset_id) | ||
| except Exception as exc: |
There was a problem hiding this comment.
Replace bare Exception catch with specific exception types (e.g., ImportError, AttributeError, RuntimeError) to avoid masking unexpected errors.
Code suggestion
Check the AI-generated fix before applying
| except Exception as exc: | |
| except (ImportError, AttributeError, RuntimeError) as exc: |
Code Review Run #8680f9
Should Bito avoid suggestions like this for future reviews? (Manage Rules)
- Yes, avoid them
… redundant annotation quotes - schemas.py: CWE-390 bare except → add `as exc` + debug log so display-name lookup failures are observable (BLE001 suppressed: intentional fallback) - plugin.py: remove redundant forward-reference quotes from schema_error_hint return type (from __future__ import annotations already makes all annotations lazy strings) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Code Review Agent Run #40e077Actionable Suggestions - 0Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
SUMMARY
Introduces a
ChartTypePluginprotocol and central registry that consolidates all chart-type-specific logic into a single plugin class per chart type. Previously, adding a new chart type required synchronised edits to four separate dispatch locations. With this change it requires one new file.The problem: 4 dispatch points that had to stay in sync
Before this PR,
generate_chartlogic was split across:schema_validator.py_pre_validate_*static methods, one per typedataset_validator.pyisinstancebranches in_extract_column_referencesandnormalize_column_nameschart_utils.pyif/elifchain inmap_config_to_form_data,generate_chart_name,_resolve_viz_typeruntime/__init__.pyisinstance(config, XYChartConfig)hardcoded branchResult: 5 of 7 chart types (
pie,pivot_table,mixed_timeseries,handlebars,big_number) silently skipped column validation — a bug discovered only when charts failed at query time.Architecture
New file layout
Request flow (before → after)
flowchart LR subgraph BEFORE["Before: 4 separate dispatch points"] direction TB A1[schema_validator.py\n7 pre_validate methods] A2[dataset_validator.py\nisinstance branches] A3[chart_utils.py\nif/elif chains] A4[runtime/__init__.py\nXY hardcoding] end subgraph AFTER["After: single registry lookup"] direction TB B1[registry.get chart_type] --> B2[plugin.pre_validate] B1 --> B3[plugin.extract_column_refs] B1 --> B4[plugin.to_form_data] B1 --> B5[plugin.post_map_validate] B1 --> B6[plugin.normalize_column_refs] B1 --> B7[plugin.get_runtime_warnings] B1 --> B8[plugin.generate_name] B1 --> B9[plugin.resolve_viz_type] end BEFORE -.->|refactored to| AFTERPlugin protocol
classDiagram class ChartTypePlugin { <<Protocol>> +str chart_type +str display_name +dict native_viz_types +pre_validate(config) ChartGenerationError|None +extract_column_refs(config) list[ColumnRef] +to_form_data(config, dataset_id) dict +post_map_validate(config, form_data, dataset_id) ChartGenerationError|None +normalize_column_refs(config, dataset_context) Any +get_runtime_warnings(config, dataset_id) list[str] +generate_name(config, dataset_name) str +resolve_viz_type(config) str } class BaseChartPlugin { +pre_validate() None +extract_column_refs() [] +to_form_data() NotImplementedError +post_map_validate() None +normalize_column_refs() config unchanged +get_runtime_warnings() [] +generate_name() "Chart" +resolve_viz_type() "unknown" +_with_context(what, context)$ str } class XYChartPlugin class PieChartPlugin class TableChartPlugin class BigNumberChartPlugin class PivotTableChartPlugin class MixedTimeseriesChartPlugin class HandlebarsChartPlugin ChartTypePlugin <|.. BaseChartPlugin BaseChartPlugin <|-- XYChartPlugin BaseChartPlugin <|-- PieChartPlugin BaseChartPlugin <|-- TableChartPlugin BaseChartPlugin <|-- BigNumberChartPlugin BaseChartPlugin <|-- PivotTableChartPlugin BaseChartPlugin <|-- MixedTimeseriesChartPlugin BaseChartPlugin <|-- HandlebarsChartPluginRegistry bootstrap (lazy, no circular imports)
sequenceDiagram participant Caller as Caller (test / tool) participant Registry as registry.py participant Plugins as plugins/__init__.py Caller->>Registry: get("xy") Registry->>Registry: _ensure_plugins_loaded() alt first call Registry->>Plugins: import (triggers register() calls) Plugins-->>Registry: 7 plugins registered end Registry-->>Caller: XYChartPlugin instanceThe registry self-bootstraps on first lookup — no dependency on
app.pybeing imported first, so isolated unit tests work without the full Flask app.How to add a new chart type
Everything goes in one new file. Here is a minimal working example for a hypothetical
funnelchart:Step 1 — Add the Pydantic schema (
chart/schemas.py):Step 2 — Create the plugin (
chart/plugins/funnel.py):Step 3 — Register it (
chart/plugins/__init__.py):Step 4 — Add a Pydantic error hint (
chart/validation/schema_validator.py):That is the complete change set — no edits to
chart_utils.py,dataset_validator.py, orruntime/__init__.pydispatch logic.Bug fixes included
pie,pivot_table,mixed_timeseries,handlebars, andbig_numbercharts previously skipped dataset column validation entirely (silent false-pass). All 7 types now run column existence and aggregation checks.BigNumberChartPlugin.post_map_validate().isinstance(config, XYChartConfig)hardcoding fromRuntimeValidator; XY-specific format/cardinality checks now live inXYChartPlugin.get_runtime_warnings().generate_chart.pylisted only'xy'and'table'as valid chart types; updated to list all 7.pie,pivot_table,mixed_timeseriescases to_enhance_validation_error; refactored to a data-driven lookup table to reduce cyclomatic complexity.app.pyget a fully populated registry.BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
N/A — backend-only change, no UI modifications.
TESTING INSTRUCTIONS
Manual verification:
python -m superset.mcp_service.servercolumn_not_founderror with fuzzy suggestions (previously: silent pass, runtime failure)big_numberchart withshow_trendline=truebut notemporal_column→ should error at validationgenerate_chartfor all 7 chart types to confirm no regressionsADDITIONAL INFORMATION