Skip to content

feat(mcp): chart type plugin registry for extensible generate_chart#39922

Open
aminghadersohi wants to merge 16 commits into
masterfrom
mcp-chart-tools-rearch
Open

feat(mcp): chart type plugin registry for extensible generate_chart#39922
aminghadersohi wants to merge 16 commits into
masterfrom
mcp-chart-tools-rearch

Conversation

@aminghadersohi
Copy link
Copy Markdown
Contributor

@aminghadersohi aminghadersohi commented May 6, 2026

SUMMARY

Introduces a ChartTypePlugin protocol and central registry that consolidates all chart-type-specific logic into a single plugin class per chart type. Previously, adding a new chart type required synchronised edits to four separate dispatch locations. With this change it requires one new file.

The problem: 4 dispatch points that had to stay in sync

Before this PR, generate_chart logic was split across:

Location Responsibility
schema_validator.py 7 _pre_validate_* static methods, one per type
dataset_validator.py isinstance branches in _extract_column_references and normalize_column_names
chart_utils.py if/elif chain in map_config_to_form_data, generate_chart_name, _resolve_viz_type
runtime/__init__.py isinstance(config, XYChartConfig) hardcoded branch

Result: 5 of 7 chart types (pie, pivot_table, mixed_timeseries, handlebars, big_number) silently skipped column validation — a bug discovered only when charts failed at query time.


Architecture

New file layout

superset/mcp_service/chart/
├── plugin.py          # ChartTypePlugin Protocol + BaseChartPlugin base class
├── registry.py        # Central registry with lazy bootstrap
└── plugins/
    ├── __init__.py    # Imports and registers all 7 built-in plugins
    ├── xy.py
    ├── table.py
    ├── pie.py
    ├── pivot_table.py
    ├── mixed_timeseries.py
    ├── handlebars.py
    └── big_number.py

Request flow (before → after)

flowchart LR
    subgraph BEFORE["Before: 4 separate dispatch points"]
        direction TB
        A1[schema_validator.py\n7 pre_validate methods]
        A2[dataset_validator.py\nisinstance branches]
        A3[chart_utils.py\nif/elif chains]
        A4[runtime/__init__.py\nXY hardcoding]
    end

    subgraph AFTER["After: single registry lookup"]
        direction TB
        B1[registry.get chart_type] --> B2[plugin.pre_validate]
        B1 --> B3[plugin.extract_column_refs]
        B1 --> B4[plugin.to_form_data]
        B1 --> B5[plugin.post_map_validate]
        B1 --> B6[plugin.normalize_column_refs]
        B1 --> B7[plugin.get_runtime_warnings]
        B1 --> B8[plugin.generate_name]
        B1 --> B9[plugin.resolve_viz_type]
    end

    BEFORE -.->|refactored to| AFTER
Loading

Plugin protocol

classDiagram
    class ChartTypePlugin {
        <<Protocol>>
        +str chart_type
        +str display_name
        +dict native_viz_types
        +pre_validate(config) ChartGenerationError|None
        +extract_column_refs(config) list[ColumnRef]
        +to_form_data(config, dataset_id) dict
        +post_map_validate(config, form_data, dataset_id) ChartGenerationError|None
        +normalize_column_refs(config, dataset_context) Any
        +get_runtime_warnings(config, dataset_id) list[str]
        +generate_name(config, dataset_name) str
        +resolve_viz_type(config) str
    }

    class BaseChartPlugin {
        +pre_validate() None
        +extract_column_refs() []
        +to_form_data() NotImplementedError
        +post_map_validate() None
        +normalize_column_refs() config unchanged
        +get_runtime_warnings() []
        +generate_name() "Chart"
        +resolve_viz_type() "unknown"
        +_with_context(what, context)$ str
    }

    class XYChartPlugin
    class PieChartPlugin
    class TableChartPlugin
    class BigNumberChartPlugin
    class PivotTableChartPlugin
    class MixedTimeseriesChartPlugin
    class HandlebarsChartPlugin

    ChartTypePlugin <|.. BaseChartPlugin
    BaseChartPlugin <|-- XYChartPlugin
    BaseChartPlugin <|-- PieChartPlugin
    BaseChartPlugin <|-- TableChartPlugin
    BaseChartPlugin <|-- BigNumberChartPlugin
    BaseChartPlugin <|-- PivotTableChartPlugin
    BaseChartPlugin <|-- MixedTimeseriesChartPlugin
    BaseChartPlugin <|-- HandlebarsChartPlugin
Loading

Registry bootstrap (lazy, no circular imports)

sequenceDiagram
    participant Caller as Caller (test / tool)
    participant Registry as registry.py
    participant Plugins as plugins/__init__.py

    Caller->>Registry: get("xy")
    Registry->>Registry: _ensure_plugins_loaded()
    alt first call
        Registry->>Plugins: import (triggers register() calls)
        Plugins-->>Registry: 7 plugins registered
    end
    Registry-->>Caller: XYChartPlugin instance
Loading

The registry self-bootstraps on first lookup — no dependency on app.py being imported first, so isolated unit tests work without the full Flask app.


How to add a new chart type

Everything goes in one new file. Here is a minimal working example for a hypothetical funnel chart:

Step 1 — Add the Pydantic schema (chart/schemas.py):

class FunnelChartConfig(ChartConfig):
    chart_type: Literal["funnel"] = "funnel"
    metric: ColumnRef
    groupby: ColumnRef
    filters: list[FilterConfig] | None = None

Step 2 — Create the plugin (chart/plugins/funnel.py):

from superset.mcp_service.chart.plugin import BaseChartPlugin
from superset.mcp_service.chart.schemas import ColumnRef
from superset.mcp_service.common.error_schemas import ChartGenerationError


class FunnelChartPlugin(BaseChartPlugin):
    chart_type = "funnel"
    display_name = "Funnel Chart"
    native_viz_types = {"funnel": "Funnel Chart"}

    def pre_validate(self, config: dict) -> ChartGenerationError | None:
        if "metric" not in config or "groupby" not in config:
            return ChartGenerationError(
                error_type="missing_funnel_fields",
                message="Funnel chart requires 'metric' and 'groupby'",
                suggestions=["Add metric: {'name': 'col', 'aggregate': 'SUM'}"],
                error_code="MISSING_FUNNEL_FIELDS",
            )
        return None

    def extract_column_refs(self, config) -> list[ColumnRef]:
        from superset.mcp_service.chart.schemas import FunnelChartConfig
        if not isinstance(config, FunnelChartConfig):
            return []
        refs = [config.metric, config.groupby]
        if config.filters:
            refs.extend(ColumnRef(name=f.column) for f in config.filters)
        return refs

    def to_form_data(self, config, dataset_id=None) -> dict:
        return {
            "viz_type": "funnel",
            "metric": {"column": {"column_name": config.metric.name},
                       "aggregate": config.metric.aggregate},
            "groupby": [config.groupby.name],
        }

    def resolve_viz_type(self, config) -> str:
        return "funnel"

    def generate_name(self, config, dataset_name=None) -> str:
        return self._with_context(
            f"Funnel: {config.metric.aggregate}({config.metric.name})",
            dataset_name,
        )

    def normalize_column_refs(self, config, dataset_context):
        from superset.mcp_service.chart.schemas import FunnelChartConfig
        from superset.mcp_service.chart.validation.dataset_validator import DatasetValidator
        d = config.model_dump()
        get = DatasetValidator._get_canonical_column_name
        d["metric"]["name"] = get(d["metric"]["name"], dataset_context)
        d["groupby"]["name"] = get(d["groupby"]["name"], dataset_context)
        DatasetValidator._normalize_filters(d, dataset_context)
        return FunnelChartConfig.model_validate(d)

Step 3 — Register it (chart/plugins/__init__.py):

from superset.mcp_service.chart.plugins.funnel import FunnelChartPlugin
# ...existing imports...
register(FunnelChartPlugin())

Step 4 — Add a Pydantic error hint (chart/validation/schema_validator.py):

# In _CHART_TYPE_ERROR_HINTS — keep in sync with plugins/
"funnel": {
    "metric": "Add metric: {'name': 'col', 'aggregate': 'SUM'}",
    "groupby": "Add groupby: {'name': 'stage_column'}",
},

That is the complete change set — no edits to chart_utils.py, dataset_validator.py, or runtime/__init__.py dispatch logic.


Bug fixes included

  • 5-type column validation gap: pie, pivot_table, mixed_timeseries, handlebars, and big_number charts previously skipped dataset column validation entirely (silent false-pass). All 7 types now run column existence and aggregation checks.
  • BigNumber trendline check: Moved out of the monolithic dispatcher into BigNumberChartPlugin.post_map_validate().
  • Runtime validator decoupling: Removed isinstance(config, XYChartConfig) hardcoding from RuntimeValidator; XY-specific format/cardinality checks now live in XYChartPlugin.get_runtime_warnings().
  • Stale docstring: generate_chart.py listed only 'xy' and 'table' as valid chart types; updated to list all 7.
  • Pydantic error handling: Added missing pie, pivot_table, mixed_timeseries cases to _enhance_validation_error; refactored to a data-driven lookup table to reduce cyclomatic complexity.
  • Registry bootstrap: Registry self-bootstraps on first access so unit tests running without app.py get a fully populated registry.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

N/A — backend-only change, no UI modifications.

TESTING INSTRUCTIONS

# Run the MCP unit tests for chart functionality
pytest tests/unit_tests/mcp_service/chart/ -x -q

# Spot-check column validation now works for previously-unvalidated types
pytest tests/unit_tests/mcp_service/chart/validation/ -x -q

Manual verification:

  1. Start the MCP service: python -m superset.mcp_service.server
  2. Create a pie chart with a non-existent dimension column → should get a column_not_found error with fuzzy suggestions (previously: silent pass, runtime failure)
  3. Create a big_number chart with show_trendline=true but no temporal_column → should error at validation
  4. Run generate_chart for all 7 chart types to confirm no regressions

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration
  • Introduces new feature or API
  • Removes existing feature or API

@netlify
Copy link
Copy Markdown

netlify Bot commented May 6, 2026

Deploy Preview for superset-docs-preview ready!

Name Link
🔨 Latest commit 7dc3563
🔍 Latest deploy log https://app.netlify.com/projects/superset-docs-preview/deploys/6a0e25c10bff6d0008bd723e
😎 Deploy Preview https://deploy-preview-39922--superset-docs-preview.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.
🤖 Make changes Run an agent on this branch

To edit notification comments on pull requests, go to your Netlify project configuration.

Comment thread superset/mcp_service/chart/chart_utils.py
Comment thread superset/mcp_service/chart/validation/dataset_validator.py
@bito-code-review
Copy link
Copy Markdown
Contributor

The inline context is insufficient to determine what 'this' refers to in the question. No diff hunk or suggestion snippet is provided for the thread in superset/mcp_service/chart/chart_utils.py.

Comment thread superset/mcp_service/chart/plugins/big_number.py Outdated
Comment thread superset/mcp_service/chart/plugins/big_number.py Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a chart-type plugin architecture for the MCP chart pipeline, centralizing per-chart-type behavior (pre-validation, column extraction/normalization, form_data mapping, runtime warnings, naming, viz_type resolution) into one plugin per supported chart_type. This reduces the previous need to update multiple dispatch sites when adding or modifying chart types.

Changes:

  • Added a ChartTypePlugin protocol + BaseChartPlugin, a global plugin registry, and 7 built-in chart plugins (xy, table, pie, pivot_table, mixed_timeseries, handlebars, big_number).
  • Refactored schema/dataset/runtime validation and form_data mapping to dispatch via the registry instead of isinstance / if/elif chains.
  • Enhanced tool/docs output by expanding generate_chart docstring chart types and adding chart_type_display_name to ChartInfo.

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
superset/mcp_service/chart/validation/schema_validator.py Switch pre-validation dispatch to registry/plugins; refactor Pydantic error enhancement.
superset/mcp_service/chart/validation/runtime/init.py Replace XY-only runtime checks with per-plugin runtime warning dispatch.
superset/mcp_service/chart/validation/dataset_validator.py Delegate column extraction/normalization to plugins; apply aggregation validation for all chart types.
superset/mcp_service/chart/tool/generate_chart.py Update tool docstring to list all supported chart_type values.
superset/mcp_service/chart/schemas.py Add chart_type_display_name to ChartInfo; resolve display name via registry mapping.
superset/mcp_service/chart/registry.py Introduce global registry for chart plugins + viz_type display-name lookup.
superset/mcp_service/chart/plugin.py Add plugin protocol and base class defining the plugin responsibilities.
superset/mcp_service/chart/plugins/init.py Register the built-in plugins on import.
superset/mcp_service/chart/plugins/xy.py Implement XY plugin (pre-validate, refs, mapping, naming, viz_type, runtime warnings).
superset/mcp_service/chart/plugins/table.py Implement Table plugin.
superset/mcp_service/chart/plugins/pie.py Implement Pie plugin.
superset/mcp_service/chart/plugins/pivot_table.py Implement Pivot Table plugin.
superset/mcp_service/chart/plugins/mixed_timeseries.py Implement Mixed Timeseries plugin.
superset/mcp_service/chart/plugins/handlebars.py Implement Handlebars plugin.
superset/mcp_service/chart/plugins/big_number.py Implement Big Number plugin including post-map trendline temporal validation.
superset/mcp_service/chart/chart_utils.py Replace mapping/name/viz_type dispatch with registry/plugin calls.
superset/mcp_service/app.py Import plugins at MCP app startup; update default instructions about chart type naming.

Comment thread superset/mcp_service/chart/registry.py Outdated
Comment thread superset/mcp_service/chart/validation/schema_validator.py
Comment thread superset/mcp_service/chart/validation/schema_validator.py
Comment thread superset/mcp_service/chart/validation/schema_validator.py Outdated
Comment thread superset/mcp_service/chart/validation/runtime/__init__.py
Comment thread superset/mcp_service/chart/validation/runtime/__init__.py
Comment thread superset/mcp_service/chart/plugins/xy.py
Comment thread superset/mcp_service/chart/tool/generate_chart.py
Comment thread superset/mcp_service/app.py Outdated
Comment thread superset/mcp_service/chart/plugins/pie.py Outdated
@aminghadersohi aminghadersohi force-pushed the mcp-chart-tools-rearch branch from 891676f to 1e2b541 Compare May 7, 2026 22:19
@codecov
Copy link
Copy Markdown

codecov Bot commented May 7, 2026

Codecov Report

❌ Patch coverage is 34.33476% with 459 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.12%. Comparing base (a183582) to head (42a293a).
⚠️ Report is 46 commits behind head on master.

Files with missing lines Patch % Lines
superset/mcp_service/chart/plugins/xy.py 28.75% 57 Missing ⚠️
...rset/mcp_service/chart/plugins/mixed_timeseries.py 26.86% 49 Missing ⚠️
superset/mcp_service/chart/registry.py 42.35% 44 Missing and 5 partials ⚠️
superset/mcp_service/chart/plugins/big_number.py 28.78% 47 Missing ⚠️
superset/mcp_service/chart/plugins/handlebars.py 29.03% 44 Missing ⚠️
superset/mcp_service/chart/plugins/pivot_table.py 31.03% 40 Missing ⚠️
superset/mcp_service/chart/plugins/pie.py 37.50% 30 Missing ⚠️
superset/mcp_service/chart/plugins/table.py 40.00% 27 Missing ⚠️
.../mcp_service/chart/validation/dataset_validator.py 14.81% 23 Missing ⚠️
superset/mcp_service/chart/plugin.py 64.70% 18 Missing ⚠️
... and 6 more
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #39922      +/-   ##
==========================================
- Coverage   64.18%   64.12%   -0.07%     
==========================================
  Files        2591     2602      +11     
  Lines      138471   139388     +917     
  Branches    32120    32254     +134     
==========================================
+ Hits        88883    89381     +498     
- Misses      48056    48469     +413     
- Partials     1532     1538       +6     
Flag Coverage Δ
hive 39.47% <34.33%> (+0.06%) ⬆️
mysql 58.84% <34.33%> (-0.21%) ⬇️
postgres 58.92% <34.33%> (-0.21%) ⬇️
presto 41.13% <34.33%> (+0.04%) ⬆️
python 60.47% <34.33%> (-0.10%) ⬇️
sqlite 58.57% <34.33%> (-0.21%) ⬇️
unit 100.00% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@aminghadersohi
Copy link
Copy Markdown
Contributor Author

Internal architecture review — findings & resolutions

Ran an independent codex review of the plugin registry design. Summary of findings and their status:

Finding Severity Status
_ensure_plugins_loaded() flag not lock-protected — race condition on concurrent first-calls MEDIUM ✅ Fixed in eface3b — added threading.Lock with double-checked locking
Duplicate register() call silently overwrites MEDIUM Already handled — registry.py lines 68-71 emit logger.warning on overwrite, no silent overwrite
extract_column_refs() returns [] for unrecognised configs MEDIUM Accepted — correct for the 7 known types (plugins always registered); unknown types have no column schema to validate against
BaseChartPlugin.extract_column_refs() no-op default NIT Accepted — base class provides a safe default; concrete plugins override for their types
Local imports in chart_utils.py / dataset_validator.py NIT Already addressed in previous commit with explanatory comments on why they must stay local (circular dependency prevention)

The thread-safety fix (eface3bf) is the only code change — the others are either already handled or accepted design decisions.

@aminghadersohi aminghadersohi force-pushed the mcp-chart-tools-rearch branch from f5ba09f to 7dc3563 Compare May 20, 2026 21:21
@aminghadersohi aminghadersohi marked this pull request as ready for review May 20, 2026 21:50
@dosubot dosubot Bot added change:backend Requires changing the backend risk:refactor High risk as it involves large refactoring work labels May 20, 2026
aminghadersohi and others added 5 commits May 21, 2026 10:10
…generation

Replaces four scattered dispatch locations (schema_validator, dataset_validator,
chart_utils, runtime validator) with a central ChartTypePlugin registry. Each of
the 7 supported chart types (xy, table, pie, pivot_table, mixed_timeseries,
handlebars, big_number) now owns its pre-validation, column extraction, form_data
mapping, post-map validation, column normalization, and runtime warnings in a
single plugin class.

Key changes:
- Add ChartTypePlugin protocol and BaseChartPlugin base class (plugin.py)
- Add ChartTypeRegistry with register/get/all_types helpers (registry.py)
- Add 7 chart type plugins under chart/plugins/ with full coverage
- Fix 5-type column validation gap: pie, pivot_table, mixed_timeseries, handlebars,
  and big_number now participate in dataset column validation (previously silently skipped)
- Move BigNumber trendline temporal check to BigNumberChartPlugin.post_map_validate()
- Add get_runtime_warnings() to plugin protocol; XYChartPlugin implements
  format/cardinality checks, removing isinstance(config, XYChartConfig) from RuntimeValidator
- Fix stale generate_chart.py docstring listing only 'xy' and 'table' chart types
- Add missing pie, pivot_table, mixed_timeseries handlers to _enhance_validation_error;
  refactor into a data-driven lookup table to stay within complexity limits
- Fix empty details fallback in Pydantic error handler
Each ChartTypePlugin now declares:
- display_name: human-readable label for the chart_type discriminator
  (e.g. "Line / Bar / Area / Scatter Chart", "Pivot Table")
- native_viz_types: dict mapping every Superset-internal viz_type the
  plugin produces to a user-friendly name
  (e.g. {"echarts_timeseries_line": "Line Chart", "echarts_area": "Area Chart"})

The registry gains display_name_for_viz_type(viz_type) which searches
all plugins' native_viz_types maps, replacing the need for a separate
viz_type_display_names.json or viz_type_names.py module.

ChartInfo gains a chart_type_display_name field populated via the registry,
so list_charts / get_chart_info return human-readable chart type names.
The MCP system instructions now reference display names rather than
internal viz_type identifiers.
H1: Delete 7 dead _pre_validate_* static methods from SchemaValidator
    — exact duplicates of plugin pre_validate() methods, never called
    after _pre_validate_chart_type() was updated to delegate to plugin.

H2: Inline DatasetValidator._normalize_xy_config/_normalize_table_config
    into XYChartPlugin/TableChartPlugin.normalize_column_refs() and delete
    both DatasetValidator helper methods. The 5 other plugins already
    called _get_canonical_column_name directly; XY and Table now match.

H3: Add generate_name()/resolve_viz_type() to ChartTypePlugin protocol
    and BaseChartPlugin, implement in all 7 plugins. Replace the 7-arm
    isinstance chain in generate_chart_name() and the 7-arm elif chain
    in _resolve_viz_type() with single-line registry dispatch.

H4: Add a sync comment above _CHART_TYPE_ERROR_HINTS to document that
    it must stay in sync with the plugin registry.

M4: Move logger=logging.getLogger(__name__) from inside
    XYChartPlugin.get_runtime_warnings() to module scope.
…xes, test repairs

On top of the dead-code elimination in the previous commit:
- Add lazy _ensure_plugins_loaded() bootstrap to ChartTypeRegistry so the
  registry is populated even without importing app.py (fixes isolated test runs)
- Delegate _RegistryProxy methods to module-level functions so bootstrap runs
- Guard register() against empty chart_type strings
- Add generate_name + resolve_viz_type to ChartTypePlugin Protocol and
  BaseChartPlugin; delegate generate_chart_name/_resolve_viz_type in
  chart_utils to the plugin registry
- Add _with_context static helper to BaseChartPlugin (shared by all plugins)
- Fix stale 'five methods' → 'eight methods' docstring in plugin.py
- Add TypeVar _C to normalize_column_names so mypy infers correct return type
- Fix broken tests: update _pre_validate_big_number_config → _pre_validate_chart_type,
  remove deleted TestNormalizeXYConfig/TestNormalizeTableConfig classes,
  update runtime validator tests for removed _validate_format_compatibility /
  _validate_cardinality methods, add x is not None narrowing guards

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…nal corrections, cardinality suggestions

- Remove redundant local imports from BigNumberChartPlugin.post_map_validate()
  now that BigNumberChartConfig and is_column_truly_temporal are at top level
- Add explanatory comments on the two remaining local get_registry imports in
  chart_utils.py and dataset_validator.py (circular import prevention)
- Fix schema_validator.py and generate_chart.py docstring: XY 'x' field is
  optional (defaults to dataset primary datetime column), not required
- Propagate cardinality suggestions alongside warnings in XYChartPlugin
- Clarify app.py instructions: chart_type_display_name is null for viz_types
  outside the 7 generate_chart-supported types

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
aminghadersohi and others added 8 commits May 21, 2026 10:10
All per-method local imports in the 7 chart plugins were moved to module-level.
None of them create circular imports: schemas.py, chart_utils.py, and
dataset_validator.py are safe to import at plugin load time because those
modules guard their own registry lookups with local imports.

- big_number: add map_big_number_config, _big_number_chart_what,
  _summarize_filters, DatasetValidator to top-level imports
- pie: add map_pie_config, _pie_chart_what, _summarize_filters, PieChartConfig,
  DatasetValidator to top-level imports
- xy: add map_xy_config, _xy_chart_what/context, XYChartConfig, DatasetValidator,
  FormatTypeValidator, CardinalityValidator to top-level imports
- table: add map_table_config, _table_chart_what, _summarize_filters,
  TableChartConfig, DatasetValidator to top-level imports
- pivot_table: add map_pivot_table_config, _pivot_table_what, _summarize_filters,
  PivotTableChartConfig, DatasetValidator to top-level imports
- mixed_timeseries: add map_mixed_timeseries_config, _mixed_timeseries_what,
  _summarize_filters, MixedTimeseriesChartConfig, DatasetValidator to top-level
- handlebars: add map_handlebars_config, _handlebars_chart_what, _summarize_filters,
  HandlebarsChartConfig, DatasetValidator to top-level imports

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Split long string literal in schema_validator.py line 202 (E501, 94 > 88 chars)
- Apply ruff format auto-fixes to big_number.py, handlebars.py, and test_get_chart_data.py
- Move error_schemas import above _C TypeVar definition (E402)
- Split two over-length comment lines to ≤88 chars (E501, lines 268 and 380)
_ensure_plugins_loaded() used an unprotected boolean flag, making it
unsafe under concurrent first-call scenarios (e.g. gunicorn multi-thread
workers). Double-checked locking with threading.Lock eliminates the race.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
update_chart was only running SchemaValidator + Tier-2 compile check,
silently skipping DatasetValidator's column-existence + fuzzy-match
and column-name normalisation layers that generate_chart runs.

A typo like {name: "reveneu"} would save the broken chart and only
surface as a render-time failure in the browser.

Now matches generate_chart pipeline:
- Layer 2: DatasetValidator.validate_against_dataset() — column
  existence check with fuzzy-match "did you mean?" suggestions returned
  to the LLM before any DB write occurs
- Layer 4: DatasetValidator.normalize_column_names() — case
  normalisation so "order_date" resolves to "OrderDate" if that is the
  canonical dataset name

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…nts, remove column regex

P1.1 registry.py: move _plugins_loaded=True to after successful import so a
failed load doesn't permanently poison the registry.

P1.3 schemas.py: remove overly restrictive ColumnRef.name / FilterClause.column
/ BigNumberChartConfig.temporal_column regex that blocked valid column names
containing parentheses, slashes, and other SQL-common characters.

P2.3 (DRY): eliminate _CHART_TYPE_ERROR_HINTS second-registry in
schema_validator.py by adding schema_error_hint() to ChartTypePlugin protocol,
BaseChartPlugin default, and all 7 plugin classes. SchemaValidator now delegates
to the plugin registry instead of maintaining a parallel dict.

P3.3 test_registry.py: add full registry unit-test coverage (register, get,
all_types, is_registered, display_name_for_viz_type, proxy methods, duplicate
warning, empty chart_type validation, insertion-order guarantee).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… column validation

Split an 89-char comment line and an over-limit condition in update_chart.py
to satisfy the ruff E501 rule. Also applied ruff format.

Two TestUpdateChartValidationGate tests expected CHART_VALIDATION_FAILED but
received CHART_DATASET_NOT_FOUND because _validate_update_against_dataset calls
DatasetValidator.validate_against_dataset before validate_and_compile, and the
existing mocks provided a Mock() object for chart.datasource whose .id attribute
is an auto-generated MagicMock (not a real int). Added a patch for
DatasetValidator.validate_against_dataset returning (True, None) so the
column-validation tier is bypassed and the test reaches the mocked
validate_and_compile response as intended.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…onfig

Introduces a dynamic filter layer in the chart type registry so operators can
disable individual plugins (e.g. `handlebars`) without a code deploy:

- `MCP_DISABLED_CHART_PLUGINS: frozenset[str]` — static deny-list in mcp_config.py
- `MCP_CHART_PLUGIN_ENABLED_FUNC: Callable[[str], bool] | None` — dynamic hook
  for Harness/Split/per-user targeting; takes precedence over the deny-list
- Both keys are propagated through `get_mcp_config()` defaults

registry.py changes:
- `_PluginFilterConfig` frozen dataclass replaces two bare globals so
  configure() replaces them atomically (no torn reads under concurrency)
- `configure(disabled, enabled_func)` — called at app init; accepts any
  iterable for `disabled`; validates `enabled_func` is callable
- `_is_plugin_enabled()` — reads config once, fails closed on callable exception
- `get()` / `all_types()` / `is_enabled()` apply the filter at lookup time;
  `is_registered()` and `display_name_for_viz_type()` intentionally bypass it
  so callers can distinguish "unknown" vs "disabled" and existing charts still
  resolve display names for disabled viz types

schema_validator.py: two-step pre-check — `is_registered()` for unknown types,
`is_enabled()` for disabled ones, with distinct `DISABLED_CHART_TYPE` error code.

Wiring:
- `SupersetAppInitializer.configure_mcp_chart_registry()` called after
  `configure_feature_flags()` in `init_app()`
- `flask_singleton.py` re-calls `registry.configure()` after the MCP config
  overlay so MCP-specific overrides in `superset_config.py` take effect in
  standalone MCP mode

Tests: 28 cases in test_registry_filters.py covering deny-list, callable hook,
fail-closed on exception, all_types() filtering, display_name bypass, atomic
reconfigure, and configure() with list/tuple/frozenset inputs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@aminghadersohi aminghadersohi force-pushed the mcp-chart-tools-rearch branch from 7dc3563 to 2e7a86f Compare May 21, 2026 10:10
Comment on lines +192 to +195
if config_dict.get("metric") and not config_dict["metric"].get("saved_metric"):
config_dict["metric"]["name"] = DatasetValidator._get_canonical_column_name(
config_dict["metric"]["name"], dataset_context
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Saved metrics are skipped during name normalization, so case-insensitive validation can pass but the emitted metric string can still use the wrong casing and fail later in Superset's exact-name metric lookup. Normalize saved metric names against dataset metrics (not just raw columns) so the final form_data always uses canonical metric names. [api mismatch]

Severity Level: Critical 🚨
- ❌ Big Number charts with mis-cased saved metrics fail.
- ⚠️ Validation passes but Superset query errors at runtime.
- ⚠️ LLM clients see confusing success-then-fail behavior.
Steps of Reproduction ✅
1. Define a dataset in Superset with a saved metric whose name differs in case from the
desired client input, e.g. metric_name="Revenue" (saved metric) on a dataset that
ValidationPipeline will load via DatasetValidator._get_dataset_context at
`superset/mcp_service/chart/validation/dataset_validator.py:197-252`.

2. Call the MCP tool `generate_chart` at
`superset/mcp_service/chart/tool/generate_chart.py:95` with a Big Number config using the
saved metric but with different casing, e.g. `"metric": {"name": "revenue",
"saved_metric": true}` in `GenerateChartRequest.config` (chart_type="big_number").

3. ValidationPipeline.validate_request_with_warnings at
`superset/mcp_service/chart/validation/pipeline.py:90-55` invokes
DatasetValidator.validate_against_dataset at `dataset_validator.py:55-72`, which calls
`_validate_saved_metrics` at `dataset_validator.py:485-46`; this checks
`col_ref.name.lower()` against `available_metrics` lowercased, so `"revenue"` is accepted
as a valid saved metric even though its canonical name is `"Revenue"`.

4. The same pipeline then normalizes column names via
DatasetValidator.normalize_column_names at `dataset_validator.py:94-141`, which delegates
to BigNumberChartPlugin.normalize_column_refs at
`superset/mcp_service/chart/plugins/big_number.py:189-203`; because the metric has
`"saved_metric": true`, the guard at lines 192-195 skips `_get_canonical_column_name`, so
the metric name remains `"revenue"` and is never canonicalized to `"Revenue"`.

5. After validation, `generate_chart` calls `map_config_to_form_data` at
`superset/mcp_service/chart/chart_utils.py:93-131`, which routes to
BigNumberChartPlugin.to_form_data at `big_number.py:132-135`; `map_big_number_config` at
`chart_utils.py:763-799` builds form_data using `create_metric_object(config.metric)` at
`chart_utils.py:11-51`, which returns the raw `col.name` string `"revenue"` for saved
metrics.

6. When Superset executes the query, its SQLA connector resolves metrics via a
`metrics_by_name` dict (see docstring and usage at `chart_utils.py:491-499` and
`superset/connectors/sqla/models.py:1758-38`); because the dict keys are the canonical
metric names (e.g. `"Revenue"`), the mis-cased `"revenue"` key is missing and Superset
raises `QueryObjectValidationError("Metric '%(metric)s' does not exist")`, causing the Big
Number chart generation to fail at runtime even though validation passed.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** superset/mcp_service/chart/plugins/big_number.py
**Line:** 192:195
**Comment:**
	*Api Mismatch: Saved metrics are skipped during name normalization, so case-insensitive validation can pass but the emitted metric string can still use the wrong casing and fail later in Superset's exact-name metric lookup. Normalize saved metric names against dataset metrics (not just raw columns) so the final form_data always uses canonical metric names.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The normalization is already in place. Both big_number.py and pie.py branch on saved_metric in normalize_column_refs: when saved_metric=True, the name is resolved through DatasetValidator._get_canonical_metric_name(), which performs a case-insensitive lookup exclusively against available_metrics and returns the canonical dataset name. Non-saved metrics go through _get_canonical_column_name instead. The described bug doesn't exist in the current code.

Comment thread superset/mcp_service/chart/plugins/pie.py Outdated
Comment on lines +123 to +132
def _norm_col_list(key: str) -> None:
if config_dict.get(key):
for col in config_dict[key]:
col["name"] = DatasetValidator._get_canonical_column_name(
col["name"], dataset_context
)

_norm_col_list("rows")
_norm_col_list("metrics")
_norm_col_list("columns")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Metrics are normalized with a helper that prefers matching dataset columns before metrics, so a saved metric that case-insensitively collides with a column name can be rewritten to the column identifier and stop resolving as a saved metric. Handle metric normalization with metric-aware precedence (or branch on saved_metric) to preserve metric identity. [logic error]

Severity Level: Critical 🚨
- ❌ Pivot table charts can fail on valid saved metrics.
- ⚠️ Column-first canonicalization corrupts saved metric identity.
- ⚠️ Users get runtime metric-not-found errors after validation.
Steps of Reproduction ✅
1. Consider a dataset where a physical column and a saved metric share the same
case-insensitive name but differ in canonical casing, e.g. available_columns includes
{"name": "Revenue"} and available_metrics includes {"name": "REVENUE"} as built in
DatasetValidator._get_dataset_context at
`superset/mcp_service/chart/validation/dataset_validator.py:197-252`.

2. A client calls the MCP `generate_chart` tool at
`superset/mcp_service/chart/tool/generate_chart.py:95` with a pivot_table config
referencing the saved metric using a lowercased name, e.g. `"metrics": [{"name":
"revenue", "saved_metric": true}]` in `PivotTableChartConfig`
(`superset/mcp_service/chart/schemas.py:243-261`).

3. ValidationPipeline.validate_request_with_warnings at
`superset/mcp_service/chart/validation/pipeline.py:90-55` runs
DatasetValidator.validate_against_dataset at `dataset_validator.py:55-72`, which calls
`_validate_saved_metrics` at `dataset_validator.py:485-46`; this checks
`col_ref.name.lower()` against `available_metrics` lowercased, so `"revenue"` is accepted
as a valid saved metric in spite of the ambiguous column name.

4. In the column-normalization layer, ValidationPipeline._normalize_column_names at
`validation/pipeline.py:137-181` calls DatasetValidator.normalize_column_names at
`dataset_validator.py:94-141`, which looks up the PivotTableChartPlugin and invokes its
normalize_column_refs at `superset/mcp_service/chart/plugins/pivot_table.py:120-134`;
`_norm_col_list("metrics")` at lines 123-131 runs `_get_canonical_column_name` at
`dataset_validator.py:46-77` for each metric, and because that helper checks
`available_columns` before `available_metrics`, the shared lowercase name `"revenue"` is
canonicalized to the column's name `"Revenue"` instead of the saved metric's canonical
name `"REVENUE"`.

5. The now-normalized config is used in `generate_chart` to build Superset form_data via
map_config_to_form_data at `chart_utils.py:93-131`, which routes to
PivotTableChartPlugin.to_form_data at `pivot_table.py:107-110`; `map_pivot_table_config`
at `superset/mcp_service/chart/chart_utils.py:841-871` builds the `"metrics"` list by
calling `create_metric_object(col)` at `chart_utils.py:11-51`, which returns the string
`"Revenue"` for the saved metric because `col.saved_metric` is still True.

6. At query execution time, Superset resolves metric strings via `metrics_by_name` (see
docstring at `chart_utils.py:491-499` and access pattern at
`superset/connectors/sqla/models.py:1758-38`); this dict is keyed by the canonical saved
metric name `"REVENUE"`, so the rewritten `"Revenue"` key is missing, causing
`QueryObjectValidationError("Metric '%(metric)s' does not exist")` and breaking pivot
table chart execution for datasets where saved metric names collide case-insensitively
with column names.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** superset/mcp_service/chart/plugins/pivot_table.py
**Line:** 123:132
**Comment:**
	*Logic Error: Metrics are normalized with a helper that prefers matching dataset columns before metrics, so a saved metric that case-insensitively collides with a column name can be rewritten to the column identifier and stop resolving as a saved metric. Handle metric normalization with metric-aware precedence (or branch on `saved_metric`) to preserve metric identity.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

Comment on lines +50 to +55
if "x" not in config:
missing_fields.append("'x' (X-axis temporal column)")
if "y" not in config:
missing_fields.append("'y' (primary Y-axis metrics)")
if "y_secondary" not in config:
missing_fields.append("'y_secondary' (secondary Y-axis metrics)")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: The pre-validation only checks for x, y, and y_secondary keys, but the schema explicitly accepts aliases (x_axis, metrics, metrics_b). Valid payloads that use those aliases will be rejected before Pydantic parsing, causing false validation failures. Make pre-validation honor the same aliases as the schema. [api mismatch]

Severity Level: Major ⚠️
- ❌ SchemaValidator rejects mixed_timeseries configs using alias keys.
- ⚠️ Direct callers cannot rely on x_axis/metrics/metrics_b aliases.
Steps of Reproduction ✅
1. Use the schema-level entrypoint `SchemaValidator.validate_request` at
`superset/mcp_service/chart/validation/schema_validator.py:40-57` with a raw request dict
shaped like:

   `{"dataset_id": 1, "config": {"chart_type": "mixed_timeseries", "x_axis": {"name":
   "ds"}, "metrics": [{"name": "revenue", "aggregate": "SUM"}], "metrics_b": [{"name":
   "orders", "aggregate": "COUNT"}]}}`.

2. Inside `_pre_validate` at `schema_validator.py:64-143`, the function extracts `config =
data["config"]` and dispatches to `_pre_validate_chart_type(chart_type, config)` at
`schema_validator.py:145-173`, passing this raw config dict (still using the alias keys
`x_axis`, `metrics`, and `metrics_b`).

3. `_pre_validate_chart_type` looks up the plugin from the registry and calls
`plugin.pre_validate(config)` at `schema_validator.py:190-201`; for
`chart_type="mixed_timeseries"` this resolves to `MixedTimeseriesChartPlugin.pre_validate`
defined in `superset/mcp_service/chart/plugins/mixed_timeseries.py:44-55`.

4. `MixedTimeseriesChartPlugin.pre_validate` only checks for `"x"`, `"y"`, and
`"y_secondary"` keys (lines 50-55) and does not recognize the schema-supported aliases
`x_axis`, `metrics`, and `metrics_b` defined on `MixedTimeseriesChartConfig` in
`superset/mcp_service/chart/schemas.py:18-51`, so `missing_fields` becomes non-empty, a
`ChartGenerationError` is returned, and `SchemaValidator.validate_request` exits early at
line 51, rejecting a payload that would successfully parse via Pydantic if alias keys were
honored.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** superset/mcp_service/chart/plugins/mixed_timeseries.py
**Line:** 50:55
**Comment:**
	*Api Mismatch: The pre-validation only checks for `x`, `y`, and `y_secondary` keys, but the schema explicitly accepts aliases (`x_axis`, `metrics`, `metrics_b`). Valid payloads that use those aliases will be rejected before Pydantic parsing, causing false validation failures. Make pre-validation honor the same aliases as the schema.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

Comment on lines +93 to +98
return None

def extract_column_refs(self, config: Any) -> list[ColumnRef]:
if not isinstance(config, MixedTimeseriesChartConfig):
return []
refs: list[ColumnRef] = [config.x]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Metric name normalization rewrites every entry in y/y_secondary without checking saved_metric, so saved metrics can be incorrectly remapped to similarly named dataset columns (column lookup is prioritized), breaking saved-metric resolution later. Skip canonical-column normalization for items marked as saved_metric. [logic error]

Severity Level: Critical 🚨
- ❌ Mixed_timeseries charts using saved metrics may fail at runtime.
- ⚠️ Saved metrics shadowed by columns resolve to wrong identifiers.
Steps of Reproduction ✅
1. Call the MCP tool `generate_chart` defined in
`superset/mcp_service/chart/tool/generate_chart.py:95-132` with a `GenerateChartRequest`
whose `config.chart_type` is `"mixed_timeseries"` and whose `y`/`y_secondary` lists
contain saved metrics, e.g. `{"name": "count", "saved_metric": true}` as described by
`ColumnRef.saved_metric` in `superset/mcp_service/chart/schemas.py:27-58`.

2. Inside `generate_chart`, the code invokes
`ValidationPipeline.validate_request_with_warnings(request.model_dump())` at
`generate_chart.py:205-210`; in `ValidationPipeline.validate_request_with_warnings`
(`superset/mcp_service/chart/validation/pipeline.py:90-145`), after schema and dataset
validation, it calls `_normalize_column_names` at `pipeline.py:227-260`.

3. `_normalize_column_names` delegates to `DatasetValidator.normalize_column_names(config,
dataset_id, dataset_context)` at
`superset/mcp_service/chart/validation/dataset_validator.py:351-398`; this looks up the
`mixed_timeseries` plugin from the registry and calls
`plugin.normalize_column_refs(config, dataset_context)`, which is
`MixedTimeseriesChartPlugin.normalize_column_refs` at
`superset/mcp_service/chart/plugins/mixed_timeseries.py:84-106`.

4. In `normalize_column_refs`, helper `_norm_list` (lines 93-98) iterates over every entry
in `y` and `y_secondary` and unconditionally rewrites `col["name"]` using
`DatasetValidator._get_canonical_column_name(col["name"], dataset_context)`, ignoring
`col.get("saved_metric")`; `_get_canonical_column_name` at `dataset_validator.py:303-334`
searches `dataset_context.available_columns` first and only then `available_metrics`, so
in a dataset that defines both a column `"COUNT"` and a saved metric `"count"`, a
saved-metric reference `{"name": "count", "saved_metric": true}` in `y` is normalized to
`"COUNT"`. Later, `map_mixed_timeseries_config` in
`superset/mcp_service/chart/chart_utils.py:21-46` calls `create_metric_object` on these
`ColumnRef` objects; for saved metrics it returns `col.name` (now `"COUNT"`) as the metric
identifier. Because Superset's metric lookup expects the original metric name (e.g.
`"count"`), the chart's metrics no longer resolve correctly, causing mixed_timeseries
charts using such saved metrics to fail or behave incorrectly at query time.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** superset/mcp_service/chart/plugins/mixed_timeseries.py
**Line:** 93:98
**Comment:**
	*Logic Error: Metric name normalization rewrites every entry in `y`/`y_secondary` without checking `saved_metric`, so saved metrics can be incorrectly remapped to similarly named dataset columns (column lookup is prioritized), breaking saved-metric resolution later. Skip canonical-column normalization for items marked as `saved_metric`.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

Comment thread superset/mcp_service/chart/plugins/table.py Outdated
Comment thread superset/mcp_service/chart/plugins/xy.py Outdated
Comment on lines +203 to +220
is_col_valid, col_error = DatasetValidator.validate_against_dataset(
parsed_config, dataset.id
)
if not is_col_valid and col_error is not None:
logger.warning(
"update_chart column validation failed for chart %s: %s",
getattr(chart, "id", None),
col_error,
)
return GenerateChartResponse.model_validate(
{
"chart": None,
"error": col_error.model_dump(),
"success": False,
"schema_version": "2.0",
"api_version": "v1",
}
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: This adds a full dataset validation call before validate_and_compile, but validate_and_compile already performs the same dataset validation internally using the already-loaded dataset object. The extra call triggers redundant dataset context fetching and duplicate validation work on every update request, increasing latency and database load. [performance]

Severity Level: Major ⚠️
- ⚠️ update_chart validation calls DatasetValidator twice per request.
- ⚠️ Extra dataset queries increase chart-update latency and load.
Steps of Reproduction ✅
1. Call the MCP `update_chart` tool defined at
`superset/mcp_service/chart/tool/update_chart.py:304-595` with `generate_preview=False`
and a non-null `config` so that `parsed_config` is set (`update_chart.py:413-415`) and a
full visualization update runs.

2. After building the update payload via `_build_update_payload`
(`update_chart.py:87-121`), `update_chart` extracts `new_form_data` and, before
persisting, invokes `_validate_update_against_dataset(parsed_config, new_form_data,
chart)` inside the `mcp.update_chart.validation` log context at `update_chart.py:450-453`.

3. Inside `_validate_update_against_dataset` (`update_chart.py:167-251`), once the dataset
ORM object is resolved (`dataset` from `DatasetDAO.find_by_id` at 178-180), the new code
block at lines 199-205 calls `DatasetValidator.validate_against_dataset(parsed_config,
dataset.id)` without passing a `dataset_context`, causing `DatasetValidator` to fetch
dataset context from the database via `_get_dataset_context`
(`superset/mcp_service/chart/validation/dataset_validator.py:197-253`) and run full column
and aggregation validation once.

4. Immediately afterwards, `_validate_update_against_dataset` calls
`validate_and_compile(parsed_config, form_data, dataset, run_compile_check=True)` at
`update_chart.py:222-224`, and `validate_and_compile` in
`superset/mcp_service/chart/compile.py:19-84` builds a new `dataset_context` from the same
ORM dataset and calls `DatasetValidator.validate_against_dataset(config, dataset.id,
dataset_context=dataset_context)` again (compile.py:48-52), resulting in two complete
Tier-1 dataset validation passes and two dataset-context constructions for every
update_chart request.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** superset/mcp_service/chart/tool/update_chart.py
**Line:** 203:220
**Comment:**
	*Performance: This adds a full dataset validation call before `validate_and_compile`, but `validate_and_compile` already performs the same dataset validation internally using the already-loaded dataset object. The extra call triggers redundant dataset context fetching and duplicate validation work on every update request, increasing latency and database load.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

aminghadersohi and others added 2 commits May 21, 2026 22:37
Add _get_canonical_metric_name() to DatasetValidator that searches only
available_metrics, preventing a column with matching case-insensitive name
from shadowing a saved metric's canonical casing.

Update all 7 chart plugins (xy, table, pie, big_number, handlebars,
mixed_timeseries, pivot_table) to branch on saved_metric flag: saved
metrics now go through _get_canonical_metric_name while regular column
refs continue to use _get_canonical_column_name.

Fix pre_validate alias handling in xy and mixed_timeseries plugins to
accept Pydantic AliasChoices keys (metrics/x_axis/metrics_b) so payloads
using canonical Superset field names are not incorrectly rejected.

Add TestGetCanonicalMetricName, TestSavedMetricNormalizationCorrectness,
and TestPreValidateAliasHandling test classes covering the collision case
and alias acceptance.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
A saved metric and a regular column with the same input name resolve
to different display labels after normalization (saved metrics use the
dataset's actual casing).  Using a plain string key incorrectly flags
them as duplicates; keying on (saved_metric, label) avoids the false
collision.

Fixes test_xy_saved_metric_uses_metric_casing.
Copy link
Copy Markdown
Contributor

@bito-code-review bito-code-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Agent Run #8680f9

Actionable Suggestions - 7
  • superset/mcp_service/chart/schemas.py - 1
  • superset/mcp_service/chart/plugin.py - 1
  • superset/mcp_service/chart/tool/update_chart.py - 2
  • superset/mcp_service/chart/plugins/handlebars.py - 1
  • tests/unit_tests/mcp_service/chart/validation/test_runtime_validator.py - 1
  • superset/mcp_service/chart/validation/runtime/__init__.py - 1
Additional Suggestions - 6
  • superset/initialization/__init__.py - 1
    • Missing error handling for MCP init · Line 750-750
      Wrap the call to `configure_mcp_chart_registry()` in a try-except block to catch any exceptions from `registry.configure()` and log them so that `init_app()` can continue even if MCP chart registry configuration fails.
  • superset/mcp_service/chart/validation/schema_validator.py - 2
    • Import inside function body · Line 151-152
      The `from superset.mcp_service.chart.registry import get_registry` import is placed inside the function body at lines 151-152. This causes the import to execute on every call to `_pre_validate_chart_type`. Move it to the module-level imports for better performance.
      Code suggestion
      --- superset/mcp_service/chart/validation/schema_validator.py
      +++ superset/mcp_service/chart/validation/schema_validator.py
       @@ -29,6 +29,7 @@ from superset.mcp_service.chart.schemas import (
        )
        from superset.mcp_service.common.error_schemas import ChartGenerationError
      +from superset.mcp_service.chart.registry import get_registry
       
        logger = logging.getLogger(__name__)
       
       @@ -148,9 +149,6 @@ class SchemaValidator:
                config: Dict[str, Any],
            ) -> Tuple[bool, ChartGenerationError | None]:
                """Validate chart type and dispatch to plugin pre-validation."""
      -        from superset.mcp_service.chart.registry import get_registry
      -
                registry = get_registry()
    • Misleading variable name · Line 176-176
      Variable `valid_types` on line 176 is misleading — `registry.all_types()` returns only enabled chart types (per registry.py:168), so the name suggests all registered types when it actually contains only enabled ones. Rename to `enabled_types` for clarity.
      Code suggestion
      --- superset/mcp_service/chart/validation/schema_validator.py (lines 175-182)
      +++ superset/mcp_service/chart/validation/schema_validator.py
       @@ -173,12 +173,12 @@ class SchemaValidator:
                    )
       
                if not registry.is_enabled(chart_type):
      -            valid_types = ", ".join(registry.all_types())
      +            enabled_types = ", ".join(registry.all_types())
                    return False, ChartGenerationError(
                        error_type="disabled_chart_type",
                        message=f"Chart type '{chart_type}' is not enabled on this instance",
                        details=f"Chart type '{chart_type}' is registered but has been "
                        f"disabled by the operator. "
      -                f"Enabled chart types: {valid_types}",
      +                f"Enabled chart types: {enabled_types}",
                        suggestions=[
      -                    f"Use one of the enabled chart types: {valid_types}",
      +                    f"Use one of the enabled chart types: {enabled_types}",
                            "Contact your administrator if you believe this is an error",
                        ],
                        error_code="DISABLED_CHART_TYPE",
                    )
  • superset/mcp_service/chart/plugins/xy.py - 3
    • Private member access in DatasetValidator · Line 114-114
      Accessing private method `_get_canonical_column_name`. Consider using a public API or adding a public wrapper method.
    • Private member access in DatasetValidator · Line 115-115
      Accessing private method `_get_canonical_metric_name`. Consider using a public API or adding a public wrapper method.
    • Private member access in DatasetValidator · Line 129-129
      Accessing private method `_normalize_filters`. Consider using a public API or adding a public wrapper method.
Filtered by Review Rules

Bito filtered these suggestions based on rules created automatically for your feedback. Manage rules.

  • superset/mcp_service/chart/registry.py - 1
Review Details
  • Files reviewed - 27 · Commit Range: d3af275..f90f1cd
    • superset/initialization/__init__.py
    • superset/mcp_service/app.py
    • superset/mcp_service/chart/chart_utils.py
    • superset/mcp_service/chart/plugin.py
    • superset/mcp_service/chart/plugins/__init__.py
    • superset/mcp_service/chart/plugins/big_number.py
    • superset/mcp_service/chart/plugins/handlebars.py
    • superset/mcp_service/chart/plugins/mixed_timeseries.py
    • superset/mcp_service/chart/plugins/pie.py
    • superset/mcp_service/chart/plugins/pivot_table.py
    • superset/mcp_service/chart/plugins/table.py
    • superset/mcp_service/chart/plugins/xy.py
    • superset/mcp_service/chart/registry.py
    • superset/mcp_service/chart/schemas.py
    • superset/mcp_service/chart/tool/generate_chart.py
    • superset/mcp_service/chart/tool/update_chart.py
    • superset/mcp_service/chart/validation/dataset_validator.py
    • superset/mcp_service/chart/validation/runtime/__init__.py
    • superset/mcp_service/chart/validation/schema_validator.py
    • superset/mcp_service/flask_singleton.py
    • superset/mcp_service/mcp_config.py
    • tests/unit_tests/mcp_service/chart/test_big_number_chart.py
    • tests/unit_tests/mcp_service/chart/test_registry.py
    • tests/unit_tests/mcp_service/chart/test_registry_filters.py
    • tests/unit_tests/mcp_service/chart/tool/test_update_chart.py
    • tests/unit_tests/mcp_service/chart/validation/test_column_name_normalization.py
    • tests/unit_tests/mcp_service/chart/validation/test_runtime_validator.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

Comment on lines +491 to +496
try:
from superset.mcp_service.chart.registry import display_name_for_viz_type

_display_name = display_name_for_viz_type(_viz_type) if _viz_type else None
except Exception:
_display_name = None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CWE-390: Bare Exception Handler

Replace the broad except Exception: with except (ImportError, AttributeError): to avoid masking unexpected errors while still handling expected import or attribute issues. (See also: CWE-390)

Code Review Run #8680f9


Should Bito avoid suggestions like this for future reviews? (Manage Rules)

  • Yes, avoid them

Comment thread superset/mcp_service/chart/plugin.py Outdated
"""
...

def schema_error_hint(self) -> "ChartGenerationError | None":
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Invalid type annotation string

Remove the quotes around the return type annotation in the ChartTypePlugin protocol for schema_error_hint so it reads: def schema_error_hint(self) -> ChartGenerationError | None:.

Code Review Run #8680f9


Should Bito avoid suggestions like this for future reviews? (Manage Rules)

  • Yes, avoid them

Comment on lines +203 to +204
is_col_valid, col_error = DatasetValidator.validate_against_dataset(
parsed_config, dataset.id
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant DB query in validation

Lines 203-204 call DatasetValidator.validate_against_dataset(parsed_config, dataset.id) which internally invokes _get_dataset_context(dataset_id) to fetch the dataset from the database. However, _validate_update_against_dataset already has the dataset available from lines 178-180. This causes a redundant database query per update operation. Pass the pre-fetched dataset via the dataset_context parameter to eliminate the duplicate lookup.

Code suggestion
Check the AI-generated fix before applying
 --- superset/mcp_service/chart/tool/update_chart.py
 +++ superset/mcp_service/chart/tool/update_chart.py
 @@ -200,8 +200,14 @@ def _validate_update_against_dataset(
      # Column existence + fuzzy-match validation
      # (mirrors generate_chart pipeline layer 2)
      from superset.mcp_service.chart.validation.dataset_validator import DatasetValidator
 +    from superset.mcp_service.chart.compile import build_dataset_context_from_orm
 +
 +    # Reuse the already-fetched dataset to avoid a redundant DB query
 +    dataset_context = build_dataset_context_from_orm(dataset)
 
 -    is_col_valid, col_error = DatasetValidator.validate_against_dataset(
 -        parsed_config, dataset.id
 +    is_col_valid, col_error = DatasetValidator.validate_against_dataset(
 +        parsed_config, dataset.id, dataset_context=dataset_context
      )

Code Review Run #8680f9


Should Bito avoid suggestions like this for future reviews? (Manage Rules)

  • Yes, avoid them

Comment on lines +416 to +432
# Normalize column case to match dataset canonical names
# (mirrors generate_chart pipeline layer 4)
chart_datasource_id = getattr(chart, "datasource_id", None)
if parsed_config is not None and chart_datasource_id is not None:
from superset.mcp_service.chart.validation.dataset_validator import (
DatasetValidator,
NORMALIZATION_EXCEPTIONS,
)

try:
parsed_config = DatasetValidator.normalize_column_names(
parsed_config, chart.datasource_id
)
except NORMALIZATION_EXCEPTIONS as e:
logger.warning(
"Column normalization failed for chart %s: %s", chart.id, e
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing test coverage for normalization

The new column normalization feature (lines 416-432) has no test coverage. Tests should verify that column names are properly normalized to match dataset canonical names and that exceptions are handled gracefully without crashing the update operation.

Code Review Run #8680f9


Should Bito avoid suggestions like this for future reviews? (Manage Rules)

  • Yes, avoid them

Comment on lines +44 to +124
def pre_validate(
self,
config: dict[str, Any],
) -> ChartGenerationError | None:
if "handlebars_template" not in config:
return ChartGenerationError(
error_type="missing_handlebars_template",
message="Handlebars chart missing required field: handlebars_template",
details=(
"Handlebars charts require a 'handlebars_template' string "
"containing Handlebars HTML template markup"
),
suggestions=[
"Add 'handlebars_template' with a Handlebars HTML template",
"Data is available as {{data}} array in the template",
"Example: '<ul>{{#each data}}<li>{{this.name}}: "
"{{this.value}}</li>{{/each}}</ul>'",
],
error_code="MISSING_HANDLEBARS_TEMPLATE",
)

template = config.get("handlebars_template")
if not isinstance(template, str) or not template.strip():
return ChartGenerationError(
error_type="invalid_handlebars_template",
message="Handlebars template must be a non-empty string",
details=(
"The 'handlebars_template' field must be a non-empty string "
"containing valid Handlebars HTML template markup"
),
suggestions=[
"Ensure handlebars_template is a non-empty string",
"Example: '<ul>{{#each data}}<li>{{this.name}}</li>{{/each}}</ul>'",
],
error_code="INVALID_HANDLEBARS_TEMPLATE",
)

query_mode = config.get("query_mode", "aggregate")
if query_mode not in ("aggregate", "raw"):
return ChartGenerationError(
error_type="invalid_query_mode",
message="Invalid query_mode for handlebars chart",
details="query_mode must be either 'aggregate' or 'raw'",
suggestions=[
"Use 'aggregate' for aggregated data (default)",
"Use 'raw' for individual rows",
],
error_code="INVALID_QUERY_MODE",
)

if query_mode == "raw" and not config.get("columns"):
return ChartGenerationError(
error_type="missing_raw_columns",
message="Handlebars chart in 'raw' mode requires 'columns'",
details=(
"When query_mode is 'raw', you must specify which columns "
"to include in the query results"
),
suggestions=[
"Add 'columns': [{'name': 'column_name'}] for raw mode",
"Or use query_mode='aggregate' with 'metrics' and optional 'groupby'", # noqa: E501
],
error_code="MISSING_RAW_COLUMNS",
)

if query_mode == "aggregate" and not config.get("metrics"):
return ChartGenerationError(
error_type="missing_aggregate_metrics",
message="Handlebars chart in 'aggregate' mode requires 'metrics'",
details=(
"When query_mode is 'aggregate' (default), you must specify "
"at least one metric with an aggregate function"
),
suggestions=[
"Add 'metrics': [{'name': 'column', 'aggregate': 'SUM'}]",
"Or use query_mode='raw' with 'columns' for individual rows",
],
error_code="MISSING_AGGREGATE_METRICS",
)

return None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate validation with schema

The pre_validate method duplicates validation logic already implemented in HandlebarsChartConfig.validate_query_fields() (schemas.py:1055-1087). This includes checking for missing metrics in aggregate mode (line 109) and missing columns in raw mode (line 94). The schema validator also performs additional checks (mutual exclusivity, aggregate function presence) that pre_validate lacks. Having both creates maintenance divergence risk.

Code Review Run #8680f9


Should Bito avoid suggestions like this for future reviews? (Manage Rules)

  • Yes, avoid them

Comment on lines 213 to 234
@@ -226,28 +220,15 @@ def test_validate_table_chart_skips_xy_validations(self):
],
)

# These should not be called for table charts
with (
patch(
"superset.mcp_service.chart.validation.runtime.RuntimeValidator."
"_validate_format_compatibility"
) as mock_format,
patch(
"superset.mcp_service.chart.validation.runtime.RuntimeValidator."
"_validate_cardinality"
) as mock_cardinality,
patch(
"superset.mcp_service.chart.validation.runtime.RuntimeValidator."
"_validate_chart_type"
) as mock_chart_type,
):
# Mock chart type validator to return no warnings
# Plugin runtime dispatches to TableChartPlugin which returns no warnings.
# Chart type suggester is also stubbed to return no warnings.
with patch(
"superset.mcp_service.chart.validation.runtime.RuntimeValidator."
"_validate_chart_type"
) as mock_chart_type:
mock_chart_type.return_value = ([], [])

is_valid, error = RuntimeValidator.validate_runtime_issues(config, 1)

# Format and cardinality validation should not be called for table charts
mock_format.assert_not_called()
mock_cardinality.assert_not_called()
assert is_valid is True
assert error is None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test coverage regression

Test test_validate_table_chart_skips_xy_validations no longer verifies that XY-specific validations are skipped. Original assertions mock_format.assert_not_called() and mock_cardinality.assert_not_called() were removed, weakening test coverage for the skip behavior.

Code Review Run #8680f9


Should Bito avoid suggestions like this for future reviews? (Manage Rules)

  • Yes, avoid them

if plugin is None:
return []
return plugin.get_runtime_warnings(config, dataset_id)
except Exception as exc:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid catching blind Exception

Replace bare Exception catch with specific exception types (e.g., ImportError, AttributeError, RuntimeError) to avoid masking unexpected errors.

Code suggestion
Check the AI-generated fix before applying
Suggested change
except Exception as exc:
except (ImportError, AttributeError, RuntimeError) as exc:

Code Review Run #8680f9


Should Bito avoid suggestions like this for future reviews? (Manage Rules)

  • Yes, avoid them

… redundant annotation quotes

- schemas.py: CWE-390 bare except → add `as exc` + debug log so display-name
  lookup failures are observable (BLE001 suppressed: intentional fallback)
- plugin.py: remove redundant forward-reference quotes from schema_error_hint
  return type (from __future__ import annotations already makes all annotations
  lazy strings)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@bito-code-review
Copy link
Copy Markdown
Contributor

bito-code-review Bot commented May 23, 2026

Code Review Agent Run #40e077

Actionable Suggestions - 0
Review Details
  • Files reviewed - 2 · Commit Range: f90f1cd..42a293a
    • superset/mcp_service/chart/plugin.py
    • superset/mcp_service/chart/schemas.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

change:backend Requires changing the backend review:draft risk:refactor High risk as it involves large refactoring work size/XXL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants