fix(mcp): Improve validation errors and field aliases to reduce failed LLM tool calls by kgabryje · Pull Request #38625 · apache/superset

kgabryje · 2026-03-13T09:52:29Z

User description

SUMMARY

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

Reduces failed MCP tool calls by giving LLMs better error messages and accepting common field name
variations:

Actionable validation errors — The parse_request decorator now catches Pydantic ValidationError and
re-raises it as a ToolError with field-level details and a list of required fields, so the LLM can
self-correct on retry instead of getting raw Pydantic internals.
Field alias tolerance — Adds AliasChoices to OpenSqlLabRequest so both
database_id/database_connection_id and sql/query are accepted, since LLMs frequently use these
interchangeably.
Defensive type check in execute_sql — Validates that statement data is a DataFrame before calling
.to_dict(), returning a proper error response instead of an unhandled exception.

Also updates MCP app instructions to guide LLMs toward list_datasets for finding database_id and uses
consistent sql terminology.

TESTING INSTRUCTIONS

Run the updated unit tests: pytest tests/unit_tests/mcp_service/utils/test_schema_utils.py
Verify validation errors now raise ToolError with required field names
Test open_sql_lab_with_context with both database_id and database_connection_id, and both sql and
query

ADDITIONAL INFORMATION

Has associated issue:
Required feature flags:
Changes UI
Includes DB Migration (follow approval process in SIP-59)
- Migration is atomic, supports rollback & is backwards-compatible
- Confirm DB migration upgrade and downgrade tested
- Runtime estimates and downtime expectations provided
Introduces new feature or API
Removes existing feature or API

cc @aminghadersohi

CodeAnt-AI Description

Improve MCP tool request validation, field aliasing, and SQL result handling

What Changed

Validation errors from MCP tool requests are converted into clear ToolError messages that list which fields failed and which fields are required, enabling LLMs to self-correct on retry.
Open SQL Lab tool accepts common field name variations: database_id or database_connection_id for the database, and sql or query for the pre-filled SQL; instructions now guide LLMs to use list_datasets to find a dataset's database_id.
SQL execution now detects when returned statement data is not a table and returns a controlled error with an explanation instead of raising an internal exception.
Unit tests updated to expect ToolError with field details for invalid tool requests.

Impact

✅ Fewer failed LLM tool calls due to unclear validation errors
✅ Clearer request error messages for LLM retries
✅ Fewer internal crashes when SQL returns unexpected data

💡 Usage Guide

Checking Your Pull Request

Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

Talking to CodeAnt AI

Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

@codeant-ai ask: Your question here

This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

Example

@codeant-ai ask: Can you suggest a safer alternative to storing this secret?

Preserve Org Learnings with CodeAnt

You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:

@codeant-ai: Your feedback here

This helps CodeAnt AI learn and adapt to your team's coding style and standards.

Example

@codeant-ai: Do not flag unused imports.

Retrigger review

Ask CodeAnt AI to review the PR again, by typing:

@codeant-ai: review

Check Your Repository Health

To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

…d LLM tool calls

bito-code-review · 2026-03-13T09:52:39Z

Code Review Agent Run #d515f1

Actionable Suggestions - 0

Review Details

Files reviewed - 6 · Commit Range: 882e030..882e030
- superset/mcp_service/app.py
- superset/mcp_service/sql_lab/schemas.py
- superset/mcp_service/sql_lab/tool/execute_sql.py
- superset/mcp_service/sql_lab/tool/open_sql_lab_with_context.py
- superset/mcp_service/utils/schema_utils.py
- tests/unit_tests/mcp_service/utils/test_schema_utils.py
Files skipped - 0
Tools
- Whispers (Secret Scanner) - ✔︎ Successful
- Detect-secrets (Secret Scanner) - ✔︎ Successful
- MyPy (Static Code Analysis) - ✔︎ Successful
- Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

/review - Manually triggers a full AI review.
/pause - Pauses automatic reviews on this pull request.
/resume - Resumes automatic reviews.
/resolve - Marks all Bito-posted review comments as resolved.
/abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by

codeant-ai-for-open-source · 2026-03-13T09:54:02Z

Sequence Diagram

This PR improves MCP tool reliability by accepting common field aliases and converting validation failures into actionable ToolError messages. It also adds a defensive type check when converting SQL execution results to prevent unhandled exceptions.

sequenceDiagram
    participant LLM as LLM Client
    participant MCP as MCP Tool Layer
    participant Schema as Request Schema
    participant SQL as SQL Processing

    LLM->>MCP: Call SQL tool with parameters
    MCP->>Schema: Parse request with alias support

    alt Invalid or missing required fields
        Schema-->>MCP: Validation errors
        MCP-->>LLM: ToolError with field details and required fields
    else Valid request
        Schema-->>MCP: Normalized request
        MCP->>SQL: Execute query or build SQL Lab context
        SQL-->>MCP: Statement result data
        alt Data is table format
            MCP-->>LLM: Success response with rows or SQL Lab URL
        else Data has unexpected type
            MCP-->>LLM: Structured data conversion error
        end
    end

Generated by CodeAnt AI

codeant-ai-for-open-source · 2026-03-13T09:58:06Z

superset/mcp_service/sql_lab/schemas.py

+    sql: str | None = Field(
+        None,
+        description="SQL to pre-populate in the editor",
+        validation_alias=AliasChoices("sql", "query"),
+    )


Suggestion: The optional SQL field accepts whitespace-only strings as valid input. Downstream logic checks truthiness to decide whether to auto-generate dataset context SQL, and whitespace is truthy, so context generation is skipped and users get a blank/invalid prefilled query. Normalize this field by trimming and converting empty strings to None. [logic error]

Severity Level: Major ⚠️

- ⚠️ SQL Lab context autofill skipped for whitespace SQL. - ⚠️ Users receive blank/invalid prefilled editor query.

Suggested change

sql: str | None = Field(

None,

description="SQL to pre-populate in the editor",

validation_alias=AliasChoices("sql", "query"),

)

sql: str | None = Field(

None,

description="SQL to pre-populate in the editor",

validation_alias=AliasChoices("sql", "query"),

)

@field_validator("sql")

@classmethod

def normalize_sql(cls, v: str | None) -> str | None:

if v is None:

return None

v = v.strip()

return v or None

Steps of Reproduction ✅

1. Start MCP server where `open_sql_lab_with_context` is registered via `superset/mcp_service/app.py:416-418` and documented as a normal workflow in `superset/mcp_service/app.py:106-108`. 2. Invoke tool `open_sql_lab_with_context` (function at `superset/mcp_service/sql_lab/tool/open_sql_lab_with_context.py:43`) with `database_id`/`database_connection_id`, `dataset_in_context`, and `sql` (or alias `query`) set to whitespace like `" "`. 3. Request passes through `@parse_request(OpenSqlLabRequest)` at `superset/mcp_service/sql_lab/tool/open_sql_lab_with_context.py:42`; `parse_request` calls `model_validate` in `superset/mcp_service/utils/schema_utils.py:190-194`, and `OpenSqlLabRequest.sql` has no normalizing validator in `superset/mcp_service/sql_lab/schemas.py:132-136`, so whitespace is accepted unchanged. 4. In `open_sql_lab_with_context`, `if request.sql:` at `superset/mcp_service/sql_lab/tool/open_sql_lab_with_context.py:73` is true for whitespace, so `params["sql"]` is set; then fallback context SQL generation guarded by `if not request.sql:` at line 81 is skipped, producing a SQL Lab URL without the intended dataset-context query.

Prompt for AI Agent 🤖

This is a comment left during a code review. **Path:** superset/mcp_service/sql_lab/schemas.py **Line:** 132:136 **Comment:** *Logic Error: The optional SQL field accepts whitespace-only strings as valid input. Downstream logic checks truthiness to decide whether to auto-generate dataset context SQL, and whitespace is truthy, so context generation is skipped and users get a blank/invalid prefilled query. Normalize this field by trimming and converting empty strings to `None`. Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.

👍 | 👎

codecov · 2026-03-13T10:03:47Z

Codecov Report

❌ Patch coverage is 12.50000% with 14 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.40%. Comparing base (56d6bb1) to head (882e030).
⚠️ Report is 1 commits behind head on master.

Files with missing lines	Patch %	Lines
superset/mcp_service/utils/schema_utils.py	0.00%	10 Missing ⚠️
superset/mcp_service/sql_lab/tool/execute_sql.py	0.00%	4 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #38625      +/-   ##
==========================================
- Coverage   65.02%   64.40%   -0.63%     
==========================================
  Files        1817     2529     +712     
  Lines       72319   128964   +56645     
  Branches    23033    29723    +6690     
==========================================
+ Hits        47028    83054   +36026     
- Misses      25291    44464   +19173     
- Partials        0     1446    +1446

Flag	Coverage Δ
hive	`40.75% <12.50%> (?)`
mysql	`61.88% <12.50%> (?)`
postgres	`61.95% <12.50%> (?)`
presto	`40.76% <12.50%> (?)`
python	`63.57% <12.50%> (?)`
sqlite	`61.58% <12.50%> (?)`
unit	`100.00% <ø> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…d LLM tool calls (#38625) (cherry picked from commit d91b968)

…d LLM tool calls (apache#38625)

fix(mcp): Improve validation errors and field aliases to reduce faile…

882e030

…d LLM tool calls

kgabryje requested a review from Antonio-RiveroMartnez March 13, 2026 09:52

pull-request-size bot added the size/M label Mar 13, 2026

dosubot bot added the change:backend Requires changing the backend label Mar 13, 2026

codeant-ai-for-open-source bot added the size:M This PR changes 30-99 lines, ignoring generated files label Mar 13, 2026

Antonio-RiveroMartnez approved these changes Mar 13, 2026

View reviewed changes

codeant-ai-for-open-source bot reviewed Mar 13, 2026

View reviewed changes

aminghadersohi approved these changes Mar 13, 2026

View reviewed changes

kgabryje merged commit d91b968 into apache:master Mar 13, 2026
85 checks passed

michael-s-molina pushed a commit that referenced this pull request Mar 17, 2026

fix(mcp): Improve validation errors and field aliases to reduce faile…

d170dec

…d LLM tool calls (#38625) (cherry picked from commit d91b968)

aminghadersohi pushed a commit to aminghadersohi/superset that referenced this pull request Mar 17, 2026

fix(mcp): Improve validation errors and field aliases to reduce faile…

0b111de

…d LLM tool calls (apache#38625)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(mcp): Improve validation errors and field aliases to reduce failed LLM tool calls#38625

fix(mcp): Improve validation errors and field aliases to reduce failed LLM tool calls#38625
kgabryje merged 1 commit intoapache:masterfrom
kgabryje:fix/mcp_validation

kgabryje commented Mar 13, 2026 •

edited by codeant-ai-for-open-source bot

Loading

Uh oh!

bito-code-review bot commented Mar 13, 2026 •

edited

Loading

Code Review Agent Run #d515f1

Uh oh!

codeant-ai-for-open-source bot commented Mar 13, 2026

Uh oh!

codeant-ai-for-open-source bot Mar 13, 2026

Uh oh!

codecov bot commented Mar 13, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

-    sql: str | None = Field(
-        None,
-        description="SQL to pre-populate in the editor",
-        validation_alias=AliasChoices("sql", "query"),
-    )
+    sql: str | None = Field(
+        None,
+        description="SQL to pre-populate in the editor",
+        validation_alias=AliasChoices("sql", "query"),
+    )
+    @field_validator("sql")
+    @classmethod
+    def normalize_sql(cls, v: str | None) -> str | None:
+        if v is None:
+            return None
+        v = v.strip()
+        return v or None

Conversation

kgabryje commented Mar 13, 2026 • edited by codeant-ai-for-open-source bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

SUMMARY

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

CodeAnt-AI Description

What Changed

Impact

Checking Your Pull Request

Talking to CodeAnt AI

Example

Preserve Org Learnings with CodeAnt

Example

Retrigger review

Check Your Repository Health

Uh oh!

bito-code-review bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review Agent Run #d515f1

Uh oh!

codeant-ai-for-open-source bot commented Mar 13, 2026

Sequence Diagram

Uh oh!

codeant-ai-for-open-source bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kgabryje commented Mar 13, 2026 •

edited by codeant-ai-for-open-source bot

Loading

bito-code-review bot commented Mar 13, 2026 •

edited

Loading

codecov bot commented Mar 13, 2026 •

edited

Loading