Skip to content

fix(mcp): Improve validation errors and field aliases to reduce failed LLM tool calls#38625

Merged
kgabryje merged 1 commit intoapache:masterfrom
kgabryje:fix/mcp_validation
Mar 13, 2026
Merged

fix(mcp): Improve validation errors and field aliases to reduce failed LLM tool calls#38625
kgabryje merged 1 commit intoapache:masterfrom
kgabryje:fix/mcp_validation

Conversation

@kgabryje
Copy link
Member

@kgabryje kgabryje commented Mar 13, 2026

User description

SUMMARY

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

Reduces failed MCP tool calls by giving LLMs better error messages and accepting common field name
variations:

  1. Actionable validation errors — The parse_request decorator now catches Pydantic ValidationError and
    re-raises it as a ToolError with field-level details and a list of required fields, so the LLM can
    self-correct on retry instead of getting raw Pydantic internals.
  2. Field alias tolerance — Adds AliasChoices to OpenSqlLabRequest so both
    database_id/database_connection_id and sql/query are accepted, since LLMs frequently use these
    interchangeably.
  3. Defensive type check in execute_sql — Validates that statement data is a DataFrame before calling
    .to_dict(), returning a proper error response instead of an unhandled exception.

Also updates MCP app instructions to guide LLMs toward list_datasets for finding database_id and uses
consistent sql terminology.

TESTING INSTRUCTIONS

  • Run the updated unit tests: pytest tests/unit_tests/mcp_service/utils/test_schema_utils.py
  • Verify validation errors now raise ToolError with required field names
  • Test open_sql_lab_with_context with both database_id and database_connection_id, and both sql and
    query

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

cc @aminghadersohi


CodeAnt-AI Description

Improve MCP tool request validation, field aliasing, and SQL result handling

What Changed

  • Validation errors from MCP tool requests are converted into clear ToolError messages that list which fields failed and which fields are required, enabling LLMs to self-correct on retry.
  • Open SQL Lab tool accepts common field name variations: database_id or database_connection_id for the database, and sql or query for the pre-filled SQL; instructions now guide LLMs to use list_datasets to find a dataset's database_id.
  • SQL execution now detects when returned statement data is not a table and returns a controlled error with an explanation instead of raising an internal exception.
  • Unit tests updated to expect ToolError with field details for invalid tool requests.

Impact

✅ Fewer failed LLM tool calls due to unclear validation errors
✅ Clearer request error messages for LLM retries
✅ Fewer internal crashes when SQL returns unexpected data

💡 Usage Guide

Checking Your Pull Request

Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

Talking to CodeAnt AI

Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

@codeant-ai ask: Your question here

This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

Example

@codeant-ai ask: Can you suggest a safer alternative to storing this secret?

Preserve Org Learnings with CodeAnt

You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:

@codeant-ai: Your feedback here

This helps CodeAnt AI learn and adapt to your team's coding style and standards.

Example

@codeant-ai: Do not flag unused imports.

Retrigger review

Ask CodeAnt AI to review the PR again, by typing:

@codeant-ai: review

Check Your Repository Health

To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

@bito-code-review
Copy link
Contributor

bito-code-review bot commented Mar 13, 2026

Code Review Agent Run #d515f1

Actionable Suggestions - 0
Review Details
  • Files reviewed - 6 · Commit Range: 882e030..882e030
    • superset/mcp_service/app.py
    • superset/mcp_service/sql_lab/schemas.py
    • superset/mcp_service/sql_lab/tool/execute_sql.py
    • superset/mcp_service/sql_lab/tool/open_sql_lab_with_context.py
    • superset/mcp_service/utils/schema_utils.py
    • tests/unit_tests/mcp_service/utils/test_schema_utils.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

@dosubot dosubot bot added the change:backend Requires changing the backend label Mar 13, 2026
@codeant-ai-for-open-source codeant-ai-for-open-source bot added the size:M This PR changes 30-99 lines, ignoring generated files label Mar 13, 2026
@codeant-ai-for-open-source
Copy link
Contributor

Sequence Diagram

This PR improves MCP tool reliability by accepting common field aliases and converting validation failures into actionable ToolError messages. It also adds a defensive type check when converting SQL execution results to prevent unhandled exceptions.

sequenceDiagram
    participant LLM as LLM Client
    participant MCP as MCP Tool Layer
    participant Schema as Request Schema
    participant SQL as SQL Processing

    LLM->>MCP: Call SQL tool with parameters
    MCP->>Schema: Parse request with alias support

    alt Invalid or missing required fields
        Schema-->>MCP: Validation errors
        MCP-->>LLM: ToolError with field details and required fields
    else Valid request
        Schema-->>MCP: Normalized request
        MCP->>SQL: Execute query or build SQL Lab context
        SQL-->>MCP: Statement result data
        alt Data is table format
            MCP-->>LLM: Success response with rows or SQL Lab URL
        else Data has unexpected type
            MCP-->>LLM: Structured data conversion error
        end
    end
Loading

Generated by CodeAnt AI

Comment on lines +132 to +136
sql: str | None = Field(
None,
description="SQL to pre-populate in the editor",
validation_alias=AliasChoices("sql", "query"),
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: The optional SQL field accepts whitespace-only strings as valid input. Downstream logic checks truthiness to decide whether to auto-generate dataset context SQL, and whitespace is truthy, so context generation is skipped and users get a blank/invalid prefilled query. Normalize this field by trimming and converting empty strings to None. [logic error]

Severity Level: Major ⚠️
- ⚠️ SQL Lab context autofill skipped for whitespace SQL.
- ⚠️ Users receive blank/invalid prefilled editor query.
Suggested change
sql: str | None = Field(
None,
description="SQL to pre-populate in the editor",
validation_alias=AliasChoices("sql", "query"),
)
sql: str | None = Field(
None,
description="SQL to pre-populate in the editor",
validation_alias=AliasChoices("sql", "query"),
)
@field_validator("sql")
@classmethod
def normalize_sql(cls, v: str | None) -> str | None:
if v is None:
return None
v = v.strip()
return v or None
Steps of Reproduction ✅
1. Start MCP server where `open_sql_lab_with_context` is registered via
`superset/mcp_service/app.py:416-418` and documented as a normal workflow in
`superset/mcp_service/app.py:106-108`.

2. Invoke tool `open_sql_lab_with_context` (function at
`superset/mcp_service/sql_lab/tool/open_sql_lab_with_context.py:43`) with
`database_id`/`database_connection_id`, `dataset_in_context`, and `sql` (or alias `query`)
set to whitespace like `" "`.

3. Request passes through `@parse_request(OpenSqlLabRequest)` at
`superset/mcp_service/sql_lab/tool/open_sql_lab_with_context.py:42`; `parse_request` calls
`model_validate` in `superset/mcp_service/utils/schema_utils.py:190-194`, and
`OpenSqlLabRequest.sql` has no normalizing validator in
`superset/mcp_service/sql_lab/schemas.py:132-136`, so whitespace is accepted unchanged.

4. In `open_sql_lab_with_context`, `if request.sql:` at
`superset/mcp_service/sql_lab/tool/open_sql_lab_with_context.py:73` is true for
whitespace, so `params["sql"]` is set; then fallback context SQL generation guarded by `if
not request.sql:` at line 81 is skipped, producing a SQL Lab URL without the intended
dataset-context query.
Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** superset/mcp_service/sql_lab/schemas.py
**Line:** 132:136
**Comment:**
	*Logic Error: The optional SQL field accepts whitespace-only strings as valid input. Downstream logic checks truthiness to decide whether to auto-generate dataset context SQL, and whitespace is truthy, so context generation is skipped and users get a blank/invalid prefilled query. Normalize this field by trimming and converting empty strings to `None`.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
👍 | 👎

@codecov
Copy link

codecov bot commented Mar 13, 2026

Codecov Report

❌ Patch coverage is 12.50000% with 14 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.40%. Comparing base (56d6bb1) to head (882e030).
⚠️ Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
superset/mcp_service/utils/schema_utils.py 0.00% 10 Missing ⚠️
superset/mcp_service/sql_lab/tool/execute_sql.py 0.00% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #38625      +/-   ##
==========================================
- Coverage   65.02%   64.40%   -0.63%     
==========================================
  Files        1817     2529     +712     
  Lines       72319   128964   +56645     
  Branches    23033    29723    +6690     
==========================================
+ Hits        47028    83054   +36026     
- Misses      25291    44464   +19173     
- Partials        0     1446    +1446     
Flag Coverage Δ
hive 40.75% <12.50%> (?)
mysql 61.88% <12.50%> (?)
postgres 61.95% <12.50%> (?)
presto 40.76% <12.50%> (?)
python 63.57% <12.50%> (?)
sqlite 61.58% <12.50%> (?)
unit 100.00% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@kgabryje kgabryje merged commit d91b968 into apache:master Mar 13, 2026
85 checks passed
michael-s-molina pushed a commit that referenced this pull request Mar 17, 2026
aminghadersohi pushed a commit to aminghadersohi/superset that referenced this pull request Mar 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

change:backend Requires changing the backend size/M size:M This PR changes 30-99 lines, ignoring generated files

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants