fix(mcp): Improve validation errors and field aliases to reduce failed LLM tool calls#38625
Conversation
Code Review Agent Run #d515f1Actionable Suggestions - 0Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
Sequence DiagramThis PR improves MCP tool reliability by accepting common field aliases and converting validation failures into actionable ToolError messages. It also adds a defensive type check when converting SQL execution results to prevent unhandled exceptions. sequenceDiagram
participant LLM as LLM Client
participant MCP as MCP Tool Layer
participant Schema as Request Schema
participant SQL as SQL Processing
LLM->>MCP: Call SQL tool with parameters
MCP->>Schema: Parse request with alias support
alt Invalid or missing required fields
Schema-->>MCP: Validation errors
MCP-->>LLM: ToolError with field details and required fields
else Valid request
Schema-->>MCP: Normalized request
MCP->>SQL: Execute query or build SQL Lab context
SQL-->>MCP: Statement result data
alt Data is table format
MCP-->>LLM: Success response with rows or SQL Lab URL
else Data has unexpected type
MCP-->>LLM: Structured data conversion error
end
end
Generated by CodeAnt AI |
| sql: str | None = Field( | ||
| None, | ||
| description="SQL to pre-populate in the editor", | ||
| validation_alias=AliasChoices("sql", "query"), | ||
| ) |
There was a problem hiding this comment.
Suggestion: The optional SQL field accepts whitespace-only strings as valid input. Downstream logic checks truthiness to decide whether to auto-generate dataset context SQL, and whitespace is truthy, so context generation is skipped and users get a blank/invalid prefilled query. Normalize this field by trimming and converting empty strings to None. [logic error]
Severity Level: Major ⚠️
- ⚠️ SQL Lab context autofill skipped for whitespace SQL.
- ⚠️ Users receive blank/invalid prefilled editor query.| sql: str | None = Field( | |
| None, | |
| description="SQL to pre-populate in the editor", | |
| validation_alias=AliasChoices("sql", "query"), | |
| ) | |
| sql: str | None = Field( | |
| None, | |
| description="SQL to pre-populate in the editor", | |
| validation_alias=AliasChoices("sql", "query"), | |
| ) | |
| @field_validator("sql") | |
| @classmethod | |
| def normalize_sql(cls, v: str | None) -> str | None: | |
| if v is None: | |
| return None | |
| v = v.strip() | |
| return v or None |
Steps of Reproduction ✅
1. Start MCP server where `open_sql_lab_with_context` is registered via
`superset/mcp_service/app.py:416-418` and documented as a normal workflow in
`superset/mcp_service/app.py:106-108`.
2. Invoke tool `open_sql_lab_with_context` (function at
`superset/mcp_service/sql_lab/tool/open_sql_lab_with_context.py:43`) with
`database_id`/`database_connection_id`, `dataset_in_context`, and `sql` (or alias `query`)
set to whitespace like `" "`.
3. Request passes through `@parse_request(OpenSqlLabRequest)` at
`superset/mcp_service/sql_lab/tool/open_sql_lab_with_context.py:42`; `parse_request` calls
`model_validate` in `superset/mcp_service/utils/schema_utils.py:190-194`, and
`OpenSqlLabRequest.sql` has no normalizing validator in
`superset/mcp_service/sql_lab/schemas.py:132-136`, so whitespace is accepted unchanged.
4. In `open_sql_lab_with_context`, `if request.sql:` at
`superset/mcp_service/sql_lab/tool/open_sql_lab_with_context.py:73` is true for
whitespace, so `params["sql"]` is set; then fallback context SQL generation guarded by `if
not request.sql:` at line 81 is skipped, producing a SQL Lab URL without the intended
dataset-context query.Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** superset/mcp_service/sql_lab/schemas.py
**Line:** 132:136
**Comment:**
*Logic Error: The optional SQL field accepts whitespace-only strings as valid input. Downstream logic checks truthiness to decide whether to auto-generate dataset context SQL, and whitespace is truthy, so context generation is skipped and users get a blank/invalid prefilled query. Normalize this field by trimming and converting empty strings to `None`.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #38625 +/- ##
==========================================
- Coverage 65.02% 64.40% -0.63%
==========================================
Files 1817 2529 +712
Lines 72319 128964 +56645
Branches 23033 29723 +6690
==========================================
+ Hits 47028 83054 +36026
- Misses 25291 44464 +19173
- Partials 0 1446 +1446
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
User description
SUMMARY
BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
Reduces failed MCP tool calls by giving LLMs better error messages and accepting common field name
variations:
re-raises it as a ToolError with field-level details and a list of required fields, so the LLM can
self-correct on retry instead of getting raw Pydantic internals.
database_id/database_connection_id and sql/query are accepted, since LLMs frequently use these
interchangeably.
.to_dict(), returning a proper error response instead of an unhandled exception.
Also updates MCP app instructions to guide LLMs toward list_datasets for finding database_id and uses
consistent sql terminology.
TESTING INSTRUCTIONS
query
ADDITIONAL INFORMATION
cc @aminghadersohi
CodeAnt-AI Description
Improve MCP tool request validation, field aliasing, and SQL result handling
What Changed
Impact
✅ Fewer failed LLM tool calls due to unclear validation errors✅ Clearer request error messages for LLM retries✅ Fewer internal crashes when SQL returns unexpected data💡 Usage Guide
Checking Your Pull Request
Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.
Talking to CodeAnt AI
Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:
This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.
Example
Preserve Org Learnings with CodeAnt
You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:
This helps CodeAnt AI learn and adapt to your team's coding style and standards.
Example
Retrigger review
Ask CodeAnt AI to review the PR again, by typing:
Check Your Repository Health
To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.