Skip to content

fix(mcp): simplify tool schemas in BM25 search results to prevent LLM argument misformatting#38914

Open
kgabryje wants to merge 2 commits intoapache:masterfrom
kgabryje:fix/mcp-schema-resolution
Open

fix(mcp): simplify tool schemas in BM25 search results to prevent LLM argument misformatting#38914
kgabryje wants to merge 2 commits intoapache:masterfrom
kgabryje:fix/mcp-schema-resolution

Conversation

@kgabryje
Copy link
Copy Markdown
Member

@kgabryje kgabryje commented Mar 27, 2026

User description

SUMMARY

When the BM25 tool search transform is enabled, the LLM discovers tools via search_tools and invokes them through a call_tool proxy. The tool schemas returned in search results contain raw Pydantic-generated JSON Schema with $ref pointers and anyOf[string, Object] unions (from the @parse_request decorator's str | Model annotation). The LLM reads these schemas as text and must reconstruct the argument structure when calling call_tool — but the indirection of $ref lookups and ambiguity of anyOf unions causes it to sometimes omit the required request wrapper, passing inner properties directly (e.g. {"config": ...} instead of {"request": {"config": ...}}).

This PR adds a schema simplification step in _serialize_tools_without_output_schema that:

  • Inlines all $ref pointers so the schema is self-contained (no $defs section needed)
  • Collapses anyOf[string, Object] to just the structured object variant, making the request wrapper and
    its inner properties immediately visible. Field-level metadata (description, default) is preserved
    during the collapse.
  • Preserves oneOf variants (discriminated unions) with resolved refs
  • Only collapses when the pattern is exactly one string + one object variant (the @parse_request
    pattern). All other unions are kept intact.

Before (what the LLM saw):
{"properties": {"request": {"anyOf": [{"type": "string"}, {"$ref": "#/$defs/ListChartsRequest"}]}},
"$defs": {"ListChartsRequest": {"properties": {"filters": {"items": {"$ref": "#/$defs/ChartFilter"}},
...}}, "ChartFilter": {...}}}

After:
{"properties": {"request": {"type": "object", "properties": {"filters": {"type": "array", "items":
{"type": "object", "properties": {"col": ..., "opr": ..., "value": ...}}}, "page": {"type": "integer",
"default": 1}}}}, "required": ["request"]}

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

Unit tests should pass

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

CodeAnt-AI Description

Make MCP tool search results easier to follow and use

What Changed

  • Tool search results now show a self-contained input schema instead of indirect references, so the required request shape is visible at a glance.
  • Optional fields are shown as a single field type, while nested object details inside arrays are expanded in place.
  • Mixed choices still keep their variants, but the schema no longer hides the important object structure behind references.
  • Added coverage for common schema shapes to confirm the visible request format stays intact.

Impact

✅ Fewer malformed tool calls
✅ Clearer request shapes in search results
✅ Less trial and error when invoking MCP tools

💡 Usage Guide

Checking Your Pull Request

Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

Talking to CodeAnt AI

Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

@codeant-ai ask: Your question here

This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

Example

@codeant-ai ask: Can you suggest a safer alternative to storing this secret?

Preserve Org Learnings with CodeAnt

You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:

@codeant-ai: Your feedback here

This helps CodeAnt AI learn and adapt to your team's coding style and standards.

Example

@codeant-ai: Do not flag unused imports.

Retrigger review

Ask CodeAnt AI to review the PR again, by typing:

@codeant-ai: review

Check Your Repository Health

To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

@bito-code-review
Copy link
Copy Markdown
Contributor

bito-code-review bot commented Mar 27, 2026

Code Review Agent Run #caa3c4

Actionable Suggestions - 0
Review Details
  • Files reviewed - 2 · Commit Range: 32ad1fb..32ad1fb
    • superset/mcp_service/server.py
    • tests/unit_tests/mcp_service/test_tool_search_transform.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

@dosubot dosubot bot added the change:backend Requires changing the backend label Mar 27, 2026
@codeant-ai-for-open-source codeant-ai-for-open-source bot added the size:L This PR changes 100-499 lines, ignoring generated files label Mar 27, 2026
@codeant-ai-for-open-source
Copy link
Copy Markdown
Contributor

Sequence Diagram

This PR updates the MCP server's BM25 tool search flow so that, before returning search results, it simplifies each tool's input schema by resolving references and collapsing unions, making it easier for the LLM to construct correctly wrapped call_tool arguments.

sequenceDiagram
    participant LLM
    participant MCPServer
    participant SchemaSimplifier

    LLM->>MCPServer: search_tools query
    MCPServer->>MCPServer: rank tools with BM25 search
    MCPServer->>SchemaSimplifier: simplify tool input schemas
    SchemaSimplifier-->>MCPServer: flat schemas without refs or string unions
    MCPServer-->>LLM: search_tools results with simplified schemas
    LLM->>MCPServer: call_tool with correctly wrapped request
Loading

Generated by CodeAnt AI

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 27, 2026

Codecov Report

❌ Patch coverage is 0% with 53 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.54%. Comparing base (e045f49) to head (941ea19).
⚠️ Report is 24 commits behind head on master.

Files with missing lines Patch % Lines
superset/mcp_service/server.py 0.00% 53 Missing ⚠️

❌ Your project check has failed because the head coverage (99.85%) is below the target coverage (100.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #38914      +/-   ##
==========================================
- Coverage   65.80%   64.54%   -1.26%     
==========================================
  Files        1823     2536     +713     
  Lines       73181   130825   +57644     
  Branches    23448    30339    +6891     
==========================================
+ Hits        48156    84440   +36284     
- Misses      25025    44916   +19891     
- Partials        0     1469    +1469     
Flag Coverage Δ
hive 40.31% <0.00%> (?)
mysql 61.23% <0.00%> (?)
postgres 61.32% <0.00%> (?)
presto 40.33% <0.00%> (?)
python 62.92% <0.00%> (?)
sqlite 60.94% <0.00%> (?)
unit 100.00% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@netlify
Copy link
Copy Markdown

netlify bot commented Mar 27, 2026

Deploy Preview for superset-docs-preview ready!

Name Link
🔨 Latest commit 32ad1fb
🔍 Latest deploy log https://app.netlify.com/projects/superset-docs-preview/deploys/69c6af1ca506b2000803e1ff
😎 Deploy Preview https://deploy-preview-38914--superset-docs-preview.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@bito-code-review
Copy link
Copy Markdown
Contributor

bito-code-review bot commented Mar 27, 2026

Code Review Agent Run #863ae9

Actionable Suggestions - 0
Review Details
  • Files reviewed - 2 · Commit Range: 32ad1fb..941ea19
    • superset/mcp_service/server.py
    • tests/unit_tests/mcp_service/test_tool_search_transform.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

change:backend Requires changing the backend size/L size:L This PR changes 100-499 lines, ignoring generated files

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant