Skip to content

Add Exa MCP server + fix tool schema validation for Google AI Studio models#32

Closed
Copilot wants to merge 7 commits intomainfrom
copilot/add-mcp-server-and-gemma-schema-support
Closed

Add Exa MCP server + fix tool schema validation for Google AI Studio models#32
Copilot wants to merge 7 commits intomainfrom
copilot/add-mcp-server-and-gemma-schema-support

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 8, 2026

Google AI Studio (Gemini/Gemma via OpenRouter) rejects tool schemas where required lists property names absent from properties. Many MCP servers ship schemas with this violation, making these models unusable when any such tool is loaded.

Schema sanitization (src/llm.rs)

Added sanitize_schema() — a recursive serde_json::Value mutator applied to every tool definition in chat_with_model() before the API call:

  • Filters required[] to only retain keys that exist in properties
  • Drops required entirely if the filtered array is empty
  • Recurses into nested properties and items schemas
  • No-op for already-valid schemas → all existing models unaffected
fn sanitize_schema(schema: &mut serde_json::Value) {
    let Some(obj) = schema.as_object_mut() else { return };
    let defined_props: Option<HashSet<String>> = obj
        .get("properties").and_then(|p| p.as_object())
        .map(|p| p.keys().cloned().collect());
    if let Some(ref props) = defined_props {
        if let Some(req) = obj.get_mut("required") {
            if let Some(arr) = req.as_array_mut() {
                arr.retain(|v| v.as_str().map(|s| props.contains(s)).unwrap_or(false));
            }
        }
    }
    // remove empty required, recurse into properties + items …
}

Exa MCP server

  • setup/index.html: Added exa to MCP_CATALOG under Browser & Web — bridges https://mcp.exa.ai/mcp via npx mcp-remote, requires EXA_API_KEY
  • config.example.toml: Added commented [[mcp_servers]] example block for Exa

Copilot AI and others added 6 commits April 8, 2026 09:20
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: chinkan <16433287+chinkan@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: chinkan <16433287+chinkan@users.noreply.github.com>
…patibility

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: chinkan <16433287+chinkan@users.noreply.github.com>
Add test_sanitize_schema_recurses_into_items to verify that sanitize_schema
properly recurses into array items schemas and removes undefined required fields.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: chinkan <16433287+chinkan@users.noreply.github.com>
@chinkan
Copy link
Copy Markdown
Owner

chinkan commented Apr 9, 2026

@copilot exa should be under browse and web category

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 9, 2026

@copilot exa should be under browse and web category

Done — moved exa to the Browser & Web category in commit cfe7a13. It now sits alongside playwright, brave-search, firecrawl, and fetch.

Copilot AI requested a review from chinkan April 9, 2026 00:56
@chinkan chinkan marked this pull request as ready for review April 9, 2026 01:12
Copilot AI review requested due to automatic review settings April 9, 2026 01:12
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 9, 2026

Claude finished @chinkan's task in 1m 37s —— View job


Code Review: PR #32 — Add Exa MCP server + fix tool schema validation for Google AI Studio models

Tasks:

  • Understand the PR scope and context
  • Read src/llm.rs changes (schema sanitization)
  • Read setup/index.html changes (Exa in catalog)
  • Read config.example.toml changes
  • Review tests for schema sanitization
  • Provide comprehensive review feedback

Overall Assessment

This is a clean, well-structured PR. The core schema sanitization fix is correct and the implementation is solid. The Exa integration follows established patterns. A few gaps worth noting before merging.


src/llm.rssanitize_schema()

What's good:

  • The function is correctly no-op for already-valid schemas, so existing models are unaffected
  • Applied in the right place — in chat_with_model() before the API call, on a cloned copy so the original ToolDefinition is never mutated
  • The two-phase approach (filter required, then remove if empty) is correct
  • test_sanitize_schema_no_properties_no_panic handles the edge case where required exists without properties — leaving it untouched is the right call

Missing recursive cases:

The function recurses into properties values and items, but misses two other schema locations where required violations can appear:

  1. additionalProperties — MCP tools sometimes use additionalProperties as a sub-schema:
{ "additionalProperties": { "type": "object", "properties": {"x": {}}, "required": ["x", "ghost"] } }

Ghost entries in the nested required won't be cleaned. Fix this →

  1. allOf / anyOf / oneOf — composed schemas in MCP tools are less common but possible. Not critical for a first pass.

Minor: HashSet import is inline rather than at module level

std::collections::HashSet is written inline at llm.rs:111. Since use std::collections::HashSet; isn't used elsewhere in the module, this is fine as-is — just a style note.

Missing logging when schema is modified

When sanitize_schema actually removes a field, there's no log. A warn! or debug! when a property is dropped from required would make it much easier to diagnose compatibility issues in production:

// after retain():
if arr.len() < original_len {
    warn!(removed = original_len - arr.len(), "Filtered invalid required fields from tool schema");
}

Fix this →

Test coverage is comprehensive. All 6 new tests cover the key cases: invalid fields filtered, empty array removed, valid schema untouched, no properties → no panic, nested properties recursion, items recursion. No gaps in the happy/sad paths tested.


setup/index.html

The single-line addition (llm.rs:493 equivalent) follows the catalog entry format exactly — same field names, column alignment, consistent runner/args/envVars/link pattern. Category is correctly Browser & Web as requested.

Minor: The link value is https://exa.ai/mcp — verify this resolves (it's an unusual path for a docs link). Other entries point to npmjs.com/pypi.org package pages; if the Exa MCP page is at a different URL, update accordingly.


config.example.toml

The Exa block (lines 140–148) is well-commented with setup instructions and follows the format of other commented examples. Using mcp-remote as a bridge for the remote endpoint is the correct approach for a server-sent SSE transport.


docs/superpowers/plans/2026-04-08-exa-mcp-and-gemma-schema-fix.md

This planning document (413 lines) was committed to the repo. It's not harmful but probably doesn't belong in version control. Consider whether docs/superpowers/plans/ is intentional as a planning artifact store, or if this should be gitignored.


Summary

Area Status
Schema sanitization logic ✅ Correct
required / properties filtering ✅ Correct
Recursion: nested properties ✅ Covered
Recursion: items ✅ Covered
Recursion: additionalProperties ⚠️ Missing
Logging when schema mutated ⚠️ Missing
Test coverage ✅ Comprehensive
Exa catalog entry ✅ Correct category
config.example.toml ✅ Clear and consistent

The PR is mergeable as-is — the core fix addresses the stated problem correctly. The additionalProperties recursion and logging suggestions are improvements worth considering, either here or as follow-up issues.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds compatibility handling for strict tool-schema validation (notably Google AI Studio Gemini/Gemma via OpenRouter) and introduces Exa as a selectable remote MCP server in the setup wizard.

Changes:

  • Sanitize tool parameter JSON Schemas before sending OpenRouter requests by filtering invalid required entries and recursing into nested schemas.
  • Add Exa (https://mcp.exa.ai/mcp via npx mcp-remote) to the setup wizard MCP catalog and document it in config.example.toml.
  • Add an implementation plan document for the Exa + schema fix work.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
src/llm.rs Adds sanitize_schema() and applies it to tool schemas before API calls; adds unit tests for sanitizer behavior.
setup/index.html Adds Exa entry to MCP_CATALOG for setup wizard configuration generation.
config.example.toml Adds a commented Exa [[mcp_servers]] example block with EXA_API_KEY.
docs/superpowers/plans/2026-04-08-exa-mcp-and-gemma-schema-fix.md Adds an internal implementation plan (currently inconsistent with the shipped Exa catalog entry).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +45 to +49
In `setup/index.html`, locate the line containing `{ name:'threads', ...}` (around line 509).
Insert the following new entry **immediately before** the `threads` line so it sits in the `'Knowledge & Data'` group alongside `tavily` and `context7`:

```javascript
{ name:'exa', category:'Knowledge & Data', desc:'Neural web & document search (remote)', runner:'npx', args:['-y','mcp-remote','https://mcp.exa.ai/mcp','--header','x-api-key:${EXA_API_KEY}'], envVars:['EXA_API_KEY'], link:'https://exa.ai/mcp' },
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This plan instructs adding exa under the Knowledge & Data category and shows args including --header x-api-key:${EXA_API_KEY}, but the actual implementation in setup/index.html adds exa under Browser & Web with args ['-y','mcp-remote','https://mcp.exa.ai/mcp']. Please update the plan to match the shipped catalog entry (category, insertion location, and args/auth approach) so future readers don’t follow incorrect instructions.

Copilot uses AI. Check for mistakes.
Expected:
```
PASS: exa found
PASS: category=Knowledge & Data
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The verification snippet’s expected output hardcodes PASS: category=Knowledge & Data, which will fail with the current setup/index.html change (the Exa entry is categorized as Browser & Web). Adjust the expected output (and/or the regex) to reflect the implemented category so the plan remains self-consistent.

Suggested change
PASS: category=Knowledge & Data
PASS: category=Browser & Web

Copilot uses AI. Check for mistakes.
@chinkan chinkan closed this Apr 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants