A lightweight wrapper around AI SDK that intelligently indexes and retrieves MCP tools using graph-based vector search.
MCP RAG indexes your MCP toolset into a graph structure and uses Neo4j-powered vector search to retrieve relevant tool subsets from large collections. This dramatically reduces context overhead when working with extensive tool libraries.
MCP RAG sees improvements in both efficiency and performance compared to baseline tool selection, while maintaining the same level of accuracy.
Benchmark Methodology: Tests simulate a realistic conversation with 5 sequential prompts, each triggering a different tool as context accumulates—mirroring real-world multi-turn interactions. All tests use the complete toolset from the GitHub MCP Server (90+ tools) to represent authentic large-scale tool selection scenarios.
See the proof in the pudding 🍰:
Base Tool Selection Results - Baseline approach passing all tools to the model.
RAG Tool Selection Results - RAG-powered intelligent filtering with vector search.
View Test Suite - Complete benchmark implementation and test cases.
npm install @mcp-rag/client @ai-sdk/openai neo4j-driver aiSet your OpenAI API key:
export OPENAI_API_KEY=your_key_hereimport { createMCPRag } from '@mcp-rag/client'
import { openai } from '@ai-sdk/openai'
import neo4j from 'neo4j-driver'
import { tool } from 'ai'
import { z } from 'zod'
const driver = neo4j.driver(
'neo4j://localhost:7687',
neo4j.auth.basic('neo4j', 'password')
)
const rag = createMCPRag({
model: openai('gpt-4o-mini'),
neo4j: driver,
tools: {
searchDocs: tool({
/* ... */
}),
queryDatabase: tool({
/* ... */
}),
sendEmail: tool({
/* ... */
}),
fetchWeather: tool({
/* ... */
}),
analyzeImage: tool({
/* ... */
}),
// ... hundreds more tools
},
})
await rag.sync()
const result = await rag.generateText({
prompt: 'Search for API docs',
})What does rag.sync() do?
The sync() method performs a complete synchronization of your tools to Neo4j, creating the graph structure needed for semantic search. Here's what happens under the hood:
-
Creates Vector Index: Sets up a Neo4j vector index for similarity search using 1536-dimensional embeddings (OpenAI's
text-embedding-3-smallmodel) -
Generates Embeddings: For each tool in your toolset, it creates embeddings for:
- The tool itself (name + description)
- Each parameter (name + description)
- The return type
-
Builds Graph Structure: Creates a graph in Neo4j with the following relationships:
ToolSetnodes that group tools togetherToolnodes with their embeddingsParameternodes connected to tools viaHAS_PARAMrelationshipsReturnTypenodes connected to tools viaRETURNSrelationships
-
Idempotent by Design: The sync process uses
MERGEoperations, so running it multiple times won't create duplicates. It will update existing nodes if the toolset has changed.
When to call it:
- After initial client creation (required before first use)
- After adding or removing tools with
addTool()orremoveTool() - To force a re-index of your tools
The sync process is optimized to only run when necessary - subsequent calls to generateText() won't re-sync unless you explicitly call sync() again or modify the toolset.
What does rag.generateText() do?
The generateText() method is a smart wrapper around the AI SDK's generateText function that adds automatic tool selection. Here's the workflow:
-
Ensures Migration: Automatically calls the sync process if tools haven't been indexed yet
-
Semantic Tool Selection:
- Generates an embedding for your prompt
- Performs a Neo4j vector similarity search to find the most relevant tools
- By default, selects up to 10 tools (configurable via
maxActiveTools) - You can override this by passing
activeToolsarray explicitly
-
Calls AI SDK: Passes only the selected subset of tools to the AI SDK's native
generateTextfunction along with your prompt and any additional options -
Returns Full Result: Returns the complete AI SDK result wrapped in a
GenerateTextResultWrapperobject, giving you access to:- Tool calls made by the model
- Token usage statistics
- Response content
- All other AI SDK metadata
Key Benefits:
- Reduced Context Size: Only relevant tools are sent to the LLM, saving tokens
- Better Performance: Fewer tools mean faster response times
- Same AI SDK Experience: Accepts all standard AI SDK parameters and returns familiar result structures
Want to see MCP RAG in action? Check out our complete example that demonstrates intelligent tool selection with the GitHub MCP Server's 93 tools:
This example shows:
- How to mock and index all 93 GitHub MCP server tools
- Vector similarity search selecting the top 10 most relevant tools
- Real-world tool selection with detailed debug output
- Interactive testing with different prompts
Perfect for understanding how MCP RAG reduces context overhead in large toolsets!
- Graph-based indexing – Tools are indexed with their relationships and metadata
- Vector search – Neo4j-powered semantic search for tool retrieval
- AI SDK compatible – Drop-in wrapper that works with your existing AI SDK setup
- Selective loading – Only load the tools you need for each request
MIT rconnect.tech


