A Model Context Protocol (MCP) server that provides RAG (Retrieval-Augmented Generation) capabilities for GitHub Copilot. This allows you to process PDFs, create vector embeddings, and retrieve relevant context without using external LLM APIs.
- PDF Processing: Extract text from PDF documents and chunk them for indexing
- Vector Embeddings: Generate embeddings using Hugging Face transformers
- Semantic Search: Find relevant document chunks using vector similarity
- MCP Integration: Connect directly with GitHub Copilot through MCP protocol
- Install dependencies:
npm install- Build the project:
npm run build- Configure GitHub Copilot to use this MCP server by adding to your VS Code settings.json:
{
"mcp": {
"inputs": [],
"servers": {
"rag-server": {
"command": "node",
"args": [
"D:\\YASH\\Projects GGC\\capgemini projects\\mcp part 2\\dist\\index.js"
],
"env": {}
}
}
}
}Once configured, you can use these tools in GitHub Copilot:
@rag-server add_pdf file_path="/path/to/document.pdf"
@rag-server search_documents query="your search query" max_results=5
@rag-server get_context query="what you're working on"
- PDF Processing: Uses
pdf-parseto extract text from PDFs and chunks it into manageable pieces - Embeddings: Generates vector embeddings using the
all-MiniLM-L6-v2model from Hugging Face - Vector Store: Uses FAISS for efficient similarity search
- MCP Protocol: Exposes functionality through MCP tools that GitHub Copilot can call
src/pdf-processor.ts: Handles PDF text extraction and embedding generationsrc/vector-store.ts: Manages vector storage and similarity searchsrc/index.ts: Main MCP server implementationsrc/mcp-config.json: Configuration template for GitHub Copilot
This system runs locally without requiring external API keys or cloud services.