A GraphRAG demo using MongoDB Atlas as the graph and vector store, VoyageAI for embeddings, and a local LLM (via llama.cpp) for entity extraction and answering.
The pipeline ingests markdown documents, extracts entities and relationships using an LLM, stores them as a graph in MongoDB, then answers questions by anchoring on relevant entities via vector search and expanding context through multi-hop graph traversal ($graphLookup).
- Ingest — each document is chunked with LangChain's
MarkdownTextSplitter. Each chunk is embedded (VoyageAI) and stored. An LLM extracts entities and relationships from the chunk, which are upserted as nodes and edges in MongoDB. - Query — the question is embedded and used to find anchor entities via Atlas Vector Search. A
$graphLookuptraversal expands outward up to 2 hops, pulling in neighboring entities and evidence chunks. The assembled subgraph is printed (so you can see exactly what context the LLM received) then passed to the LLM for a final answer.
| Collection | Purpose |
|---|---|
chunks |
Raw text chunks with embeddings |
entities |
Graph nodes with embeddings |
relationships |
Graph edges linking entities |
uv is a fast Python package manager. If you don't have it:
curl -LsSf https://astral.sh/uv/install.sh | shCreate a virtual environment and install dependencies:
uv venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
uv pip install -r requirements.txtpython -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txtCopy .env.example to .env and fill in your credentials:
cp .env.example .envMONGO_URI=mongodb+srv://<user>:<pass>@<cluster>.mongodb.net/
VOYAGE_API_KEY=your-voyage-api-key
You will also need an Atlas cluster with Vector Search enabled (M10 or higher, or a local Atlas deployment). The ingest script will attempt to create the vector search indexes automatically; if that fails, create them manually in the Atlas UI:
- Collection
graphrag.entities, index nameentity_vector_index, fieldembedding, 1024 dimensions, cosine similarity - Collection
graphrag.chunks, index namechunk_vector_index, fieldembedding, 1024 dimensions, cosine similarity
The local LLM endpoint and model are hardcoded in both scripts (http://10.0.23.6:8086/v1, model gemma-4). Edit the llm = OpenAI(...) line to point at a different server or swap in a real OpenAI API key.
Ingest docs (run once, or whenever docs change — this takes a while):
python mgraphrag-ingest.pyQuery:
# As a command-line argument
python mgraphrag-query.py "How do I create a vector search index with scalar quantization?"
# Piped from stdin
echo "What is binary quantization?" | python mgraphrag-query.py
# Interactive (type question, then Ctrl+D)
python mgraphrag-query.pyThe query script prints the graph context (entities, relationships, and evidence chunks) that was passed to the LLM before showing the final answer, so you can trace exactly how the result was produced.
- The embedding model is
voyage-4. Changing it requires re-ingesting all documents since stored embeddings must match the query embedding space. - Ingest nukes and rebuilds all three collections on each run.