
# **GraphRAG: Graph‑based Retrieval‑Augmented Generation**  
*A practical, hands‑on tutorial for Prompt Engineering students*  

---  
**Learning Objectives**  
1. Explain the motivation, benefits, and architecture of GraphRAG.  
2. Build a small knowledge graph from unstructured documents.  
3. Combine vector and graph retrieval for multi‑hop question answering.  
4. Deploy & evaluate a reusable GraphRAG pipeline with LlamaIndex.  
5. Explore advanced patterns (Neo4j, LangChain Graph flow, custom retrievers).  

> **Prerequisites:** Basic Python & Colab familiarity, an OpenAI API key (or other chat‑completion endpoint) set as an environment variable (`OPENAI_API_KEY`).  



## **Table of Contents**
1. [Why GraphRAG?](#why)  
2. [Setup & Dependencies](#setup)  
3. [Build a Knowledge Graph](#build)  
4. [Graph + Vector Retrieval Pipeline](#pipeline)  
5. [Evaluation & Comparison](#eval)  
6. [Advanced Topics (Neo4j, LangChain, Custom Retrievers)](#advanced)  
7. [Exercises](#ex)  
8. [References](#refs)  


In [None]:

# @title ← Install core libraries (takes ~1 min)
!pip -q install llama-index==0.11.40                 llama-index-graph-stores-neo4j                 llama-index-postprocessor-cohere-rerank                 neo4j networkx pyvis tiktoken openai



<a id="why"></a>
## 1 | Why GraphRAG?  

Traditional **vector‑only RAG** treats each text chunk as an isolated embedding.  
That works for topical similarity but struggles with:  

* **Multi‑hop reasoning** (e.g. _"Which director won an Oscar after collaborating with actor X?"_)  
* **Entity disambiguation** (multiple people named “Jordan”)  
* **Explainability** (show paths between facts)  

**Knowledge graphs** store **nodes** (entities/concepts) and **edges** (relationships).  
By retrieving sub‑graphs that match a query, **GraphRAG** can:  

* Provide *structured* context with explicit relations.  
* Follow relationship paths for deeper answers.  
* Reduce hallucinations by grounding in graph facts.  

> **TL;DR:** GraphRAG = Vector Search ➕ Graph Traversal ➕ LLM Generation.  



```text
┌────────────┐       ┌───────────────┐
│  Documents │──────▶│  Chunk & KG   │
└────────────┘       │  Construction │
                     └──────┬────────┘
                            │ PropertyGraph
                            ▼
                    ┌──────────────────┐
 Query  ───────────▶│  Graph Retriever │
                    └──────┬───────────┘
                            │ nodes / paths
                            ▼
                    ┌──────────────────┐
 Query  ───────────▶│ Vector Retriever │
                    └──────┬───────────┘
                            │ chunks
                            ▼
                    ┌──────────────────┐
                    │  Reranker /      │
                    │  Context Builder │
                    └──────┬───────────┘
                            │ top‑k context
                            ▼
                    ┌──────────────────┐
                    │   LLM (Generate) │
                    └──────────────────┘
```



<a id="setup"></a>
## 2 | Setup: Tiny Demo Corpus  

We’ll create a **toy corpus** about movies & directors to keep runtime low but still illustrate multi‑hop reasoning.


In [None]:

from llama_index import SimpleDirectoryReader

docs = SimpleDirectoryReader(input_files=[], input_texts=[
    """Christopher Nolan directed *Inception* (2010) and *Oppenheimer* (2023).
    *Inception* stars Leonardo DiCaprio and was distributed by Warner Bros.
    *Oppenheimer* features Cillian Murphy and explores the life of J. Robert Oppenheimer.""",

    """Greta Gerwig directed *Lady Bird* (2017) and *Barbie* (2023).
    *Barbie* stars Margot Robbie and Ryan Gosling and made \$1.4 billion worldwide.
    *Lady Bird* stars Saoirse Ronan and was produced by A24.""",

    """Quentin Tarantino collaborated with Leonardo DiCaprio on *Django Unchained* (2012) and *Once Upon a Time in Hollywood* (2019).
    Tarantino is known for nonlinear storylines and stylized violence."""])
print(f"Loaded {len(docs)} documents.")



<a id="build"></a>
### 2.1 | Construct a Knowledge Graph  

We use LlamaIndex’s **`SimpleKGIndex`** to extract entities & relations via the LLM and populate an in‑memory property graph.


In [None]:

from llama_index import SimpleKGIndex, StorageContext, ServiceContext, KnowledgeGraphIndex
from llama_index.llms import OpenAI
from llama_index.schema import EntityRelationship

# Use your own model/provider; here we assume gpt‑4o‑mini
llm = OpenAI(model="gpt-4o-mini", temperature=0.1)

kg_index = SimpleKGIndex.from_documents(
    docs,
    llm=llm,
    max_triplets_per_chunk=8,  # keep it small
)
print("KG contains", len(kg_index.get_entity_nodes()), "entities.")



#### Visualise the Graph (optional)


In [None]:

from pyvis.network import Network
import networkx as nx

g_nx = kg_index.get_networkx_graph()
net = Network(height='400px', width='600px', notebook=True)
net.from_nx(g_nx)
net.show('graph.html')
print("Interactive graph saved to graph.html (opens in Colab preview)." )



<a id="pipeline"></a>
## 3 | Graph + Vector Retrieval Pipeline  

LlamaIndex provides **`KnowledgeGraphRAGRetriever`** which  
1. extracts entities from the query,  
2. expands a sub‑graph (default depth = 2),  
3. converts nodes & edges into context,  
4. optionally merges with vector‑similar chunks.  


In [None]:

from llama_index.retrievers import KnowledgeGraphRAGRetriever
from llama_index import VectorStoreIndex

vector_index = VectorStoreIndex.from_documents(docs, service_context=ServiceContext.from_defaults(llm=llm))

retriever = KnowledgeGraphRAGRetriever(
    graph_store=kg_index.get_graph_store(),
    vector_index=vector_index,
    depth=2,
    mode="hybrid",           # graph + vector
    verbose=True,
)


In [None]:

from llama_index.query_engine import RetrieverQueryEngine

qa_engine = RetrieverQueryEngine.from_args(
    retriever=retriever,
    llm=llm,
    response_mode="compact"
)



### 3.1 | Run Example Queries  


In [None]:

queries = [
    "Which films has Leonardo DiCaprio appeared in that were directed by Christopher Nolan?",  # multi‑hop
    "How much money did the Greta Gerwig film starring Margot Robbie gross worldwide?",        # attribute lookup
]
for q in queries:
    print("\n\033[1mQ:", q, "\033[0m")
    print(qa_engine.query(q))



<a id="eval"></a>
## 4 | Quick Evaluation  

Below we compare answers using **vector‑only retrieval** vs **GraphRAG** for a multi‑hop question.


In [None]:

baseline_qa = vector_index.as_query_engine(response_mode="compact", llm=llm)
question = "Which actor worked with BOTH Greta Gerwig and Quentin Tarantino?"
print("\nVector‑only answer →", baseline_qa.query(question))
print("\nGraphRAG answer   →", qa_engine.query(question))



<a id="advanced"></a>
## 5 | Advanced Topics  

* **Neo4j backend:** Swap the in‑memory store with a full Neo4j instance (`Neo4jGraphStore`). See commented code below.  
* **LangChain Graph flow:** Use `langchain-neo4j` + `LangGraph` for stateful multi‑step pipelines.  
* **Custom retriever:** Combine text‑to‑Cypher generation, path‑aware ranking, or GNN‑based link prediction.  
* **Reranking:** Add `CohereRerank` or `OpenAIEmbeddingReranker` post‑processors.  
* **Scaling tips:** Pre‑compute entity links, limit graph expansion depth, cache Cypher queries, async calls.  


In [None]:

# OPTIONAL: connect to Neo4j
# from llama_index.graph_stores import Neo4jGraphStore
# neo4j_store = Neo4jGraphStore(
#     username="neo4j",
#     password="<password>",
#     url="bolt://<host>:7687",
# )
# kg_index_neo = SimpleKGIndex.from_documents(
#     docs,
#     graph_store=neo4j_store,
#     llm=llm,
# )



<a id="ex"></a>
## 6 | Exercises  

1. **Entity‑linking tweak:** Modify the `max_triplets_per_chunk` and observe graph density.  
2. **Domain shift:** Replace the toy corpus with your own class materials; rebuild the pipeline.  
3. **Prompt engineering:** Craft a system prompt that explains *why* the answer is correct using explicit graph paths.  
4. **Evaluation:** Implement BLEU / NL‑based metrics or LLM‑grader to quantify answer accuracy vs ground truth.  
5. **Neo4j Explore:** Spin up Neo4j Aura DB Free and visualize the imported graph in Neo4j Bloom.  



<a id="refs"></a>
## 7 | References  

* Neo4j Blog — “Create a GraphRAG Workflow Using LangChain & LangGraph” (2024).  
* LlamaIndex Docs — *GraphRAG v2 Cookbook* (2025).  
* Neo4j — “What is GraphRAG?” (2024).  
* LangChain‑Neo4j Partner Package Announcement (2025).  
* Peng et al., *Graph Retrieval‑Augmented Generation: A Survey* (2024).  
