# **🔷🔷Introduction to Graph RAG🔷🔷**

## **⭐01: From vectors to graphs**

### **⭕Vector RAG Limitations**

![img_1](https://github.com/mohd-faizy/Developing_LLMs_Applications_with_LangChain/blob/main/_img/0601.jpeg?raw=true)

Vector-based RAG (Retrieval-Augmented Generation) systems combine **vector similarity search** with **generative models** (like GPT) to improve response accuracy by fetching relevant information from a knowledge base. While powerful, **Vector RAG has several limitations**, especially in real-world applications.

Here’s a breakdown of the key limitations:



#### **1. Semantic Drift or Irrelevant Retrieval**

**Problem:** Vector search retrieves documents based on *semantic similarity*, not factual accuracy or context relevance.

**Example:**
Suppose a user asks:

> "What are the side effects of metformin?"

The system may retrieve chunks containing:

* "The benefits of metformin in managing blood sugar"
* "Metformin's role in preventing diabetes complications"

**Issue:** These are related **semantically**, but they may **not directly mention side effects**. The model may then hallucinate or guess side effects, leading to unreliable answers.

---

#### **2. Loss of Granularity**

**Problem:** When documents are chunked too large or too small, RAG systems struggle.

**Example:**
A 20-page medical paper is chunked into 1-page chunks. The user's query:

> "What did the clinical trial say about metformin and weight loss?"

If the specific trial is mentioned briefly in just one paragraph, vector search may **miss it** if:

* Chunk is too broad and dilutes relevance
* Embedding doesn't capture that fine-grained detail

---


#### **3. No True Understanding of Context**

**Problem:** Vector similarity is context-agnostic — it doesn’t consider previous turns in a conversation unless explicitly added.

**Example:**
**Conversation:**

* User: "Tell me about Python decorators."
* Then: "What about their use in Flask?"

Vector search may **not know what "their" refers to**, unless conversation history is added to the query, which RAG doesn’t do natively.

---


#### **4. Stale or Outdated Knowledge**

**Problem:** Vector RAG systems are only as up-to-date as their vector store.

**Example:**
User asks:

> "What's the latest security patch in Log4j?"

If the vector DB was last updated in 2022, it might **miss 2024 disclosures**, leading to outdated or dangerous information.

---


#### **5. Embedding Quality and Limitations**

**Problem:** Embeddings might not capture important domain-specific nuances.

**Example:**
For a legal query like:

> "What precedent did the judge rely on in Roe v. Wade?"

Legal documents may contain dense citations and context-specific phrases. General-purpose embeddings may not **capture the key precedent accurately**, resulting in **weak or unrelated chunks** being retrieved.


---


#### **6. Long Documents, Poor Chunking**

**Problem:** Poor chunking strategies lead to missed context or cutoff info.

**Example:**
In a user manual, the definition of "safe operating temperature" might be split across two chunks. The RAG system may retrieve only half of the definition, leading the model to guess or produce partial info.


---


#### **7. Lack of Ranking or Reranking**

**Problem:** Vector RAG might retrieve the *top k* most similar documents but **not the most useful or trustworthy** ones.

**Example:**

> "What causes high CPU usage in Kubernetes?"

The top vector match might describe CPU metrics collection, while a lower-ranked chunk (ignored) contains the **actual explanation** for common CPU issues (e.g., liveness probes, throttling, etc.).


---

#### **Summary Table**

| Limitation                       | Description                                         | Example                               |
| -------------------------------- | --------------------------------------------------- | ------------------------------------- |
| `Semantic Drift`                 | Retrieval is related but not precise                | Side effects vs benefits of metformin |
| `Granularity Issues`             | Too broad/narrow chunks miss details                | Specific paragraph in long PDF missed |
| `Lack of Conversational Context` | Fails to understand pronouns or dialogue continuity | "What about their use?" ambiguity     |
| `Outdated Knowledge`             | Vector store isn't real-time                        | Security patch from 2024 missing      |
| `Embedding Limitations`          | General embeddings miss domain-specific meaning     | Legal or medical terms                |
| `Poor Chunking`                  | Important info split across boundaries              | Definition split between chunks       |
| `No Reranking or Trust Score`    | Most relevant docs may be ranked lower              | Actual CPU issue not retrieved        |





### **⭕Graph Databases – Nodes and Edges**

![img_2](https://github.com/mohd-faizy/Developing_LLMs_Applications_with_LangChain/blob/main/_img/0602.jpeg?raw=true)

Graph Databases – Nodes and Edges in the context of Graph RAG (Retrieval-Augmented Generation) 

#### 🔍 **Graph Databases – Simple Explanation**

* **Graph databases** store data like a web of connected dots (like a mind map).
* These dots are called **Nodes** (represent real things like a person, model, or idea).
* The lines between them are called **Edges** (show how those things are related).

---

#### 🧩 **Components**

* **Nodes = Things (entities)**
  Example:

  * `Gpt-4` (a model)
  * `OpenAI` (a company)

* **Edges = Relationships between things**
  Example:

  * `Gpt-4` ➝ `DEVELOPED_BY` ➝ `OpenAI`
  * `ChatGPT` ➝ `MENTIONS` ➝ `Gpt-4`
  * `AutoGPT` ➝ `BASED_ON` ➝ `Gpt-4`

---

#### 🤖 **How it's used in Graph RAG**

* You can ask a question like:
  *"Who developed Gpt-4?"*

* The system follows the edge:

  * Start at node `Gpt-4`
  * Follow `DEVELOPED_BY` ➝ reach node `OpenAI`
  * ✅ Answer: `OpenAI`

---

#### ✅ **Why Graphs are Useful**

* Makes it easier to **trace relationships** step-by-step.
* Helps LLMs **reason clearly** and **give explainable answers**.
* Good for **complex questions** involving multiple steps or layers of knowledge.

### **⭕Loading and Chunking Wikipedia Documents**

#### 🔄 **How Unstructured Text is Converted into Graph Data**

* 📚 **Start with Wikipedia** as the knowledge source (it's full of useful info).
* 🔍 Use **WikipediaLoader** to get articles related to your question.
* ✂️ Split the article into smaller parts (chunks) using **TokenTextSplitter**
*  ➤ This helps the model handle the text properly (due to input size limits).
* 🧠 These smaller chunks are then used to **build a graph**, making search and reasoning easier.

---

#### 📝 **Additional Points**

* ✅ **Good chunking is important** – It keeps the meaning of the text clear when reasoning later.
* 🔧 Graphs are created from these chunks using either:

  * **LLMs (smart models)** or
  * **Rule-based systems** (predefined logic)

In [1]:
# Step 1: Import WikipediaLoader from langchain_community
# This class allows us to programmatically fetch Wikipedia articles based on a query.
from langchain_community.document_loaders import WikipediaLoader

# Step 2: Import TokenTextSplitter from langchain_text_splitters
# TokenTextSplitter breaks down large documents into smaller, manageable chunks based on token counts.
from langchain_text_splitters import TokenTextSplitter


# Step 3: Load Wikipedia documents related to a specific topic (e.g., "large language model")
# This returns a list of Document objects, each containing content and metadata like title and source.
raw_documents = WikipediaLoader(query="large language model").load()

# Step 4: Initialize a TokenTextSplitter
# chunk_size = 100 → each text chunk will be 100 tokens long
# chunk_overlap = 20 → 20 tokens from the previous chunk will be included in the next to maintain context
text_splitter = TokenTextSplitter(chunk_size=100, chunk_overlap=20)

# Step 5: Split the raw documents into smaller chunks
# Splitting only the first 3 documents from the Wikipedia query result for simplicity/performance
# This helps make downstream processing (e.g., embeddings, retrieval) more efficient and accurate
documents = text_splitter.split_documents(raw_documents[:3])

# Step 6: Print the first chunk to inspect its structure
# Useful to verify what kind of data (page content + metadata) we are working with
print(documents[0])

page_content='A large language model (LLM) is a language model trained with self-supervised machine learning on a vast amount of text, designed for natural language processing tasks, especially language generation.
The largest and most capable LLMs are generative pretrained transformers (GPTs), which are largely used in generative chatbots such as ChatGPT, Gemini or Claude. LLMs can be fine-tuned for specific tasks or guided by prompt engineering. These models acquire predictive power regarding' metadata={'title': 'Large language model', 'summary': 'A large language model (LLM) is a language model trained with self-supervised machine learning on a vast amount of text, designed for natural language processing tasks, especially language generation.\nThe largest and most capable LLMs are generative pretrained transformers (GPTs), which are largely used in generative chatbots such as ChatGPT, Gemini or Claude. LLMs can be fine-tuned for specific tasks or guided by prompt engineering. These

```
page_content='A large language model (LLM) is a computational model capable of...'  
metadata={'title': 'Large language model', 
          'summary': "A large language model (LLM) is...", 
          'source': 'https://en.wikipedia.org/wiki/Large_language_model'} 
```

### **⭕Converting Text to Graph Structures**

In [3]:
# Step 1: Import ChatGroq (LLM wrapper) and LLMGraphTransformer for graph generation
from langchain_groq import ChatGroq
from langchain_experimental.graph_transformers import LLMGraphTransformer 

# Step 2: Load environment variables (e.g., API keys) from .env file
from dotenv import load_dotenv
load_dotenv()

# Step 3: Initialize the LLM (Groq's LLaMA 3.3 model)
# temperature controls randomness; max_tokens limits the output length
llm = ChatGroq(model="llama-3.3-70b-versatile")

# Step 4: Wrap the LLM with a graph transformer to extract entities and relationships
llm_transformer = LLMGraphTransformer(llm=llm)  

# Step 5: Convert document chunks into structured graph documents (nodes + edges)
graph_documents = llm_transformer.convert_to_graph_documents(documents) 

# Step 6: Print the resulting graph structure for inspection
print(graph_documents)

[GraphDocument(nodes=[Node(id='Llm', type='Language model', properties={}), Node(id='Gpt', type='Transformer', properties={}), Node(id='Chatgpt', type='Chatbot', properties={}), Node(id='Gemini', type='Chatbot', properties={}), Node(id='Claude', type='Chatbot', properties={})], relationships=[Relationship(source=Node(id='Llm', type='Language model', properties={}), target=Node(id='Gpt', type='Transformer', properties={}), type='INCLUDES', properties={}), Relationship(source=Node(id='Gpt', type='Transformer', properties={}), target=Node(id='Chatgpt', type='Chatbot', properties={}), type='USED_IN', properties={}), Relationship(source=Node(id='Gpt', type='Transformer', properties={}), target=Node(id='Gemini', type='Chatbot', properties={}), type='USED_IN', properties={}), Relationship(source=Node(id='Gpt', type='Transformer', properties={}), target=Node(id='Claude', type='Chatbot', properties={}), type='USED_IN', properties={})], source=Document(metadata={'title': 'Large language model', 

```
[GraphDocument( 
    nodes=[ 
        Node(id='Llm', type='Computational model'), 
        Node(id='Language Generation', type='Concept'), 
        Node(id='Natural Language Processing Tasks', type='Concept'), 
        Node(id='Llama Family', type='Computational model'), 
        Node(id='Ibm', type='Organization'), 
        ..., Node(id='Bert', type='Computational model')], 
    relationships=[ 
        Relationship(source=Node(id='Llm', type='Computational model'), 
                     target=Node(id='Language Generation', type='Concept'), 
                     type='CAPABLE_OF'), 
        ...])] 

```

![img_3](https://github.com/mohd-faizy/Developing_LLMs_Applications_with_LangChain/blob/main/_img/0603.jpeg?raw=true)

## **⭐02: Storing and Querying documents**

![img_4](https://github.com/mohd-faizy/Developing_LLMs_Applications_with_LangChain/blob/main/_img/0604.jpeg?raw=true)


Once graph documents are prepared, we store them in a `Neo4j` database and interact using Cypher queries.


#### 🧠 What is **Neo4j**?

* **Neo4j** is a **graph database** that stores data in **nodes** (entities) and **relationships** (connections between entities).
* Unlike traditional relational databases, it excels at handling **complex, interconnected data** — like `social networks`, `knowledge graphs`, `recommendation systems`, etc.

---

#### 🔗 What is its role in **LangChain**/**LangGraph**?

* In **LangChain**/**LangGraph**, Neo4j is used to:

  * Store and retrieve structured knowledge (as a **Knowledge Graph**).
  * Help language models **reason** better by providing **contextual, structured data**.
  * Support **RAG** (Retrieval-Augmented Generation) pipelines where the LLM queries the knowledge graph for better answers.

---

#### ✅ Why use Neo4j with LangChain / LangGraph?

* **Structured memory**: It acts like a memory graph for agents — "who met whom, when, where, and why".
* **Better reasoning**: LLMs can retrieve and reason over structured data instead of guessing from text alone.
* **Entity linking**: Tracks and connects information about entities like people, places, or products.
* **Dynamic updates**: You can update the graph over time with new info as the agent learns.

---

#### 🔍 Example Use Case

> Imagine you're building an **AI agent that acts like a personal assistant**.

* The assistant talks to you daily and stores facts:

  * "Alex likes leg day."
  * "Alex has a herniated disc at L4-L5."
  * "Avoids barbell squats."
* These facts are stored in **Neo4j as a knowledge graph**:

  ```plaintext
  (:Person {name: "alex"})-[:LIKES]->(:Activity {name: "leg day"})
  (:Person {name: "alex"})-[:HAS_INJURY]->(:Injury {type: "herniated disc", location: "L4-L5"})
  ```

When the assistant is asked:

> "Should I do squats today?"

It can reason: "Alex has a herniated disc → avoid heavy spinal loading → no barbell squats."

---

#### ⚙️ How to Use Neo4j with LangChain (Code Example)

> 🛠 **Install dependencies:**

```bash
pip install -U langchain-neo4j neo4j
```

> 🧱 **Set up Neo4j graph in LangChain:**

```python
from langchain_neo4j import Neo4jGraph

graph = Neo4jGraph(
    url="bolt://localhost:7687",
    username="neo4j",
    password="your_password"
)
```

> 📥 **Ingest data into the graph:**

```python
graph.query("""
MERGE (p:Person {name: 'Faizy'})
MERGE (a:Activity {name: 'leg day'})
MERGE (p)-[:LIKES]->(a)
""")
```

> 🔍 **Query the graph using Cypher:**

```python
result = graph.query("""
MATCH (p:Person)-[:LIKES]->(a:Activity)
WHERE p.name = 'Faizy'
RETURN a.name
""")
print(result)
```

---

#### 🧩 Summary

| Feature | Neo4j in LangChain                                      |
| ------- | ------------------------------------------------------- |
| Type    | Graph Database                                          |
| Purpose | Store structured knowledge (nodes/edges)                |
| Benefit | Enables reasoning and memory for agents                 |
| Example | Personal assistant storing health + workout preferences |
| Tools   | `langchain-neo4j`, Cypher queries                       |


![img_5](https://github.com/mohd-faizy/Developing_LLMs_Applications_with_LangChain/blob/main/_img/0605.jpeg?raw=true)

In [None]:
# === Import Required Libraries ===
from langchain_neo4j import Neo4jGraph  # Interface to connect and interact with Neo4j
from langchain_experimental.graph_transformers import LLMGraphTransformer
from langchain_groq import ChatGroq  # LLM provider (Groq's LLaMA models)
from langchain_core.documents import Document
from dotenv import load_dotenv  # Load environment variables from .env
import os

# === Load environment variables ===
load_dotenv()


# === Connect to Neo4j ===
# Establishes a session with the Neo4j graph database
graph = Neo4jGraph(
    url=os.getenv("NEO4J_URI"),
    username=os.getenv("NEO4J_USERNAME"),
    password=os.getenv("NEO4J_PASSWORD")
)

# === Initialize the LLM and Graph Transformer ===
# ChatGroq provides LLaMA-3.1 model; used to extract graph-structured info
llm = ChatGroq(model="llama-3.3-70b-versatile")
llm_transformer = LLMGraphTransformer(llm=llm)

# === Convert document chunks into graph format (nodes + relationships) ===
graph_documents = llm_transformer.convert_to_graph_documents(documents)

# === Store graph documents in Neo4j ===
graph.add_graph_documents(
    graph_documents,
    include_source=True,   # Adds MENTIONS edges linking nodes to source docs
    baseEntityLabel=True   # Adds a generic __Entity__ label to all nodes
)


In [None]:
# Refresh the graph schema to sync with the latest database structure
graph.refresh_schema()

# Print the current schema: shows node labels, properties, and relationship types
print(graph.get_schema())

In [None]:
# Run this mini-test:
print(graph.query("MATCH (n) RETURN COUNT(n)"))

```
Node properties: 
Concept {id: STRING} 
Architecture {id: STRING} 
Organization {id: STRING} 
Event {id: STRING} 
Paper {id: STRING} 
The relationships: 
(:Concept)-[:DEVELOPED_BY]->(:Person) 
(:Architecture)-[:BASED_ON]->(:Concept) 
(:Organization)-[:PROPOSED]->(:Concept) 
(:Document)-[:MENTIONS]->(:Event) 
(:Paper)-[:BASED_ON]->(:Concept) 
```

```python
# verify the Bolt connection separately

from neo4j import GraphDatabase

driver = GraphDatabase.driver("bolt://localhost:7687", auth=("neo4j", "your_password"))
with driver.session() as session:
    result = session.run("RETURN 1")
    print(result.single())
```
If you get a number, connection works. If you still get ConnectionRefusedError, then the database isn't running or port isn't open.

## **⭐03: Creating the RAG Chain**

![img_6](https://github.com/mohd-faizy/Developing_LLMs_Applications_with_LangChain/blob/main/_img/0606.jpeg?raw=true)

We now build the retrieval-augmented generation chain using graph context.

In [None]:
# Import the Cypher-based QA chain for graph querying
from langchain_community.chains.graph_qa.cypher import GraphCypherQAChain  

# Initialize the LLM (Groq's LLaMA 3.1) for Cypher generation and answer formatting
# llm = ChatGroq(model="llama-3.1-8b-instant", temperature=0.7, max_tokens=100)
llm = ChatGoogleGenerativeAI(model="gemini-1.5-flash")

# Create a QA chain that uses the LLM and connected Neo4j graph
chain = GraphCypherQAChain.from_llm( 
    llm=llm,
    graph=graph, 
    verbose=True  # Prints generated Cypher query and execution details
)  

# Ask a natural language question → LLM generates a Cypher query → executes on Neo4j
result = chain.invoke({"query": "What is the most accurate model?"}) 

# Print the final answer from the graph database
print(f"Final answer: {result['result']}")


```
> Entering new GraphCypherQAChain chain... 
  Generated Cypher: 
  MATCH (m:Model) 
  RETURN m 
  ORDER BY m.accuracy DESC 
  LIMIT 1; 
  Full Context: 
  [{'m': {'id': 'Artificial Neural Networks'}}] 

> Finished chain. 

> Final answer: Artificial Neural Networks 
```

- `qa_prompt` : Prompt template for result generation
- `cypher_prompt` : Prompt template for Cypher generation
- `cypher_llm` : LLM for Cypher generation
- `qa_llm` : LLM for result generation

## **⭐04: Improving Graph Retrieval**

#### **Techniques**

>`Main limitation: reliability of user → Cypher translation`

##### **Strategies to improve graph retrieval system:**
- ***1. Filtering Graph Schema***
- ***2. Validating the Cypher Query***
- ***3. Few-shot prompting***

In [None]:
#=====================
# 1. Filtering 
#=====================

# Import the GraphCypherQAChain for querying Neo4j using natural language
from langchain_community.chains.graph_qa.cypher import GraphCypherQAChain  

# Initialize the LLM (LLaMA 3.1 from Groq)
llm = ChatGroq(model="llama-3.1-8b-instant", temperature=0.7, max_tokens=100)

# Create a QA chain with filtering enabled
# exclude_types=["Concept"] will ignore 'Concept' nodes in generated Cypher queries
chain = GraphCypherQAChain.from_llm(
    llm=llm,
    graph=graph,
    exclude_types=["Concept"], 
    verbose=True  # Show generated Cypher and execution logs
)

# Print the current graph schema (useful to verify node types being filtered)
print(graph.get_schema)

```
Node properties: 
Document {title: STRING, id: STRING, text: STRING, summary: STRING, source: STRING} 
Organization {id: STRING} 
```

In [None]:
#================================
# 2. Validating the Cypher query
#================================

# Create a QA chain with Cypher validation enabled
# validate_cypher=True ensures the generated query:
# - follows schema rules
# - uses correct relationship directions
# - avoids invalid node/edge types
chain = GraphCypherQAChain.from_llm(
    llm=llm,
    graph=graph,
    verbose=True,
    validate_cypher=True, 
)


1. Detects nodes and relationships
2. Determines the directions of the relationship
3. Checks the graph schema
4. Update the direction of relationships

In [None]:
#======================
# 3. Few-shot prompting
#======================

from langchain_core.prompts import FewShotPromptTemplate, PromptTemplate  

# Define a few-shot set: sample questions and their correct Cypher queries
examples = [ 
    { 
        "question": "How many notable large language models are mentioned in the article?", 
        "query": "MATCH (m:Concept {id: 'Large Language Model'}) RETURN count(DISTINCT m)", 
    }, 
    { 
        "question": "Which companies or organizations have developed the large language models mentioned?", 
        "query": "MATCH (o:Organization)-[:DEVELOPS]->(m:Concept {id: 'Large Language Model'}) RETURN DISTINCT o.id", 
    }, 
    { 
        "question": "What is the largest model size mentioned in the article, in terms of number of parameters?", 
        "query": "MATCH (m:Concept {id: 'Large Language Model'}) RETURN max(m.parameters) AS largest_model", 
    }, 
]

# Format each example into a consistent prompt structure
example_prompt = PromptTemplate.from_template("User input: {question}\nCypher query: {query}")  

# Construct the few-shot Cypher prompt with schema injection
cypher_prompt = FewShotPromptTemplate( 
    examples=examples, 
    example_prompt=example_prompt, 
    prefix="""You are a Neo4j expert. Given an input question, create a syntactically correct
            Cypher query to run.\n\nHere is the schema information\n{schema}.\n\n
            Below are a number of examples of questions and their corresponding Cypher queries.""", 
    suffix="User input: {question}\nCypher query: ", 
    input_variables=["question"], 
)


In [None]:
#=================================
# Adding Few-shot Prompt to Chain
#=================================

# Create a QA chain with:
# - Few-shot Cypher prompt for better query generation
# - Cypher validation to ensure correctness
# - Verbose mode to log internal steps
chain = GraphCypherQAChain.from_llm(
    llm=llm,
    graph=graph,
    cypher_prompt=cypher_prompt,
    verbose=True,
    validate_cypher=True 
)