# RAG Failure #1: The Multi-Hop Disconnect

## The Problem
Standard RAG retrieves documents based on semantic similarity to the query. If the answer requires connecting **Fact A $\to$ Fact B $\to$ Fact C**, and Fact C is semantically unrelated to the original query, RAG fails.

## The Scenario: Corporate Intelligence Investigation
**Query:** "What is the primary currency used in the city where the lead engineer of Project Chimera was born?"

**The Logic Chain (Hidden in disjointed docs):**
1.  **Doc 1:** Project Chimera is led by **Dr. Elias Thorne**.
2.  **Doc 2:** Dr. Elias Thorne was born in **Valoria City**.
3.  **Doc 3:** Valoria City uses the **Valorian Credit (V-Cred)** as its currency.

**The Adversarial Noise:**
-   Mention of "Project Chimera-Next" (a different project).
-   Mention of "Valoria" in a tourism context (irrelevant).
-   Mention of other currencies (Euro, USD) in irrelevant docs.

In [None]:
# --- Step 1: Environment Setup ---
# Installing specific versions to ensure compatibility in Colab/Jupyter
!pip install -q langchain langchain-community langchain-huggingface faiss-cpu networkx transformers sentence-transformers accelerate bitsandbytes

In [None]:
import torch
from transformers import pipeline, AutoTokenizer, AutoModelForCausalLM
from langchain_huggingface import HuggingFacePipeline, HuggingFaceEmbeddings
import networkx as nx

# --- Step 2: Load LLM & Embeddings ---
# We use TinyLlama 1.1B. It's small, fast, and follows instructions well.
# This avoids 'HuggingFace Login' issues while providing Llama-level architecture.
model_id = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"

print(f"Loading {model_id}...")
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

# Create a text-generation pipeline
# temperature=0.1 ensures the Extraction phase is deterministic and not 'creative'
pipe = pipeline(
    "text-generation", 
    model=model, 
    tokenizer=tokenizer, 
    max_new_tokens=256, 
    temperature=0.1,  
    do_sample=True
)

llm = HuggingFacePipeline(pipeline=pipe)
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")

print("Model loaded. Pipeline ready.")
print("Embedding model loaded.")

Loading TinyLlama-1.1B-Chat-v1.0...
Model loaded. Pipeline ready.
Embedding model loaded.


In [None]:
from langchain.docstore.document import Document

# --- Step 3: Simulate The PDF Data ---
raw_texts = [
    # -- The Critical Chain (Split across 3 docs) --
    "Project Chimera is a classified top-secret aerospace initiative led by Chief Engineer Dr. Elias Thorne.",
    "Dr. Elias Thorne is a renowned physicist who was born and raised in the coastal metropolis of Valoria City.",
    "Valoria City is an independent economic zone that exclusively trades using the Valorian Credit (V-Cred).",
    
    # -- The Noise / Distractors (Designed to trick vector search) --
    # 1. Has 'Project', 'Chimera', 'AI' -> High similarity to query part 1
    "Project Chimera-Next is a separate software subsidiary managed by Sarah Connor, focusing on AI.",
    # 2. Has 'Currency', 'Trade' -> High similarity to query part 2
    "The Euro and USD are commonly used in international trade, but not in all independent zones.",
    # 3. Has 'Valoria' but irrelevant context
    "Valoria City is a popular tourist destination known for its beaches, distinct from its economic policies."
]

docs = [Document(page_content=t) for t in raw_texts]
print(f"Created {len(docs)} documents.")
print(f"Sample Doc: [{docs[0].page_content}]")

Created 6 documents.
Sample Doc: [Project Chimera is a classified top-secret aerospace initiative led by Chief Engineer Dr. Elias Thorne.]


In [None]:
# --- Step 4: Naive RAG Implementation ---
from langchain_community.vectorstores import FAISS

print("\n--- RUNNING NAIVE RAG ---")
query = "What is the primary currency used in the city where the lead engineer of Project Chimera was born?"
print(f"Query: {query}")

# 1. Indexing
vectorstore = FAISS.from_documents(docs, embeddings)

# 2. Retrieval
# We use k=2. The chain is 3 steps long (Project->Person->City->Currency).
# Vector search retrieves the 'Project' doc and the 'Distractor Project' doc because they share keywords.
retriever = vectorstore.as_retriever(search_kwargs={"k": 2})
retrieved_docs = retriever.invoke(query)

print("\nRetrieved Context (k=2):")
context_str = ""
for i, d in enumerate(retrieved_docs):
    print(f"{i+1}. {d.page_content}")
    context_str += d.page_content + "\n"

# 3. Generation
prompt = f"<|system|>\nAnswer based ONLY on the context provided. If unsure, say you don't know.\n<|user|>\nContext:\n{context_str}\nQuestion:\n{query}\n<|assistant|>"
response = llm.invoke(prompt)

print("\nLLM Response (Naive):")
print(response.split("<|assistant|>")[-1].strip())


--- RUNNING NAIVE RAG ---
Query: What is the primary currency used in the city where the lead engineer of Project Chimera was born?

Retrieved Context (k=2):
1. Project Chimera is a classified top-secret aerospace initiative led by Chief Engineer Dr. Elias Thorne.
2. Project Chimera-Next is a separate software subsidiary managed by Sarah Connor, focusing on AI.

LLM Response (Naive):
Based on the context provided, there is no information about the primary currency used in the city where the lead engineer of Project Chimera was born.


In [None]:
# --- Step 5: ACTUAL Dynamic Knowledge Graph Extraction ---
# This is a "Schema-Based Extraction" pipeline.
# We force the LLM to output a specific separator "|" and then parse it.

kg = nx.DiGraph()

def extract_triplets_with_llm(text):
    """
    Uses the LLM to analyze text and extract structured relations.
    We use few-shot prompting to ensure the LLM understands the format.
    """
    extraction_prompt = f"""<|system|>
    You are a Knowledge Graph Engineer. Extract the most important relationship from the sentence.
    Format: Subject | Relation | Object
    Example: "Apple Inc. was founded by Steve Jobs." -> Apple Inc. | founded_by | Steve Jobs
    <|user|>
    Sentence: {text}
    <|assistant|>"""
    
    # 1. Run LLM
    raw_out = llm.invoke(extraction_prompt)
    # 2. Extract the last part (the assistant's response)
    cleaned_out = raw_out.split("<|assistant|>")[-1].strip()
    
    # 3. Parse Logic (Robustness check)
    parts = []
    if "|" in cleaned_out:
        parts = [p.strip() for p in cleaned_out.split("|")]
        
    return cleaned_out, parts

print("\n--- INDUSTRY STANDARD KG EXTRACTION ---")

# Iterate through our documents and build the graph
for doc in docs:
    print(f"\nParsing Chunk: {doc.page_content[:90]}...")
    
    # Call the actual LLM function
    raw_response, parsed_parts = extract_triplets_with_llm(doc.page_content)
    
    if len(parsed_parts) >= 3:
        subj, rel, obj = parsed_parts[0], parsed_parts[1], parsed_parts[2]
        print(f"   [Raw LLM Output]: {raw_response}")
        print(f"   [Graph Action]: Added Edge ({subj}) -> [{rel}] -> ({obj})")
        kg.add_edge(subj, obj, relation=rel)
    else:
        print(f"   [Skipped]: LLM output format mismatch. Raw: {raw_response}")

print(f"\nGraph Statistics: {kg.number_of_nodes()} Nodes, {kg.number_of_edges()} Edges")


--- INDUSTRY STANDARD KG EXTRACTION ---
Parsing Chunk: Project Chimera is a classified top-secret aerospace initiative led by Chief Engineer Dr. Elias Thorne.
   [Raw LLM Output]: Project Chimera | led_by | Dr. Elias Thorne
   [Graph Action]: Added Edge (Project Chimera) -> [led_by] -> (Dr. Elias Thorne)

Parsing Chunk: Dr. Elias Thorne is a renowned physicist who was born and raised in the coastal metropolis of Valoria City.
   [Raw LLM Output]: Dr. Elias Thorne | born_in | Valoria City
   [Graph Action]: Added Edge (Dr. Elias Thorne) -> [born_in] -> (Valoria City)

Parsing Chunk: Valoria City is an independent economic zone that exclusively trades using the Valorian Credit (V-Cred).
   [Raw LLM Output]: Valoria City | uses_currency | Valorian Credit (V-Cred)
   [Graph Action]: Added Edge (Valoria City) -> [uses_currency] -> (Valorian Credit (V-Cred))
...
Graph Statistics: 10 Nodes, 6 Edges


In [None]:
# --- Step 6: Graph RAG Implementation (The Solver) ---

print("\n--- RUNNING GRAPH RAG SOLVER ---")

def get_graph_context(graph, query_str, hops=3):
    """
    Instead of searching for text, we search for a connected path.
    """
    # 1. Entity Resolution (Find graph node that matches query)
    start_node = None
    for node in graph.nodes():
        # Simple string matching for demo. In prod, use Vector Similarity or Fuzzy Matching.
        if node in query_str and len(node) > 4: 
            start_node = node
            break
    
    print(f"Query Entity Detected: '{start_node}'")
    if not start_node: return "No entity found."
    
    # 2. Traverse (Depth First Search) to gather context
    print("\nTraversing Graph Paths...")
    path_sentences = []
    
    # We use DFS to find connected chains up to depth 3
    # This simulates "Multi-Hop" reasoning
    edges = list(nx.bfs_edges(graph, source=start_node, depth_limit=hops))
    
    for u, v in edges:
        rel = graph[u][v]['relation']
        sentence = f"{u} is {rel} {v}."
        path_sentences.append(sentence)
    
    return " ".join(path_sentences)

# 1. Get Context from Graph
graph_context = get_graph_context(kg, query, hops=3)
print(f"Found Path: {graph_context}")

# 2. Generate Final Answer
print("\n--- FINAL ANSWER GENERATION ---")
prompt = f"<|system|>\nAnswer based on the context.\n<|user|>\nContext: {graph_context}\nQuestion: {query}\n<|assistant|>"
final_response = llm.invoke(prompt)

print("LLM Graph Response:")
print(final_response.split("<|assistant|>")[-1].strip())


--- RUNNING GRAPH RAG SOLVER ---
Query Entity Detected: 'Project Chimera'

Traversing Graph Paths...
Found Path: Project Chimera is led_by Dr. Elias Thorne. Dr. Elias Thorne is born_in Valoria City. Valoria City is uses_currency Valorian Credit (V-Cred).

--- FINAL ANSWER GENERATION ---
LLM Graph Response:
The primary currency used in the city where the lead engineer of Project Chimera was born is the Valorian Credit (V-Cred).
