# 🔍 Tutorial 3: Understanding RAG (Retrieval-Augmented Generation)

**Welcome to the exciting world of RAG!** This is where AI gets really smart about finding and using information.

## 🎯 What You'll Learn:
- What RAG means in simple terms
- How AI "understands" text with embeddings
- How to find the most relevant information
- Building a smart question-answering system
- Why RAG is better than just asking AI questions

## ⏱️ Time: 30-35 minutes
## 📚 Level: Beginner
## 📋 Prerequisites: Tutorials 1 & 2 completed

## 🤔 What is RAG?

**RAG = Retrieval-Augmented Generation**

Sounds complicated? Let's break it down:

### 🧩 The Three Parts:
1. **Retrieval**: Finding the right information
2. **Augmented**: Adding that information to help the AI
3. **Generation**: AI creates an answer using that information

### 🏠 Real-World Analogy:
Imagine you're helping a friend with homework:
- **Without RAG**: "What's the capital of France?" → Friend guesses from memory
- **With RAG**: Friend first looks it up in a textbook, then answers confidently

### 🎯 Why RAG is Amazing:
- **Up-to-date**: Can use latest information
- **Accurate**: Based on real documents, not just AI memory
- **Specific**: Can answer questions about your specific documents
- **Trustworthy**: You can see where the answer came from

## 📚 Step 1: Setup for RAG

Let's import everything we need to build a RAG system.

In [None]:
# Import tools for RAG
import sys

sys.path.append('..')

from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import PyPDFLoader
from langchain_community.vectorstores import Chroma
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_ollama import ChatOllama, OllamaEmbeddings

print("🔍 RAG tools imported!")
print("📊 Ready to build an intelligent retrieval system")
print("🧠 This will combine document search with AI understanding")

## 🧠 Step 2: Understanding Embeddings

**Embeddings** are how AI "understands" text. Think of them as AI's way of describing text with numbers.

### 🎨 Analogy:
- **Human**: "This text is about cats" 
- **AI Embedding**: [0.2, -0.8, 0.5, 0.1, ...] (hundreds of numbers)

### 🤝 Similar Meanings = Similar Numbers:
- "Dog" and "Puppy" have similar embeddings
- "Car" and "Airplane" are less similar
- "Happy" and "Joyful" are very similar

In [None]:
# Create an embedding model
print("🧠 Creating an embedding model...")
print("⏳ This connects to Ollama's nomic-embed-text model...")

embeddings = OllamaEmbeddings(
    model="nomic-embed-text"
)

print("✅ Embedding model ready!")
print("🔢 This model converts text into numbers that AI can understand")

# Let's see embeddings in action
print("\n🧪 EXPERIMENT: How AI Sees Text")
print("=" * 40)

# Create embeddings for different words
word1 = "cat"
word2 = "dog"
word3 = "car"

print("Converting words to AI numbers...")
embedding1 = embeddings.embed_query(word1)
embedding2 = embeddings.embed_query(word2)
embedding3 = embeddings.embed_query(word3)

print("\n📊 Results:")
print(f"'{word1}' → {len(embedding1)} numbers (first 5: {embedding1[:5]})")
print(f"'{word2}' → {len(embedding2)} numbers (first 5: {embedding2[:5]})")
print(f"'{word3}' → {len(embedding3)} numbers (first 5: {embedding3[:5]})")

print("\n💡 Each word becomes a list of numbers that capture its meaning!")

## 📄 Step 3: Prepare Documents for RAG

Let's load and prepare our research paper for RAG. We'll convert all the text chunks into embeddings.

In [None]:
# Load and split the document (like Tutorial 2)
print("📄 Loading research paper...")

pdf_path = "../examples/d4sc03921a.pdf"
loader = PyPDFLoader(pdf_path)
documents = loader.load()

# Split into chunks
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=200
)
chunks = text_splitter.split_documents(documents)

print(f"✅ Paper loaded and split into {len(chunks)} chunks")
print("📝 Each chunk is about 1000 characters")
print("🔄 200 character overlap between chunks for continuity")

## 🗃️ Step 4: Create a Vector Database

A **vector database** stores all our text chunks as embeddings. It's like a super-smart library that can find similar content instantly.

### 🏗️ What We're Building:
1. Take each text chunk
2. Convert it to embeddings (numbers)
3. Store in a searchable database
4. When asked a question, find the most similar chunks

In [None]:
# Create a vector database
print("🗃️ Creating vector database...")
print("⏳ Converting all text chunks to embeddings... (this takes 30-60 seconds)")

# This creates embeddings for all chunks and stores them
vectorstore = Chroma.from_documents(
    documents=chunks,
    embedding=embeddings,
    persist_directory="../tutorial/chroma_db"  # Save to disk
)

print("✅ Vector database created!")
print(f"📊 Stored {len(chunks)} chunks as searchable embeddings")
print("🔍 Ready for intelligent document search!")

# Create a retriever (the search engine)
retriever = vectorstore.as_retriever(
    search_type="similarity",
    search_kwargs={"k": 3}  # Return top 3 most similar chunks
)

print("\n🎯 Search engine configured:")
print("   • Type: Similarity search")
print("   • Returns: Top 3 most relevant chunks")

## 🔍 Step 5: Test the Retrieval System

Let's see how well our system can find relevant information!

In [None]:
# Test our retrieval system
print("🔍 Testing the retrieval system...")

# Ask a question
test_question = "What are the main findings of this research?"

print(f"❓ Question: {test_question}")
print("🔍 Searching for relevant chunks...")

# Find relevant chunks
relevant_chunks = retriever.invoke(test_question)

print(f"\n✅ Found {len(relevant_chunks)} relevant chunks!")

# Show what we found
for i, chunk in enumerate(relevant_chunks):
    print(f"\n📄 Chunk {i+1}:")
    print(f"Page: {chunk.metadata.get('page', 'Unknown')}")
    print(f"Content preview: {chunk.page_content[:200]}...")
    print(f"Length: {len(chunk.page_content)} characters")

print("\n💡 The system found the most relevant parts of the document for your question!")

## 🤖 Step 6: Build the Complete RAG System

Now let's combine retrieval with AI generation to create a complete RAG system!

In [None]:
# Create the AI assistant
ai_assistant = ChatOllama(
    model="llama3.1:8b",
    temperature=0.1  # Low temperature for factual answers
)

# Create a RAG prompt template
rag_prompt = ChatPromptTemplate.from_template("""
You are a helpful assistant that answers questions based on provided context.

Context from the document:
{context}

Question: {question}

Please provide a clear, accurate answer based on the context above. 
If the context doesn't contain enough information, say so.
""")

print("🤖 RAG system components created!")
print("🔧 Components: Retriever + AI + Smart Prompt")

In [None]:
# Helper function to format retrieved documents
def format_docs(docs):
    """Combine multiple document chunks into one context"""
    return "\n\n".join(doc.page_content for doc in docs)

# Build the RAG chain
print("🔗 Building the complete RAG chain...")

# This chain: Question → Retrieve docs → Format → AI → Answer
rag_chain = (
    {
        "context": retriever | format_docs,  # Get and format relevant docs
        "question": RunnablePassthrough()    # Pass question through
    }
    | rag_prompt          # Create prompt with context and question
    | ai_assistant        # Generate answer
    | StrOutputParser()   # Clean output
)

print("✅ RAG chain created!")
print("🎯 Flow: Question → Retrieve → Context → AI → Answer")
print("🚀 Ready for intelligent question answering!")

## 🎯 Step 7: Test Your RAG System

Time for the exciting part - let's ask our RAG system questions about the research paper!

In [None]:
# Test the RAG system
print("🎯 Testing the complete RAG system!")
print("⏳ This might take 15-30 seconds...")

question = "What is this research paper about? What are the main topics?"

print(f"\n❓ Question: {question}")
print("\n🔍 RAG Process:")
print("   1. Finding relevant document chunks...")
print("   2. Providing context to AI...")
print("   3. Generating informed answer...")

# Get the answer
answer = rag_chain.invoke(question)

print("\n🤖 RAG Answer:")
print(answer)

print("\n✨ This answer is based on the actual content of the document!")

In [None]:
# Try another question
question2 = "Who are the authors of this paper and what did they study?"

print(f"❓ Question: {question2}")
print("\n🤖 RAG Answer:")

answer2 = rag_chain.invoke(question2)
print(answer2)

print("\n💡 RAG found the specific information about authors in the document!")

In [None]:
# Try a more specific question
question3 = "What methods or techniques were used in this research?"

print(f"❓ Question: {question3}")
print("\n🤖 RAG Answer:")

answer3 = rag_chain.invoke(question3)
print(answer3)

print("\n🎯 RAG can find specific technical details in the document!")

## 🔬 Step 8: Compare RAG vs Non-RAG

Let's see the difference between asking AI with and without document context!

In [None]:
# Compare RAG vs regular AI
print("🔬 EXPERIMENT: RAG vs Regular AI")
print("=" * 45)

comparison_question = "What are the main conclusions of the paper by Mayk Caldas Ramos?"

print(f"❓ Question: {comparison_question}")

# Get answer WITHOUT RAG (just AI memory)
print("\n🤖 Regular AI (without document):")
regular_answer = ai_assistant.invoke(comparison_question)
print(regular_answer.content)

# Get answer WITH RAG (using document)
print("\n🔍 RAG AI (with document context):")
rag_answer = rag_chain.invoke(comparison_question)
print(rag_answer)

print("\n💡 NOTICE THE DIFFERENCE:")
print("   • Regular AI: Might guess or say it doesn't know")
print("   • RAG AI: Uses actual content from the document")
print("   • RAG is more accurate and specific!")

## 🎮 Step 9: Interactive RAG Session

Now it's your turn to ask questions!

In [None]:
# 🎯 YOUR TURN: Ask the RAG system anything!

your_question = "What are the key findings or results mentioned in this paper?"  # 👈 Change this!

print(f"❓ Your Question: {your_question}")
print("\n🔍 RAG is searching and thinking...")

your_answer = rag_chain.invoke(your_question)

print("\n🤖 RAG Answer:")
print(your_answer)

print("\n🎉 Great! You're now using RAG to get intelligent answers from documents!")

## 🔧 Step 10: Understanding RAG Parameters

Let's learn how to tune RAG for better results.

In [None]:
# Experiment with different retrieval settings
print("🔧 TUNING RAG: Different Settings")
print("=" * 40)

question = "What methods were used?"

# Try retrieving more chunks
more_chunks_retriever = vectorstore.as_retriever(
    search_kwargs={"k": 5}  # Get 5 chunks instead of 3
)

# Build new chain with more context
more_context_chain = (
    {
        "context": more_chunks_retriever | format_docs,
        "question": RunnablePassthrough()
    }
    | rag_prompt
    | ai_assistant
    | StrOutputParser()
)

print(f"❓ Question: {question}")

print("\n🔍 RAG with 3 chunks:")
answer_3 = rag_chain.invoke(question)
print(answer_3)

print("\n📚 RAG with 5 chunks (more context):")
answer_5 = more_context_chain.invoke(question)
print(answer_5)

print("\n💡 More chunks = more context, but also more processing time!")

## 🎓 What You've Learned

**Fantastic work!** You've built a complete RAG system and understand how it works.

### ✅ **Key Concepts:**
- **RAG**: Combines document retrieval with AI generation
- **Embeddings**: How AI converts text to searchable numbers
- **Vector Database**: Stores and searches document embeddings
- **Retrieval**: Finding relevant document chunks
- **Context**: Providing AI with relevant information

### ✅ **Skills You've Gained:**
- Creating embeddings from text
- Building vector databases for document search
- Implementing similarity-based retrieval
- Combining retrieval with AI generation
- Tuning RAG parameters for better results

### 🚀 **What's Next:**
In **Tutorial 4**, you'll learn about **Knowledge Graphs**:
- What knowledge graphs are and how they work
- Extracting entities (people, places, concepts) from text
- Finding relationships between entities
- Visualizing knowledge networks
- Combining knowledge graphs with RAG

### 🎯 **Practice Ideas:**
- Try RAG with your own documents
- Experiment with different chunk sizes
- Test various numbers of retrieved chunks
- Compare answers with and without RAG

## 🏆 Final Challenge

Build your own specialized RAG system!

In [None]:
# 🏆 CHALLENGE: Build a Summary RAG System
print("🏆 FINAL CHALLENGE: Specialized RAG for Summaries")
print("=" * 50)

# TODO: Create a special prompt for generating summaries
summary_rag_prompt = ChatPromptTemplate.from_template("""
You are an expert at creating clear, concise summaries of research papers.

Based on this context from a research paper:
{context}

Question: {question}

Provide a clear summary that:
1. Highlights the main points
2. Uses simple language
3. Is 2-3 sentences maximum
""")

# Build the summary RAG chain
summary_rag_chain = (
    {
        "context": retriever | format_docs,
        "question": RunnablePassthrough()
    }
    | summary_rag_prompt
    | ai_assistant
    | StrOutputParser()
)

# Test it
question = "Give me a brief summary of what this paper is about"
summary = summary_rag_chain.invoke(question)

print(f"❓ Question: {question}")
print("\n📝 Summary RAG Answer:")
print(summary)

print("\n🎉 Challenge Complete! You've built a specialized RAG system!")
print("🚀 Ready for Tutorial 4: Building Knowledge Graphs")