# Advanced RAG - Retrieval Augmented Generation 🔍

Learn to build sophisticated RAG systems with vector databases, semantic search, and context management.

## What You'll Build
- Document ingestion pipeline
- Vector database integration
- Semantic search system
- Context-aware Q&A

## 1. Setup and Dependencies

In [None]:
import os
import sys
from pathlib import Path

# Add utils
sys.path.append(str(Path().parent.parent / 'utils'))

try:
    from config import get_api_key
    api_key = get_api_key('openai')
    print("✅ Configuration loaded")
except ImportError:
    api_key = os.getenv('OPENAI_API_KEY')
    print("⚠️ Using basic config")

print("✅ API key found" if api_key else "⚠️ Set OPENAI_API_KEY")

## 2. Document Processing

In [None]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.schema import Document

# Sample documents
documents = [
    Document(
        page_content="LangChain is a framework for developing applications powered by language models. It enables applications that are context-aware and can reason about their environment.",
        metadata={"source": "langchain_intro", "type": "documentation"}
    ),
    Document(
        page_content="Vector databases store high-dimensional vectors and enable semantic search. They are crucial for RAG applications as they allow finding similar content based on meaning rather than exact keywords.",
        metadata={"source": "vector_db_guide", "type": "technical"}
    ),
    Document(
        page_content="Machine learning models can be fine-tuned for specific tasks. This process involves training the model on domain-specific data to improve performance on particular use cases.",
        metadata={"source": "ml_basics", "type": "educational"}
    )
]

# Split documents
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=200,
    chunk_overlap=50
)

splits = text_splitter.split_documents(documents)
print(f"📄 Created {len(splits)} document chunks")

for i, split in enumerate(splits):
    print(f"\nChunk {i+1}: {split.page_content[:80]}...")
    print(f"Metadata: {split.metadata}")

## 3. Vector Store Creation

In [None]:
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import FAISS

if api_key:
    # Create embeddings
    embeddings = OpenAIEmbeddings(openai_api_key=api_key)
    
    # Create vector store
    vectorstore = FAISS.from_documents(splits, embeddings)
    print("✅ Vector store created with FAISS")
    
    # Test similarity search
    query = "What is LangChain?"
    docs = vectorstore.similarity_search(query, k=2)
    
    print(f"\n🔍 Query: {query}")
    print(f"Found {len(docs)} relevant documents:")
    
    for i, doc in enumerate(docs):
        print(f"\n{i+1}. {doc.page_content}")
        print(f"   Source: {doc.metadata.get('source', 'unknown')}")
else:
    print("🔧 Demo: Vector store would index documents for semantic search")
    print("Example: Finding documents about 'LangChain' would return relevant chunks")

## 4. RAG Chain Implementation

In [None]:
from langchain.chat_models import ChatOpenAI
from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate

if api_key:
    # Create LLM
    llm = ChatOpenAI(
        openai_api_key=api_key,
        model_name="gpt-3.5-turbo",
        temperature=0.3
    )
    
    # Custom prompt for RAG
    rag_prompt = PromptTemplate(
        input_variables=["context", "question"],
        template="""Use the following context to answer the question. If you don't know the answer based on the context, say so.

Context: {context}

Question: {question}

Answer:"""
    )
    
    # Create RAG chain
    qa_chain = RetrievalQA.from_chain_type(
        llm=llm,
        chain_type="stuff",
        retriever=vectorstore.as_retriever(search_kwargs={"k": 2}),
        chain_type_kwargs={"prompt": rag_prompt},
        return_source_documents=True
    )
    
    print("✅ RAG chain created")
    
    # Test questions
    questions = [
        "What is LangChain used for?",
        "How do vector databases help with search?",
        "What is fine-tuning in machine learning?"
    ]
    
    for question in questions:
        result = qa_chain({"query": question})
        
        print(f"\n❓ Q: {question}")
        print(f"🤖 A: {result['result']}")
        print(f"📚 Sources: {len(result['source_documents'])} documents used")
else:
    print("🔧 Demo: RAG chain combines retrieval with generation")
    print("Process: Question → Retrieve relevant docs → Generate answer with context")

## 5. Advanced RAG Features

In [None]:
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory

if api_key:
    # Create conversational RAG with memory
    memory = ConversationBufferMemory(
        memory_key="chat_history",
        return_messages=True
    )
    
    conversational_chain = ConversationalRetrievalChain.from_llm(
        llm=llm,
        retriever=vectorstore.as_retriever(),
        memory=memory,
        return_source_documents=True
    )
    
    print("✅ Conversational RAG chain created")
    
    # Multi-turn conversation
    conversation = [
        "What is LangChain?",
        "How can it be used with vector databases?",
        "Can you give me a practical example?"
    ]
    
    for i, question in enumerate(conversation, 1):
        result = conversational_chain({"question": question})
        
        print(f"\n{i}. ❓ {question}")
        print(f"   🤖 {result['answer'][:150]}...")
        
        if i == 1:  # Only show full response for first question
            break
else:
    print("🔧 Demo: Conversational RAG maintains context across questions")
    print("Example: Follow-up questions can reference previous answers")

## 6. RAG Evaluation and Metrics

In [None]:
def evaluate_rag_response(question, answer, source_docs):
    """Simple RAG evaluation metrics"""
    
    # Relevance score (simplified)
    question_words = set(question.lower().split())
    answer_words = set(answer.lower().split())
    relevance = len(question_words.intersection(answer_words)) / len(question_words)
    
    # Source utilization
    source_utilization = len(source_docs) / 2  # Assuming max 2 sources
    
    # Answer completeness (length-based heuristic)
    completeness = min(len(answer.split()) / 50, 1.0)  # Normalize to 50 words
    
    return {
        "relevance_score": round(relevance, 2),
        "source_utilization": round(source_utilization, 2),
        "completeness_score": round(completeness, 2),
        "overall_score": round((relevance + source_utilization + completeness) / 3, 2)
    }

# Example evaluation
test_question = "What is LangChain used for?"
test_answer = "LangChain is a framework for developing applications powered by language models that are context-aware."
test_sources = ["doc1", "doc2"]

metrics = evaluate_rag_response(test_question, test_answer, test_sources)

print(f"📊 RAG Evaluation Metrics:")
for metric, score in metrics.items():
    print(f"   {metric}: {score}")

print("\n💡 Key RAG Metrics:")
print("   • Relevance: How well the answer addresses the question")
print("   • Source Utilization: How effectively retrieved documents are used")
print("   • Completeness: Whether the answer is comprehensive")
print("   • Faithfulness: Answer accuracy to source content")

## 🎯 Key Takeaways

You've learned advanced RAG concepts:
- **Document Processing**: Chunking and metadata management
- **Vector Stores**: Semantic search with embeddings
- **RAG Chains**: Combining retrieval with generation
- **Conversational RAG**: Multi-turn context awareness
- **Evaluation**: Measuring RAG system performance

### Next Steps:
1. Try different vector databases (ChromaDB, Pinecone)
2. Experiment with chunking strategies
3. Build domain-specific RAG systems
4. Implement advanced retrieval techniques

### Advanced Topics:
- Multi-modal RAG (text + images)
- Graph-based retrieval
- Hybrid search (dense + sparse)
- RAG optimization techniques

In [None]:
# 🧪 Experiment with your own RAG system
print("🧪 Build your own RAG system here!")
print("Try: Different documents, chunking strategies, or retrieval methods")