# Advanced RAG with LangChain and Gemini 3 Pro

> **Created by [Build Fast with AI](https://www.buildfastwithai.com)**

This notebook demonstrates advanced RAG techniques using LangChain, including document loaders, text splitting, vector stores, and retrieval chains.

## What you'll learn:
- Using LangChain for RAG pipelines
- Advanced text splitting strategies
- Semantic search with reranking
- Conversation memory with RAG
- Building production-ready RAG systems

In [None]:
!pip install -q langchain langchain-google-genai chromadb pypdf

In [None]:
import os
from langchain_google_genai import GoogleGenerativeAI, GoogleGenerativeAIEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA, ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory
from langchain.document_loaders import TextLoader
from langchain.prompts import PromptTemplate
from IPython.display import Markdown, display

In [None]:
# Configure API key
try:
    from google.colab import userdata
    GOOGLE_API_KEY = userdata.get('GOOGLE_API_KEY')
except:
    GOOGLE_API_KEY = os.environ.get('GOOGLE_API_KEY', 'your-api-key-here')

os.environ['GOOGLE_API_KEY'] = GOOGLE_API_KEY

## 1. Initialize LangChain Components

In [None]:
# Initialize Gemini model
llm = GoogleGenerativeAI(
    model="gemini-3-pro",
    temperature=0.7,
    google_api_key=GOOGLE_API_KEY
)

# Initialize embeddings
embeddings = GoogleGenerativeAIEmbeddings(
    model="models/embedding-001",
    google_api_key=GOOGLE_API_KEY
)

print("LangChain components initialized!")

## 2. Creating Sample Documents

In [None]:
# Create sample documents about Python programming
sample_text = """
Python Programming Guide

Chapter 1: Introduction to Python
Python is a high-level, interpreted programming language known for its simplicity and readability.
Created by Guido van Rossum and first released in 1991, Python emphasizes code readability with
its notable use of significant whitespace. It supports multiple programming paradigms including
procedural, object-oriented, and functional programming.

Chapter 2: Data Structures
Python provides several built-in data structures. Lists are ordered, mutable collections that can
contain items of different types. Tuples are similar to lists but are immutable. Dictionaries are
key-value pairs that provide fast lookup times. Sets are unordered collections of unique elements.

Chapter 3: Functions and Decorators
Functions in Python are defined using the 'def' keyword. They can accept arguments and return values.
Decorators are a powerful feature that allows you to modify or enhance functions without changing
their source code. They are commonly used for logging, authentication, and memoization.

Chapter 4: Object-Oriented Programming
Python supports object-oriented programming with classes and inheritance. Classes are defined using
the 'class' keyword. They can have attributes and methods. Inheritance allows you to create new
classes based on existing ones, promoting code reuse.

Chapter 5: File Handling
Python makes it easy to work with files. The 'open()' function is used to open files in different
modes (read, write, append). Context managers with the 'with' statement ensure proper resource
management. Python can handle text files, binary files, and CSV files efficiently.

Chapter 6: Error Handling
Python uses try-except blocks for error handling. This allows you to catch and handle exceptions
gracefully. Common exceptions include ValueError, TypeError, and FileNotFoundError. You can also
create custom exceptions by inheriting from the Exception class.

Chapter 7: Modules and Packages
Python modules are files containing Python code. They can be imported using the 'import' statement.
Packages are directories containing multiple modules. The Python Package Index (PyPI) hosts thousands
of third-party packages that can be installed using pip.

Chapter 8: Web Development
Python is widely used for web development. Popular frameworks include Django, Flask, and FastAPI.
Django is a full-featured framework that follows the MTV pattern. Flask is a lightweight framework
that gives you more control. FastAPI is modern and fast, with automatic API documentation.

Chapter 9: Data Science and Machine Learning
Python is the leading language for data science and machine learning. NumPy provides numerical
computing capabilities. Pandas offers data manipulation tools. Scikit-learn provides machine learning
algorithms. TensorFlow and PyTorch are popular deep learning frameworks.

Chapter 10: Best Practices
Following PEP 8 style guide ensures code consistency. Write clear docstrings for functions and classes.
Use virtual environments to manage dependencies. Write unit tests to ensure code quality. Use version
control systems like Git. Comment your code when necessary but prefer self-documenting code.
"""

# Save to a temporary file
with open('/tmp/python_guide.txt', 'w') as f:
    f.write(sample_text)

print("Sample document created!")

## 3. Advanced Text Splitting

In [None]:
# Load document
loader = TextLoader('/tmp/python_guide.txt')
documents = loader.load()

# Create text splitter with overlap
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,
    chunk_overlap=100,
    length_function=len,
    separators=["\n\n", "\n", " ", ""]
)

# Split documents
splits = text_splitter.split_documents(documents)

print(f"Split document into {len(splits)} chunks")
print(f"\nFirst chunk preview:\n{splits[0].page_content[:200]}...")

## 4. Creating Vector Store

In [None]:
# Create Chroma vector store
vectorstore = Chroma.from_documents(
    documents=splits,
    embedding=embeddings,
    collection_name="python_guide"
)

print(f"Vector store created with {len(splits)} documents")

## 5. Basic Retrieval QA Chain

In [None]:
# Create retrieval QA chain
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=vectorstore.as_retriever(search_kwargs={"k": 3}),
    return_source_documents=True
)

# Test the chain
question = "What are Python decorators?"
result = qa_chain({"query": question})

print(f"Question: {question}\n")
display(Markdown(result['result']))

print("\n\nSource documents:")
for i, doc in enumerate(result['source_documents'], 1):
    print(f"\n{i}. {doc.page_content[:100]}...")

## 6. Custom Prompt Template

In [None]:
# Define custom prompt template
custom_prompt = PromptTemplate(
    template="""You are a helpful Python programming tutor. Use the following context to answer the question.
If you don't know the answer, say so. Provide code examples when relevant.

Context: {context}

Question: {question}

Detailed Answer:""",
    input_variables=["context", "question"]
)

# Create QA chain with custom prompt
qa_chain_custom = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=vectorstore.as_retriever(search_kwargs={"k": 3}),
    return_source_documents=True,
    chain_type_kwargs={"prompt": custom_prompt}
)

# Test with custom prompt
question = "How do I handle errors in Python?"
result = qa_chain_custom({"query": question})

print(f"Question: {question}\n")
display(Markdown(result['result']))

## 7. Conversational RAG with Memory

In [None]:
# Create memory
memory = ConversationBufferMemory(
    memory_key="chat_history",
    return_messages=True,
    output_key="answer"
)

# Create conversational retrieval chain
conv_chain = ConversationalRetrievalChain.from_llm(
    llm=llm,
    retriever=vectorstore.as_retriever(search_kwargs={"k": 3}),
    memory=memory,
    return_source_documents=True
)

print("Conversational RAG chain created!")

In [None]:
# Multi-turn conversation
questions = [
    "What data structures does Python have?",
    "Which one is mutable?",
    "Can you give me an example of using it?"
]

for question in questions:
    print(f"\n{'='*80}")
    print(f"User: {question}")
    print(f"{'='*80}\n")
    
    result = conv_chain({"question": question})
    display(Markdown(result['answer']))

## 8. Advanced Retrieval Strategies

In [None]:
# MMR (Maximal Marginal Relevance) for diverse results
mmr_retriever = vectorstore.as_retriever(
    search_type="mmr",
    search_kwargs={"k": 4, "fetch_k": 10, "lambda_mult": 0.5}
)

# Test MMR retrieval
query = "Python frameworks"
docs = mmr_retriever.get_relevant_documents(query)

print(f"MMR retrieval for: '{query}'\n")
for i, doc in enumerate(docs, 1):
    print(f"{i}. {doc.page_content[:100]}...\n")

## 9. Similarity Search with Scores

In [None]:
# Similarity search with relevance scores
query = "How to work with files?"
docs_with_scores = vectorstore.similarity_search_with_relevance_scores(
    query,
    k=3
)

print(f"Query: {query}\n")
print("Results with relevance scores:\n")

for i, (doc, score) in enumerate(docs_with_scores, 1):
    print(f"{i}. Score: {score:.3f}")
    print(f"   {doc.page_content[:150]}...\n")

## 10. Building a Production-Ready RAG System

In [None]:
class ProductionRAGSystem:
    def __init__(self, vectorstore, llm, embeddings):
        self.vectorstore = vectorstore
        self.llm = llm
        self.embeddings = embeddings
        self.conversation_history = []
        
    def ask(
        self,
        question,
        k=3,
        search_type="similarity",
        use_context=True,
        show_sources=True
    ):
        """Ask a question with advanced retrieval."""
        # Retrieve relevant documents
        if search_type == "mmr":
            retriever = self.vectorstore.as_retriever(
                search_type="mmr",
                search_kwargs={"k": k, "fetch_k": k*2}
            )
        else:
            retriever = self.vectorstore.as_retriever(
                search_kwargs={"k": k}
            )
        
        docs = retriever.get_relevant_documents(question)
        
        # Build context
        context = "\n\n".join([doc.page_content for doc in docs])
        
        # Add conversation history if requested
        history_context = ""
        if use_context and self.conversation_history:
            history_context = "\n".join([
                f"Q: {h['question']}\nA: {h['answer']}"
                for h in self.conversation_history[-2:]
            ])
        
        # Build prompt
        prompt = f"""
You are a helpful assistant. Answer the question based on the context provided.

{f'Previous conversation:\n{history_context}\n\n' if history_context else ''}
Context:
{context}

Question: {question}

Answer:
"""
        
        # Generate answer
        answer = self.llm.invoke(prompt)
        
        # Store in history
        self.conversation_history.append({
            "question": question,
            "answer": answer,
            "sources": docs
        })
        
        # Display results
        print(f"\n{'='*80}")
        print(f"Question: {question}")
        print(f"{'='*80}\n")
        display(Markdown(answer))
        
        if show_sources:
            print("\n\nSources:")
            for i, doc in enumerate(docs, 1):
                print(f"\n{i}. {doc.page_content[:100]}...")
        
        return answer
    
    def reset_history(self):
        """Reset conversation history."""
        self.conversation_history = []
        print("Conversation history reset.")

# Create production RAG system
rag_system = ProductionRAGSystem(vectorstore, llm, embeddings)

# Test it
rag_system.ask("What are the best practices for Python programming?")

In [None]:
# Test with follow-up questions
rag_system.ask("Can you elaborate on virtual environments?")

## Next Steps

- Build agentic systems with LangGraph
- Create multi-agent workflows with CrewAI
- Deploy RAG applications with Streamlit

---

## Learn More

Master advanced RAG techniques with the **[Gen AI Crash Course](https://www.buildfastwithai.com/genai-course)** by Build Fast with AI!

**Created by [Build Fast with AI](https://www.buildfastwithai.com)**