# LangChain RAG: Complete Retrieval-Augmented Generation Pipeline

## Introduction

**RAG (Retrieval-Augmented Generation)** combines retrieval of relevant documents with LLM generation to answer questions based on your own data.

### What is RAG?

RAG enables LLMs to:
- **Answer questions** about your private documents
- **Stay up-to-date** without retraining
- **Reduce hallucinations** by grounding in source material
- **Cite sources** for transparency
- **Scale to millions** of documents

### RAG Pipeline Components

1. **Document Loading**: Read files (PDF, text, web, etc.)
2. **Text Splitting**: Break into chunks
3. **Embeddings**: Convert text to vectors
4. **Vector Store**: Store and index embeddings
5. **Retrieval**: Find relevant chunks
6. **Generation**: LLM answers using retrieved context

### When to Use RAG?

| ‚úÖ Use RAG For | ‚ùå Don't Use For |
|----------------|------------------|
| Private documents | Public knowledge (use base LLM) |
| Frequently updated data | Static, well-known facts |
| Domain-specific Q&A | General conversation |
| Citation needed | Creative writing |

---

## Installation & Setup

In [None]:
# Install required packages
# !pip install langchain langchain-openai langchain-community
# !pip install chromadb faiss-cpu pypdf

import os
from getpass import getpass

# Set API key
if not os.getenv("OPENAI_API_KEY"):
    os.environ["OPENAI_API_KEY"] = getpass("Enter OpenAI API Key: ")

print("API key configured!")

---

## Example 1: Simple RAG from Text

Build a complete RAG pipeline from scratch:

In [None]:
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

# Sample documents
documents = [
    "Python is a high-level programming language known for its readability and simplicity.",
    "Python was created by Guido van Rossum and first released in 1991.",
    "Python supports multiple programming paradigms including procedural, object-oriented, and functional programming.",
    "The Zen of Python is a collection of 19 guiding principles for writing computer programs in Python.",
    "Python's standard library is extensive and includes modules for everything from web development to data science."
]

# Step 1: Create text splitter (not needed here since docs are small)
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=200,
    chunk_overlap=20
)
splits = text_splitter.create_documents(documents)

# Step 2: Create embeddings and vector store
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(documents=splits, embedding=embeddings)

# Step 3: Create retriever
retriever = vectorstore.as_retriever(search_kwargs={"k": 2})

# Step 4: Create RAG prompt
template = """Answer the question based only on the following context:

{context}

Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)

# Step 5: Create RAG chain
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | ChatOpenAI(model="gpt-4")
    | StrOutputParser()
)

# Step 6: Ask questions
question = "Who created Python?"
answer = rag_chain.invoke(question)
print(f"Question: {question}")
print(f"Answer: {answer}")

### What Just Happened?

1. **Split documents** into chunks
2. **Embed chunks** and store in vector database
3. **Retrieve** top 2 most relevant chunks for question
4. **Generate** answer using retrieved context

---

## Example 2: Document Loaders

Load documents from various sources:

In [None]:
from langchain_community.document_loaders import TextLoader, PyPDFLoader, WebBaseLoader
from pathlib import Path

# Text file loader
# loader = TextLoader("path/to/file.txt")
# docs = loader.load()

# PDF loader
# loader = PyPDFLoader("path/to/file.pdf")
# docs = loader.load()

# Web page loader
loader = WebBaseLoader("https://python.langchain.com/docs/get_started/introduction")
docs = loader.load()

print(f"Loaded {len(docs)} documents")
print(f"First doc preview: {docs[0].page_content[:200]}...")

### Common Loaders

```python
# Text files
from langchain_community.document_loaders import TextLoader
loader = TextLoader("file.txt")

# PDFs
from langchain_community.document_loaders import PyPDFLoader
loader = PyPDFLoader("file.pdf")

# Web pages
from langchain_community.document_loaders import WebBaseLoader
loader = WebBaseLoader("https://example.com")

# Directory (all files)
from langchain_community.document_loaders import DirectoryLoader
loader = DirectoryLoader("./docs", glob="**/*.txt")

# CSV
from langchain_community.document_loaders import CSVLoader
loader = CSVLoader("data.csv")

# JSON
from langchain_community.document_loaders import JSONLoader
loader = JSONLoader("data.json", jq_schema=".")
```

---

## Example 3: Text Splitting Strategies

Different splitters for different use cases:

In [None]:
from langchain_text_splitters import (
    RecursiveCharacterTextSplitter,
    CharacterTextSplitter,
    TokenTextSplitter
)

sample_text = """
Python is a high-level programming language. It was created by Guido van Rossum.

Python supports multiple paradigms. It is used for web development, data science, and more.

The language has a large standard library. This makes it very versatile.
"""

# RecursiveCharacterTextSplitter (RECOMMENDED)
# Tries to split on paragraphs, then sentences, then words
recursive_splitter = RecursiveCharacterTextSplitter(
    chunk_size=100,
    chunk_overlap=20,
    separators=["\n\n", "\n", ".", " ", ""]
)
recursive_chunks = recursive_splitter.create_documents([sample_text])
print("RecursiveCharacterTextSplitter:")
for i, chunk in enumerate(recursive_chunks):
    print(f"Chunk {i+1}: {chunk.page_content}")

print("\n" + "="*80 + "\n")

# CharacterTextSplitter
# Simple split by character count
char_splitter = CharacterTextSplitter(
    chunk_size=100,
    chunk_overlap=20,
    separator="\n"
)
char_chunks = char_splitter.create_documents([sample_text])
print("CharacterTextSplitter:")
for i, chunk in enumerate(char_chunks):
    print(f"Chunk {i+1}: {chunk.page_content}")

print("\n" + "="*80 + "\n")

# TokenTextSplitter
# Split by token count (important for staying within model limits)
token_splitter = TokenTextSplitter(
    chunk_size=50,
    chunk_overlap=10
)
token_chunks = token_splitter.create_documents([sample_text])
print("TokenTextSplitter:")
for i, chunk in enumerate(token_chunks):
    print(f"Chunk {i+1}: {chunk.page_content}")

### Chunking Best Practices

| Document Type | Chunk Size | Overlap | Splitter |
|---------------|------------|---------|----------|
| General text | 500-1000 | 50-100 | RecursiveCharacter |
| Code | 300-500 | 50 | Language-specific |
| Markdown | 500-1000 | 50-100 | MarkdownHeader |
| Conversations | 500-1000 | 0 | RecursiveCharacter |

---

## Example 4: Embeddings Comparison

Different embedding models have different characteristics:

In [None]:
from langchain_openai import OpenAIEmbeddings
from langchain_community.embeddings import HuggingFaceEmbeddings

text = "Python is a programming language"

# OpenAI embeddings (best quality, paid)
openai_embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
openai_vector = openai_embeddings.embed_query(text)
print(f"OpenAI embedding dimension: {len(openai_vector)}")
print(f"First 5 values: {openai_vector[:5]}")

# HuggingFace embeddings (free, local)
hf_embeddings = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-MiniLM-L6-v2"
)
hf_vector = hf_embeddings.embed_query(text)
print(f"\nHuggingFace embedding dimension: {len(hf_vector)}")
print(f"First 5 values: {hf_vector[:5]}")

### Embedding Models Comparison

| Model | Dimensions | Cost | Quality | Use Case |
|-------|------------|------|---------|----------|
| OpenAI text-embedding-3-small | 1536 | $$ | Excellent | Production |
| OpenAI text-embedding-3-large | 3072 | $$$ | Best | High accuracy |
| HuggingFace all-MiniLM-L6-v2 | 384 | Free | Good | Development/Local |
| Cohere embed-english-v3.0 | 1024 | $$ | Excellent | Multilingual |

---

## Example 5: Vector Stores - Chroma

In-memory vector store (great for development):

In [None]:
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter

# Sample documents
docs = [
    "The factory pattern is a creational design pattern.",
    "The observer pattern is a behavioral design pattern.",
    "The decorator pattern is a structural design pattern.",
    "Creational patterns deal with object creation.",
    "Behavioral patterns deal with object communication.",
    "Structural patterns deal with object composition."
]

# Create embeddings
embeddings = OpenAIEmbeddings()

# Create vector store
text_splitter = RecursiveCharacterTextSplitter(chunk_size=100)
splits = text_splitter.create_documents(docs)
vectorstore = Chroma.from_documents(documents=splits, embedding=embeddings)

# Similarity search
query = "What is the factory pattern?"
results = vectorstore.similarity_search(query, k=2)

print(f"Query: {query}\n")
for i, doc in enumerate(results, 1):
    print(f"Result {i}: {doc.page_content}")

---

## Example 6: Vector Stores - FAISS

High-performance vector store (production-ready):

In [None]:
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter

# Same documents
docs = [
    "Python was created by Guido van Rossum.",
    "Python is known for its simplicity.",
    "Python has a large ecosystem of libraries.",
]

# Create FAISS vector store
embeddings = OpenAIEmbeddings()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=100)
splits = text_splitter.create_documents(docs)
vectorstore = FAISS.from_documents(documents=splits, embedding=embeddings)

# Save to disk
vectorstore.save_local("faiss_index")

# Load from disk
loaded_vectorstore = FAISS.load_local("faiss_index", embeddings, allow_dangerous_deserialization=True)

# Search
results = loaded_vectorstore.similarity_search("Who made Python?", k=1)
print(results[0].page_content)

### Vector Store Comparison

| Vector Store | Persistence | Performance | Best For |
|--------------|-------------|-------------|----------|
| Chroma | Optional | Good | Development |
| FAISS | File-based | Excellent | Local production |
| Pinecone | Cloud | Excellent | Cloud production |
| Weaviate | Self-hosted/Cloud | Excellent | Enterprise |
| Qdrant | Self-hosted/Cloud | Excellent | High scale |

---

## Example 7: Retrieval Strategies

Different ways to retrieve documents:

In [None]:
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter

# Create vector store
docs = [
    "Python is great for data science.",
    "JavaScript is used for web development.",
    "Python has many data science libraries like NumPy and Pandas.",
    "JavaScript frameworks include React and Vue.",
    "Data science involves statistics and machine learning."
]
text_splitter = RecursiveCharacterTextSplitter(chunk_size=100)
splits = text_splitter.create_documents(docs)
vectorstore = Chroma.from_documents(splits, OpenAIEmbeddings())

# Strategy 1: Similarity search (default)
retriever_similarity = vectorstore.as_retriever(
    search_type="similarity",
    search_kwargs={"k": 2}
)
results = retriever_similarity.get_relevant_documents("Python data science")
print("Similarity search:")
for doc in results:
    print(f"- {doc.page_content}")

print("\n" + "="*80 + "\n")

# Strategy 2: MMR (Maximum Marginal Relevance) - diverse results
retriever_mmr = vectorstore.as_retriever(
    search_type="mmr",
    search_kwargs={"k": 2, "fetch_k": 4}
)
results = retriever_mmr.get_relevant_documents("Python data science")
print("MMR (diverse results):")
for doc in results:
    print(f"- {doc.page_content}")

print("\n" + "="*80 + "\n")

# Strategy 3: Similarity with threshold
retriever_threshold = vectorstore.as_retriever(
    search_type="similarity_score_threshold",
    search_kwargs={"score_threshold": 0.8}
)
results = retriever_threshold.get_relevant_documents("Python data science")
print("Similarity with threshold (score > 0.8):")
for doc in results:
    print(f"- {doc.page_content}")

---

## Example 8: create_retrieval_chain (Simplified RAG)

Use built-in helper for complete RAG chains:

In [None]:
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter

# Create vector store
docs = [
    "LangChain is a framework for developing LLM applications.",
    "LangChain supports multiple LLM providers like OpenAI and Anthropic.",
    "RAG is a technique to augment LLMs with external knowledge.",
    "LCEL is LangChain's expression language for building chains."
]
text_splitter = RecursiveCharacterTextSplitter(chunk_size=100)
splits = text_splitter.create_documents(docs)
vectorstore = Chroma.from_documents(splits, OpenAIEmbeddings())
retriever = vectorstore.as_retriever()

# Create prompt
system_prompt = (
    "You are an assistant for question-answering tasks. "
    "Use the following pieces of retrieved context to answer the question. "
    "If you don't know the answer, say that you don't know. "
    "Use three sentences maximum and keep the answer concise.\n\n"
    "{context}"
)
prompt = ChatPromptTemplate.from_messages([
    ("system", system_prompt),
    ("human", "{input}")
])

# Create chains
llm = ChatOpenAI(model="gpt-4")
question_answer_chain = create_stuff_documents_chain(llm, prompt)
rag_chain = create_retrieval_chain(retriever, question_answer_chain)

# Ask question
result = rag_chain.invoke({"input": "What is LangChain?"})
print(f"Question: {result['input']}")
print(f"Answer: {result['answer']}")
print(f"\nSource documents:")
for i, doc in enumerate(result['context'], 1):
    print(f"{i}. {doc.page_content}")

---

## Example 9: Custom RAG Chain with LCEL

Build RAG from scratch with full control:

In [None]:
from langchain_core.runnables import RunnablePassthrough, RunnableParallel
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import StrOutputParser
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings

# Create vector store
docs = [
    "Design patterns are reusable solutions to common software problems.",
    "The singleton pattern ensures a class has only one instance.",
    "The factory pattern provides an interface for creating objects.",
    "The observer pattern defines one-to-many dependencies between objects."
]
vectorstore = Chroma.from_texts(docs, OpenAIEmbeddings())
retriever = vectorstore.as_retriever(search_kwargs={"k": 2})

# Custom prompt
template = """You are a helpful assistant. Answer the question based on the context below.
If you can't answer, say "I don't have enough information."

Context: {context}

Question: {question}

Answer:"""
prompt = ChatPromptTemplate.from_template(template)

# Helper function
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

# Build custom RAG chain with LCEL
rag_chain = (
    RunnableParallel(
        context=retriever | format_docs,
        question=RunnablePassthrough()
    )
    | prompt
    | ChatOpenAI(model="gpt-4", temperature=0)
    | StrOutputParser()
)

# Test it
answer = rag_chain.invoke("What is the factory pattern?")
print(answer)

---

## Example 10: RAG with Chat History

Contextualized RAG that understands follow-up questions:

In [None]:
from langchain.chains import create_history_aware_retriever, create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.messages import HumanMessage, AIMessage
from langchain_openai import ChatOpenAI
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings

# Create vector store
docs = [
    "Python was created by Guido van Rossum in 1991.",
    "Python is known for its simple and readable syntax.",
    "Python has a large ecosystem including Django, Flask, and NumPy."
]
vectorstore = Chroma.from_texts(docs, OpenAIEmbeddings())
retriever = vectorstore.as_retriever()

llm = ChatOpenAI(model="gpt-4")

# Contextualize question (rewrite based on chat history)
contextualize_q_system_prompt = (
    "Given a chat history and the latest user question "
    "which might reference context in the chat history, "
    "formulate a standalone question which can be understood "
    "without the chat history. Do NOT answer the question, "
    "just reformulate it if needed and otherwise return it as is."
)
contextualize_q_prompt = ChatPromptTemplate.from_messages([
    ("system", contextualize_q_system_prompt),
    MessagesPlaceholder("chat_history"),
    ("human", "{input}")
])
history_aware_retriever = create_history_aware_retriever(
    llm, retriever, contextualize_q_prompt
)

# Answer question
qa_system_prompt = (
    "You are an assistant for question-answering tasks. "
    "Use the following pieces of retrieved context to answer the question. "
    "If you don't know the answer, say that you don't know.\n\n"
    "{context}"
)
qa_prompt = ChatPromptTemplate.from_messages([
    ("system", qa_system_prompt),
    MessagesPlaceholder("chat_history"),
    ("human", "{input}")
])
question_answer_chain = create_stuff_documents_chain(llm, qa_prompt)

# Complete RAG chain
rag_chain = create_retrieval_chain(history_aware_retriever, question_answer_chain)

# Conversation
chat_history = []

# First question
result = rag_chain.invoke({
    "input": "Who created Python?",
    "chat_history": chat_history
})
print(f"Q: Who created Python?")
print(f"A: {result['answer']}\n")

chat_history.extend([
    HumanMessage(content="Who created Python?"),
    AIMessage(content=result["answer"])
])

# Follow-up question (uses chat history!)
result = rag_chain.invoke({
    "input": "What year?",  # Refers to creation year from previous question
    "chat_history": chat_history
})
print(f"Q: What year?")
print(f"A: {result['answer']}")

---

## Advanced Pattern: Multi-Query RAG

Generate multiple queries for better retrieval:

In [None]:
from langchain.retrievers.multi_query import MultiQueryRetriever
from langchain_openai import ChatOpenAI
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings

# Create vector store
docs = [
    "Machine learning is a subset of AI focused on learning from data.",
    "Deep learning uses neural networks with multiple layers.",
    "Supervised learning uses labeled data for training.",
    "Unsupervised learning finds patterns without labels."
]
vectorstore = Chroma.from_texts(docs, OpenAIEmbeddings())
base_retriever = vectorstore.as_retriever()

# Multi-query retriever (generates alternative questions)
llm = ChatOpenAI(model="gpt-4", temperature=0)
retriever = MultiQueryRetriever.from_llm(
    retriever=base_retriever,
    llm=llm
)

# Single query retrieves using multiple generated questions
results = retriever.get_relevant_documents("What is ML?")
print("Retrieved documents:")
for i, doc in enumerate(results, 1):
    print(f"{i}. {doc.page_content}")

---

## Advanced Pattern: Parent Document Retriever

Retrieve small chunks but return larger parent documents:

In [None]:
from langchain.retrievers import ParentDocumentRetriever
from langchain.storage import InMemoryStore
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings

# Sample document
full_doc = """
Python Programming Language.

Python is a high-level, interpreted programming language. It was created by Guido van Rossum and first released in 1991.

Python emphasizes code readability with significant whitespace. It supports multiple programming paradigms including procedural, object-oriented, and functional programming.

Python has a large standard library and ecosystem. Popular frameworks include Django for web development and NumPy for numerical computing.
"""

# Parent splitter (large chunks)
parent_splitter = RecursiveCharacterTextSplitter(chunk_size=200)
# Child splitter (small chunks for retrieval)
child_splitter = RecursiveCharacterTextSplitter(chunk_size=50)

# Storage for parent documents
store = InMemoryStore()
vectorstore = Chroma(embedding_function=OpenAIEmbeddings())

# Create retriever
retriever = ParentDocumentRetriever(
    vectorstore=vectorstore,
    docstore=store,
    child_splitter=child_splitter,
    parent_splitter=parent_splitter
)

# Add documents
retriever.add_documents([{"page_content": full_doc}])

# Retrieves based on small chunks but returns large parent
results = retriever.get_relevant_documents("Who created Python?")
print("Retrieved parent document:")
print(results[0].page_content)

---

## Best Practices

### ‚úÖ Do

1. **Choose appropriate chunk size** (500-1000 for general text)
2. **Add chunk overlap** (10-20% of chunk size)
3. **Use RecursiveCharacterTextSplitter** (respects document structure)
4. **Include metadata** (source, page number, etc.)
5. **Test retrieval quality** (check if right docs are retrieved)
6. **Use MMR for diversity** (avoid redundant results)
7. **Add citations** (return source documents)

### ‚ùå Don't

1. **Don't use tiny chunks** (<100 chars loses context)
2. **Don't use huge chunks** (>2000 chars too much noise)
3. **Don't skip overlap** (loses continuity between chunks)
4. **Don't ignore document structure** (split mid-sentence)
5. **Don't retrieve too many docs** (context window limit)
6. **Don't ignore retrieval metrics** (precision/recall)

---

## Common Pitfalls

### ‚ùå Mistake 1: Chunks Too Large

```python
# Bad - chunks too large, too much irrelevant info
splitter = RecursiveCharacterTextSplitter(chunk_size=5000)
```

**Solution**: Use 500-1000 character chunks.

### ‚ùå Mistake 2: No Overlap

```python
# Bad - loses context at chunk boundaries
splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
```

**Solution**: Add 10-20% overlap.

### ‚ùå Mistake 3: Not Testing Retrieval

```python
# Bad - assumes retrieval works without testing
retriever = vectorstore.as_retriever()
```

**Solution**: Test with sample queries:
```python
# Test retrieval quality
test_query = "your test question"
docs = retriever.get_relevant_documents(test_query)
for doc in docs:
    print(doc.page_content)
```

---

## Practice Exercises

In [None]:
# Exercise 1: Build a RAG system for code documentation
# Load Python docstrings, split appropriately, enable Q&A

from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

# Sample code documentation
code_docs = [
    """def calculate_sum(numbers: List[int]) -> int:
    '''Calculate the sum of a list of numbers.
    
    Args:
        numbers: List of integers to sum
    
    Returns:
        The sum of all numbers
    '''""",
    """def filter_even(numbers: List[int]) -> List[int]:
    '''Filter even numbers from a list.
    
    Args:
        numbers: List of integers
    
    Returns:
        List containing only even numbers
    '''"""
]

# Your code here: Build RAG system for code Q&A
# ...

In [None]:
# Exercise 2: Implement RAG with source citations
# Answer questions and include which documents were used

# Your code here:
# Build a chain that returns both answer and source documents
# ...

In [None]:
# Exercise 3: Build RAG with different retrieval strategies
# Compare similarity search vs MMR for the same query

# Your code here:
# Create two retrievers and compare results
# ...

---

## Key Takeaways

### ‚úÖ What We Learned

1. **RAG Pipeline**: Load ‚Üí Split ‚Üí Embed ‚Üí Store ‚Üí Retrieve ‚Üí Generate
2. **Document Loaders**: TextLoader, PyPDFLoader, WebBaseLoader, etc.
3. **Text Splitting**: RecursiveCharacterTextSplitter (recommended)
4. **Embeddings**: OpenAI (best), HuggingFace (free)
5. **Vector Stores**: Chroma (dev), FAISS (prod), Pinecone (cloud)
6. **Retrieval Strategies**: Similarity, MMR, threshold
7. **create_retrieval_chain**: Built-in RAG helper
8. **Chat History**: Contextualize questions with history
9. **Advanced Patterns**: Multi-query, parent document retrieval

### üìö Next Steps

- **langchain_agents.ipynb**: Combine RAG with agents
- **langchain_memory.ipynb**: Advanced conversation memory
- Production RAG: Evaluation, monitoring, optimization

---

## Resources

- [RAG Tutorial](https://python.langchain.com/docs/tutorials/rag/)
- [Document Loaders](https://python.langchain.com/docs/integrations/document_loaders/)
- [Text Splitters](https://python.langchain.com/docs/modules/data_connection/document_transformers/)
- [Vector Stores](https://python.langchain.com/docs/integrations/vectorstores/)
- [Retrieval Strategies](https://python.langchain.com/docs/modules/data_connection/retrievers/)

---

**Next Notebook**: `langchain_agents.ipynb` - Build intelligent agents with tools