# Module 8: RAG with LangChain

## Overview

Build complete RAG systems using LangChain's modern LCEL (LangChain Expression Language).

**Learning Objectives:**
1. Use LangChain core components (loaders, splitters, embeddings, vector stores)
2. Build RAG chains with LCEL
3. Add basic conversational memory
4. Save and load vector stores


## Setup

**Required packages:**
```bash
pip install langchain langchain-core langchain-community langchain-openai langchain-text-splitters
pip install faiss-cpu python-dotenv
```

In [1]:
import os
from dotenv import load_dotenv

load_dotenv()
api_key = os.getenv("OPENAI_API_KEY")

if not api_key:
    raise ValueError("OPENAI_API_KEY not found in .env file")

print("‚úÖ API key loaded")

‚úÖ API key loaded


## Part 1: LangChain Core Components

### 1.1 Documents

In [2]:
from langchain_core.documents import Document

# Create documents directly (you can also load from files using loaders)
documents = [
    Document(page_content="LangChain is a framework for developing applications powered by language models."),
    Document(page_content="RAG combines retrieval with generation to provide accurate, grounded responses."),
    Document(page_content="Vector stores enable efficient similarity search over embedded documents."),
    Document(page_content="FAISS is a library for efficient similarity search developed by Meta AI."),
    Document(page_content="Text splitters chunk documents into smaller pieces for better retrieval.")
]

print(f"Created {len(documents)} documents")
print(f"First document: {documents[0].page_content}")

Created 5 documents
First document: LangChain is a framework for developing applications powered by language models.


### 1.2 Text Splitters

In [3]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

# Create splitter
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=100,
    chunk_overlap=20,
    length_function=len
)

# Split documents
chunks = text_splitter.split_documents(documents)

print(f"Split {len(documents)} documents into {len(chunks)} chunks")
for i, chunk in enumerate(chunks[:3]):
    print(f"\nChunk {i+1}: {chunk.page_content}")

Split 5 documents into 5 chunks

Chunk 1: LangChain is a framework for developing applications powered by language models.

Chunk 2: RAG combines retrieval with generation to provide accurate, grounded responses.

Chunk 3: Vector stores enable efficient similarity search over embedded documents.


### 1.3 Embeddings

In [4]:
from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(
    model="text-embedding-3-small",
    openai_api_key=api_key
)

# Test embedding
test_embedding = embeddings.embed_query("What is RAG?")
print(f"Embedding dimension: {len(test_embedding)}")
print(f"First 5 values: {test_embedding[:5]}")

Embedding dimension: 1536
First 5 values: [0.0006903278990648687, 0.025743499398231506, 0.007137471344321966, 0.03336307406425476, -0.03193636238574982]


### 1.4 Vector Store

In [5]:
from langchain_community.vectorstores import FAISS

# Create vector store from documents
vectorstore = FAISS.from_documents(chunks, embeddings)

print(f"‚úÖ Vector store created with {len(chunks)} chunks")

# Test similarity search
query = "What is FAISS?"
results = vectorstore.similarity_search(query, k=2)

print(f"\nQuery: {query}")
for i, doc in enumerate(results):
    print(f"\nResult {i+1}: {doc.page_content}")

‚úÖ Vector store created with 5 chunks

Query: What is FAISS?

Result 1: FAISS is a library for efficient similarity search developed by Meta AI.

Result 2: RAG combines retrieval with generation to provide accurate, grounded responses.


## Part 2: Building RAG with LCEL (LangChain Expression Language)

### 2.1 Simple RAG Chain

In [6]:
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnableParallel, RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser

# Create LLM
llm = ChatOpenAI(
    model="gpt-3.5-turbo",
    temperature=0,
    openai_api_key=api_key
)

# Create retriever
retriever = vectorstore.as_retriever(search_kwargs={"k": 2})

# Create prompt
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant. Answer using ONLY the provided context."),
    ("human", "{question}\n\nContext:\n{context}")
])

# Helper function to format documents
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

# Build RAG chain using LCEL
rag_chain = (
    RunnableParallel(context=retriever | format_docs, question=RunnablePassthrough())
    | prompt
    | llm
    | StrOutputParser()
)

print("‚úÖ RAG chain created")

‚úÖ RAG chain created


In [7]:
# Query the chain
response = rag_chain.invoke("What is LangChain?")
print(response)

LangChain is a framework for developing applications powered by language models.


### 2.2 Custom Prompts

In [12]:
# Create a custom prompt with specific instructions
custom_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a precise assistant. If you don't know the answer based on the context, say 'I don't know'."),
    ("human", "Context: {context}\n\nQuestion: {question}")
])

# Build chain with custom prompt
custom_rag = (
    RunnableParallel(context=retriever | format_docs, question=RunnablePassthrough())
    | custom_prompt
    | llm
    | StrOutputParser()
)

# Query
response = custom_rag.invoke("What is vector search?")
print(response)

Vector search is a type of search that involves finding similar items based on their vector representations in a high-dimensional space. It is commonly used in applications such as recommendation systems, image search, and document retrieval.


## Part 3: Simple Conversational RAG

In [9]:
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_core.chat_history import InMemoryChatMessageHistory
from langchain_core.prompts import MessagesPlaceholder

# Store for chat histories
chat_store = {}

def get_session_history(session_id: str):
    if session_id not in chat_store:
        chat_store[session_id] = InMemoryChatMessageHistory()
    return chat_store[session_id]

# Create conversational prompt
conv_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant. Answer using the context provided."),
    MessagesPlaceholder(variable_name="chat_history"),
    ("human", "Context: {context}\n\nQuestion: {question}")
])

# Build base chain
conv_chain_base = (
    RunnableParallel(
        context=lambda x: format_docs(retriever.invoke(x["question"])),
        question=lambda x: x["question"],
        chat_history=lambda x: x.get("chat_history", [])
    )
    | conv_prompt
    | llm
    | StrOutputParser()
)

# Wrap with message history
conv_chain = RunnableWithMessageHistory(
    conv_chain_base,
    get_session_history,
    input_messages_key="question",
    history_messages_key="chat_history"
)

print("‚úÖ Conversational chain created")

‚úÖ Conversational chain created


In [13]:
# First question
response1 = conv_chain.invoke(
    {"question": "What is FAISS?"},
    config={"configurable": {"session_id": "session1"}}
)
print(f"Q1: What is FAISS?")
print(f"A1: {response1}\n")

# Follow-up question (remembers context)
response2 = conv_chain.invoke(
    {"question": "Who developed it?"},
    config={"configurable": {"session_id": "session1"}}
)
print(f"Q2: Who developed it?")
print(f"A2: {response2}")

Q1: What is FAISS?
A1: FAISS is a library developed by Meta AI for efficient similarity search, which can be used in combination with RAG to enhance retrieval capabilities for generating accurate responses.

Q2: Who developed it?
A2: FAISS was developed by Meta AI.


In [18]:
# View chat history
session = get_session_history("session1")
print("Chat History:")
for msg in session.messages:
    print(f"\n{msg.type}: {msg.content}")

Chat History:

human: What is FAISS?

ai: FAISS is a library developed by Meta AI for efficient similarity search.

human: What is FAISS?

ai: FAISS is a library developed by Meta AI for efficient similarity search, which can be used in combination with RAG to enhance retrieval capabilities for generating accurate responses.

human: Who developed it?

ai: FAISS was developed by Meta AI.


In [14]:
# First question
response1 = conv_chain.invoke(
    {"question": "What is FAISS?"},
    config={"configurable": {"session_id": "session3"}}
)
print(f"Q1: What is FAISS?")
print(f"A1: {response1}\n")

# Follow-up question (remembers context)
response2 = conv_chain.invoke(
    {"question": "Who developed it?"},
    config={"configurable": {"session_id": "session4"}}
)
print(f"Q2: Who developed it?")
print(f"A2: {response2}")

Q1: What is FAISS?
A1: FAISS is a library developed by Meta AI for efficient similarity search.

Q2: Who developed it?
A2: FAISS was developed by Meta AI.


## üéØ Exercise 1: The Memory Mystery
**Your Task:**

1. **Run the code above** - What happens? Does it work?
2. **Question:** If `session4` has NO memory of `session1`, why does        
"Who developed it?" give a correct answer about FAISS?
3. **Investigate:** What is REALLY providing the context for the answer?    
(Hint: It's not memory!)
4. **Prove it:** Design a simple test to confirm your hypothesis.

**Bonus Challenge:** What would happen if you asked "Who developed it?"     
in a brand new session5 WITHOUT asking about FAISS first?


## Part 4: Saving and Loading Vector Stores

In [None]:
# Save vector store
vectorstore.save_local("faiss_index")
print("‚úÖ Vector store saved")

# Load vector store
loaded_vectorstore = FAISS.load_local(
    "faiss_index",
    embeddings,
    allow_dangerous_deserialization=True
)
print("‚úÖ Vector store loaded")

# Test
test_results = loaded_vectorstore.similarity_search("LangChain", k=1)
print(f"\nTest result: {test_results[0].page_content}")

## Summary

### What You Learned:

1. **Core Components:**
   - Documents and loaders
   - Text splitters for chunking
   - OpenAI embeddings
   - FAISS vector store

2. **Modern RAG with LCEL:**
   - Simple RAG chains using `|` pipe operator
   - Returning source documents
   - Custom prompts
   - Streaming responses

3. **Conversational RAG:**
   - Adding message history
   - Session management

4. **Persistence:**
   - Saving and loading vector stores



## Additional Resources for Self-Study

**Official Documentation:**
- [LangChain Documentation](https://python.langchain.com/)
- [LCEL Conceptual Guide](https://python.langchain.com/docs/expression_language/)
- [Retrieval Conceptual Guide](https://python.langchain.com/docs/modules/data_connection/)

**Key Topics to Explore:**
1. **Document Loaders** - Try loading PDFs, web pages, and structured data
2. **Advanced Text Splitting** - Explore different splitters and optimal chunking strategies
3. **Alternative Vector Stores** - Compare Chroma vs FAISS
4. **Advanced Retrieval Strategies** - Implement MMR, multi-query retrieval, and contextual compression
5. **Prompt Engineering for RAG** - Techniques to reduce hallucination and improve accuracy
6. **Advanced Conversational RAG** - Query reformulation and context window management
7. **Metadata Filtering** - Add and filter documents by metadata for better retrieval


## üéØ RAG Project 1: Build Your RAG Assistant

### Project Overview
Build a complete RAG (Retrieval-Augmented Generation) system using LangChain that answers questions about documents relevant to YOU. This is a portfolio project - make it meaningful and showcase-worthy!

**Why This Matters:**
- Apply everything you've learned in a real-world scenario
- Create something you'll actually use
- Build a portfolio piece that demonstrates your AI skills
- Lay the foundation for more complex RAG systems

---

### üìã Project Requirements

#### **What You Must Build:**

1. **Your Document Collection**
   - Choose documents that matter to YOU 
   - Load and process at least 5-10 documents
   - Experiment with different chunk sizes and overlaps

2. **Vector Store with ChromaDB**
   - Create and persist embeddings
   - Ensure your data is saved for future sessions

3. **RAG Chain with LCEL**
   - Build using LangChain Expression Language
   - Custom prompt tailored to your use case
   - Return answers with source citations

4. **Conversational Memory**
   - Add chat history functionality
   - Handle follow-up questions naturally
   - Maintain context across questions

5. **Documentation & Testing**
   - Test with real questions you'd ask your system
   - Document your process and learnings
   - Write clear instructions for running your project

---

### üí° Make It Portfolio-Worthy

#### **Project Structure**
```
my-rag-project/
‚îÇ
‚îú‚îÄ‚îÄ documents/              # Your documents
‚îú‚îÄ‚îÄ chroma_db/             # Vector store
‚îú‚îÄ‚îÄ rag_system.ipynb       # Main notebook
‚îú‚îÄ‚îÄ .env                   # API keys
‚îú‚îÄ‚îÄ requirements.txt       
‚îî‚îÄ‚îÄ README.md             # Tell your story
```

### ‚úÖ What to Submit

**Your Notebook Should Include:**
- Complete working code
- Comments explaining your decisions
- Test questions and responses
- Observations from your experiments

**Your README Should Tell:**
- What problem you're solving
- What documents you're using (and why)
- How to run your system
- Example interactions
- What you learned
- Future improvements you'd make

**Include:**
- requirements.txt
- Your saved ChromaDB
- Clear instructions

### ‚è∞ Deadline
**Sunday, December 15, 2025 at 11:59 PM**
Submit your or GitHub repository link before the deadline. Submission link will be provided.


**Good luck!** üöÄ