# Lab 4: RAG Pipeline Implementation

**Module 4 - Retrieval-Augmented Generation**

| Duration | Difficulty | Framework | Exercises |
|----------|------------|-----------|----------|
| 120 min | Advanced | LangChain + FAISS | 4 |

## Learning Objectives

- Implement document loading and text chunking strategies
- Generate and store embeddings using OpenAI and FAISS
- Build semantic search with similarity scoring
- Create a complete RAG pipeline with context injection

## Setup

In [None]:
# !pip install langchain langchain-openai faiss-cpu tiktoken

In [None]:
import os
import numpy as np
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import FAISS
from langchain.schema import Document
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

os.environ["OPENAI_API_KEY"] = "your-api-key-here"

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
llm = ChatOpenAI(model="gpt-4", temperature=0)

In [None]:
# Sample documents
documents = [
    """Machine Learning is a subset of artificial intelligence that enables
    systems to learn and improve from experience without being explicitly
    programmed. It focuses on developing computer programs that can access
    data and use it to learn for themselves.

    The process begins with observations or data, such as examples, direct
    experience, or instruction. It looks for patterns in data and makes
    better decisions in the future based on the examples provided.

    There are three main types of machine learning: supervised learning,
    unsupervised learning, and reinforcement learning.""",

    """Deep Learning is part of a broader family of machine learning methods
    based on artificial neural networks. Learning can be supervised,
    semi-supervised or unsupervised.

    Deep learning architectures such as deep neural networks, recurrent
    neural networks, convolutional neural networks and transformers have
    been applied to fields including computer vision, speech recognition,
    natural language processing, and machine translation.

    Neural networks are inspired by biological neural networks, although
    they are not identical."""
]

---

## Exercise 1: Document Chunking

Implement different text chunking strategies and compare their effectiveness.

**Your Task:** Complete the chunking function and analyze different strategies.

In [None]:
def create_chunks(texts: list, chunk_size: int, chunk_overlap: int) -> list:
    """Create chunks from a list of texts."""
    # TODO: Initialize RecursiveCharacterTextSplitter
    splitter = None  # Your code here
    
    # TODO: Split all documents and return chunks as Document objects
    all_chunks = []
    # Your code here
    
    return all_chunks


def analyze_chunking_strategy(texts: list):
    """Compare different chunking strategies."""
    strategies = [
        {"chunk_size": 100, "chunk_overlap": 20},
        {"chunk_size": 200, "chunk_overlap": 40},
        {"chunk_size": 500, "chunk_overlap": 50},
    ]
    
    for strategy in strategies:
        chunks = create_chunks(texts, **strategy)
        print(f"\n--- Strategy: {strategy} ---")
        print(f"Number of chunks: {len(chunks)}")
        if chunks:
            avg_len = sum(len(c.page_content) for c in chunks) / len(chunks)
            print(f"Average chunk length: {avg_len:.0f}")

In [None]:
# Run analysis
# analyze_chunking_strategy(documents)

---

## Exercise 2: Embedding Generation and Vector Store

Generate embeddings and create a FAISS vector store.

**Your Task:** Create a vector store and analyze embedding properties.

In [None]:
def create_vector_store(documents: list) -> FAISS:
    """Create a FAISS vector store from documents."""
    # TODO: Create chunks from documents
    chunks = None  # Your code here
    
    # TODO: Create FAISS vector store with embeddings
    vector_store = None  # Your code here
    
    return vector_store


def compare_similarities(texts: list):
    """Compare semantic similarities between texts."""
    # TODO: Generate embeddings for all texts
    text_embeddings = None  # Your code here
    
    # TODO: Calculate and display cosine similarities
    print("\nSimilarity Matrix:")
    # Your code here

In [None]:
# Test
test_texts = [
    "Machine learning uses data to make predictions",
    "Deep learning is based on neural networks",
    "The weather today is sunny and warm",
    "AI systems can learn from experience"
]
# compare_similarities(test_texts)

---

## Exercise 3: Semantic Search

Implement semantic search with relevance scoring.

**Your Task:** Build search functions with different retrieval methods.

In [None]:
def semantic_search(vector_store, query: str, k: int = 3) -> list:
    """Perform semantic search and return results with scores."""
    # TODO: Perform similarity search with scores
    results = None  # Your code here
    
    formatted_results = []
    # TODO: Format results with content, metadata, and similarity scores
    
    return formatted_results


def compare_search_methods(vector_store, query: str):
    """Compare different search methods."""
    print(f"\nQuery: {query}")
    print("=" * 60)
    
    # TODO: Implement basic search, search with scores, and MMR search
    pass

---

## Exercise 4: Complete RAG Pipeline

Build a complete RAG pipeline that retrieves context and generates responses.

**Your Task:** Implement the RAGPipeline class.

In [None]:
class RAGPipeline:
    def __init__(self, vector_store, llm, k: int = 3):
        self.vector_store = vector_store
        self.llm = llm
        self.k = k
        
        # TODO: Define the RAG prompt template
        self.prompt = None  # Your code here
    
    def retrieve(self, query: str) -> str:
        """Retrieve relevant documents and format as context."""
        # TODO: Retrieve documents and format them
        pass
    
    def generate(self, question: str, context: str) -> str:
        """Generate response using retrieved context."""
        # TODO: Generate response
        pass
    
    def query(self, question: str) -> dict:
        """Run the full RAG pipeline."""
        # TODO: Implement retrieve and generate
        pass

In [None]:
# Test RAG pipeline
# vector_store = create_vector_store(documents)
# rag = RAGPipeline(vector_store, llm)
# result = rag.query("What are the types of machine learning?")
# print(result)

---

## Checkpoint

You've completed Lab 4! Key concepts:

- Document chunking affects retrieval quality
- Embeddings capture semantic meaning
- RAG combines retrieval with generation for grounded responses

**Next:** Lab 5 - LoRA Fine-tuning