# Retrieval-Augmented Generation (RAG) with DSPy

This notebook demonstrates how to build a RAG system using DSPy:
- Setting up document retrieval
- Creating RAG signatures and modules
- Implementing different RAG strategies
- Evaluating RAG performance

RAG combines the power of information retrieval with language generation to provide accurate, contextual answers.

## Setup and Imports

In [None]:
import os
import sys
sys.path.append('../../')

import dspy
import numpy as np
from typing import List, Dict, Any
from utils import setup_default_lm, print_step, print_result, print_error
from utils.datasets import get_sample_rag_documents
from dotenv import load_dotenv

# Load environment variables
load_dotenv('../../.env')

## Configure Language Model

In [None]:
print_step("Configuring Language Model", "Setting up DSPy with OpenAI")

try:
    lm = setup_default_lm(provider="openai", model="gpt-4o", max_tokens=1000)
    dspy.configure(lm=lm)
    print_result("Language model configured successfully!")
except Exception as e:
    print_error(f"Failed to configure language model: {e}")
    print("Make sure you have set your OPENAI_API_KEY in the .env file")

## Simple In-Memory Retriever

Let's create a simple retriever that uses TF-IDF for document ranking.

In [None]:
print_step("Creating Document Retriever", "Building a simple TF-IDF based retriever")

from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

class SimpleRetriever:
    def __init__(self, documents: List[str]):
        self.documents = documents
        self.vectorizer = TfidfVectorizer(stop_words='english', max_features=1000)
        self.document_vectors = self.vectorizer.fit_transform(documents)
    
    def retrieve(self, query: str, k: int = 3) -> List[str]:
        """Retrieve top-k most relevant documents for the query."""
        query_vector = self.vectorizer.transform([query])
        similarities = cosine_similarity(query_vector, self.document_vectors)[0]
        
        # Get top-k document indices
        top_indices = np.argsort(similarities)[::-1][:k]
        
        return [self.documents[i] for i in top_indices]

# Load sample documents
documents = get_sample_rag_documents()
retriever = SimpleRetriever(documents)

print_result(f"Retriever initialized with {len(documents)} documents")

# Test the retriever
test_query = "What is machine learning?"
retrieved_docs = retriever.retrieve(test_query, k=2)

print(f"\nQuery: {test_query}")
print("\nRetrieved documents:")
for i, doc in enumerate(retrieved_docs, 1):
    print(f"{i}. {doc[:100]}...")

## RAG Signatures

Let's define signatures for our RAG system.

In [None]:
print_step("Defining RAG Signatures", "Creating input/output specifications for RAG")

class GenerateAnswer(dspy.Signature):
    """Answer a question using the provided context documents."""
    context = dspy.InputField(desc="Relevant documents or passages")
    question = dspy.InputField(desc="The question to answer")
    answer = dspy.OutputField(desc="A comprehensive answer based on the context")

class GenerateAnswerWithCitation(dspy.Signature):
    """Answer a question using provided context and include citations."""
    context = dspy.InputField(desc="Relevant documents or passages")
    question = dspy.InputField(desc="The question to answer")
    answer = dspy.OutputField(desc="A comprehensive answer based on the context")
    citations = dspy.OutputField(desc="Citations or references to specific parts of the context")

print_result("RAG signatures defined successfully!")

## Basic RAG Module

Let's create a simple RAG module that retrieves documents and generates answers.

In [None]:
print_step("Creating Basic RAG Module", "Combining retrieval and generation")

class BasicRAG(dspy.Module):
    def __init__(self, retriever, k=3):
        super().__init__()
        self.retriever = retriever
        self.k = k
        self.generate_answer = dspy.ChainOfThought(GenerateAnswer)
    
    def forward(self, question):
        # Retrieve relevant documents
        context_docs = self.retriever.retrieve(question, k=self.k)
        
        # Combine contexts
        context = "\n\n".join([f"Document {i+1}: {doc}" for i, doc in enumerate(context_docs)])
        
        # Generate answer
        result = self.generate_answer(context=context, question=question)
        
        return dspy.Prediction(
            context=context,
            reasoning=result.reasoning,
            answer=result.answer
        )

# Create and test the basic RAG system
basic_rag = BasicRAG(retriever, k=2)

test_questions = [
    "What is machine learning?",
    "How does deep learning work?",
    "What programming language is mentioned in the documents?"
]

for question in test_questions:
    result = basic_rag(question=question)
    print_result(
        f"Question: {question}\n\n"
        f"Reasoning: {result.reasoning}\n\n"
        f"Answer: {result.answer}",
        "Basic RAG Result"
    )
    print("-" * 80)

## Advanced RAG with Citations

Let's create a more advanced RAG system that includes citations.

In [None]:
print_step("Creating Advanced RAG with Citations", "Adding citation tracking to RAG")

class AdvancedRAG(dspy.Module):
    def __init__(self, retriever, k=3):
        super().__init__()
        self.retriever = retriever
        self.k = k
        self.generate_answer = dspy.ChainOfThought(GenerateAnswerWithCitation)
    
    def forward(self, question):
        # Retrieve relevant documents
        context_docs = self.retriever.retrieve(question, k=self.k)
        
        # Create numbered context with clear document boundaries
        context_parts = []
        for i, doc in enumerate(context_docs, 1):
            context_parts.append(f"[Document {i}]: {doc}")
        
        context = "\n\n".join(context_parts)
        
        # Generate answer with citations
        result = self.generate_answer(context=context, question=question)
        
        return dspy.Prediction(
            context=context,
            reasoning=result.reasoning,
            answer=result.answer,
            citations=result.citations,
            retrieved_docs=context_docs
        )

# Create and test the advanced RAG system
advanced_rag = AdvancedRAG(retriever, k=2)

question = "What is the relationship between machine learning and deep learning?"
result = advanced_rag(question=question)

print_result(
    f"Question: {question}\n\n"
    f"Reasoning: {result.reasoning}\n\n"
    f"Answer: {result.answer}\n\n"
    f"Citations: {result.citations}",
    "Advanced RAG Result"
)

print("\nRetrieved Documents:")
for i, doc in enumerate(result.retrieved_docs, 1):
    print(f"Document {i}: {doc[:100]}...")

## Multi-Query RAG

Let's create a RAG system that generates multiple queries to improve retrieval coverage.

In [None]:
print_step("Creating Multi-Query RAG", "Generating multiple queries for better retrieval")

class QueryExpansion(dspy.Signature):
    """Generate multiple related queries to improve document retrieval."""
    original_query = dspy.InputField(desc="The original question")
    expanded_queries = dspy.OutputField(desc="3-5 related queries that could help find relevant information")

class MultiQueryRAG(dspy.Module):
    def __init__(self, retriever, k=2):
        super().__init__()
        self.retriever = retriever
        self.k = k
        self.expand_query = dspy.Predict(QueryExpansion)
        self.generate_answer = dspy.ChainOfThought(GenerateAnswer)
    
    def forward(self, question):
        # Expand the original query
        expansion_result = self.expand_query(original_query=question)
        
        # Parse expanded queries (simple split by newline)
        expanded_queries = [q.strip() for q in expansion_result.expanded_queries.split('\n') if q.strip()]
        all_queries = [question] + expanded_queries[:3]  # Limit to avoid too many queries
        
        # Retrieve documents for each query
        all_docs = []
        for query in all_queries:
            docs = self.retriever.retrieve(query, k=self.k)
            all_docs.extend(docs)
        
        # Remove duplicates while preserving order
        unique_docs = []
        seen = set()
        for doc in all_docs:
            if doc not in seen:
                unique_docs.append(doc)
                seen.add(doc)
        
        # Limit to top documents
        final_docs = unique_docs[:4]  # Limit to 4 documents
        
        # Create context
        context = "\n\n".join([f"Document {i+1}: {doc}" for i, doc in enumerate(final_docs)])
        
        # Generate answer
        result = self.generate_answer(context=context, question=question)
        
        return dspy.Prediction(
            original_question=question,
            expanded_queries=expanded_queries,
            context=context,
            reasoning=result.reasoning,
            answer=result.answer,
            retrieved_docs=final_docs
        )

# Create and test the multi-query RAG system
multi_query_rag = MultiQueryRAG(retriever, k=2)

question = "How can AI help with data analysis?"
result = multi_query_rag(question=question)

print_result(
    f"Original Question: {result.original_question}\n\n"
    f"Expanded Queries: {result.expanded_queries}\n\n"
    f"Reasoning: {result.reasoning}\n\n"
    f"Answer: {result.answer}",
    "Multi-Query RAG Result"
)

print(f"\nTotal Retrieved Documents: {len(result.retrieved_docs)}")

## RAG Evaluation

Let's create some evaluation metrics for our RAG systems.

In [None]:
print_step("RAG Evaluation", "Comparing different RAG approaches")

class AnswerQuality(dspy.Signature):
    """Evaluate the quality of an answer given a question and context."""
    question = dspy.InputField(desc="The original question")
    context = dspy.InputField(desc="The context used to generate the answer")
    answer = dspy.InputField(desc="The generated answer")
    quality_score = dspy.OutputField(desc="Quality score from 1-10 with explanation")

# Create evaluator
evaluator = dspy.Predict(AnswerQuality)

# Test questions for evaluation
test_questions = [
    "What is machine learning?",
    "How does deep learning differ from traditional programming?"
]

print("Comparing RAG Systems:")
print("=" * 50)

for question in test_questions:
    print(f"\nQuestion: {question}")
    print("-" * 30)
    
    # Test basic RAG
    basic_result = basic_rag(question=question)
    basic_eval = evaluator(
        question=question,
        context=basic_result.context[:500] + "...",  # Truncate for evaluation
        answer=basic_result.answer
    )
    
    # Test advanced RAG
    advanced_result = advanced_rag(question=question)
    advanced_eval = evaluator(
        question=question,
        context=advanced_result.context[:500] + "...",
        answer=advanced_result.answer
    )
    
    print(f"Basic RAG Score: {basic_eval.quality_score}")
    print(f"Advanced RAG Score: {advanced_eval.quality_score}")
    print()

## Interactive RAG Demo

Let's create an interactive demo where you can ask questions and see how different RAG systems respond.

In [None]:
print_step("Interactive RAG Demo", "Try asking your own questions!")

def demo_rag_systems(question: str):
    """Demonstrate all RAG systems with a given question."""
    print(f"Question: {question}")
    print("=" * 60)
    
    # Basic RAG
    print("\n🔍 Basic RAG:")
    basic_result = basic_rag(question=question)
    print(f"Answer: {basic_result.answer}")
    
    # Advanced RAG with Citations
    print("\n🎯 Advanced RAG with Citations:")
    advanced_result = advanced_rag(question=question)
    print(f"Answer: {advanced_result.answer}")
    if hasattr(advanced_result, 'citations') and advanced_result.citations:
        print(f"Citations: {advanced_result.citations}")
    
    # Multi-Query RAG
    print("\n🚀 Multi-Query RAG:")
    multi_result = multi_query_rag(question=question)
    print(f"Answer: {multi_result.answer}")
    print(f"Expanded Queries Used: {multi_result.expanded_queries[:100]}...")
    
    print("\n" + "=" * 60)

# Demo questions
demo_questions = [
    "What is natural language processing?",
    "How is data science related to machine learning?"
]

for demo_question in demo_questions:
    demo_rag_systems(demo_question)
    print("\n")

# You can also try your own questions by uncommenting and modifying the line below:
# demo_rag_systems("Your question here")

## Summary

In this notebook, we explored various RAG approaches with DSPy:

1. **Simple Retriever**: Built a TF-IDF based document retriever
2. **Basic RAG**: Combined retrieval with generation
3. **Advanced RAG**: Added citation tracking for transparency
4. **Multi-Query RAG**: Used query expansion for better retrieval coverage
5. **Evaluation**: Created metrics to compare RAG system performance
6. **Interactive Demo**: Tested different approaches with various questions

Key takeaways:
- RAG systems combine retrieval and generation for more accurate, contextual answers
- Different retrieval strategies can significantly impact answer quality
- Citations and transparency features improve trust in AI-generated answers
- Query expansion can help retrieve more diverse and relevant information
- Systematic evaluation helps compare and improve RAG approaches

Next steps could include:
- Using more sophisticated retrievers (e.g., dense embeddings, vector databases)
- Implementing re-ranking mechanisms
- Adding filtering and fact-checking capabilities
- Optimizing the RAG pipeline with DSPy optimizers