# RAG Assignment (Graded): Building a RAG System

Welcome to your programming assignment on Retrieval Augmented Generation (RAG)! You will build a comprehensive RAG system from scratch.

## Problem Description

- In this assignment, you will build a complete RAG system that combines document retrieval and language model generation to create more accurate and contextual responses.
- The system will implement document chunking, embedding generation, vector storage, similarity search, and response generation.

## Assignment Tasks

**1. Document Processing Implementation**
- Implement the `DocumentProcessor` class for handling various document formats
- Add text extraction and cleaning methods
- Implement document chunking with overlap
- Handle metadata preservation

**2. Embedding Generation**
- Implement `EmbeddingGenerator` class with OpenAI integration
- Add batch processing for embeddings
- Implement caching mechanism
- Add error handling and retry logic

**3. Vector Store Integration**
- Implement `VectorStore` class with FAISS integration
- Add document indexing functionality
- Implement similarity search methods
- Add metadata filtering capabilities

**4. Context Builder**
- Implement `ContextBuilder` class
- Add relevance scoring
- Implement context window management
- Add diversity sampling

**5. Response Generator**
- Implement `ResponseGenerator` class
- Add prompt engineering
- Implement source attribution
- Add fact verification

**6. RAG Pipeline Integration**
- Implement the main `RAGSystem` class
- Add pipeline orchestration
- Implement caching and optimization
- Add evaluation metrics

## Instructions

- Only write code when you see any of the below prompts:
    ```
    # YOUR CODE GOES HERE
    # YOUR CODE ENDS HERE
    # TODO
    ```
- Do not modify any other section of the code unless stated otherwise in the comments.

# Code Section

In [None]:
from typing import Dict, List, Optional, Union
import faiss
import numpy as np
import openai
import json
import time
import os
from tests.test_methods import TestRAGSystem

In [None]:
os.environ["OPENAI_API_KEY"] = "FILL_IN_YOUR_API_KEY"

In [None]:
class DocumentProcessor:
    def __init__(self):
        """Initialize the Document Processor"""
        # TODO: Initialize document processing parameters
        self.chunk_size = 
        self.chunk_overlap = 
        self.supported_formats = 

    def process_document(self, document: Union[str, bytes], format: str) -> List[Dict]:
        """Process document and return chunks with metadata"""
        # TODO: Implement document processing
        # TODO: Extract text based on format
        # TODO: Clean and normalize text
        # TODO: Split into chunks with overlap
        # TODO: Preserve metadata
        pass

In [None]:
class EmbeddingGenerator:
    def __init__(self, api_key: str):
        """Initialize the Embedding Generator"""
        # TODO: Initialize OpenAI client
        # TODO: Set up caching
        pass

    def generate_embeddings(self, texts: List[str]) -> np.ndarray:
        """Generate embeddings for given texts"""
        # TODO: Implement batch embedding generation
        # TODO: Add caching mechanism
        # TODO: Implement retry logic
        # TODO: Handle rate limits
        pass

In [None]:
class VectorStore:
    def __init__(self, dimension: int):
        """Initialize the Vector Store"""
        # TODO: Initialize FAISS index
        # TODO: Set up metadata storage
        pass

    def add_documents(self, documents: List[Dict], embeddings: np.ndarray):
        """Add documents and their embeddings to the store"""
        # TODO: Implement document indexing
        # TODO: Store metadata
        pass

    def similarity_search(self, query_embedding: np.ndarray, k: int = 5) -> List[Dict]:
        """Perform similarity search"""
        # TODO: Implement k-NN search
        # TODO: Return documents with metadata
        pass

In [None]:
class ContextBuilder:
    def __init__(self, max_tokens: int):
        """Initialize the Context Builder"""
        # TODO: Set up context parameters
        pass

    def build_context(self, relevant_docs: List[Dict], query: str) -> str:
        """Build context from relevant documents"""
        # TODO: Implement context selection
        # TODO: Add relevance scoring
        # TODO: Implement context truncation
        pass

In [None]:
class ResponseGenerator:
    def __init__(self, api_key: str):
        """Initialize the Response Generator"""
        # TODO: Initialize OpenAI client
        # TODO: Set up prompt templates
        pass

    def generate_response(self, query: str, context: str) -> Dict:
        """Generate response using context"""
        # TODO: Implement response generation
        # TODO: Add source attribution
        # TODO: Implement fact verification
        pass

In [None]:
class RAGSystem:
    def __init__(self, openai_api_key: str):
        """Initialize the RAG System"""
        # TODO: Initialize all components
        self.doc_processor = DocumentProcessor()
        self.embedding_generator = EmbeddingGenerator(openai_api_key)
        self.vector_store = VectorStore(1536)  # OpenAI embedding dimension
        self.context_builder = ContextBuilder(max_tokens=2000)
        self.response_generator = ResponseGenerator(openai_api_key)

    def add_documents(self, documents: List[Union[str, bytes]], formats: List[str]):
        """Process and add documents to the system"""
        # TODO: Implement document processing pipeline
        # TODO: Generate embeddings
        # TODO: Add to vector store
        pass

    def query(self, query: str) -> Dict:
        """Process query and generate response"""
        # TODO: Implement query pipeline
        # TODO: Generate query embedding
        # TODO: Retrieve relevant documents
        # TODO: Build context
        # TODO: Generate response
        pass

    def evaluate(self, test_queries: List[str], ground_truth: List[str]) -> Dict:
        """Evaluate system performance"""
        # TODO: Implement evaluation metrics
        # TODO: Calculate accuracy
        # TODO: Measure response time
        # TODO: Assess relevance
        pass

In [None]:
## DO NOT MODIFY THE BELOW CODE ##
# Driver code for testing
if __name__ == "__main__":
    # Initialize system
    api_key = os.getenv("OPENAI_API_KEY")
    rag_system = RAGSystem(api_key)

    paper1 = """The Impact of Artificial Intelligence on Modern Healthcare Systems

Recent advances in artificial intelligence have revolutionized healthcare delivery and patient outcomes. Machine learning algorithms have demonstrated remarkable accuracy in diagnostic imaging, with success rates exceeding 95% in detecting early-stage cancers. Natural language processing systems are now capable of analyzing millions of medical records to identify patterns and treatment correlations that human researchers might miss.

A particularly promising application is in predictive analytics for patient care. Studies conducted across 50 major hospitals showed that AI-powered systems reduced hospital readmission rates by 28% through early intervention recommendations. These systems analyze vital signs, medical history, and lifestyle factors to predict potential complications before they become severe.

However, challenges remain in implementation. Data privacy concerns, integration with existing healthcare infrastructure, and the need for continuous model updating present significant hurdles. Healthcare providers must also ensure proper training for medical staff to effectively utilize these new tools while maintaining the human element in patient care.
"""

    paper2 = """Quantum Computing: A Developer's Guide to QASM Implementation

The Quantum Assembly Language (QASM) provides a fundamental interface for quantum circuit description and manipulation. This guide covers essential implementation details for quantum programs.

Basic QASM Operations:
1. Qubit initialization and reset procedures
2. Single-qubit gates (X, Y, Z, H)
3. Two-qubit controlled operations (CNOT, CZ)
4. Measurement operations

Error correction is crucial in quantum computing due to decoherence effects. The surface code implementation requires careful consideration of physical qubit layouts and error thresholds. When designing quantum circuits, developers must account for connectivity constraints and gate fidelity metrics.

Performance optimization techniques include:
- Circuit depth reduction through gate cancellation
- Parallel gate execution where possible
- Strategic qubit mapping to minimize SWAP operations
- Measurement-based feed-forward operations

Testing quantum programs requires both classical simulation for small circuits and hardware validation for scaled implementations. Use standard benchmarking protocols to verify gate fidelities and system coherence times.
"""

    # Test documents
    documents = [
       paper1,
        paper2
    ]
    formats = ["text", "text"]

    # Add documents
    rag_system.add_documents(documents, formats)

    # Test query
    query = "What information can you find about topic AI?"
    result = rag_system.query(query)

    # Print results
    print("\nQuery Result:")
    print(json.dumps(result, indent=2))


# Run tests
tester = TestRAGSystem(rag_system)
results = tester.run_all_tests()