[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/your-username/your-repo/blob/main/path/to/notebook.ipynb)


# Document QA System with Agent Routing

## Overview

This project implements an advanced question-answering system that can:

- Load documents from various sources (PDF, DOCX, TXT, CSV)

- Process documents into smaller chunks

- Generate and store embeddings in ChromaDB

- Perform semantic search

- Generate answers using a language model (LLM)

- Use an agent-based architecture for modularity

The system is built using Langchain, OpenAI API, ChromaDB, and other supporting libraries.

## 📓 How These Components Work Together
1️⃣ **Document Loading**
- Reads different file formats (PDF, DOCX, TXT, CSV) and extracts text.

2️⃣ **Text Chunking**
- Splits large documents into small, overlapping segments to improve retrieval accuracy.

3️⃣ **Embedding Generation**
- Converts text chunks into mathematical vector representations using Sentence Transformers.

4️⃣ **Storing Embeddings**
- Saves vector embeddings into ChromaDB, allowing fast similarity search.

5️⃣ **Query Processing**
- Converts user queries into vectors and retrieves the most relevant text chunks.

6️⃣ **Answer Generation**
- Feeds retrieved chunks into GPT (OpenAI LLM) to generate a human-like response.


# 📌 Cell 1 - Required Packages

Installs necessary libraries for document processing, embeddings, vector storage, and AI-powered responses.

- **Langchain:** Manages language model interactions.

- **ChromaDB:** Stores embeddings.

- **Sentence-Transformers:** Converts text to vector embeddings.

- **OpenAI:** Provides the LLM (like GPT-4).

- **Python-docx:** Handles .docx files.

- **PyPDF2:** Reads PDF files.

- **pandas:** Processes CSV files.

In [1]:
# Cell 1 - Install Required Packages
! pip install langchain chromadb sentence-transformers openai python-docx PyPDF2 pandas
! pip install -U langchain-community



# 📌 Cell 2 : Import Dependencies

## 🔹 General Purpose Imports
- os: Provides functions for interacting with the operating system (e.g., - setting API keys, file handling).
- datetime: Handles date and time formatting.
- typing: Defines type hints (List, Dict, Union) for better code readability.
- docx: Allows reading and writing .docx (Microsoft Word) files.
- PyPDF2: Extracts text from .pdf files.
- pandas: Handles .csv file processing.
- io: Provides file handling utilities.

## 🔹 Langchain and AI-Specific Imports
- RecursiveCharacterTextSplitter: Splits large text files into smaller chunks while maintaining context.
- SentenceTransformerEmbeddings: Converts text into vector embeddings for similarity search.
- Chroma: A vector database that efficiently stores and retrieves embeddings.
- OpenAI: Integrates GPT-based AI models for generating responses.
- RetrievalQA: Implements question-answering based on retrieved document chunks.
- Chromadb: Directly interacts with ChromaDB, a high-speed vector database.

In [2]:
# Cell 2 - Imports Dependencies
import os
from typing import List, Dict, Union
from datetime import datetime
import docx
import PyPDF2
import pandas as pd
import io
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import SentenceTransformerEmbeddings
from langchain.vectorstores import Chroma
from langchain.llms import OpenAI
from langchain.chains import RetrievalQA
import chromadb

# Setting Up OpenAI API Key

***os.environ["OPENAI_API_KEY"] = "your-api-key"***

This sets the environment variable required to access OpenAI’s GPT models.

In [1]:
# Cell 3 - API Key
import os
os.environ["OPENAI_API_KEY"] = ""

# 📌 Cell 4 - Base Agent Implementation

The BaseAgent class is a foundational class for implementing different agents in your system. Let's break it down:

## Explanation
- **Purpose:** The BaseAgent class serves as a blueprint for all agents. It ensures that all agent classes follow a consistent structure.

- **Attributes:**
  - name: Stores the agent's name for identification.

- **Methods:**
  - __init__: Initializes the agent with a name.
  - execute: A placeholder method that must be implemented by subclasses.

In [22]:
# Cell 4 - Base Agent Implementation
class BaseAgent:
    def __init__(self, name: str):
        self.name = name

    def execute(self, *args, **kwargs):
        raise NotImplementedError

# 📌 Cell 5 -  Document Loading Agent

## **Overview**
The **DocumentLoadingAgent** is responsible for loading and extracting text from various file formats.  
It supports both:
- **File paths** (for reading from disk)
- **Raw file content** (for in-memory processing)

This agent ensures smooth document handling for multiple formats, including:
- **TXT** (Plain text files)
- **PDF** (Extracting text from PDF pages)
- **DOCX** (Microsoft Word documents)
- **CSV** (Comma-separated values)

In [23]:
# Cell 5 - Document Loading Implementation

class DocumentLoadingAgent(BaseAgent):
    """
    An agent responsible for loading document content from various file types.
    It supports reading from file paths or in-memory file content.
    """
    def __init__(self):
        super().__init__("DocumentLoader")  # Initialize with a name for easy identification

    def execute(self, file_path: str = None, file_content: bytes = None, file_type: str = None) -> str:
        """
        Loads a document's content based on the given file type.

        Parameters:
            - file_path (str): Path to the document file (if reading from disk).
            - file_content (bytes): Raw file content (if reading from memory).
            - file_type (str): The type of file (e.g., 'txt', 'pdf', 'docx', 'csv').

        Returns:
            - str: The extracted text content of the document.
        """
        print(f"🤖 {self.name}: Loading document...")  # Logging for tracking execution

        # If a file path is provided, determine the file type if not explicitly given
        if file_path:
            if not file_type:
                file_type = file_path.split('.')[-1].lower()  # Extracts file extension to determine file type

            try:
                # Handling different file types
                if file_type == 'txt':
                    # Reading a plain text file
                    with open(file_path, 'r', encoding='utf-8') as file:
                        return file.read()

                elif file_type == 'pdf':
                    # Extracting text from a PDF file
                    text = ""
                    with open(file_path, 'rb') as file:
                        pdf_reader = PyPDF2.PdfReader(file)
                        for page in pdf_reader.pages:
                            text += page.extract_text() + "\n"  # Extract text from each page
                    return text

                elif file_type == 'docx':
                    # Extracting text from a Word document
                    doc = docx.Document(file_path)
                    return "\n".join([paragraph.text for paragraph in doc.paragraphs])  # Join paragraphs with line breaks

                elif file_type == 'csv':
                    # Reading a CSV file as a string
                    df = pd.read_csv(file_path)
                    return df.to_string()  # Converts the DataFrame to a readable string

            except Exception as e:
                print(f"Error loading document: {str(e)}")  # Print error if file reading fails
                return ""

        # If the file is provided as raw content instead of a path (useful for in-memory processing)
        elif file_content:
            try:
                if file_type == 'txt':
                    return file_content.decode('utf-8')  # Decode raw bytes into a text string

                elif file_type == 'pdf':
                    # Extracting text from an in-memory PDF file
                    pdf_reader = PyPDF2.PdfReader(io.BytesIO(file_content))
                    text = ""
                    for page in pdf_reader.pages:
                        text += page.extract_text() + "\n"
                    return text

                elif file_type == 'docx':
                    # Extracting text from an in-memory Word file
                    doc = docx.Document(io.BytesIO(file_content))
                    return "\n".join([paragraph.text for paragraph in doc.paragraphs])

                elif file_type == 'csv':
                    # Reading an in-memory CSV file
                    df = pd.read_csv(io.BytesIO(file_content))
                    return df.to_string()

            except Exception as e:
                print(f"Error loading document content: {str(e)}")  # Print error if reading fails
                return ""

        return ""  # Return an empty string if no valid input was provided


# 📌 Cell 6 -  Document Processing Agent

## **Overview**
The **DocumentProcessingAgent** is responsible for splitting large documents into smaller, manageable chunks.  
This is crucial for:
- **Efficient text retrieval** in a vector database.
- **Improved semantic search** by preserving context.
- **Better performance** when querying a language model.


### **🔹 Uses RecursiveCharacterTextSplitter, which:**
  - Splits text into chunks of 1000 characters.
  - Maintains 200-character overlap between chunks to preserve context.

The **DocumentProcessingAgent:**
  - ✅ Breaks large documents into smaller, structured pieces
  - ✅ Ensures smooth vector search and retrieval
  - ✅ Preserves contextual information with chunk overlap
  - ✅ Enhances the performance of AI-based Q&A systems



In [24]:
# Cell 6 - Document Processing Agent

class DocumentProcessingAgent(BaseAgent):
    def __init__(self):
        """
        Initializes the DocumentProcessingAgent.
        - Sets the agent name as "DocumentProcessor".
        - Defines the text splitter using RecursiveCharacterTextSplitter.
        """
        super().__init__("DocumentProcessor")  # Call the parent class (BaseAgent) constructor with the name "DocumentProcessor"

        # Create a text splitter that:
        # - Splits the document into chunks of 1000 characters.
        # - Maintains an overlap of 200 characters to preserve context between chunks.
        self.text_splitter = RecursiveCharacterTextSplitter(
            chunk_size=1000,   # Each chunk will contain 1000 characters.
            chunk_overlap=200  # Ensures overlapping content for better context retention.
        )

    def execute(self, document: str) -> List[str]:
        """
        Processes the given document by splitting it into smaller chunks.

        Args:
            document (str): The full text content of the document.

        Returns:
            List[str]: A list of text chunks.
        """

        # Log the start of the document processing.
        print(f"🤖 {self.name}: Processing document into chunks...")

        # Perform the text splitting operation.
        chunks = self.text_splitter.split_text(document)

        # Log the number of chunks created.
        print(f"Created {len(chunks)} chunks")

        # Return the list of text chunks.
        return chunks


## 📌 Cell 7 - Embedding Agent Implementation
### 📍 Objective:
- Convert text chunks into numerical embeddings using a pre-trained transformer model.
- Store embeddings in ChromaDB for efficient retrieval and similarity search.
- Ensure embeddings are persistent and can be retrieved later.
- Provide collection statistics for monitoring stored documents.


---


## 🔑 Key Concepts Explained

### 1️⃣ What is an Embedding?  
An **embedding** is a numerical representation of text that captures **semantic meaning**.  
Instead of raw words, embeddings represent sentences as **high-dimensional vectors in space**.  
These vectors allow **similarity searches** (e.g., finding documents with similar meaning).  

**📌 Example:**  
- `"I love programming"` → `[0.23, -0.56, 0.87, …]` (Vector representation).  
- `"Coding is fun"` → `[0.22, -0.55, 0.86, …]` (Very similar vector).  

---

### 2️⃣ What is ChromaDB?  
**ChromaDB** is an **open-source vector database** designed to **store and retrieve embeddings efficiently**.  

- It enables **fast similarity searches** based on numerical vector comparisons.  
- Ideal for **semantic search, recommendation systems, and AI-powered chatbots**.  

**📌 Why use ChromaDB?**  
✅ Stores embeddings efficiently.  
✅ Enables **fast document search** based on meaning, not keywords.  
✅ Supports **persistent storage** (saves embeddings between runs).  

---

### 3️⃣ What is a Sentence Transformer?  
A **Sentence Transformer** is a model that generates **meaningful text embeddings**.  

- We use **all-MiniLM-L6-v2**, a **lightweight yet powerful model**.  
- Converts **sentences into embeddings** that capture **context and meaning**.  

**📌 Example:**  
- `"Artificial Intelligence is fascinating"` → **Embedding:** `[0.45, -0.12, 0.99, …]`  
- `"AI is interesting"` → **Very similar embedding** (because they mean the same).  

---

### 4️⃣ What is a Vector Store?  
A **vector store** is a **database for embeddings**. It allows:  

- **Storing embeddings persistently.**  
- **Fast retrieval** of similar documents.  
- **Optimized AI-powered search.**  

**📌 ChromaDB is our Vector Store.**  


In [25]:
# Cell 7 - Embedding Agent Implementation
# This class is responsible for converting text chunks into numerical embeddings
# and storing them in ChromaDB for efficient retrieval.

class EnhancedEmbeddingAgent(BaseAgent):
    def __init__(self):
        """
        Initializes the EnhancedEmbeddingAgent.
        - Sets the agent name as "EnhancedEmbeddingGenerator".
        - Loads the SentenceTransformer model for text embeddings.
        - Initializes vector store and collection name.
        """
        super().__init__("EnhancedEmbeddingGenerator")  # Initialize the base agent with a name.

        # Load a pre-trained embedding model (MiniLM) for efficient text representation.
        self.embedding_model = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")

        # Placeholder for the ChromaDB vector store.
        self.vector_store = None

        # Name of the ChromaDB collection where embeddings will be stored.
        self.collection_name = "document_collection"

    def create_collection(self):
        """
        Creates a new ChromaDB collection or retrieves an existing one.

        Returns:
            collection (Chroma Collection): The ChromaDB collection object.
        """
        client = chromadb.Client()  # Initialize a ChromaDB client.

        try:
            # Attempt to create a new collection with metadata.
            collection = client.create_collection(
                name=self.collection_name,
                metadata={"description": "Document embeddings collection"}
            )
            print(f"Created new collection: {self.collection_name}")
            return collection

        except ValueError:
            # If the collection already exists, retrieve it instead.
            collection = client.get_collection(self.collection_name)
            print(f"Using existing collection: {self.collection_name}")
            return collection

    def execute(self, chunks: List[str]) -> Chroma:
        """
        Generates embeddings for text chunks and stores them in ChromaDB.

        Args:
            chunks (List[str]): List of document text chunks.

        Returns:
            Chroma: The vector store containing document embeddings.
        """

        print(f"🤖 {self.name}: Generating embeddings and storing in ChromaDB...")

        # Create embeddings and store them in ChromaDB.
        self.vector_store = Chroma.from_texts(
            texts=chunks,  # The document chunks to be embedded.
            embedding=self.embedding_model,  # The embedding model used.
            persist_directory="./chroma_db",  # Directory to store persistent embeddings.
            collection_name=self.collection_name,  # Name of the ChromaDB collection.
            collection_metadata={
                "document_count": len(chunks),  # Number of documents being stored.
                "created_at": str(datetime.now()),  # Timestamp for tracking.
                "embedding_model": "all-MiniLM-L6-v2"  # Model used for embedding.
            }
        )

        # Save the embeddings to persistent storage.
        self.vector_store.persist()

        print(f"Stored {len(chunks)} document chunks in ChromaDB")

        # Return the vector store for further processing.
        return self.vector_store

    def get_collection_stats(self):
        """
        Retrieves statistics about the ChromaDB collection.

        Returns:
            dict: A dictionary containing collection statistics.
        """

        # If no vector store exists, return None.
        if self.vector_store is None:
            return None

        # Return key statistics about the ChromaDB collection.
        return {
            "total_documents": len(self.vector_store.get()["ids"]),  # Number of stored documents.
            "embedding_function": str(self.vector_store._embedding_function),  # Embedding model used.
            "persist_directory": self.vector_store._persist_directory  # Directory where embeddings are saved.
        }


# 📌 Cell 8 - Query Processing Agent

## 🎯 Objective:
The **Query Processing Agent** is responsible for transforming user queries into numerical embeddings using a **Sentence Transformer model**. These embeddings allow efficient similarity searches and semantic understanding.

---

## 🔑 Key Components:

### 1️⃣ SentenceTransformerEmbeddings
- Uses the **all-MiniLM-L6-v2** model to generate embeddings.
- Converts text queries into high-dimensional vectors.
- Helps in **semantic search** by representing similar meanings with close vector values.

### 2️⃣ Query Embedding Generation
- The agent takes a **user query** as input.
- Generates an **embedding vector** representing the query’s meaning.
- Returns the query along with its vector representation.

### 3️⃣ `execute()` Method
- **Input:** A text query.
- **Processing:** Converts the query into an embedding.
- **Output:** A dictionary containing:
  - `"query"` → Original input query.
  - `"embedding"` → Generated vector representation.

📌 **Why is this important?**\
✅ Enables **fast** and **meaningful** document retrieval.  
✅ Helps in **semantic similarity** searches.  
✅ Converts user queries into structured numerical representations.



In [26]:
# Cell 8 - Query Processing Implementation

class QueryProcessingAgent(BaseAgent):
    def __init__(self):
        """
        Initializes the QueryProcessingAgent.
        This agent is responsible for processing user queries and generating embeddings.
        """
        super().__init__("QueryProcessor")

        # Load a pre-trained Sentence Transformer model for generating embeddings.
        # We use "all-MiniLM-L6-v2" for efficient and meaningful text representation.
        self.embedding_model = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")

    def execute(self, query: str) -> Dict:
        """
        Processes a given query and generates its embedding.

        Args:
        query (str): The input text query.

        Returns:
        Dict: A dictionary containing the original query and its corresponding embedding.
        """
        print(f"🤖 {self.name}: Processing query...")

        # Convert the query into an embedding (vector representation).
        query_embedding = self.embedding_model.embed_query(query)

        # Return the query along with its generated embedding.
        return {"query": query, "embedding": query_embedding}


# 📌 Cell 9 - Enhanced Search Agent

## 🎯 Objective:
The **Enhanced Search Agent** is responsible for performing **semantic search** on document embeddings stored in **ChromaDB**. It retrieves the most relevant results based on similarity scores.

---

## 🔑 Key Components:

### 1️⃣ **Vector Store (ChromaDB)**
- Stores document embeddings for fast retrieval.
- Enables **similarity-based search** instead of keyword matching.

### 2️⃣ **Semantic Search (`execute()`)**
- **Input:** Query embedding (`query_data`).
- **Processing:**  
  - Uses `similarity_search_with_score()` to find **k** most relevant results.
  - Filters results based on a **score threshold** (default: `0.5`).
- **Output:** A list of relevant results containing:
  - `"text"` → Extracted content.
  - `"score"` → Similarity score.
  - `"metadata"` → Additional information (if available).

### 3️⃣ **Maximum Marginal Relevance (MMR) Search (`execute_mmr_search()`)**
- **Input:** Query embedding (`query_data`).
- **Processing:**
  - Uses `max_marginal_relevance_search()` to **diversify results**.
  - Fetches **fetch_k** results and selects **k** most relevant but diverse ones.
  - `lambda_mult` controls the balance between relevance and diversity.
- **Output:** A list of diverse and relevant results.

### 🔄 Reflection Flow: Ensuring Relevant Chunks

**🏗️ Process Flow:**

```text
📢 START Reflection Process
   │
   │-- 🔎 Perform Initial Semantic Search
   │    │
   │    ├── ✅ Got relevant chunks? → Return results 🎉
   │    │
   │    ├── ❌ No good chunks? → Retry Process 🔄
   │         │
   │         ├── 🔄 Attempt 1: Increase `k`
   │         │      ├── ✅ Got relevant chunks? → Return results 🎉
   │         │      ├── ❌ No good chunks? → Retry Again 🔄
   │         │
   │         ├── 🔄 Attempt 2: Lower `score_threshold`
   │         │      ├── ✅ Got relevant chunks? → Return results 🎉
   │         │      ├── ❌ No good chunks? → Retry Again 🔄
   │         │
   │         ├── 🔄 Attempt 3: Switch to MMR search
   │         │      ├── ✅ Got relevant chunks? → Return results 🎉
   │         │      ├── ❌ No relevant chunks even in MMR → FAIL ❌
   │
   └── 📢 Reflection Completed
  ```
## Example Logs for Each Reflection Step
✅ Scenario 1: Successful Search (No Reflection Needed)
```text
📢 ====================== SEARCH EXECUTION STARTED ======================
🔎 Query Received: "What is LangChain?"
🔢 Initial Parameters → k: 3, Score Threshold: 0.5

🔄 **Attempt 1** (k=3, Threshold=0.5)...

📜 **Raw Retrieved Results Before Filtering:**
  🔹 Doc 1: Score=0.76, Content Snippet="LangChain is a framework for AI-powered applications..."
  🔹 Doc 2: Score=0.63, Content Snippet="LangChain integrates with ChromaDB for retrieval..."
  🔹 Doc 3: Score=0.55, Content Snippet="LangChain supports RAG-based processing..."

✅ **Post-Filtering:** 3 results above threshold 0.5
🎯 **Reflection: Retrieved relevant chunks, returning results!**
📢 ====================== SEARCH EXECUTION COMPLETED ======================
```

❌ Scenario 2: First Retry (Increase k)
```text
⚠️ **Reflection: No good chunks found, increasing `k` to fetch more results.**

🔄 **Attempt 2** (k=5, Threshold=0.5)...
📜 **Raw Retrieved Results Before Filtering:**
  🔹 Doc 1: Score=0.75, Content Snippet="LangChain can optimize search efficiency..."
  🔹 Doc 2: Score=0.62, Content Snippet="LangChain leverages vector search..."

✅ **Post-Filtering:** 2 results above threshold 0.5
🎯 **Reflection: Retrieved relevant chunks, returning results!**
📢 ====================== SEARCH EXECUTION COMPLETED ======================
```

❌ Scenario 3: Second Retry (Lower Score Threshold)
```text
⚠️ **Reflection: Still no good chunks, lowering `score_threshold` to be less strict.**

🔄 **Attempt 3** (k=5, Threshold=0.4)...
📜 **Raw Retrieved Results Before Filtering:**
  🔹 Doc 1: Score=0.48, Content Snippet="LangChain helps build AI chatbots..."
  🔹 Doc 2: Score=0.42, Content Snippet="LangChain uses OpenAI's LLM APIs..."

✅ **Post-Filtering:** 2 results above threshold 0.4
🎯 **Reflection: Retrieved relevant chunks, returning results!**
📢 ====================== SEARCH EXECUTION COMPLETED ======================
```

If that fails, expand search (e.g., increase fetch_k).

📌 **Why is this important?**\
✅ Finds **meaningful** results based on semantics, not just keywords.  
✅ Uses a **score threshold** to filter out irrelevant matches.  
✅ Supports **MMR search**, preventing redundant results.  
✅ Helps build **AI-powered search engines** with ChromaDB.



In [27]:
# Cell 9 - Enhanced Search Agent (Detailed Logging Added)

class EnhancedSearchAgent(BaseAgent):
    def __init__(self, vector_store: Chroma):
        """
        🎯 Initializes the Enhanced Search Agent with deep logging for reflection.
        - `vector_store`: The ChromaDB instance containing document embeddings.
        """
        super().__init__("EnhancedSearcher")
        self.vector_store = vector_store

    def execute(self, query_data: Dict, k: int = 3, score_threshold: float = 0.5, retry_attempts: int = 2) -> List[Dict]:
        """
        🔎 Perform a **semantic search** using ChromaDB with reflection and deep logs.

        - **Reflection Logs:**
            1. Logs query input, initial parameters, and changes at each step.
            2. Shows retrieved chunks & scores before filtering.
            3. Explains why a retry is happening (not enough results, low score, etc.).

        - **Input:** `query_data` (contains query text and embedding).
        - **Output:** List of relevant documents with `"text"`, `"score"`, and `"metadata"`.
        """
        print("\n📢 ====================== SEARCH EXECUTION STARTED ======================")
        print(f"🔎 Query Received: {query_data['query']}")
        print(f"🔢 Initial Parameters → k: {k}, Score Threshold: {score_threshold}\n")

        attempt = 0
        while attempt <= retry_attempts:
            print(f"🔄 **Attempt {attempt + 1}** (k={k}, Threshold={score_threshold})...")

            # Retrieve top `k` relevant documents with similarity scores
            results = self.vector_store.similarity_search_with_score(
                query_data["query"],
                k=k
            )

            # Print retrieved results before filtering
            print("\n📜 **Raw Retrieved Results Before Filtering:**")
            for i, (doc, score) in enumerate(results):
                print(f"  🔹 Doc {i+1}: Score={score:.4f}, Content Snippet={doc.page_content[:100]}...")

            # Filter results based on similarity score threshold
            filtered_results = [
                {
                    "text": doc.page_content,
                    "score": score,
                    "metadata": doc.metadata if hasattr(doc, 'metadata') else {}
                }
                for doc, score in results
                if score >= score_threshold  # Only include results above threshold
            ]

            print(f"\n✅ **Post-Filtering:** {len(filtered_results)} results above threshold {score_threshold}")

            # Reflection: If we got enough results, break loop
            if len(filtered_results) > 0:
                print("✅ **Reflection: Retrieved relevant chunks, returning results!**")
                print("📢 ====================== SEARCH EXECUTION COMPLETED ======================\n")
                return filtered_results

            # If no relevant chunks, retry with different parameters
            attempt += 1
            if attempt == 1:
                print("⚠️ **Reflection: No good chunks found, increasing `k` to fetch more results.**")
                k += 2  # Fetch more results
            elif attempt == 2:
                print("⚠️ **Reflection: Still no good chunks, lowering `score_threshold` to be less strict.**")
                score_threshold -= 0.1  # Lower filtering strictness
            elif attempt == 3:
                print("⚠️ **Reflection: No success, switching to MMR search for diverse results.**")
                return self.execute_mmr_search(query_data, k=5, fetch_k=15, lambda_mult=0.7)

        print("❌ **Reflection: All retries failed. No relevant results found.**")
        print("📢 ====================== SEARCH EXECUTION COMPLETED ======================\n")
        return []

    def execute_mmr_search(self, query_data: Dict, k: int = 3, fetch_k: int = 10, lambda_mult: float = 0.5) -> List[Dict]:
        """
        🔄 **Maximum Marginal Relevance (MMR) Search** with reflection.

        - **Reflection:** Used as a backup strategy when regular search fails.
        - **Diversity Trade-Off:** `lambda_mult` controls balance between **relevance vs. diversity**.
        """
        print(f"\n🤖 {self.name}: Performing MMR search...")

        # Retrieve diverse and relevant documents using MMR
        results = self.vector_store.max_marginal_relevance_search(
            query_data["query"],
            k=k,
            fetch_k=fetch_k,
            lambda_mult=lambda_mult
        )

        print(f"📌 MMR search retrieved {len(results)} results.\n")

        # Print retrieved MMR results
        for i, doc in enumerate(results):
            print(f"  🔹 MMR Doc {i+1}: Content Snippet={doc.page_content[:100]}...")

        return [{"text": doc.page_content, "metadata": doc.metadata} for doc in results]


## 📌 Cell 10 -  LLM Agent Implementation

**🎯 Objective:**

The LLM Agent is responsible for generating human-like responses based on a given query and relevant document context using OpenAI's language model.


---



**🔐 Key Components**

1️⃣ Language Model (LLM):

- Uses OpenAI(temperature=0.7) for response generation.

- A higher temperature (0.7) introduces controlled creativity in responses.

2️⃣ Contextual Query Processing:

- Retrieves relevant context from document embeddings.

- Constructs a dynamic prompt to guide the model's response.

3️⃣ Prompt Engineering:

- Merges query and context into a structured prompt.

- Enhances accuracy and relevance of generated responses.

In [28]:
# 📌 Cell 10 - LLM Agent
class LLMAgent(BaseAgent):
    def __init__(self):
        """
        🎯 Initializes the LLM Agent.
        - Uses OpenAI's language model with a temperature of 0.7 for response generation.
        """
        super().__init__("LLMResponder")
        self.llm = OpenAI(temperature=0.7)  # Initializes OpenAI model with moderate creativity

    def execute(self, query: str, context: List[Dict]) -> str:
        """
        🤖 **Generates a response based on retrieved document context.**

        - **Input:**
          - `query`: The user's input question.
          - `context`: A list of retrieved document chunks relevant to the query.

        - **Processing:**
          1. Extracts **text** from the retrieved context.
          2. Constructs a **structured prompt** for LLM using the query and context.
          3. Calls OpenAI's model to generate a response.

        - **Output:** A human-like response relevant to the query.
        """
        print(f"🤖 {self.name}: Generating response...")

        # Format retrieved context into readable text
        context_text = "\n".join([f"Context {i+1}: {doc['text']}" for i, doc in enumerate(context)])

        # 🔹 Structured prompt for LLM
        prompt = f"""
        Based on the following context, please answer the question.

        Context:
        {context_text}

        Question: {query}

        Answer:
        """

        # 🔹 Generate response using OpenAI's language model
        response = self.llm.generate([prompt])

        # Extract the generated text response
        return response.generations[0][0].text


# 📌 Cell 11 - Orchestrator Implementation

## 🔍 Objective:
The **Orchestrator** class is responsible for managing the entire document processing pipeline, from loading and embedding documents to answering queries using a Language Model (LLM).

## 🛠️ Key Components:

1️⃣ **Document Loading Agent (`doc_loader`)**  
   - Loads documents from various formats (TXT, PDF, DOCX, CSV).  

2️⃣ **Document Processing Agent (`doc_processor`)**  
   - Splits the document into smaller, manageable chunks for embedding.  

3️⃣ **Enhanced Embedding Agent (`embedding_agent`)**  
   - Converts text chunks into embeddings and stores them in **ChromaDB** for efficient retrieval.  

4️⃣ **Query Processing Agent (`query_processor`)**  
   - Converts user queries into embeddings for semantic search.  

5️⃣ **LLM Agent (`llm_agent`)**  
   - Uses an **OpenAI-powered** model to generate meaningful responses based on retrieved document context.  

6️⃣ **Search Agent (`search_agent`)**  
   - Performs **semantic search** or **Maximum Marginal Relevance (MMR) search** to find the most relevant document chunks.  

---

## ⚙️ Methods:

### 🔹 `process_document(file_path, file_content, file_type)`
📌 **Function:** Loads, processes, and indexes a document into **ChromaDB**.  
✅ **Steps:**
- Loads document content.
- Splits text into **chunks**.
- Generates **embeddings** and stores them in **ChromaDB**.
- Initializes the **search agent** for future queries.

🔄 **Returns:** `"Document processed and indexed successfully!"` or `"Failed to load document!"`  

---

### 🔹 `answer_question(query, use_mmr=False)`
📌 **Function:** Answers a user query using the indexed document.  
✅ **Steps:**
- Converts the query into an **embedding**.
- Retrieves **relevant document chunks** using **semantic search** or **MMR search**.
- Feeds retrieved context into the **LLM model** to generate a response.

🔄 **Returns:** A generated response based on retrieved document context.  
⚠️ **Note:** If no document is processed first, it returns `"Please process a document first!"`  

---

## 📖 Example Usage:

```python
orchestrator = Orchestrator()

# Process a document
orchestrator.process_document(file_path="sample.pdf")

# Ask a question based on the document
response = orchestrator.answer_question("What is the main topic of the document?")
print(response)


In [29]:
# Cell 11 - Orchestrator Implementation
class Orchestrator:
    def __init__(self):
        self.doc_loader = DocumentLoadingAgent()
        self.doc_processor = DocumentProcessingAgent()
        self.embedding_agent = EnhancedEmbeddingAgent()
        self.query_processor = QueryProcessingAgent()
        self.llm_agent = LLMAgent()
        self.search_agent = None

    def process_document(self, file_path: str = None, file_content: bytes = None, file_type: str = None):
        # Load document
        document = self.doc_loader.execute(file_path, file_content, file_type)
        if not document:
            return "Failed to load document!"

        # Process document
        chunks = self.doc_processor.execute(document)
        vector_store = self.embedding_agent.execute(chunks)
        self.search_agent = EnhancedSearchAgent(vector_store)
        return "Document processed and indexed successfully!"

    def answer_question(self, query: str, use_mmr: bool = False) -> str:
        if not self.search_agent:
            return "Please process a document first!"

        query_data = self.query_processor.execute(query)

        if use_mmr:
            search_results = self.search_agent.execute_mmr_search(query_data)
        else:
            search_results = self.search_agent.execute(query_data)

        response = self.llm_agent.execute(query, search_results)
        return response


# 📌 Cell 12 - Utility Implementation

## 🔍 Objective:
The `handle_upload()` function is responsible for handling **file uploads** in a **Google Colab environment**.  
It processes the uploaded files and indexes them using the **Orchestrator**.

---

## 🛠️ Key Components:

1️⃣ **File Upload Handling (`files.upload()`)**  
   - Uses Google Colab's `files.upload()` method to allow users to upload documents.  
   - Supports multiple file uploads.  

2️⃣ **Orchestrator Initialization (`orchestrator = Orchestrator()`)**  
   - Creates an instance of the **Orchestrator** to process documents.  

3️⃣ **File Processing (`orchestrator.process_document()`)**  
   - Extracts file content and determines the file type.  
   - Passes the content to the **Orchestrator** for document processing and embedding.  

---

## ⚙️ Function Definition:

### 🔹 `handle_upload()`
📌 **Function:** Handles file uploads and processes them using the **Orchestrator**.  

✅ **Steps:**
- Uploads files using **Google Colab's** `files.upload()`.
- Iterates over the uploaded files.
- Extracts **file content** and **file type**.
- Calls `orchestrator.process_document()` to process and index the document.
- Prints the result of each file's processing.

🔄 **Returns:**  
- The initialized `Orchestrator` instance with processed documents.  

---

## 📖 Example Usage:

```python
orchestrator = handle_upload()

# Now, you can ask queries based on the uploaded documents:
response = orchestrator.answer_question("What is the summary of the document?")
print(response)


In [30]:
# Cell 12 - Utility Implementation
from google.colab import files  # Import the 'files' object from google.colab

def handle_upload():
    """Handle file upload through Colab"""
    uploaded = files.upload()
    orchestrator = Orchestrator()

    for filename, content in uploaded.items():
        file_type = filename.split('.')[-1].lower()
        result = orchestrator.process_document(
            file_content=content,
            file_type=file_type
        )
        print(f"Processing {filename}: {result}")

    return orchestrator

# 📌 Cell 13 - Upload Your File

## 🔍 Objective:
This cell prompts the user to **upload a document**, which is then processed by the `handle_upload()` function.

---

## 🛠️ Key Components:

1️⃣ **User Prompt (`print("Upload a document:")`)**  
   - Displays a message instructing the user to upload a file.  

2️⃣ **File Upload & Processing (`handle_upload()`)**  
   - Calls the `handle_upload()` function, which:
     - Handles file uploads.
     - Processes and indexes the uploaded document.
     - Returns an instance of the `Orchestrator`.

3️⃣ **Orchestrator Initialization (`orchestrator = handle_upload()`)**  
   - Stores the returned `Orchestrator` instance, enabling **query-based searches** on the uploaded document.

---

## ⚙️ Code Implementation:

```python
# Display upload prompt
print("Upload a document:")

# Handle file upload and initialize the orchestrator
orchestrator = handle_upload()


In [31]:
# Cell 13 - Upload Your File
print("Upload a document:")
orchestrator = handle_upload()

Upload a document:


Saving CONTRACTOR INSURANCE REQUIREMENTS.docx to CONTRACTOR INSURANCE REQUIREMENTS (1).docx
🤖 DocumentLoader: Loading document...
🤖 DocumentProcessor: Processing document into chunks...
Created 9 chunks
🤖 EnhancedEmbeddingGenerator: Generating embeddings and storing in ChromaDB...
Stored 9 document chunks in ChromaDB
Processing CONTRACTOR INSURANCE REQUIREMENTS (1).docx: Document processed and indexed successfully!


# 📌 Cell 14 - Query Execution

## 🔍 Objective:
This snippet **queries** the document processed by the `Orchestrator` to obtain an AI-generated response.

---

## 🛠️ Key Components:

1️⃣ **Define a Query (`question = "What is the main topic of the document?"`)**  
   - Sets the **user query** to extract relevant information from the document.  

2️⃣ **Get AI-Generated Answer (`answer = orchestrator.answer_question(question)`)**  
   - Calls the `answer_question()` method of the `Orchestrator`, which:
     - Converts the query into an embedding.
     - Searches for relevant document sections.
     - Uses an LLM to generate a response based on context.

3️⃣ **Display Results (`print(...)`)**  
   - Prints both the query and the AI-generated response.

---

## ⚙️ Code Implementation:

```python
# Define the query
question = "What is the main topic of the document?"

# Get AI-generated answer
answer = orchestrator.answer_question(question)

# Display the question and answer
print("\nQuestion:", question)
print("Answer:", answer)


In [32]:
# Cell 14 - Query Execution
question = "What is the main topic of the document?"
answer = orchestrator.answer_question(question)
print("\nQuestion:", question)
print("Answer:", answer)


🤖 QueryProcessor: Processing query...

🔎 Query Received: What is the main topic of the document?
🔢 Initial Parameters → k: 3, Score Threshold: 0.5

🔄 **Attempt 1** (k=3, Threshold=0.5)...

📜 **Raw Retrieved Results Before Filtering:**
  🔹 Doc 1: Score=1.7171, Content Snippet=Bodily Injury by accident:

Bodily Injury by disease:
Bodily Injury by disease:

$1,000,000 each acc...
  🔹 Doc 2: Score=1.7171, Content Snippet=Bodily Injury by accident:

Bodily Injury by disease:
Bodily Injury by disease:

$1,000,000 each acc...
  🔹 Doc 3: Score=1.7664, Content Snippet=ADDITIONAL INSUREDS TO THE EXTENT OF CONTRACTOR’S LIABILITIES AND INDEMNITIES UNDER THE AGREEMENT, W...

✅ **Post-Filtering:** 3 results above threshold 0.5
🎯 **Reflection: Retrieved relevant chunks, returning results!**

🤖 LLMResponder: Generating response...

Question: What is the main topic of the document?
Answer: 
The main topic of the document is insurance coverage for bodily injury and liability in relation to work performe

# 📌 Cell 15 - Query Execution with Maximum Marginal Relevance (MMR)

## 🔍 Objective:
This snippet **queries** the document while leveraging **Maximum Marginal Relevance (MMR)** to ensure **diverse** and **contextually relevant** results.

---

## 🛠️ Key Components:

1️⃣ **Define a Query (`question = "What is the main topic of the document?"`)**  
   - The user **asks a question** about the uploaded document.  

2️⃣ **Run MMR Search (`orchestrator.answer_question(question, use_mmr=True)`)**  
   - Calls the `answer_question()` method with `use_mmr=True`, which:
     - Performs **Maximum Marginal Relevance (MMR) search**.
     - Ensures **results are not just similar**, but **diverse** to cover multiple perspectives.

3️⃣ **Display the MMR-enhanced Answer (`print(...)`)**  
   - Prints the AI-generated **diverse response**.

---

## ⚙️ Code Implementation:

```python
# Define the query
question = "What is the main topic of the document?"

# Get AI-generated diverse answer using MMR
diverse_answer = orchestrator.answer_question(question, use_mmr=True)

# Display the result
print("\nDiverse Answer (using MMR):", diverse_answer)


In [19]:
# Cell 15 - Try MMR search
diverse_answer = orchestrator.answer_question(question, use_mmr=True)
print("\nDiverse Answer (using MMR):", diverse_answer)



🤖 QueryProcessor: Processing query...
🤖 EnhancedSearcher: Performing MMR search...
📌 MMR search retrieved 3 results.
🤖 LLMResponder: Generating response...

Diverse Answer (using MMR): 
The main topic of the document is contractor insurance requirements for a master service agreement.
