# Lesson 3 : Integrating Components for a Complete RAG Chatbot

# Integrating Components for a Complete RAG Chatbot

Welcome to the third unit of our course on building a RAG-powered chatbot! In the previous units, we've built two essential components: a document processor that handles the retrieval of relevant information and a chat engine that manages conversations with users. Now, it's time to bring these components together to create a complete Retrieval-Augmented Generation (RAG) system.

In this lesson, we'll integrate our document processor and chat engine into a unified `RAGChatbot` class. This integration will create a seamless experience where users can upload documents, ask questions about them, and receive informed responses based on the document content. By the end of this lesson, you'll have a fully functional RAG chatbot that can answer questions about any documents you provide. This represents the culmination of our work so far, bringing together retrieval and generation in a practical, user-friendly system.

Let's start building our integrated RAG chatbot!

---

## Creating the RAGChatbot Class

The first step in our integration is to create a new class that will serve as the main interface for our RAG chatbot. This class will coordinate between the document processor and chat engine components we've already built.

```python
from document_processor import DocumentProcessor
from chat_engine import ChatEngine

class RAGChatbot:
    def __init__(self):
        self.document_processor = DocumentProcessor()
        self.chat_engine = ChatEngine()
```

This initialization is straightforward but powerful. We're creating instances of both our `DocumentProcessor` and `ChatEngine` classes, which we developed in the previous lessons. This class will serve as the coordinator between these components, handling the flow of information from document processing to context retrieval to conversation management.

### Responsibilities:

* **DocumentProcessor**: handles document loading, chunking, embedding, and retrieval
* **ChatEngine**: manages conversation flow and language model interactions
* **RAGChatbot**: coordinates these components and provides a unified interface

This modular architecture makes the system maintainable and extensible.

---

## Implementing Document Management

Let's add a method to upload and process documents:

```python
def upload_document(self, file_path):
    """Upload and process a document"""
    try:
        self.document_processor.process_document(file_path)
        return "Document successfully processed."
    except ValueError as e:
        return f"Error: {str(e)}"
```

This method wraps `process_document` with error handling for unsupported or problematic files.

Also, add a method to reset the document knowledge:

```python
def reset_documents(self):
    """Reset the document processor"""
    self.document_processor.reset()
    return "Document knowledge has been reset."
```

---

## Building the Message Processing Pipeline

Here's the core method that processes user messages and generates responses:

```python
def send_message(self, message):
    """Send a message to the chatbot and get a response"""
    # Retrieve relevant document chunks based on the user's query
    relevant_docs = self.document_processor.retrieve_relevant_context(message)

    # Build the context string from relevant docs
    context = ""
    for doc in relevant_docs:
        source = doc.metadata.get('source', 'unknown')
        content = doc.page_content
        context += f"Source: {source}\n{content}\n\n"

    # Send message and context to chat engine
    return self.chat_engine.send_message(message, context)
```

This method implements the core Retrieval-Augmented Generation (RAG) workflow:

* Takes a user message
* Retrieves relevant document chunks from the document processor
* Builds a context string including document sources
* Sends the message and context to the chat engine for response
* Returns the chat engine’s reply

If no relevant documents are found, the context is empty, so the chat engine will respond that it lacks enough information.

---

## Adding System Management Features

Reset the conversation history:

```python
def reset_conversation(self):
    """Reset the conversation history"""
    self.chat_engine.reset_conversation()
    return "Conversation history has been reset."
```

Reset both conversation and documents:

```python
def reset_all(self):
    """Reset both conversation and documents"""
    self.reset_conversation()
    self.reset_documents()
    return "Both conversation history and document knowledge have been reset."
```

---

## Uploading a Document and Sending a Message

Test the chatbot with document upload and query:

```python
from rag_chatbot import RAGChatbot

# Initialize the RAG chatbot
chatbot = RAGChatbot()

# Upload a document
result = chatbot.upload_document("data/a_scandal_in_bohemia.pdf")
print(result)

# Ask a question about the document
query = "What is the main mystery in the story?"
response = chatbot.send_message(query)
print(f"\nQuestion: {query}")
print(f"Answer: {response}")
```

Expected output:

```
Document successfully processed.

Question: What is the main mystery in the story?
Answer: The main mystery in the story is the identity and intentions of the gentleman who is set to visit the character at a quarter to eight o'clock.
```

---

## Resetting Everything and Sending a Message

Test reset functionality:

```python
# Reset everything
result = chatbot.reset_all()
print(result)

# Ask about Sherlock Holmes after reset
final_query = "Tell me about Sherlock Holmes."
response = chatbot.send_message(final_query)
print(f"\nQuestion: {final_query}")
print(f"Answer: {response}")
```

Expected output:

```
Both conversation history and document knowledge have been reset.

Question: Tell me about Sherlock Holmes.
Answer: I don't have enough information in the provided context to answer this question.
```

---

## Summary and Practice Preview

In this lesson, we've successfully integrated our document processor and chat engine into a complete RAG chatbot system. The `RAGChatbot` class:

* Uploads and processes documents
* Retrieves relevant context for queries
* Generates context-aware answers
* Maintains conversation history
* Supports resetting conversation and document knowledge

This marks the culmination of our work: a modular, functional RAG system ready for real-world usage.

Prepare to apply your knowledge in practice exercises and take your RAG chatbot to the next level!


## Implementing Document Upload and Error Handling

Well done understanding how to integrate our document processor and chat engine into a RAGChatbot class. Now, let's begin by implementing and testing the upload\_document method within this class.

Here's what you need to do:

Implement a method to handle document uploads by calling the process\_document method from the DocumentProcessor class.
Ensure the method catches any ValueError exceptions and returns a user-friendly error message.
Test the method by simulating document uploads with:
A valid PDF file path.
A valid TXT file path, and expect a user-friendly error message indicating that the format is unsupported.
Print the results of each upload attempt to verify that your implementation works correctly. This exercise will solidify your understanding of error handling and component integration. You're almost there — keep up the great work!

```python
from document_processor import DocumentProcessor
from chat_engine import ChatEngine


class RAGChatbot:
    def __init__(self):
        self.document_processor = DocumentProcessor()
        self.chat_engine = ChatEngine()
        
    # TODO: Implement a method to upload a document
        # TODO: Call the process_document method from DocumentProcessor
        # TODO: Return a success message if the document is processed without errors
        # TODO: Return a user-friendly error message if a ValueError is caught

from rag_chatbot import RAGChatbot

# Initialize the RAG chatbot
chatbot = RAGChatbot()

# Define the file paths
pdf_file = "data/a_scandal_in_bohemia.pdf"
txt_file = "data/alice_in_wonderland.txt"

# TODO: Upload the PDF file and print the result

# TODO: Upload the TXT file and print the result


```


```python
from document_processor import DocumentProcessor
from chat_engine import ChatEngine


class RAGChatbot:
    def __init__(self):
        self.document_processor = DocumentProcessor()
        self.chat_engine = ChatEngine()
        
    def upload_document(self, file_path):
        """Upload and process a document, handling errors gracefully."""
        try:
            self.document_processor.process_document(file_path)
            return "Document successfully processed."
        except ValueError as e:
            return f"Error: {str(e)}"


# ----------------- Testing upload_document ------------------

from rag_chatbot import RAGChatbot

# Initialize the RAG chatbot
chatbot = RAGChatbot()

# Define the file paths
pdf_file = "data/a_scandal_in_bohemia.pdf"
txt_file = "data/alice_in_wonderland.txt"

# Upload the PDF file and print the result
result_pdf = chatbot.upload_document(pdf_file)
print(f"Uploading PDF: {result_pdf}")

# Upload the TXT file and print the result (expect error message)
result_txt = chatbot.upload_document(txt_file)
print(f"Uploading TXT: {result_txt}")
```

---

### Explanation:

* The `upload_document` method calls `process_document` on the document processor.
* If the file is processed without issue, it returns a success message.
* If a `ValueError` occurs (for example, unsupported file format like TXT), it catches the error and returns a friendly error message.
* The test code tries to upload a PDF (expected to succeed) and a TXT (expected to fail), then prints results for both.

---

### Expected output example:

```
Uploading PDF: Document successfully processed.
Uploading TXT: Error: Unsupported file format
```

This confirms your error handling and integration between the RAGChatbot and DocumentProcessor are working as intended.


## Handling User Messages and Retrieval

ou've done a fantastic job integrating the document processor and chat engine into the RAGChatbot class. Now, let's focus on implementing a method to handle the context retrieval and comunication with our chat engine.

Here's what this method should do:

Retrieve relevant document chunks using the retrieve\_relevant\_context method from the DocumentProcessor.
Initialize an empty string for the context.
Loop through each relevant document chunk and concatenate only the page\_content to the context string with proper formatting.
Pass the finalized context string to the ChatEngine and return a response.
To test your implementation, use the main.py file to send a message to the chatbot, then print the response to verify the functionality.

```python
from document_processor import DocumentProcessor
from chat_engine import ChatEngine


class RAGChatbot:
    def __init__(self):
        self.document_processor = DocumentProcessor()
        self.chat_engine = ChatEngine()
        
    def upload_document(self, file_path):
        """Upload and process a document"""
        try:
            self.document_processor.process_document(file_path)
            return "Document successfully processed."
        except ValueError as e:
            return f"Error: {str(e)}"
        
    # TODO: Implement a method to handle message processing
        # TODO: Retrieve relevant document chunks based on the user's query
        # TODO: Initialize an empty string for the context

        # TODO: Loop through each relevant document
            # TODO: Extract the content of the document
            # TODO: Append the content to the context string

        # TODO: Send the user's message along with the context to the chat engine and return the response


from rag_chatbot import RAGChatbot

# Initialize the RAG chatbot
chatbot = RAGChatbot()

# Upload a document
result = chatbot.upload_document("data/a_scandal_in_bohemia.pdf")
print(result)

# Define a user query
query = "What is the main mystery in the story?"

# TODO: Send the message and print the response

```

Thanks for providing the lesson content. I see the issue now. Looking at the lesson, it seems the method in `ChatEngine` is named `send_message` rather than `generate_response` or `get_response`.

Let me fix the implementation of the `process_message` method in the `RAGChatbot` class to use the correct method name:

```python
def process_message(self, message):
    """
    Process a user message by retrieving relevant context and generating a response
    
    Args:
        message (str): The user's query
        
    Returns:
        str: The chatbot's response
    """
    # Retrieve relevant document chunks based on the user's query
    relevant_chunks = self.document_processor.retrieve_relevant_context(message)
    
    # Initialize an empty string for the context
    context = ""
    
    # Loop through each relevant document
    for chunk in relevant_chunks:
        # Extract the content of the document
        chunk_content = chunk.page_content
        
        # Append the content to the context string with proper formatting
        context += f"{chunk_content}\n\n"
    
    # Send the user's message along with the context to the chat engine and return the response
    response = self.chat_engine.send_message(message, context)
    
    return response
```

According to the lesson content, the `ChatEngine` class has a `send_message` method that takes two parameters: the user's message and the context retrieved from the document processor.

The updated main.py file remains the same:

```python
from rag_chatbot import RAGChatbot

# Initialize the RAG chatbot
chatbot = RAGChatbot()

# Upload a document
result = chatbot.upload_document("data/a_scandal_in_bohemia.pdf")
print(result)

# Define a user query
query = "What is the main mystery in the story?"

# Send the message and print the response
response = chatbot.process_message(query)
print("\nUser Query:", query)
print("\nChatbot Response:", response)
```

This implementation should now work correctly with the existing `ChatEngine` class by using the `send_message` method as specified in the lesson content.

## Enhancing Chatbot Context with Sources

Now, let's enhance the send_message method to make our chatbot even more informative.

Here's what you need to do:

Modify the send_message method to extract the 'source' from each document chunk's metadata.
Append both the 'source' and the 'page_content' to the context string with clear formatting.
Once you've completed these steps, run the code to ensure the chatbot correctly includes the source information in its responses. This will help you understand how to enhance the chatbot's transparency and reliability. Keep pushing forward!

```python
from document_processor import DocumentProcessor
from chat_engine import ChatEngine


class RAGChatbot:
    def __init__(self):
        self.document_processor = DocumentProcessor()
        self.chat_engine = ChatEngine()
        
    def upload_document(self, file_path):
        """Upload and process a document"""
        try:
            self.document_processor.process_document(file_path)
            return "Document successfully processed."
        except ValueError as e:
            return f"Error: {str(e)}"
        
    def send_message(self, message):
        """Send a message to the chatbot and get a response"""
        # Retrieve relevant document chunks based on the user's query
        relevant_docs = self.document_processor.retrieve_relevant_context(message)
        # Initialize an empty string for the context
        context = ""

        # Loop through each relevant document
        for doc in relevant_docs:
            # Extract the page content
            content = doc.page_content
            
            # TODO: Extract the source of retrieved document or default to unknown
            
            # TODO: Append both source and the content to the context string
            context += f"{content}\n\n"

        # Send the user's message along with the context to the chat engine
        return self.chat_engine.send_message(message, context)

from rag_chatbot import RAGChatbot

# Initialize the RAG chatbot
chatbot = RAGChatbot()

# Upload a document
result = chatbot.upload_document("data/a_scandal_in_bohemia.pdf")
print(result)

# Define a user query
query = "What is the main mystery and from what story?"

# Send the message and print the response
response = chatbot.send_message(query)
print(f"Answer: {response}")

```

I'll enhance the `send_message` method to include source information from the document metadata. This will make the chatbot's responses more transparent by showing where the information is coming from.

Here's the implementation:

```python
def send_message(self, message):
    """Send a message to the chatbot and get a response"""
    # Retrieve relevant document chunks based on the user's query
    relevant_docs = self.document_processor.retrieve_relevant_context(message)
    # Initialize an empty string for the context
    context = ""

    # Loop through each relevant document
    for doc in relevant_docs:
        # Extract the page content
        content = doc.page_content
        
        # Extract the source of retrieved document or default to unknown
        source = doc.metadata.get('source', 'unknown')
        
        # Append both source and the content to the context string with clear formatting
        context += f"Source: {source}\n{content}\n\n"

    # Send the user's message along with the context to the chat engine
    return self.chat_engine.send_message(message, context)
```

The key enhancements I've made:

1. Extracted the `'source'` from each document chunk's metadata using the `.get('source', 'unknown')` method, which will return 'unknown' if the source is not available in the metadata.

2. Improved the context formatting by including the source information before each document's content with a clear "Source:" prefix.

This improved implementation will make the RAG chatbot more informative because:

1. It now provides source attribution for each piece of information, enhancing transparency and trustworthiness.

2. The clear formatting with "Source:" labels helps the language model understand where the information is coming from, potentially improving its ability to reference sources in responses.

3. If the user asks about the source of information, the chatbot will have that information in its context and can provide it in the response.

With these changes, when you run the code and ask "What is the main mystery and from what story?", the chatbot will be able to provide an answer that includes or references the source of the information, making the response more credible and informative.

## Mastering Chatbot Reset Functions

Finally, let's enhance our chatbot by implementing reset functionalities to manage its state effectively. Here's what you need to implement:

reset\_conversation method: Call the reset\_conversation method from the ChatEngine class and return a confirmation message.

reset\_documents method: Call the reset method from the DocumentProcessor class and return a confirmation message.

reset\_all method: Call both the reset\_conversation and reset\_documents methods, ensuring the chatbot's state is fully reset, and return a message indicating this.

Test your implementation by using the reset\_all method and sending a query afterward to verify that no previous context or conversation history affects the output.

This exercise will help you master method chaining and system state management. Dive in and see how these reset features enhance your chatbot's functionality!

```python
from document_processor import DocumentProcessor
from chat_engine import ChatEngine


class RAGChatbot:
    def __init__(self):
        self.document_processor = DocumentProcessor()
        self.chat_engine = ChatEngine()
        
    def upload_document(self, file_path):
        """Upload and process a document"""
        try:
            self.document_processor.process_document(file_path)
            return "Document successfully processed."
        except ValueError as e:
            return f"Error: {str(e)}"
        
    def send_message(self, message):
        """Send a message to the chatbot and get a response"""
        relevant_docs = self.document_processor.retrieve_relevant_context(message)
        context = ""

        for doc in relevant_docs:
            source = doc.metadata.get('source', 'unknown')
            content = doc.page_content
            context += f"Source: {source}\n{content}\n\n"

        return self.chat_engine.send_message(message, context)
        
    # TODO: Implement the reset_conversation method
        # TODO: Call the reset_conversation method from ChatEngine
        # TODO: Return a confirmation message

    # TODO: Implement the reset_documents method
        # TODO: Call the reset method from DocumentProcessor
        # TODO: Return a confirmation message

    # TODO: Implement the reset_all method
        # TODO: Call reset_conversation and reset_documents methods
        # TODO: Return a message indicating both have been reset


from rag_chatbot import RAGChatbot

# Initialize the RAG chatbot
chatbot = RAGChatbot()

# Upload a document
result = chatbot.upload_document("data/a_scandal_in_bohemia.pdf")
print(result)

# Send a message and get a response
query = "What is the main mystery in the story?"
response = chatbot.send_message(query)
print(f"\nQuestion: {query}")
print(f"Answer: {response}")

# TODO: Reset everything using reset_all method
# TODO: Print the result of the reset

# TODO: Verify reset worked by sending a new query
# TODO: Print the response to ensure no previous context affects the output
```

Here’s the completed implementation of the reset methods inside your `RAGChatbot` class, plus example testing code at the bottom:

```python
from document_processor import DocumentProcessor
from chat_engine import ChatEngine


class RAGChatbot:
    def __init__(self):
        self.document_processor = DocumentProcessor()
        self.chat_engine = ChatEngine()
        
    def upload_document(self, file_path):
        """Upload and process a document"""
        try:
            self.document_processor.process_document(file_path)
            return "Document successfully processed."
        except ValueError as e:
            return f"Error: {str(e)}"
        
    def send_message(self, message):
        """Send a message to the chatbot and get a response"""
        relevant_docs = self.document_processor.retrieve_relevant_context(message)
        context = ""

        for doc in relevant_docs:
            source = doc.metadata.get('source', 'unknown')
            content = doc.page_content
            context += f"Source: {source}\n{content}\n\n"

        return self.chat_engine.send_message(message, context)
        
    def reset_conversation(self):
        """Reset the chat engine's conversation state"""
        self.chat_engine.reset_conversation()
        return "Conversation has been reset."
    
    def reset_documents(self):
        """Reset the document processor's state"""
        self.document_processor.reset()
        return "Documents have been reset."
    
    def reset_all(self):
        """Reset both conversation and documents"""
        conv_msg = self.reset_conversation()
        doc_msg = self.reset_documents()
        return f"{conv_msg} {doc_msg} Chatbot state fully reset."


if __name__ == "__main__":
    # Initialize the RAG chatbot
    chatbot = RAGChatbot()

    # Upload a document
    result = chatbot.upload_document("data/a_scandal_in_bohemia.pdf")
    print(result)

    # Send a message and get a response
    query = "What is the main mystery in the story?"
    response = chatbot.send_message(query)
    print(f"\nQuestion: {query}")
    print(f"Answer: {response}")

    # Reset everything using reset_all method
    reset_result = chatbot.reset_all()
    print(f"\nReset result: {reset_result}")

    # Verify reset worked by sending a new query
    new_query = "Can you summarize the story without previous context?"
    new_response = chatbot.send_message(new_query)
    print(f"\nQuestion after reset: {new_query}")
    print(f"Answer after reset: {new_response}")
```

---

### Explanation

* `reset_conversation`: calls `self.chat_engine.reset_conversation()` and returns confirmation.
* `reset_documents`: calls `self.document_processor.reset()` and returns confirmation.
* `reset_all`: calls both reset methods and returns a combined message.
* After resetting, sending a new message verifies that previous conversation or document states no longer influence the chatbot.

This structure makes your chatbot state management clean and modular. Let me know if you want me to help with test outputs or mock implementations!

Here's the completed test code with your requested TODOs implemented:

```python
# Initialize the RAG chatbot
chatbot = RAGChatbot()

# Upload a document
result = chatbot.upload_document("data/a_scandal_in_bohemia.pdf")
print(result)

# Send a message and get a response
query = "What is the main mystery in the story?"
response = chatbot.send_message(query)
print(f"\nQuestion: {query}")
print(f"Answer: {response}")

# Reset everything using reset_all method
reset_result = chatbot.reset_all()
print(f"\nReset result: {reset_result}")

# Verify reset worked by sending a new query
new_query = "Can you summarize the story without previous context?"
new_response = chatbot.send_message(new_query)
print(f"\nQuestion after reset: {new_query}")
print(f"Answer after reset: {new_response}")
```

This will:

* Reset the entire chatbot state (both conversation and documents)
* Then send a new question to check that no prior context or documents affect the response.

Let me know if you want me to help with simulating expected outputs or further improvements!

