# Lesson 4 : Analyzing Interplanetary Agreements with RAG


Welcome to the final unit of our course on building a RAG-powered chatbot! Throughout this course, you've built a complete Retrieval-Augmented Generation system from the ground up. You've created a document processor for handling document retrieval, developed a chat engine for managing conversations, and integrated these components into a unified RAG chatbot. Now it's time to put your creation to work on a practical application.

In this lesson, we'll explore how to use your RAG chatbot to analyze a collection of fictional interplanetary agreements. This scenario mimics real-world document analysis tasks that professionals often face ‚Äî reviewing multiple complex documents, extracting specific information, and making comparisons across documents. While our documents are fictional and space-themed, the techniques you'll learn apply directly to real-world use cases like legal document review, policy analysis, or research synthesis.

Our interplanetary agreements dataset consists of three fictional documents:

- **An Interplanetary Trade Agreement**  
- **A Space Exploration Partnership**  
- **A Galactic Environmental Protection Pact**

These documents contain various clauses, terms, and provisions that our RAG chatbot will help us analyze. By the end of this lesson, you'll understand how to apply your RAG chatbot to extract insights from document collections efficiently.

---

## Implementing a Document Analysis Workflow

Before diving into document analysis, let's set up our RAG chatbot and plan our approach. We'll use the `RAGChatbot` class we built in the previous lesson, which integrates our document processor and chat engine components.

First, let's import our chatbot and initialize it:

```python
from rag_chatbot import RAGChatbot

# Initialize the RAG chatbot
chatbot = RAGChatbot()
````

With our chatbot initialized, we need to plan our document analysis workflow. For complex document analysis tasks, it's often helpful to follow a structured approach:

1. Start with single document analysis to understand individual documents.
2. Progress to comparative analysis between documents.
3. Perform comprehensive analysis across all documents.
4. Use targeted analysis for specific inquiries.

This progressive approach helps build a comprehensive understanding of the document collection while allowing for focused analysis when needed. It also makes efficient use of our RAG system's capabilities, as the chatbot can retrieve relevant information from the entire document collection or from specific documents depending on our needs.

Let's implement this workflow to analyze our interplanetary agreements.

---

## Single Document Analysis Techniques

Let's begin by uploading a single document and asking specific questions about it. This approach helps us understand the content of individual documents before attempting to make comparisons or draw broader conclusions.

```python
# Step 1: Upload a single document and ask a specific question about it
trade_agreement = "data/interplanetary_trade_agreement.pdf"
result = chatbot.upload_document(trade_agreement)
print(f"Uploaded {trade_agreement}: {result}")

# Ask a specific question about the trade agreement
question = "How are disputes resolved?"
response = chatbot.send_message(question)
print(f"\nQuestion: {question}")
print(f"Answer: {response}")
```

When you run this code, you'll see output similar to:

```
Uploaded data/interplanetary_trade_agreement.pdf: Document successfully processed.

Question: How are disputes resolved?
Answer: Disputes are resolved through mediation facilitated by the Galactic Trade Council, followed by binding arbitration under the rules established by the Galactic Arbitration Tribunal, and ultimately falling under the exclusive jurisdiction of the Galactic Court of Justice.
```

This example demonstrates how to extract specific information from a single document. The question "How are disputes resolved?" is targeted and specific, allowing our RAG chatbot to retrieve relevant sections of the document and provide a detailed answer.

When formulating questions for single document analysis, it's best to be specific and focused. Questions like "What are the key terms?" while broad, might not yield the most useful results. Instead, questions that target specific aspects of the document, such as "How are disputes resolved?" or "What are the confidentiality obligations?" will yield more precise and useful information.

---

## Comparative Document Analysis

Once we understand individual documents, we can progress to comparative analysis. This involves uploading multiple documents and asking questions that require the chatbot to compare information across them.

Let's upload a second document and ask a comparative question:

```python
# Step 2: Upload a second document and ask a comparative question
space_partnership = "data/space_exploration_partnership.pdf"
result = chatbot.upload_document(space_partnership)
print(f"Uploaded {space_partnership}: {result}")

# Ask a comparison question between the two documents
question = "What are the about liability clauses?"
response = chatbot.send_message(question)
print(f"\nQuestion: {question}")
print(f"Answer: {response}")
```

The output will look like:

```
Uploaded data/space_exploration_partnership.pdf: Document successfully processed.

Question: What are the about liability clauses?
Answer: The liability clauses in the provided context state that neither partner shall be held liable for failure to perform obligations due to events beyond reasonable control, such as natural disasters, interstellar disruptions, interstellar conflicts, or systemic technological disruptions.
```

Comparative questions require our RAG system to retrieve relevant information from multiple documents and synthesize a response. This is where the power of RAG really shines ‚Äî the system can pull context from different documents based on semantic relevance, not just keyword matching.

---

## Comprehensive Multi-Document Analysis

After understanding individual documents and making targeted comparisons, we can perform comprehensive analysis across all documents. This involves uploading all relevant documents and asking questions that require synthesizing information from the entire collection.

Let's add our third document and ask a question that might require information from any of the documents:

```python
# Step 3: Add a third document and ask for a comprehensive analysis
environmental_pact = "data/galactic_environmental_protection_pact.pdf"
result = chatbot.upload_document(environmental_pact)
print(f"Uploaded {environmental_pact}: {result}")

# Ask for a summary that involves information from all three documents
question = "What document mentioned fines?"
response = chatbot.send_message(question)
print(f"\nQuestion: {question}")
print(f"Answer: {response}")
```

The output will look like:

```
Uploaded data/galactic_environmental_protection_pact.pdf: Document successfully processed.

Question: What document mentioned fines?
Answer: The document that mentioned fines is the Galactic Federation Galactic Environmental Protection Pact.
```

This example shows how our RAG system can search across all uploaded documents to find specific information. The question "What document mentioned fines?" requires the system to identify which document contains information about fines, demonstrating the RAG chatbot's ability to search across the entire document collection.

---

## Strategic Knowledge Base Management

For complex document analysis tasks, it's sometimes helpful to reset your knowledge base and focus on specific documents. This allows for more targeted analysis without interference from other documents in the collection.

Let's demonstrate this by resetting our knowledge base and focusing only on the environmental pact:

```python
# Step 4: Reset everything and focus only on the environmental pact
reset_result = chatbot.reset_all()
print(reset_result)

# Re-upload only the environmental pact
result = chatbot.upload_document(environmental_pact)
print(f"Re-uploaded {environmental_pact}: {result}")

# Ask a complex question specifically about the environmental pact
question = "What penalties exist for emissions violations?"
response = chatbot.send_message(question)
print(f"\nQuestion: {question}")
print(f"Answer: {response}")
```

The output will look like:

```
Both conversation history and document knowledge have been reset.
Re-uploaded data/galactic_environmental_protection_pact.pdf: Document successfully processed.

Question: What penalties exist for emissions violations?
Answer: Violation of any environmental standards may result in penalties including fines, suspension of operations, or revocation of eco-certification.
```

This example demonstrates how to use the `reset_all()` method to clear both conversation history and document knowledge, allowing you to focus on a specific document without interference from previously uploaded documents. This is particularly useful when you want to perform deep analysis on a single document after exploring the broader collection.

Strategic knowledge base management involves deciding when to keep multiple documents in your knowledge base for comparative analysis and when to reset and focus on specific documents for deeper analysis. This flexibility allows you to tailor your analysis approach to your specific needs.

---

## Conclusion

Congratulations! You've completed the final lesson in our course on building a RAG-powered chatbot with LangChain and Python. Throughout this course, you've built a complete RAG system from the ground up and learned how to apply it to practical document analysis tasks.

In this lesson, you've learned several key techniques for document analysis with RAG:

* Single document analysis for extracting specific information.
* Comparative analysis for identifying similarities and differences between documents.
* Comprehensive analysis for synthesizing information across multiple documents.
* Strategic knowledge base management for focused analysis.

The RAG architecture you've built is flexible and extensible, allowing you to adapt it to various use cases and document collections. Whether you're analyzing interplanetary agreements, legal contracts, research papers, or any other document collection, the techniques you've learned in this course will help you extract insights efficiently and effectively. Keep exploring, keep building, and keep pushing the boundaries of what's possible with RAG and LangChain!



## Querying a Single Interplanetary Agreement

Ready to test your interstellar document analysis capabilities? Let's put your RAG chatbot to work on decoding an important galactic treaty! üöÄüåå

Here's your mission:

Upload the Interplanetary Trade Agreement document to your RAG chatbot.
Query the document with a specific question like:
"What are the standard tariff rates for goods under the agreement?"
This exercise will verify your document processing capabilities and demonstrate how your system extracts precise information from a single document - a critical skill for any interstellar diplomat. Launch your analysis and discover what this agreement holds! ü™ê‚ú®

```python
from rag_chatbot import RAGChatbot

# Initialize the RAG chatbot
chatbot = RAGChatbot()

# File path for the "Interplanetary Trade Agreement"
trade_agreement = "data/interplanetary_trade_agreement.pdf"

# TODO: Upload the "Interplanetary Trade Agreement" document

# TODO: Ask a specific question about the tariff rates and print the response
``
```


```python
from rag_chatbot import RAGChatbot

# Initialize the RAG chatbot
chatbot = RAGChatbot()

# File path for the "Interplanetary Trade Agreement"
trade_agreement = "data/interplanetary_trade_agreement.pdf"

# Upload the "Interplanetary Trade Agreement" document
upload_result = chatbot.upload_document(trade_agreement)
print(f"Uploaded {trade_agreement}: {upload_result}")

# Ask a specific question about the tariff rates and print the response
question = "What are the standard tariff rates for goods under the agreement?"
response = chatbot.send_message(question)
print(f"\nQuestion: {question}")
print(f"Answer: {response}")
```


## Cosmic Treaty Comparison Challenge

Time to level up your cosmic intelligence gathering! Your RAG chatbot must now analyze the diplomatic nuances between two stellar accords. üî≠‚öñÔ∏è

Your mission:

Upload both the Interplanetary Trade Agreement and Space Exploration Partnership into your RAG system.
Challenge your AI with a question that bridges both galactic documents, such as:
"How do the dispute resolution mechanisms differ between the trade agreement and the exploration partnership?"
This test will reveal if your chatbot can weave together information threads from separate star systems to form a cohesive analysis - crucial for navigating the complex web of interstellar relations. Can your creation identify the subtle differences between these cosmic compacts? The Galactic Council awaits your findings! üí´üõ∏

```python
from rag_chatbot import RAGChatbot

# Initialize the RAG chatbot
chatbot = RAGChatbot()

# File path for the "Interplanetary Trade Agreement"
trade_agreement = "data/interplanetary_trade_agreement.pdf"

# File path for the "Space Exploration Partnership"
space_partnership = "data/space_exploration_partnership.pdf"

# TODO: Upload the "Interplanetary Trade Agreement" document

# TODO: Upload the "Space Exploration Partnership" document

# TODO: Ask a comparative question about the two documents and print the response

```
```python
from rag_chatbot import RAGChatbot

# Initialize the RAG chatbot
chatbot = RAGChatbot()

# File path for the "Interplanetary Trade Agreement"
trade_agreement = "data/interplanetary_trade_agreement.pdf"

# File path for the "Space Exploration Partnership"
space_partnership = "data/space_exploration_partnership.pdf"

# Upload the "Interplanetary Trade Agreement" document
upload_result_1 = chatbot.upload_document(trade_agreement)
print(f"Uploaded {trade_agreement}: {upload_result_1}")

# Upload the "Space Exploration Partnership" document
upload_result_2 = chatbot.upload_document(space_partnership)
print(f"Uploaded {space_partnership}: {upload_result_2}")

# Ask a comparative question about the two documents and print the response
question = "How do the dispute resolution mechanisms differ between the trade agreement and the exploration partnership?"
response = chatbot.send_message(question)
print(f"\nQuestion: {question}")
print(f"Answer: {response}")
```


## Exploring the Document Multiverse

Prepare for your cosmic journey through the document multiverse! You've mastered single and comparative document analysis, but now it's time to traverse the entire constellation of files in your knowledge base. üîçüåå

Your mission is to:

Loop through all files in the data directory and process each document with your RAG system.
Pose a stellar question like:
"Which agreement contains provisions for certification requirements?"
Print the response to verify if your chatbot can pinpoint the exact document containing the vital information, illuminating the path through the document cosmos.
Here's a hint on how to loop through files in a directory and process each document:

Python
Copy to clipboard
for filename in os.listdir(directory\_path):
This task will hone your ability to extract precise insights from a constellation of documents. The universe of knowledge awaits your exploration! üõ∞Ô∏è

```python
from rag_chatbot import RAGChatbot
import os

# Initialize the RAG chatbot
chatbot = RAGChatbot()

# Directory containing the documents
data_directory = "data"

# TODO: Iterate over all files in the data directory and upload each document

# TODO: Ask a specific question that requires information from any of the documents

# TODO: Print the response to verify if the system can pinpoint the exact document

```

```python
from rag_chatbot import RAGChatbot
import os

# Initialize the RAG chatbot
chatbot = RAGChatbot()

# Directory containing the documents
data_directory = "data"

# Iterate over all files in the data directory and upload each document
for filename in os.listdir(data_directory):
    file_path = os.path.join(data_directory, filename)
    if os.path.isfile(file_path) and filename.lower().endswith(('.pdf', '.txt', '.docx')):
        upload_result = chatbot.upload_document(file_path)
        print(f"Uploaded {filename}: {upload_result}")

# Ask a specific question that requires information from any of the documents
question = "Which agreement contains provisions for certification requirements?"
response = chatbot.send_message(question)

# Print the response to verify if the system can pinpoint the exact document
print(f"\nQuestion: {question}")
print(f"Answer: {response}")
```


Before returning to Earth with newfound knowledge, let's complete one last mission! You've expertly navigated through document analysis, and now it's time to explore each celestial document in your knowledge base with precision. üéØüåå

For this mission you'll need to:

Loop through all the files in the data directory.
For each document:
Upload it to your RAG chatbot.
Send the defined query to the chatbot.
Print the response you receive.
Use the reset_all method to clear both the conversation history and document knowledge before moving to the next document.
Complete this final task with precision and prepare to return to Earth, equipped with the skills to manage a knowledge base across the document universe! üöÄüåç

```python
from rag_chatbot import RAGChatbot
import os

# Initialize the RAG chatbot
chatbot = RAGChatbot()

# Directory containing the documents
data_directory = "data"

# Define a query to extract information from each document
query = "Identify the key parties involved in this document"

# TODO: Iterate over all files in the data directory
    # TODO: Upload each document
    # TODO: Send the query to the chatbot
    # TODO: Print the response
    # TODO: Reset eveything after each document

```

```python
from rag_chatbot import RAGChatbot
import os

# Initialize the RAG chatbot
chatbot = RAGChatbot()

# Directory containing the documents
data_directory = "data"

# Define a query to extract information from each document
query = "Identify the key parties involved in this document"

# Iterate over all files in the data directory
for filename in os.listdir(data_directory):
    file_path = os.path.join(data_directory, filename)
    if os.path.isfile(file_path) and filename.lower().endswith(('.pdf', '.txt', '.docx')):
        # Upload each document
        upload_result = chatbot.upload_document(file_path)
        print(f"Uploaded {filename}: {upload_result}")
        
        # Send the query to the chatbot
        response = chatbot.send_message(query)
        
        # Print the response
        print(f"\nQuestion: {query}")
        print(f"Answer: {response}")
        
        # Reset everything after each document
        reset_result = chatbot.reset_all()
        print(f"Reset result: {reset_result}")
```
