In [None]:
# RAG System Demo

This notebook demonstrates how to use the Retrieval Augmented Generation (RAG) system to query documents.

## Setup

First, let's import the necessary functions and dependencies:


```python
import sys
sys.path.append('../')  # Adjust if needed to find the parent directory

from query_data import query_rag
from populate_database import main as populate_db, clear_database
```

## Initialize the Database

If you haven't already built the database, you can do so from the notebook:


```python
# Uncomment the next line if you want to reset the database
# clear_database()

# Build/update the database with documents from the data directory
populate_db()
```

## Query Your Documents

Now let's ask some questions about the documents in our knowledge base:


```python
# Question about Monopoly rules
response = query_rag("How much money does each player start with in Monopoly?")
print("Response:", response)
```

Let's try another question:


```python
# Question about Ticket to Ride
response = query_rag("How many train cars do players start with in Ticket to Ride?")
print("Response:", response)
```

## Understanding the Results

Each response includes:
1. The answer generated by the LLM
2. Source references showing which document chunks were used

Let's visualize what happens when we ask a question:


```python
import matplotlib.pyplot as plt
from matplotlib.patches import Rectangle
import numpy as np

def visualize_rag_process():
    # Create a figure and axis
    fig, ax = plt.subplots(figsize=(10, 6))
    
    # Define the components
    components = ['Documents', 'Chunking', 'Embeddings', 'Vector DB', 'Query', 'Context', 'LLM', 'Response']
    y_positions = np.arange(len(components))
    
    # Plot the components
    ax.barh(y_positions, [0.8] * len(components), height=0.6, left=0.1, color='lightgrey', alpha=0.3)
    
    # Add labels
    for i, comp in enumerate(components):
        ax.text(0.5, i, comp, va='center', ha='center', fontsize=12)
    
    # Add flow arrows
    for i in range(len(components)-1):
        if i < 3:  # Document processing flow
            ax.annotate('', xy=(0.9, i), xytext=(0.1, i+1),
                        arrowprops=dict(facecolor='blue', shrink=0.05, width=1.5, headwidth=8))
        elif i == 3:  # Vector DB to Query
            ax.annotate('', xy=(0.5, i), xytext=(0.1, i+1),
                        arrowprops=dict(facecolor='green', shrink=0.05, width=1.5, headwidth=8))
        else:  # Query to Response flow
            ax.annotate('', xy=(0.9, i), xytext=(0.1, i+1),
                        arrowprops=dict(facecolor='red', shrink=0.05, width=1.5, headwidth=8))
    
    # Remove axes
    ax.set_yticks([])
    ax.set_xticks([])
    ax.spines['top'].set_visible(False)
    ax.spines['right'].set_visible(False)
    ax.spines['bottom'].set_visible(False)
    ax.spines['left'].set_visible(False)
    
    plt.title('RAG System Process Flow', fontsize=14)
    plt.tight_layout()
    plt.show()

visualize_rag_process()
```

## Customizing the System

You can customize the RAG system by modifying parameters:


```python
# Example: Custom RAG query with more context chunks
from langchain.vectorstores.chroma import Chroma
from langchain.prompts import ChatPromptTemplate
from langchain_community.llms.ollama import Ollama
from get_embedding_function import get_embedding_function

CHROMA_PATH = "chroma"

def custom_query_rag(query_text, num_chunks=8, temperature=0.1):
    # Prepare the DB
    embedding_function = get_embedding_function()
    db = Chroma(persist_directory=CHROMA_PATH, embedding_function=embedding_function)
    
    # Search the DB with custom parameters
    results = db.similarity_search_with_score(query_text, k=num_chunks)
    
    # Create context from results
    context_text = "\n\n---\n\n".join([doc.page_content for doc, _score in results])
    
    # Custom prompt
    custom_prompt = """
    You are an expert research assistant. Answer the question accurately based on the provided context.
    If the answer is not in the context, say "I don't have enough information to answer this question."
    
    CONTEXT:
    {context}
    
    QUESTION: {question}
    
    ANSWER:
    """
    
    prompt_template = ChatPromptTemplate.from_template(custom_prompt)
    prompt = prompt_template.format(context=context_text, question=query_text)
    
    # Get response with custom temperature
    model = Ollama(model="mistral", temperature=temperature)
    response_text = model.invoke(prompt)
    
    # Get sources
    sources = [doc.metadata.get("id", None) for doc, _score in results]
    
    return {
        "response": response_text,
        "sources": sources,
        "num_chunks_used": len(results)
    }

# Try the custom function
result = custom_query_rag("What happens when you land on Free Parking?", num_chunks=10)
print(f"Response: {result['response']}\n")
print(f"Number of chunks used: {result['num_chunks_used']}")
print(f"Sources: {result['sources']}")
```

## Conclusion

This notebook showed how to:
1. Initialize the RAG database
2. Query documents with the standard function
3. Visualize the RAG process
4. Create a custom query function with different parameters

You can extend this system by adding more documents to the `data` directory and rebuilding the database.