# Supervisor-Worker RAG Chatbot Demonstration

This notebook demonstrates the RAG (Retrieval-Augmented Generation) chatbot implementation using LangGraph with a Supervisor-Worker architecture. The chatbot is designed to answer questions about the University's MS in Applied Data Science program.

## Setup

First, let's import the required libraries and set up our environment.

In [1]:
import os
import sys
import json
from IPython.display import display, HTML

# Add the src directory to the path
sys.path.append(os.path.abspath('..'))

# Import our RAG chatbot implementation
from src.rag_chatbot import RAGChatbot

# For visualization
from IPython.display import Image, display
import matplotlib.pyplot as plt

## Set OpenAI API Key

Make sure to set your OpenAI API key. You can either set it as an environment variable or directly in this notebook.

In [None]:
# Option 1: Set the API key directly (replace with your actual key)
os.environ["OPENAI_API_KEY"] = "app keys here"

# Option 2: Load from a .env file or check if already set
#from dotenv import load_dotenv
#load_dotenv()  # This will load environment variables from a .env file if present

# Verify the API key is set
if "OPENAI_API_KEY" not in os.environ or not os.environ["OPENAI_API_KEY"]:
    print("⚠️ Warning: OPENAI_API_KEY is not set. Please set it before proceeding.")
else:
    print("✓ OPENAI_API_KEY is set!")

✓ OPENAI_API_KEY is set!


## Initialize the Chatbot

Now let's initialize our RAG chatbot that uses the Supervisor-Worker architecture with LangGraph.

In [3]:
# Initialize the chatbot with GPT-4-Turbo
# You can also use 'gpt-4' if you have access to it
chatbot = RAGChatbot(model="gpt-4o")

print("RAG Chatbot initialized!")

RAG Chatbot initialized!


## Visualize the LangGraph Workflow

Here we'll visualize the workflow using Mermaid chart representation (if available).

In [4]:
try:
    # Attempt to draw the graph if LangGraph supports it
    from langgraph.graph import get_graph_representation, get_mermaid
    
    # Get the graph from our implementation
    graph = chatbot.graph.graph
    
    # Generate mermaid representation
    mermaid_representation = get_mermaid(graph)
    
    # Display as mermaid diagram
    display(HTML(f"""
    <div class="mermaid">
    {mermaid_representation}
    </div>
    <script src="https://cdn.jsdelivr.net/npm/mermaid@10/dist/mermaid.min.js"></script>
    <script>mermaid.initialize({{startOnLoad:true}});</script>
    """))
except Exception as e:
    print(f"Could not generate graph visualization: {e}")
    print("\nFallback text representation:")
    print("Supervisor → Retrieve → Generate → Supervisor")
    print("Supervisor → Generate → Supervisor")
    print("Supervisor → Summarize → Done")
    print("Supervisor → Done")

Could not generate graph visualization: cannot import name 'get_graph_representation' from 'langgraph.graph' (c:\Users\alen.pavlovic\Documents\GitLab\gen-ai-midterm-project\venv\Lib\site-packages\langgraph\graph\__init__.py)

Fallback text representation:
Supervisor → Retrieve → Generate → Supervisor
Supervisor → Generate → Supervisor
Supervisor → Summarize → Done
Supervisor → Done


## Test the RAG Chatbot

Let's test our chatbot with some sample questions about the MS in Applied Data Science program.

In [5]:
def formatted_chat(query, stream=False):
    """Format the chat nicely for display"""
    print(f"🧑 User: {query}")
    print("\n")
    
    # Get the response
    try:
        if stream:
            response, steps = chatbot.chat(query, stream=True)
        else:
            response = chatbot.chat(query)
            
        # Format the response appropriately
        if isinstance(response, list) and len(response) > 0:
            if isinstance(response[-1], dict):
                print(f"🤖 Assistant: {response[-1].get('content', 'No content')}\n")
            else:
                print(f"🤖 Assistant: {response}\n")
        else:
            print(f"🤖 Assistant: {response}\n")
    except Exception as e:
        print(f"Error: {str(e)}")
        import traceback
        traceback.print_exc()
        return None
    
    print("-" * 80)
    return response

In [6]:
# Example 1: Basic program information
query1 = "What is the MS in Applied Data Science program about?"
formatted_chat(query1)

🧑 User: What is the MS in Applied Data Science program about?


🤖 Assistant: 

--------------------------------------------------------------------------------


''

In [7]:
# Example 2: Course requirements
query2 = "What are the core courses for the program?"
formatted_chat(query2)

🧑 User: What are the core courses for the program?


Retrieved existing collection 'uchicago_ms_applied_ds_header_chunks'
Performing direct course search due to poor retrieval results
Found 16 course-related documents through direct search
🤖 Assistant: The Master’s in Applied Data Science program at the University of Chicago requires six core courses to enhance theoretical and practical skills in data science:

1. **Time Series Analysis and Forecasting** - Predicts future trends from past data.
2. **Statistical Models for Data Science** - Explores linear models and statistical methods.
3. **Machine Learning I** - Introduces machine learning techniques and algorithms.
4. **Machine Learning II** - Focuses on Deep Learning and Generative AI.
5. **Data Engineering Platforms for Analytics or Big Data and Cloud Computing** - Covers data engineering or big data methods.
6. **Leadership and Consulting for Data Science** - Promotes business understanding and project management.

These courses p

'The Master’s in Applied Data Science program at the University of Chicago requires six core courses to enhance theoretical and practical skills in data science:\n\n1. **Time Series Analysis and Forecasting** - Predicts future trends from past data.\n2. **Statistical Models for Data Science** - Explores linear models and statistical methods.\n3. **Machine Learning I** - Introduces machine learning techniques and algorithms.\n4. **Machine Learning II** - Focuses on Deep Learning and Generative AI.\n5. **Data Engineering Platforms for Analytics or Big Data and Cloud Computing** - Covers data engineering or big data methods.\n6. **Leadership and Consulting for Data Science** - Promotes business understanding and project management.\n\nThese courses prepare students for diverse data science challenges in various industries.'

In [None]:
# Example 3: Follow-up question (demonstrates conversation memory)
query3 = "How long does it take to complete these courses?"
formatted_chat(query3)

In [None]:
# Example 4: Question requiring summarization (potentially lengthy answer)
query4 = "What career opportunities are available after completing this program? Please provide detailed examples."
formatted_chat(query4)

In [14]:
# Example 4: Question requiring summarization (potentially lengthy answer)
query4 = "How do I apply to the MBA/MS program?"
formatted_chat(query4)

🧑 User: How do I apply to the MBA/MS program?


🤖 Assistant: I'm sorry, but I do not have specific information on how to apply to the MBA/MS program. I recommend visiting the official program website or contacting the admissions office directly to get detailed application instructions.

--------------------------------------------------------------------------------


"I'm sorry, but I do not have specific information on how to apply to the MBA/MS program. I recommend visiting the official program website or contacting the admissions office directly to get detailed application instructions."

## Debugging Mode with Step-by-Step Execution

For debugging purposes, we can run the chatbot in streaming mode to see the step-by-step execution of the LangGraph workflow.

In [None]:
# Stream mode with detailed steps
debug_query = "What are the admission requirements for the program?"
formatted_chat(debug_query, stream=True)

## Reset Conversation

We can reset the conversation history if needed.

In [None]:
# Reset the conversation
chatbot.reset()
print("Conversation has been reset.")

## Multi-turn Conversation Test

Let's test a multi-turn conversation to see how the chatbot maintains context.

In [None]:
# Start a new conversation after reset
formatted_chat("Tell me about the online program options.")

In [None]:
# Follow-up question
formatted_chat("What's the difference between online and on-campus programs?")

In [None]:
# Another follow-up
formatted_chat("Do online students get the same degree?")

## View Conversation History

We can view the full conversation history.

In [None]:
# Get the conversation history
history = chatbot.get_conversation_history()

# Display it nicely
for i, message in enumerate(history):
    role = message["role"]
    content = message["content"]
    
    if role == "user":
        print(f"Message {i+1} - 🧑 User: {content[:50]}..." if len(content) > 50 else f"Message {i+1} - 🧑 User: {content}")
    else:
        print(f"Message {i+1} - 🤖 Assistant: {content[:50]}..." if len(content) > 50 else f"Message {i+1} - 🤖 Assistant: {content}")