# Supervisor-Worker RAG Chatbot Demonstration

This notebook demonstrates the RAG (Retrieval-Augmented Generation) chatbot implementation using LangGraph with a Supervisor-Worker architecture. The chatbot is designed to answer questions about the University's MS in Applied Data Science program.

## Setup

First, let's import the required libraries and set up our environment.

In [1]:
import os
import sys
import json
from IPython.display import display, HTML

# Add the src directory to the path
sys.path.append(os.path.abspath('..'))

# Import our RAG chatbot implementation
from src.rag_chatbot import RAGChatbot

# For visualization
from IPython.display import Image, display
import matplotlib.pyplot as plt

## Set OpenAI API Key

Make sure to set your OpenAI API key. You can either set it as an environment variable or directly in this notebook.

In [None]:
# Option 1: Set the API key directly (replace with your actual key)
os.environ["OPENAI_API_KEY"] = "api_keys here"

# Option 2: Load from a .env file or check if already set
#from dotenv import load_dotenv
#load_dotenv()  # This will load environment variables from a .env file if present

# Verify the API key is set
if "OPENAI_API_KEY" not in os.environ or not os.environ["OPENAI_API_KEY"]:
    print("⚠️ Warning: OPENAI_API_KEY is not set. Please set it before proceeding.")
else:
    print("✓ OPENAI_API_KEY is set!")

✓ OPENAI_API_KEY is set!


## Initialize the Chatbot

Now let's initialize our RAG chatbot that uses the Supervisor-Worker architecture with LangGraph.

In [3]:
# Initialize the chatbot with GPT-4-Turbo
# You can also use 'gpt-4' if you have access to it
chatbot = RAGChatbot(model="gpt-4o")

print("RAG Chatbot initialized!")

RAG Chatbot initialized!


## Visualize the LangGraph Workflow

Here we'll visualize the workflow using Mermaid chart representation (if available).

In [4]:
try:
    # Attempt to draw the graph if LangGraph supports it
    from langgraph.graph import get_graph_representation, get_mermaid
    
    # Get the graph from our implementation
    graph = chatbot.graph.graph
    
    # Generate mermaid representation
    mermaid_representation = get_mermaid(graph)
    
    # Display as mermaid diagram
    display(HTML(f"""
    <div class="mermaid">
    {mermaid_representation}
    </div>
    <script src="https://cdn.jsdelivr.net/npm/mermaid@10/dist/mermaid.min.js"></script>
    <script>mermaid.initialize({{startOnLoad:true}});</script>
    """))
except Exception as e:
    print(f"Could not generate graph visualization: {e}")
    print("\nFallback text representation:")
    print("Supervisor → Retrieve → Generate → Supervisor")
    print("Supervisor → Generate → Supervisor")
    print("Supervisor → Summarize → Done")
    print("Supervisor → Done")

Could not generate graph visualization: cannot import name 'get_graph_representation' from 'langgraph.graph' (c:\Users\alen.pavlovic\Documents\GitLab\gen-ai-midterm-project\venv\Lib\site-packages\langgraph\graph\__init__.py)

Fallback text representation:
Supervisor → Retrieve → Generate → Supervisor
Supervisor → Generate → Supervisor
Supervisor → Summarize → Done
Supervisor → Done


## Test the RAG Chatbot

Let's test our chatbot with some sample questions about the MS in Applied Data Science program.

In [5]:
def formatted_chat(query, stream=False):
    """Format the chat nicely for display"""
    print(f"🧑 User: {query}")
    print("\n")
    
    # Get the response
    try:
        if stream:
            response, steps = chatbot.chat(query, stream=True)
        else:
            response = chatbot.chat(query)
            
        # Format the response appropriately
        if isinstance(response, list) and len(response) > 0:
            if isinstance(response[-1], dict):
                print(f"🤖 Assistant: {response[-1].get('content', 'No content')}\n")
            else:
                print(f"🤖 Assistant: {response}\n")
        else:
            print(f"🤖 Assistant: {response}\n")
    except Exception as e:
        print(f"Error: {str(e)}")
        import traceback
        traceback.print_exc()
        return None
    
    print("-" * 80)
    return response

In [6]:
# Example 1: Basic program information
query1 = "What is the MS in Applied Data Science program about?"
formatted_chat(query1)

🧑 User: What is the MS in Applied Data Science program about?


🤖 Assistant: I'm sorry, but I don't have the information about the MS in Applied Data Science program based on the documents available. If you need detailed information about the program, I recommend contacting the university's admissions office or visiting their official website for accurate and up-to-date details.

--------------------------------------------------------------------------------


"I'm sorry, but I don't have the information about the MS in Applied Data Science program based on the documents available. If you need detailed information about the program, I recommend contacting the university's admissions office or visiting their official website for accurate and up-to-date details."

In [7]:
# Example 2: Course requirements
query2 = "What are the core courses for the program?"
formatted_chat(query2)

🧑 User: What are the core courses for the program?


Retrieved existing collection 'uchicago_ms_applied_ds_header_chunks'
Performing direct course search due to poor retrieval results
Found 16 course-related documents through direct search
🤖 Assistant: The core courses for the Master's in Applied Data Science program at the University of Chicago consist of six courses. These courses are designed to build both theoretical knowledge of data science and practical skills to apply this knowledge to solve real-world business problems. The identified core courses in the program are as follows:

1. Time Series Analysis and Forecasting
2. Statistical Models for Data Science
3. Machine Learning I
4. Machine Learning II
5. Either Data Engineering Platforms for Analytics or Big Data and Cloud Computing
6. Leadership and Consulting for Data Science

These courses collectively prepare students for various challenges in the data science field by covering key topics from predictive analytics and stati

"The core courses for the Master's in Applied Data Science program at the University of Chicago consist of six courses. These courses are designed to build both theoretical knowledge of data science and practical skills to apply this knowledge to solve real-world business problems. The identified core courses in the program are as follows:\n\n1. Time Series Analysis and Forecasting\n2. Statistical Models for Data Science\n3. Machine Learning I\n4. Machine Learning II\n5. Either Data Engineering Platforms for Analytics or Big Data and Cloud Computing\n6. Leadership and Consulting for Data Science\n\nThese courses collectively prepare students for various challenges in the data science field by covering key topics from predictive analytics and statistical modeling to practical machine learning applications and data engineering."

In [None]:
# Example 3: Follow-up question (demonstrates conversation memory)
query3 = "How long does it take to complete these courses?"
formatted_chat(query3)

In [None]:
# Example 4: Question requiring summarization (potentially lengthy answer)
query4 = "What career opportunities are available after completing this program? Please provide detailed examples."
formatted_chat(query4)

In [8]:
# Example 4: Question requiring summarization (potentially lengthy answer)
query4 = "What is tuition cost for the MS in Applied Data Science?"
formatted_chat(query4)

🧑 User: What is tuition cost for the MS in Applied Data Science?


Performing direct tuition information search
Found 15 tuition-related documents
🤖 Assistant: The tuition cost for the MS in Applied Data Science program at the University of Chicago is $5,967 per course, with a total tuition of $71,604 for the entire program. Additionally, there is a non-refundable program enrollment deposit of $1,500, which is credited toward the first quarter’s tuition balance.

Please note that tuition rates are subject to annual increases of 3-7%. For detailed information on all costs associated with the program, including potential scholarships and financial aid options, it is recommended to visit the university's official resources or directly consult with the admissions office.

--------------------------------------------------------------------------------


"The tuition cost for the MS in Applied Data Science program at the University of Chicago is $5,967 per course, with a total tuition of $71,604 for the entire program. Additionally, there is a non-refundable program enrollment deposit of $1,500, which is credited toward the first quarter’s tuition balance.\n\nPlease note that tuition rates are subject to annual increases of 3-7%. For detailed information on all costs associated with the program, including potential scholarships and financial aid options, it is recommended to visit the university's official resources or directly consult with the admissions office."

In [9]:
query4 = "What scholarships are available for the program?"
formatted_chat(query4)

🧑 User: What scholarships are available for the program?


Performing direct tuition information search
Found 17 tuition-related documents
🤖 Assistant: The MS in Applied Data Science program at the University of Chicago offers several scholarship opportunities. The key scholarships available are:

1. **Data Science Institute Scholarship**: Partial tuition scholarships are awarded to top applicants based on merit. No separate application is required for these scholarships, but early application submission is recommended to increase the chances of securing a scholarship.

2. **MS in Applied Data Science Alumni Scholarship**: Similar to the Data Science Institute Scholarship, it provides partial tuition coverage to deserving applicants.

Additionally, applicants are automatically considered for merit scholarships when they apply to the program. Submitting an application early is encouraged to improve the likelihood of receiving a scholarship.

Beyond these program-specific scholarships, s

"The MS in Applied Data Science program at the University of Chicago offers several scholarship opportunities. The key scholarships available are:\n\n1. **Data Science Institute Scholarship**: Partial tuition scholarships are awarded to top applicants based on merit. No separate application is required for these scholarships, but early application submission is recommended to increase the chances of securing a scholarship.\n\n2. **MS in Applied Data Science Alumni Scholarship**: Similar to the Data Science Institute Scholarship, it provides partial tuition coverage to deserving applicants.\n\nAdditionally, applicants are automatically considered for merit scholarships when they apply to the program. Submitting an application early is encouraged to improve the likelihood of receiving a scholarship.\n\nBeyond these program-specific scholarships, students are advised to explore external scholarships offered by civic and professional organizations, foundations, and state agencies. A useful

In [10]:
query4 = "What are the minimum scores for the TOEFL and IELTS English Language Requirement?"
formatted_chat(query4)

🧑 User: What are the minimum scores for the TOEFL and IELTS English Language Requirement?


Performing direct course search due to poor retrieval results
Found 8 course-related documents through direct search
🤖 Assistant: The minimum English language proficiency scores required for the Master’s in Applied Data Science program at the University of Chicago are as follows:

- **TOEFL**: A minimum overall score of 102 is required, with no subscore requirement.
- **IELTS**: A minimum overall score of 7 is required, also with no subscore requirement.

These scores are necessary for applicants who do not meet the English Language Proficiency criteria and must be submitted as proof of proficiency. Please note that the program requires the IELTS Academic Reading/Writing test, not the General Training one. Additionally, TOEFL or IELTS score reports must be valid and are considered expired if taken more than two years prior to the application submission. Photocopies or PDFs of these score reports

"The minimum English language proficiency scores required for the Master’s in Applied Data Science program at the University of Chicago are as follows:\n\n- **TOEFL**: A minimum overall score of 102 is required, with no subscore requirement.\n- **IELTS**: A minimum overall score of 7 is required, also with no subscore requirement.\n\nThese scores are necessary for applicants who do not meet the English Language Proficiency criteria and must be submitted as proof of proficiency. Please note that the program requires the IELTS Academic Reading/Writing test, not the General Training one. Additionally, TOEFL or IELTS score reports must be valid and are considered expired if taken more than two years prior to the application submission. Photocopies or PDFs of these score reports will not be accepted in lieu of an official report. \n\nFurther details and the waiver policy can be reviewed on the university's [English Language Requirements page](https://grad.uchicago.edu/admissions/apply/engli

In [12]:
query4 = "Is there an application fee waiver?"
formatted_chat(query4)

🧑 User: Is there an application fee waiver?


Performing direct tuition information search
Found 14 tuition-related documents
🤖 Assistant: For specific inquiries regarding an application fee waiver for the Master's in Applied Data Science program, applicants are advised to refer to the Physical Sciences Division fee waiver policy through the provided link: [Physical Sciences Division fee waiver policy](https://physicalsciences.uchicago.edu/academics/admissions/application-requirements/#FeeWaiver). This source will contain the most accurate and detailed information on eligibility and application procedures for a fee waiver.

--------------------------------------------------------------------------------


"For specific inquiries regarding an application fee waiver for the Master's in Applied Data Science program, applicants are advised to refer to the Physical Sciences Division fee waiver policy through the provided link: [Physical Sciences Division fee waiver policy](https://physicalsciences.uchicago.edu/academics/admissions/application-requirements/#FeeWaiver). This source will contain the most accurate and detailed information on eligibility and application procedures for a fee waiver."

In [13]:
query4 = "What are the deadlines for the in-person program?"
formatted_chat(query4)

🧑 User: What are the deadlines for the in-person program?


🤖 Assistant: The retrieved documents do not contain specific information about the deadlines for the in-person program of the Applied Data Science program. Therefore, I cannot provide details based on the available sources. For accurate and reliable information, I recommend checking directly on the university's official website or contacting the admissions office for guidance.

--------------------------------------------------------------------------------


"The retrieved documents do not contain specific information about the deadlines for the in-person program of the Applied Data Science program. Therefore, I cannot provide details based on the available sources. For accurate and reliable information, I recommend checking directly on the university's official website or contacting the admissions office for guidance."

In [14]:
query4 = "How long will it take for me to receive a decision on my application?"
formatted_chat(query4)

🧑 User: How long will it take for me to receive a decision on my application?


🤖 Assistant: The retrieved documents do not contain specific information regarding the timeline for receiving a decision on your application for the Applied Data Science program. Therefore, I cannot provide an answer based on the available sources. For detailed and accurate information, I suggest checking the program's official website or contacting the admissions office directly.

--------------------------------------------------------------------------------


"The retrieved documents do not contain specific information regarding the timeline for receiving a decision on your application for the Applied Data Science program. Therefore, I cannot provide an answer based on the available sources. For detailed and accurate information, I suggest checking the program's official website or contacting the admissions office directly."

In [15]:
query4 = "Can I set up an advising appointment with the enrollment management team?"
formatted_chat(query4)

🧑 User: Can I set up an advising appointment with the enrollment management team?


🤖 Assistant: The retrieved documents do not contain specific information about setting up an advising appointment with the enrollment management team for the Applied Data Science program. Therefore, I cannot provide an answer based on the available sources. I recommend visiting the university's official website or reaching out directly to the program's admissions office or enrollment management team for guidance on scheduling an advising appointment.

--------------------------------------------------------------------------------


"The retrieved documents do not contain specific information about setting up an advising appointment with the enrollment management team for the Applied Data Science program. Therefore, I cannot provide an answer based on the available sources. I recommend visiting the university's official website or reaching out directly to the program's admissions office or enrollment management team for guidance on scheduling an advising appointment."

In [16]:
query4 = "Where can I mail my official transcripts?"
formatted_chat(query4)

🧑 User: Where can I mail my official transcripts?


🤖 Assistant: The retrieved documents do not contain information about where you can mail your official transcripts for the Applied Data Science program. I recommend visiting the program's official website or contacting the admissions office directly for accurate mailing instructions.

--------------------------------------------------------------------------------


"The retrieved documents do not contain information about where you can mail your official transcripts for the Applied Data Science program. I recommend visiting the program's official website or contacting the admissions office directly for accurate mailing instructions."

In [17]:
query4 = "Does the Master’s in Applied Data Science Online program provide visa sponsorship?"
formatted_chat(query4)

🧑 User: Does the Master’s in Applied Data Science Online program provide visa sponsorship?


🤖 Assistant: The Master’s in Applied Data Science Online program does not provide visa sponsorship. Only the full-time, in-person program is eligible for visa sponsorship. If you require visa sponsorship, you are encouraged to apply to the in-person program instead.

--------------------------------------------------------------------------------


'The Master’s in Applied Data Science Online program does not provide visa sponsorship. Only the full-time, in-person program is eligible for visa sponsorship. If you require visa sponsorship, you are encouraged to apply to the in-person program instead.'

In [18]:
query4 = "How do I apply to the MBA/MS program?"
formatted_chat(query4)

🧑 User: How do I apply to the MBA/MS program?


🤖 Assistant: The retrieved documents do not contain information on how to apply to the MBA/MS program. I recommend visiting the university's official website for the Applied Data Science program or contacting the admissions office directly to obtain accurate and detailed application instructions.

--------------------------------------------------------------------------------


"The retrieved documents do not contain information on how to apply to the MBA/MS program. I recommend visiting the university's official website for the Applied Data Science program or contacting the admissions office directly to obtain accurate and detailed application instructions."

In [19]:
query4 = "Is the MS in Applied Data Science program STEM/OPT eligible?"
formatted_chat(query4)

🧑 User: Is the MS in Applied Data Science program STEM/OPT eligible?


🤖 Assistant: The retrieved documents do not contain information regarding whether the MS in Applied Data Science program is STEM/OPT eligible. I recommend checking the university's official website for the program or contacting the admissions office directly for accurate and detailed information about program eligibility for STEM/OPT.

--------------------------------------------------------------------------------


"The retrieved documents do not contain information regarding whether the MS in Applied Data Science program is STEM/OPT eligible. I recommend checking the university's official website for the program or contacting the admissions office directly for accurate and detailed information about program eligibility for STEM/OPT."

In [20]:
query4 = "How many courses must you complete to earn UChicago’s Master’s in Applied Data Science?"
formatted_chat(query4)

🧑 User: How many courses must you complete to earn UChicago’s Master’s in Applied Data Science?


🤖 Assistant: To earn the University of Chicago's Master's in Applied Data Science, you must successfully complete 12 courses. This includes 6 core courses, 4 elective courses, and 2 Capstone courses. Additionally, there is a tailored Career Seminar that is required but does not count as a credit course.

--------------------------------------------------------------------------------


"To earn the University of Chicago's Master's in Applied Data Science, you must successfully complete 12 courses. This includes 6 core courses, 4 elective courses, and 2 Capstone courses. Additionally, there is a tailored Career Seminar that is required but does not count as a credit course."

## Debugging Mode with Step-by-Step Execution

For debugging purposes, we can run the chatbot in streaming mode to see the step-by-step execution of the LangGraph workflow.

In [None]:
# Stream mode with detailed steps
debug_query = "What are the admission requirements for the program?"
formatted_chat(debug_query, stream=True)

## Reset Conversation

We can reset the conversation history if needed.

In [None]:
# Reset the conversation
chatbot.reset()
print("Conversation has been reset.")

## Multi-turn Conversation Test

Let's test a multi-turn conversation to see how the chatbot maintains context.

In [None]:
# Start a new conversation after reset
formatted_chat("Tell me about the online program options.")

In [None]:
# Follow-up question
formatted_chat("What's the difference between online and on-campus programs?")

In [None]:
# Another follow-up
formatted_chat("Do online students get the same degree?")

## View Conversation History

We can view the full conversation history.

In [None]:
# Get the conversation history
history = chatbot.get_conversation_history()

# Display it nicely
for i, message in enumerate(history):
    role = message["role"]
    content = message["content"]
    
    if role == "user":
        print(f"Message {i+1} - 🧑 User: {content[:50]}..." if len(content) > 50 else f"Message {i+1} - 🧑 User: {content}")
    else:
        print(f"Message {i+1} - 🤖 Assistant: {content[:50]}..." if len(content) > 50 else f"Message {i+1} - 🤖 Assistant: {content}")