# Supervisor-Worker RAG Chatbot Demonstration

This notebook demonstrates the RAG (Retrieval-Augmented Generation) chatbot implementation using LangGraph with a Supervisor-Worker architecture. The chatbot is designed to answer questions about the University's MS in Applied Data Science program.

## Setup

First, let's import the required libraries and set up our environment.

In [1]:
import os
import sys
import json
from IPython.display import display, HTML

# Add the src directory to the path
sys.path.append(os.path.abspath('..'))

# Import our RAG chatbot implementation
from src.rag_chatbot import RAGChatbot

# For visualization
from IPython.display import Image, display
import matplotlib.pyplot as plt

## Set OpenAI API Key

Make sure to set your OpenAI API key. You can either set it as an environment variable or directly in this notebook.

In [None]:
# Option 1: Set the API key directly (replace with your actual key)
os.environ["OPENAI_API_KEY"] = "api_keys_here"

# Option 2: Load from a .env file or check if already set
#from dotenv import load_dotenv
#load_dotenv()  # This will load environment variables from a .env file if present

# Verify the API key is set
if "OPENAI_API_KEY" not in os.environ or not os.environ["OPENAI_API_KEY"]:
    print("⚠️ Warning: OPENAI_API_KEY is not set. Please set it before proceeding.")
else:
    print("✓ OPENAI_API_KEY is set!")

✓ OPENAI_API_KEY is set!


## Initialize the Chatbot

Now let's initialize our RAG chatbot that uses the Supervisor-Worker architecture with LangGraph.

In [4]:
# Initialize the chatbot with GPT-4-Turbo
# You can also use 'gpt-4' if you have access to it
chatbot = RAGChatbot(model="gpt-4o")

print("RAG Chatbot initialized!")

RAG Chatbot initialized!


## Visualize the LangGraph Workflow

Here we'll visualize the workflow using Mermaid chart representation (if available).

In [5]:
try:
    # Attempt to draw the graph if LangGraph supports it
    from langgraph.graph import get_graph_representation, get_mermaid
    
    # Get the graph from our implementation
    graph = chatbot.graph.graph
    
    # Generate mermaid representation
    mermaid_representation = get_mermaid(graph)
    
    # Display as mermaid diagram
    display(HTML(f"""
    <div class="mermaid">
    {mermaid_representation}
    </div>
    <script src="https://cdn.jsdelivr.net/npm/mermaid@10/dist/mermaid.min.js"></script>
    <script>mermaid.initialize({{startOnLoad:true}});</script>
    """))
except Exception as e:
    print(f"Could not generate graph visualization: {e}")
    print("\nFallback text representation:")
    print("Supervisor → Retrieve → Generate → Supervisor")
    print("Supervisor → Generate → Supervisor")
    print("Supervisor → Summarize → Done")
    print("Supervisor → Done")

Could not generate graph visualization: cannot import name 'get_graph_representation' from 'langgraph.graph' (c:\Users\alen.pavlovic\Documents\GitLab\gen-ai-midterm-project\venv\Lib\site-packages\langgraph\graph\__init__.py)

Fallback text representation:
Supervisor → Retrieve → Generate → Supervisor
Supervisor → Generate → Supervisor
Supervisor → Summarize → Done
Supervisor → Done


## Test the RAG Chatbot

Let's test our chatbot with some sample questions about the MS in Applied Data Science program.

In [6]:
def formatted_chat(query, stream=False):
    """Format the chat nicely for display"""
    print(f"🧑 User: {query}")
    print("\n")
    
    # Get the response
    try:
        if stream:
            response, steps = chatbot.chat(query, stream=True)
        else:
            response = chatbot.chat(query)
            
        # Format the response appropriately
        if isinstance(response, list) and len(response) > 0:
            if isinstance(response[-1], dict):
                print(f"🤖 Assistant: {response[-1].get('content', 'No content')}\n")
            else:
                print(f"🤖 Assistant: {response}\n")
        else:
            print(f"🤖 Assistant: {response}\n")
    except Exception as e:
        print(f"Error: {str(e)}")
        import traceback
        traceback.print_exc()
        return None
    
    print("-" * 80)
    return response

In [7]:
# Example 1: Basic program information
query1 = "What is the MS in Applied Data Science program about?"
formatted_chat(query1)

🧑 User: What is the MS in Applied Data Science program about?


🤖 Assistant: I'm sorry, but I don't have the necessary information about the MS in Applied Data Science program to answer your question. If you have specific inquiries or need details about the program, I recommend reaching out directly to the university's admissions office or checking their official website for the most accurate and updated information.

--------------------------------------------------------------------------------


"I'm sorry, but I don't have the necessary information about the MS in Applied Data Science program to answer your question. If you have specific inquiries or need details about the program, I recommend reaching out directly to the university's admissions office or checking their official website for the most accurate and updated information."

In [8]:
# Example 2: Course requirements
query2 = "What are the core courses for the program?"
formatted_chat(query2)

🧑 User: What are the core courses for the program?


Retrieved existing collection 'uchicago_ms_applied_ds_header_chunks'
Performing direct course search due to poor retrieval results
Found 16 course-related documents through direct search
🤖 Assistant: The Master's in Applied Data Science program at the University of Chicago requires students to complete six core courses as part of their degree. These courses are designed to build theoretical data science knowledge and provide practical experience in applying these theories to real-world business problems. 

The six core courses are:

1. **Time Series Analysis and Forecasting**: Focuses on making rational predictions based on previous records, widely used in various business settings.

2. **Statistical Models for Data Science**: Covers traditional linear models and new methods for working with a variety of data distributions and associations.

3. **Machine Learning I**: Introduces students to machine learning with data mining technique

"The Master's in Applied Data Science program at the University of Chicago requires students to complete six core courses as part of their degree. These courses are designed to build theoretical data science knowledge and provide practical experience in applying these theories to real-world business problems. \n\nThe six core courses are:\n\n1. **Time Series Analysis and Forecasting**: Focuses on making rational predictions based on previous records, widely used in various business settings.\n\n2. **Statistical Models for Data Science**: Covers traditional linear models and new methods for working with a variety of data distributions and associations.\n\n3. **Machine Learning I**: Introduces students to machine learning with data mining techniques and algorithms, covering exploratory analyses, predictive modeling, and more.\n\n4. **Machine Learning II**: Extends knowledge to deep learning and generative AI, focusing on practical applications and evaluating results.\n\n5. **Data Enginee

In [9]:
# Example 3: Follow-up question (demonstrates conversation memory)
query3 = "How long does it take to complete these courses?"
formatted_chat(query3)

🧑 User: How long does it take to complete these courses?


🤖 Assistant: To complete the Master’s in Applied Data Science at the University of Chicago, students must finish 12 courses and a Career Seminar, broken down into 6 core courses, 4 electives, and a Capstone Project over 2 quarters. 

- **Part-time Students**: Take 2 courses per quarter, with classes usually in the evenings or on Saturdays.
- **Full-time Students**: Take 3 courses per quarter, with some daytime classes.

Optional noncredit courses, like Statistical Concepts, Python, and Advanced Linear Algebra, are offered to help students prepare. The program's duration depends on whether students enroll part-time or full-time.

--------------------------------------------------------------------------------


"To complete the Master’s in Applied Data Science at the University of Chicago, students must finish 12 courses and a Career Seminar, broken down into 6 core courses, 4 electives, and a Capstone Project over 2 quarters. \n\n- **Part-time Students**: Take 2 courses per quarter, with classes usually in the evenings or on Saturdays.\n- **Full-time Students**: Take 3 courses per quarter, with some daytime classes.\n\nOptional noncredit courses, like Statistical Concepts, Python, and Advanced Linear Algebra, are offered to help students prepare. The program's duration depends on whether students enroll part-time or full-time."

In [10]:
# Example 4: Question requiring summarization (potentially lengthy answer)
query4 = "What career opportunities are available after completing this program? Please provide detailed examples."
formatted_chat(query4)

🧑 User: What career opportunities are available after completing this program? Please provide detailed examples.


🤖 Assistant: I apologize, but the documents provided do not contain specific information about career opportunities available after completing the Master's in Applied Data Science program at the University of Chicago. Consequently, I'm unable to provide detailed examples of career paths related to this program. For comprehensive and specific career information, I recommend reaching out directly to the university's career services or academic advising office.

--------------------------------------------------------------------------------


"I apologize, but the documents provided do not contain specific information about career opportunities available after completing the Master's in Applied Data Science program at the University of Chicago. Consequently, I'm unable to provide detailed examples of career paths related to this program. For comprehensive and specific career information, I recommend reaching out directly to the university's career services or academic advising office."

In [12]:
query5 = "What is tuition cost for the MS in Applied Data Science?"
formatted_chat(query5)

🧑 User: What is tuition cost for the MS in Applied Data Science?


Performing direct tuition information search
Found 15 tuition-related documents
🤖 Assistant: The tuition cost for the MS in Applied Data Science program is $5,967 per course, which totals $71,604 for the entire program. Please note that this total cost is subject to change as tuition is expected to increase by 3-7% per year. There is also a non-refundable program enrollment deposit of $1,500, which is credited toward the first quarter's tuition balance. Additional costs, which may include student fees, health insurance, and living expenses, are not included in the tuition total provided. For more detailed information on these additional expenses, it is advisable to visit the Graduate Financial Aid Office's Cost of Attendance page.

--------------------------------------------------------------------------------


"The tuition cost for the MS in Applied Data Science program is $5,967 per course, which totals $71,604 for the entire program. Please note that this total cost is subject to change as tuition is expected to increase by 3-7% per year. There is also a non-refundable program enrollment deposit of $1,500, which is credited toward the first quarter's tuition balance. Additional costs, which may include student fees, health insurance, and living expenses, are not included in the tuition total provided. For more detailed information on these additional expenses, it is advisable to visit the Graduate Financial Aid Office's Cost of Attendance page."

In [13]:
query6 = "What scholarships are available for the program?"
formatted_chat(query6)

🧑 User: What scholarships are available for the program?


Performing direct tuition information search
Found 17 tuition-related documents
🤖 Assistant: The MS in Applied Data Science program offers partial tuition scholarships to top applicants. These scholarships are merit-based and do not require a separate application; candidates are automatically considered upon applying to the program. However, it is recommended that applicants submit their applications prior to the early deadline to enhance their chances of receiving a scholarship.

In addition to program-specific scholarships, students are encouraged to explore external scholarships offered by various civic and professional organizations, foundations, and state agencies. A useful resource for searching for these opportunities is the financial aid information web page sponsored by the National Association of Student Aid Administration.

For detailed financial aid options, including information about loans, students can refer to t

"The MS in Applied Data Science program offers partial tuition scholarships to top applicants. These scholarships are merit-based and do not require a separate application; candidates are automatically considered upon applying to the program. However, it is recommended that applicants submit their applications prior to the early deadline to enhance their chances of receiving a scholarship.\n\nIn addition to program-specific scholarships, students are encouraged to explore external scholarships offered by various civic and professional organizations, foundations, and state agencies. A useful resource for searching for these opportunities is the financial aid information web page sponsored by the National Association of Student Aid Administration.\n\nFor detailed financial aid options, including information about loans, students can refer to the University's Graduate Aid page and the International Student Funding Options page for international students."

In [14]:
query7 = "What are the minimum scores for the TOEFL and IELTS English Language Requirement?"
formatted_chat(query7)

🧑 User: What are the minimum scores for the TOEFL and IELTS English Language Requirement?


Performing direct course search due to poor retrieval results
Found 8 course-related documents through direct search
🤖 Assistant: The minimum scores required for the English Language Requirement for the Master's in Applied Data Science program are as follows: a TOEFL internet-based test (iBT) score of 102 with no subscore requirement, and an IELTS Academic score of 7 with no subscore requirement. It is important to ensure that applicants take the Academic Reading/Writing test within IELTS, as the General Training Reading/Writing test is not accepted.

Please also note that TOEFL or IELTS score reports are valid for two years. Scores will be considered expired if the test was taken more than two years before the application submission.

--------------------------------------------------------------------------------


"The minimum scores required for the English Language Requirement for the Master's in Applied Data Science program are as follows: a TOEFL internet-based test (iBT) score of 102 with no subscore requirement, and an IELTS Academic score of 7 with no subscore requirement. It is important to ensure that applicants take the Academic Reading/Writing test within IELTS, as the General Training Reading/Writing test is not accepted.\n\nPlease also note that TOEFL or IELTS score reports are valid for two years. Scores will be considered expired if the test was taken more than two years before the application submission."

In [16]:
query8 = "Is there an application fee waiver?"
formatted_chat(query8)

🧑 User: Is there an application fee waiver?


Performing direct tuition information search
Found 14 tuition-related documents
🤖 Assistant: There is a $90 non-refundable application fee for the program. For questions regarding an application fee waiver, it is suggested to refer to the [Physical Sciences Division fee waiver policy](https://physicalsciences.uchicago.edu/academics/admissions/application-requirements/#FeeWaiver).

--------------------------------------------------------------------------------


'There is a $90 non-refundable application fee for the program. For questions regarding an application fee waiver, it is suggested to refer to the [Physical Sciences Division fee waiver policy](https://physicalsciences.uchicago.edu/academics/admissions/application-requirements/#FeeWaiver).'

In [17]:
query9 = "What are the deadlines for the in-person program?"
formatted_chat(query9)

🧑 User: What are the deadlines for the in-person program?


🤖 Assistant: The retrieved documents do not contain information about the deadlines for the in-person program. Therefore, I am unable to provide specifics about application deadlines. I recommend checking the program's official website or contacting the admissions office directly for accurate and up-to-date information regarding application deadlines.

--------------------------------------------------------------------------------


"The retrieved documents do not contain information about the deadlines for the in-person program. Therefore, I am unable to provide specifics about application deadlines. I recommend checking the program's official website or contacting the admissions office directly for accurate and up-to-date information regarding application deadlines."

In [19]:
query10 = "How long will it take for me to receive a decision on my application?"
formatted_chat(query10)

🧑 User: How long will it take for me to receive a decision on my application?


🤖 Assistant: The retrieved documents do not contain information about how long it will take to receive a decision on your application for the Applied Data Science program. Therefore, I am unable to provide specifics about the decision timeline. I recommend checking the program's official website or reaching out to the admissions office directly for this information.

--------------------------------------------------------------------------------


"The retrieved documents do not contain information about how long it will take to receive a decision on your application for the Applied Data Science program. Therefore, I am unable to provide specifics about the decision timeline. I recommend checking the program's official website or reaching out to the admissions office directly for this information."

In [20]:
query11 = "Can I set up an advising appointment with the enrollment management team?"
formatted_chat(query11)

🧑 User: Can I set up an advising appointment with the enrollment management team?


🤖 Assistant: The retrieved documents do not contain information on how to set up an advising appointment with the enrollment management team for the Applied Data Science program. Therefore, I am unable to provide specifics on this process. I recommend visiting the program's official website or contacting the enrollment management team directly for guidance on setting up an appointment.

--------------------------------------------------------------------------------


"The retrieved documents do not contain information on how to set up an advising appointment with the enrollment management team for the Applied Data Science program. Therefore, I am unable to provide specifics on this process. I recommend visiting the program's official website or contacting the enrollment management team directly for guidance on setting up an appointment."

In [21]:
query12 = "Where can I mail my official transcripts?"
formatted_chat(query12)

🧑 User: Where can I mail my official transcripts?


🤖 Assistant: The retrieved documents do not contain information on where to mail your official transcripts for the Applied Data Science program. Therefore, I am unable to provide specifics about the mailing address for transcripts. I recommend checking the program's official website or contacting the admissions office directly for this information.

--------------------------------------------------------------------------------


"The retrieved documents do not contain information on where to mail your official transcripts for the Applied Data Science program. Therefore, I am unable to provide specifics about the mailing address for transcripts. I recommend checking the program's official website or contacting the admissions office directly for this information."

In [22]:
query13 = "Does the Master’s in Applied Data Science Online program provide visa sponsorship?"
formatted_chat(query13)

🧑 User: Does the Master’s in Applied Data Science Online program provide visa sponsorship?


🤖 Assistant: The Master's in Applied Data Science Online program does not provide visa sponsorship. Visa sponsorship is only available for students in the full-time, in-person program. Therefore, if you require visa sponsorship, you should consider applying to the in-person program instead.

--------------------------------------------------------------------------------


"The Master's in Applied Data Science Online program does not provide visa sponsorship. Visa sponsorship is only available for students in the full-time, in-person program. Therefore, if you require visa sponsorship, you should consider applying to the in-person program instead."

In [23]:
query14 = "How do I apply to the MBA/MS program?"
formatted_chat(query14)

🧑 User: How do I apply to the MBA/MS program?


🤖 Assistant: The retrieved documents do not provide information on how to apply to the MBA/MS program. I recommend visiting the official website of the university's business school or contacting their admissions office directly for detailed application instructions. They will be able to provide the most accurate and up-to-date information regarding application procedures and requirements for the MBA/MS program.

--------------------------------------------------------------------------------


"The retrieved documents do not provide information on how to apply to the MBA/MS program. I recommend visiting the official website of the university's business school or contacting their admissions office directly for detailed application instructions. They will be able to provide the most accurate and up-to-date information regarding application procedures and requirements for the MBA/MS program."

In [28]:
query15 = "Is the MS in Applied Data Science program STEM/OPT eligible?"
formatted_chat(query15)

🧑 User: Is the MS in Applied Data Science program STEM/OPT eligible?


🤖 Assistant: The retrieved documents do not contain specific information about whether the MS in Applied Data Science program is STEM/OPT eligible. To obtain accurate details on OPT eligibility, I recommend reaching out directly to UChicago’s admissions office or checking their official program website for the most up-to-date information.

--------------------------------------------------------------------------------


'The retrieved documents do not contain specific information about whether the MS in Applied Data Science program is STEM/OPT eligible. To obtain accurate details on OPT eligibility, I recommend reaching out directly to UChicago’s admissions office or checking their official program website for the most up-to-date information.'

In [27]:
query16 = "How many courses must you complete to earn UChicago’s Master’s in Applied Data Science?"
formatted_chat(query16)

🧑 User: How many courses must you complete to earn UChicago’s Master’s in Applied Data Science?


🤖 Assistant: 

--------------------------------------------------------------------------------


''

## Debugging Mode with Step-by-Step Execution

For debugging purposes, we can run the chatbot in streaming mode to see the step-by-step execution of the LangGraph workflow.

In [None]:
# Stream mode with detailed steps
debug_query = "What are the admission requirements for the program?"
formatted_chat(debug_query, stream=True)

## Reset Conversation

We can reset the conversation history if needed.

In [None]:
# Reset the conversation
chatbot.reset()
print("Conversation has been reset.")

## Multi-turn Conversation Test

Let's test a multi-turn conversation to see how the chatbot maintains context.

In [None]:
# Start a new conversation after reset
formatted_chat("Tell me about the online program options.")

In [None]:
# Follow-up question
formatted_chat("What's the difference between online and on-campus programs?")

In [None]:
# Another follow-up
formatted_chat("Do online students get the same degree?")

## View Conversation History

We can view the full conversation history.

In [None]:
# Get the conversation history
history = chatbot.get_conversation_history()

# Display it nicely
for i, message in enumerate(history):
    role = message["role"]
    content = message["content"]
    
    if role == "user":
        print(f"Message {i+1} - 🧑 User: {content[:50]}..." if len(content) > 50 else f"Message {i+1} - 🧑 User: {content}")
    else:
        print(f"Message {i+1} - 🤖 Assistant: {content[:50]}..." if len(content) > 50 else f"Message {i+1} - 🤖 Assistant: {content}")