# Workshop 3: Building AI Agents

To recall what we did last week:
- Learned about how LLMs work at a high level
- Explored a few prompt engineering techniques
- Explored some ways we can get structured output from LLMs
- We had a few exercises on prompting, structured data extraction and tool selection

Today:
- We will code our way up to AI Agents and to a definition of what an "Agent" is
- Discuss differences between workflows and agents
- Identify building blocks for our Analytics agent
- Hopefully add a functionality to explore the dataset 

In [None]:
# ============================================================================
# SETUP & IMPORTS
# ============================================================================

from openai import OpenAI
import pandas as pd
import os
from datetime import datetime

openai_client = OpenAI()

def generate(prompt, temperature=0):
    """Generate text using OpenAI's API"""
    response = openai_client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": prompt}
        ],
        temperature=temperature
    )
    return response.choices[0].message.content.strip()

First, let's look at the following query:

```"What is 1234 * 5678?"```

The correct answer to that is

In [12]:
1234 * 5678

7006652

**Let's see if a LLM can actually answer that accurately.**

In [32]:
num1 = 1234
num2 = 5678

# Ask the LLM to calculate
prompt = f"What is {num1} * {num2}? Please respond in the format number1 * number2 = product"
llm_response = generate(prompt)

print(f"🗒️ Prompt: {prompt}")
print(f"🤖 LLM says: {llm_response}")
print(f"✅ Actual answer: {num1 * num2}")
llm_answer = llm_response.split("=")[1].strip()
print(f"❌ Difference: {abs(int(llm_answer.replace(',', '').replace(' ', '')) - (num1 * num2))}")


🗒️ Prompt: What is 1234 * 5678? Please respond in the format number1 * number2 = product
🤖 LLM says: 1234 * 5678 = 700665
✅ Actual answer: 7006652
❌ Difference: 6305987


### What's going on?
If we recall from the last workshop, LLMs are exactly what their name says, they are Language Models - they are trained to predict the next token (word/piece of text).

So when asked `What is 1234 * 5678?`, tt's not actually multiplying the numbers - it's predicting what tokens are most likely to follow.

The model has seen millions of examples like "1200 × 5000 = 6000000" and "1300 × 6000 = 7800000" during its training stage, so when it sees "1234 × 5678", it predicts the answer should "look like" something around 7 million. That's why it guessed "700665" - similar, but completely wrong value (actual: 7006652).

**The key point:** LLMs learned what math looks like from its training data, but never learned the multiplication algorithm itself. It's the difference between memorizing "2 × 2 = 4" versus understanding that multiplication is repeated addition.

**Let's explore a few more examples**

In [39]:
# LLMs are good at predicting likely text
print("\nGood at text patterns:")
print(generate("The next word after 'Large Language' is usually..."))


Good at text patterns:
The next word after "Large Language" is often "Model," as in "Large Language Model" (LLM), which refers to a type of artificial intelligence model designed to understand and generate human language.


In [42]:
print(generate("List all the files in the current directory"))

I don't have access to your file system or the ability to interact with your current directory. However, I can provide you with commands to list files in various operating systems.

- **For Windows Command Prompt**:
  ```cmd
  dir
  ```

- **For Windows PowerShell**:
  ```powershell
  Get-ChildItem
  ```

- **For macOS or Linux Terminal**:
  ```bash
  ls
  ```

You can run one of these commands in your terminal or command prompt to see the files in your current directory.


In [43]:
print(generate("What is the exact current time (ET) right now?"))

I’m sorry, but I can't provide real-time information, including the current time. However, you can easily check the current Eastern Time (ET) by looking it up on your device or using a world clock website.


In [44]:
print(generate("What's the current temperature in New York City right now?"))

I'm unable to provide real-time data, including current temperatures. However, you can easily find the current temperature in New York City by checking a weather website or using a weather app on your smartphone.


### KEY INSIGHT:
LLMs can **ONLY** generate text. They cannot:
- Perform actual calculations
- Access files or databases
- Check the current time
- Fetch live data
- Execute code
- Send emails
- Delete files
- .......

We saw in the previous example of `List all the files in the current directory` that even though the LLMs can't really access the filesystem and do the task that was asked, it was able to give instructions on exactly what we can do to achieve that goal. Let's explore a few examples and see if it holds for other examples as well.

In [47]:
print(generate(f"How would you calculate {1234} * {5678}? Be specific about the method."))

To calculate \( 1234 \times 5678 \), you can use the standard multiplication method (also known as long multiplication). Here’s a step-by-step breakdown of the process:

1. **Write the numbers vertically**: Write 1234 on top and 5678 below it, aligning the digits to the right.

   ```
     1234
   x 5678
   ```

2. **Multiply the bottom number’s last digit (8) by the top number (1234)**:
   - \( 8 \times 1234 = 9872 \)

   Write this result below the line.

   ```
     1234
   x 5678
   _______
     9872   (this is 1234 * 8)
   ```

3. **Multiply the next digit (7) by the top number**:
   - Since 7 is in the tens place, we actually multiply by 70.
   - \( 7 \times 1234 = 8638 \)
   - Write this result one position to the left (as it represents 70).

   ```
     1234
   x 5678
   _______
     9872   (this is 1234 * 8)
   + 8638   (this is 1234 * 70, shifted one position to the left)
   ```

4. **Multiply the next digit (6) by the top number**:
   - Since 6 is in the hundreds place, we m

In [48]:
print(generate("What command or code would you run to list files in the current directory?"))

To list files in the current directory, you can use the following commands depending on your operating system:

### On Linux or macOS:
You can use the `ls` command in the terminal:
```bash
ls
```

### On Windows:
You can use the `dir` command in the Command Prompt:
```cmd
dir
```

If you are using PowerShell on Windows, you can use:
```powershell
Get-ChildItem
```
or simply:
```powershell
ls
```
as `ls` is an alias for `Get-ChildItem` in PowerShell.

These commands will display the files and directories in the current working directory.


In [49]:
print(generate("What Python code would get the current time?"))

In Python, you can get the current time using the `datetime` module. Here is a simple example of how to do this:

```python
from datetime import datetime

# Get the current time
current_time = datetime.now()

# Print the current time
print("Current time:", current_time)
```

If you want to format the output to show only the time, you can do so like this:

```python
from datetime import datetime

# Get the current time
current_time = datetime.now()

# Format the time
formatted_time = current_time.strftime("%H:%M:%S")

# Print the formatted time
print("Current time:", formatted_time)
```

In this example, `strftime` is used to format the time into a more readable format (hours:minutes:seconds). You can adjust the format string to display the time in different ways according to your needs.


### KEY INSIGHT:
LLMs are excellent at understanding and describing and planning what needs to be done!

### THOUGHT:
What if instead of asking LLMs to **DO** things, we ask them to tell us **WHAT NEEDS TO BE DONE**?

In [53]:
def add_ints(x, y):
    """Add two integers"""
    return x + y

def multiply_ints(x, y):
    """Multiply two integers"""
    return x * y

def divide_ints(x, y):
    """Divide two integers"""
    return x / y

print("Testing the three functions to see if they work:")
print(f"\nadd_ints(5, 3) = {add_ints(5, 3)}")
print(f"multiply_ints(5, 3) = {multiply_ints(5, 3)}")
print(f"divide_ints(5, 3) = {divide_ints(5, 3)}")

Testing the three functions to see if they work:

add_ints(5, 3) = 8
multiply_ints(5, 3) = 15
divide_ints(5, 3) = 1.6666666666666667


In [68]:
# Ask the LLM to generate a function call (as text!)
prompt = f"""You have access to these functions:
- add_ints(x, y): Adds two integers
- multiply_ints(x, y): Multiplies two integers
- divide_ints(x, y): Divides two integers

Please respond with what function needs to be called and with what arguments to fulfill the following query:
Product of {num1} and {num2}

Response format:
Please respond with just the function call needed
"""

function_call_text = generate(prompt)
print(f"\n🤖 LLM response: {function_call_text}")
print(f"📝 Type: {type(function_call_text)}  👈 (still just text!)")


🤖 LLM response: multiply_ints(1234, 5678)
📝 Type: <class 'str'>  👈 (still just text!)


In [71]:
# Execute the function call
result = eval(function_call_text, {"add_ints": add_ints, "multiply_ints": multiply_ints, "divide_ints": divide_ints})
print(f"✅ Execution result: {result}")
print(f"🧮 Correct result: {num1} * {num2} = {num1 * num2}")
print(f"✨ Perfect match! The LLM + Function = Correct Answer")

✅ Execution result: 7006652
🧮 Correct result: 1234 * 5678 = 7006652
✨ Perfect match! The LLM + Function = Correct Answer


In [85]:
# Complete conversation flow
user_query = f"Multiply {num1} and {num2}?"

print(f"\n👤 User asks: {user_query}")

prompt = f"""You have access to these functions:
- add_ints(x, y): Adds two integers
- multiply_ints(x, y): Multiplies two integers
- divide_ints(x, y): Divides two integers

Please respond with what function needs to be called and with what arguments to fulfill the following query:
{user_query}

Response format:
Please respond with just the function call needed
"""

function_call = generate(prompt)
print(f"🤖 LLM generates: {function_call}")

# Execute the function
result = eval(function_call, {"add_ints": add_ints, "multiply_ints": multiply_ints, "divide_ints": divide_ints})
print(f"⚙️ System executes: {function_call} → {result}")

# LLM formats the response
final_response = generate(f"The user query was \"{user_query}\". The correct answer to that query is {result}. Now that you have the answer, please respond to the user appropriately.")
print(f"🤖 LLM responds: {final_response}")


👤 User asks: Multiply 1234 and 5678?
🤖 LLM generates: multiply_ints(1234, 5678)
⚙️ System executes: multiply_ints(1234, 5678) → 7006652
🤖 LLM responds: The product of 1234 and 5678 is 7,006,652.


### What We Discovered:
1. LLMs are text generators - They output strings, nothing else
2. LLMs can't directly "run" things - No access to files, time, APIs, or calculations
3. LLMs understand tasks - They know WHAT needs to be done (assuming the training data had some similar tasks)
4. Function calling can bridge the gap - LLM writes instructions → System executes → LLM looks at the result → Responds

### Pattern:
User Query → LLM (thinks) → Function Call (text) → Execution (action) → Result

<hr>

We've seen that LLMs can tell us which function to call and with what parameters pretty reliably.
But these examples were too simple, and real-world tasks often need multiple steps to be performed to achieve something meaningful.
So, let's look at something a bit more complicated.

Here is a sample scenario:

#### SCENARIO: Emergency Room (ER) Wait Time System

Imagine you're building a system to help people find the fastest ER service.
We have real-time data from multiple hospitals:
- Current number of patients waiting at each hospital
- Historical average treatment time per patient (varies by hospital)
- Distance from user to each hospital

**The challenge**: An LLM alone CANNOT answer these queries because:
1. It doesn't know how many people are CURRENTLY waiting (changes every minute)
2. It doesn't have access to each hospital's specific average treatment times
3. It can't calculate real wait times without this live data

But we have seen that if we have the right functions that do have access to the required data/information, the LLM can understand what the user wants, and generate plausible plans to answer the query.

So let's build a simple application using that idea, and see if we can create something that can answer questions like:
- What's the wait at General Hospital?
- Which ER will see me fastest?
- Should I drive further for shorter wait?

First, let's define some functions that can:
- Return the current queue length at a given hospital
- Return the average treatment time at a given hospital
- Calculate the wait time given the current queue length and the average treatment time at a given hospital
- Get distance to the given hospital
- Calculate the travel time to the hospital
- Return a list of all nearby hospitals

In [124]:
def get_queue_length(hospital: str):
    """Get number of patients currently waiting
    This is just dummy data for the purposes of illustration, but in practice this would connect to some source that can provide realtime data"""
    
    # Simulated REAL-TIME data
    current_queues = {
        'General Hospital': 12,
        'St. Mary Medical': 5,
        'City Emergency': 18,
        'University Hospital': 8,
        'Riverside Clinic': 3
    }
    return current_queues.get(hospital, 0)

def get_treatment_time(hospital: str):
    """Get average minutes per patient at this hospital
    Again, this is just dummy data for the purposes of illustration, but in practice this would connect to some source that can provide realtime data"""
    
    avg_times = {
        'General Hospital': 25,
        'St. Mary Medical': 45,
        'City Emergency': 35,
        'University Hospital': 40,
        'Riverside Clinic': 20
    }
    return avg_times.get(hospital, 30)

def calculate_wait(queue: int, avg_time: int):
    """Calculate expected wait in minutes"""
    return queue * avg_time

def get_distance(hospital: str):
    """Get miles from current location to hospital (Static, dummy data)"""
    distances = {
        'General Hospital': 2.5,
        'St. Mary Medical': 4.0,
        'City Emergency': 1.2,
        'University Hospital': 6.0,
        'Riverside Clinic': 3.3
    }
    return distances.get(hospital, 0)

def calculate_travel_time(distance: float):
    """Returns the travel time, given the distance (assuming 2 minutes per mile for this example)"""
    return distance * 2

def list_hospitals():
    """Get list of all hospitals nearby"""
    return ['General Hospital', 'St. Mary Medical', 'City Emergency', 
            'University Hospital', 'Riverside Clinic']

We've defined the following functions:
- **get_queue_length(hospital)** - Returns the current queue at a given hospital
- **get_treatment_time(hospital)** - Returns the avg minutes per patient
- **calculate_wait(queue, time)** - Returns the total wait time given the current queue length and the average treatment time
- **get_distance(hospital)** - Returns the distance to the hospital
- **calculate_travel_time(distance)** - Returns the travel time given the distance
- **list_hospitals()** - Returns a list of all nearby hospitals

**Let's test each function to see if they work:**

In [113]:
# Test 1: Check queue at one hospital
hospital = 'General Hospital'
queue = get_queue_length(hospital)
print(f"Queue at {hospital}: {queue} patients waiting")

Queue at General Hospital: 12 patients waiting


In [114]:
# Test 2: Get treatment time
avg_time = get_treatment_time(hospital)
print(f"Average treatment time: {avg_time} minutes per patient")

Average treatment time: 25 minutes per patient


In [115]:
# Test 3: Calculate wait
wait = calculate_wait(queue, avg_time)
print(f"Expected wait: {wait} minutes ({wait/60:.1f} hours)")

Expected wait: 300 minutes (5.0 hours)


In [116]:
# Test 4: Check distance
distance = get_distance(hospital)
print(f"Distance: {distance} miles away")

Distance: 2.5 miles away


In [117]:
# Test 5: Calculate travel time
traveltime = calculate_travel_time(2.5)
print(f"Travel time: {traveltime} mins")

Travel time: 5.0 mins


In [118]:
# Test 6: List all hospitals
print(f"\nAll hospitals: {list_hospitals()}")


All hospitals: ['General Hospital', 'St. Mary Medical', 'City Emergency', 'University Hospital', 'Riverside Clinic']


Now let's manually walk through a simple query and what needs to be done to answer it:

#### **Query** What's the wait time at General Hospital?

**🤔 To answer this, we need to:**
1. Check how many people are waiting now at General Hospital
2. Get the average treatment time per patient at General Hospital
3. Calculate total wait time

In [119]:
hospital = 'General Hospital'

# Step 1: Get current queue
queue = get_queue_length(hospital)
print(f"Step 1: Current queue = {queue} patients")

# Step 2: Get average time
avg_time = get_treatment_time(hospital)
print(f"Step 2: Avg time per patient = {avg_time} minutes")

# Step 3: Calculate total wait
wait_time = calculate_wait(queue, avg_time)
print(f"Step 3: Total wait = {queue} × {avg_time} = {wait_time} minutes")

print(f"\n✅ Wait time at {hospital} is {wait_time} minutes ({wait_time/60:.1f} hours)")

Step 1: Current queue = 12 patients
Step 2: Avg time per patient = 25 minutes
Step 3: Total wait = 12 × 25 = 300 minutes

✅ Wait time at General Hospital is 300 minutes (5.0 hours)


<hr>

Let's look at a different, slightly more complicated query:

#### **Query**: Which hospital will see me the fastest? (ignoring travel time)

**🤔 To answer this question, we need to:**
1. Check ALL hospitals
2. Calculate wait time for EACH
3. Find the minimum

In [122]:
# Query Type 2: Find the best option
query = "Which hospital will see me the fastest? (ignoring travel time)"

print(f"👤 User query: {query}")

# Check all hospitals
results = []

for hospital in list_hospitals():
    queue = get_queue_length(hospital)
    avg_time = get_treatment_time(hospital)
    wait = calculate_wait(queue, avg_time)
    results.append((hospital, wait, queue))
    print(f"{hospital:20} - {queue:2} patients × {avg_time:2} min = {wait:3} min wait")

# Find the best option
best = min(results, key=lambda x: x[1])
print(f"\n✅ Answer: {best[0]} has shortest wait ({best[1]} minutes)")

👤 User query: Which hospital will see me the fastest? (ignoring travel time)
General Hospital     - 12 patients × 25 min = 300 min wait
St. Mary Medical     -  5 patients × 45 min = 225 min wait
City Emergency       - 18 patients × 35 min = 630 min wait
University Hospital  -  8 patients × 40 min = 320 min wait
Riverside Clinic     -  3 patients × 20 min =  60 min wait

✅ Answer: Riverside Clinic has shortest wait (60 minutes)


Next, lets try the same query, but also factor in the travel time:

#### **Query**: Which hospital will treat me soonest, including travel time?

**🤔 To answer this, we need to:
1. Calculate wait time for each hospital
2. Calculate travel time to each hospital
3. Add travel times to wait times to get total time for each hospital
4. Find optimal total time

In [123]:
# Query Type 3: Most complex - optimize total time
query = "Which hospital will treat me soonest, including travel time?"

print(f"👤 User query: {query}")
print("\n🤔 Even more complex! We need to:")
print("  1. Calculate wait time for each hospital")
print("  2. Add travel time (assume 2 minutes per mile)")
print("  3. Find optimal total time")
print("\nCoordinating even more tools:\n")

# Travel speed assumption
MINUTES_PER_MILE = 2

results = []
print(f"{'Hospital':<20} {'Wait':<8} {'Travel':<8} {'Total':<8}")
print("-" * 50)

for hospital in list_hospitals():
    # Get wait time
    queue = get_queue_length(hospital)
    avg_time = get_treatment_time(hospital)
    wait = calculate_wait(queue, avg_time)
    
    # Get travel time
    distance = get_distance(hospital)
    travel_time = distance * MINUTES_PER_MILE
    
    # Calculate total
    total = wait + travel_time
    
    results.append((hospital, wait, travel_time, total))
    print(f"{hospital:<20} {wait:<8} {travel_time:<8.0f} {total:<8.0f}")

# Find optimal
best = min(results, key=lambda x: x[3])
print(f"\n✅ Answer: {best[0]} is fastest overall")
print(f"   (Wait: {best[1]} min + Travel: {best[2]:.0f} min = {best[3]:.0f} min total)")

👤 User query: Which hospital will treat me soonest, including travel time?

🤔 Even more complex! We need to:
  1. Calculate wait time for each hospital
  2. Add travel time (assume 2 minutes per mile)
  3. Find optimal total time

Coordinating even more tools:

Hospital             Wait     Travel   Total   
--------------------------------------------------
General Hospital     300      5        305     
St. Mary Medical     225      8        233     
City Emergency       630      2        632     
University Hospital  320      12       332     
Riverside Clinic     60       7        67      

✅ Answer: Riverside Clinic is fastest overall
   (Wait: 60 min + Travel: 7 min = 67 min total)


<hr>

Now all these steps are hardcoded by us, providing very little flexibility. Let's see if a LLM can create a plan for any given query, similar to ours:

In [128]:
user_query = "What's the wait at General Hospital?"

prompt = f"""
You have these ER tools:
- get_queue_length(hospital) - Returns the current queue at a given hospital
- get_treatment_time(hospital) - Returns the avg minutes per patient
- calculate_wait(queue, time) - Returns the total wait time given the current queue length and the average treatment time
- get_distance(hospital) - Returns the distance to the hospital
- calculate_travel_time(distance) - Returns the travel time given the distance
- list_hospitals() - Returns a list of all nearby hospitals

User query: {user_query}

What functions should be called and in what order?
Just list the function calls needed.
"""

print("🤖 Testing LLM's understanding:")
print("\nQuery: 'What's the wait at St. Mary Medical?'")
print("\nLLM response:")
response = generate(prompt)
print(response)

🤖 Testing LLM's understanding:

Query: 'What's the wait at St. Mary Medical?'

LLM response:
1. get_queue_length("General Hospital")
2. get_treatment_time("General Hospital")
3. calculate_wait(queue, time)


In [129]:
# Execute what the LLM suggested
print("\n📊 Let's execute that:")
queue = get_queue_length("General Hospital")
avg = get_treatment_time("General Hospital")
wait = calculate_wait(queue, avg)
print(f"Result: {wait} minutes wait at General Hospital")


📊 Let's execute that:
Result: 300 minutes wait at General Hospital


In [132]:
sample_queries = [
    "How many people are waiting at City Emergency?",
    "Which hospital is closest to me?",
    "What's the average treatment time at Riverside Clinic?",
    "If I go to University Hospital, how long will I wait?",
    "Which ER has the shortest queue right now?",
    "Is General Hospital or St. Mary Medical faster?"
]

In [135]:
"""
SECTION 3: BUILDING OUR FIRST AGENT
Let's start with the simplest possible "agent" - keyword matching
"""

# VERSION 1: Rigid Keyword-Based Agent

def er_agent_v1(query):
    """Our first attempt - simple keyword matching"""
    query_lower = query.lower()
    
    print(f"🎯 Query: {query}")
    
    # Try to understand intent through keywords
    if "wait" in query_lower and any(hosp in query_lower for hosp in ['general', 'mary', 'city', 'university', 'riverside']):
        # Extract hospital name (very fragile!)
        for hospital in list_hospitals():
            if hospital.lower() in query_lower:
                print(f"💭 Intent: Check wait time at {hospital}")
                
                # Execute the steps
                queue = get_queue_length(hospital)
                avg = get_treatment_time(hospital)
                wait = calculate_wait(queue, avg)
                
                result = f"Wait at {hospital}: {wait} minutes ({wait/60:.1f} hours)"
                print(f"📊 Result: {result}")
                return result
                
    elif "fastest" in query_lower or "shortest" in query_lower:
        print(f"💭 Intent: Find fastest hospital")
        
        # Find the best option
        best_hospital = None
        min_wait = float('inf')
        
        for hospital in list_hospitals():
            queue = get_queue_length(hospital)
            avg = get_treatment_time(hospital)
            wait = calculate_wait(queue, avg)
            if wait < min_wait:
                min_wait = wait
                best_hospital = hospital
        
        result = f"Fastest: {best_hospital} ({min_wait} minutes)"
        print(f"📊 Result: {result}")
        return result
        
    else:
        print(f"💭 Intent: Unknown")
        return "❌ I don't understand that request"

# Test our first agent
print("Testing Agent v1 (Keyword Matching):")
print("="*50)
er_agent_v1("What's the wait at General Hospital?")
print("="*50)
er_agent_v1("Which hospital is the fastest to treat patients")
print("="*50)
er_agent_v1("Which hospital has the shortest wait time")

Testing Agent v1 (Keyword Matching):
🎯 Query: What's the wait at General Hospital?
💭 Intent: Check wait time at General Hospital
📊 Result: Wait at General Hospital: 300 minutes (5.0 hours)
🎯 Query: Which hospital is the fastest to treat patients
💭 Intent: Find fastest hospital
📊 Result: Fastest: Riverside Clinic (60 minutes)
🎯 Query: Which hospital has the shortest wait time
💭 Intent: Find fastest hospital
📊 Result: Fastest: Riverside Clinic (60 minutes)


'Fastest: Riverside Clinic (60 minutes)'

In [139]:
print("\n🚨 Let's test a few variations:")
print("="*50)

test_queries = [
    "What's the wait at General Hospital?",     # ✅ Works
    "How long at General Hospital?",            # ❌ Fails - missing "wait"
    "General Hospital waiting time?",           # ❌ Fails - "waiting" not "wait"
    "Which ER is fastest?",                     # ✅ Works
    "Best hospital to visit when in a rush?",                  # ❌ Fails - no "fastest"
    "Quickest ER?",                             # ❌ Fails - "quickest" not "fastest"
]

for query in test_queries:
    result = er_agent_v1(query)
    print(f"\nQuery: '{query}'")
    print(f"Result: {result[:50]}...")  # Truncate for readability
    
print("\n💡 Problem: Slight variations break everything!")
print("We need something that understands MEANING, not just keywords...")


🚨 Let's test a few variations:
🎯 Query: What's the wait at General Hospital?
💭 Intent: Check wait time at General Hospital
📊 Result: Wait at General Hospital: 300 minutes (5.0 hours)

Query: 'What's the wait at General Hospital?'
Result: Wait at General Hospital: 300 minutes (5.0 hours)...
🎯 Query: How long at General Hospital?
💭 Intent: Unknown

Query: 'How long at General Hospital?'
Result: ❌ I don't understand that request...
🎯 Query: General Hospital waiting time?
💭 Intent: Check wait time at General Hospital
📊 Result: Wait at General Hospital: 300 minutes (5.0 hours)

Query: 'General Hospital waiting time?'
Result: Wait at General Hospital: 300 minutes (5.0 hours)...
🎯 Query: Which ER is fastest?
💭 Intent: Find fastest hospital
📊 Result: Fastest: Riverside Clinic (60 minutes)

Query: 'Which ER is fastest?'
Result: Fastest: Riverside Clinic (60 minutes)...
🎯 Query: Best hospital to visit when in a rush?
💭 Intent: Unknown

Query: 'Best hospital to visit when in a rush?'
Result: ❌ I

In [144]:
"""
VERSION 2: LLM-Powered Understanding
What if we let an LLM understand what the user wants?
"""

def understand_er_query(query):
    """Use LLM to understand intent"""
    prompt = f"""
    You are helping with ER queries. Classify this query into one of these intents:
    - check_wait: User wants to know wait time at a specific hospital
    - find_fastest: User wants to find the hospital with shortest wait
    - check_distance: User wants to know how far a hospital is
    - unknown: Doesn't match any intent
    
    Query: {query}
    
    Respond with just the intent name.
    """
    
    intent = generate(prompt).strip().lower()
    return intent

def extract_hospital_name(query):
    """Extract hospital name from query using LLM"""
    prompt = f"""
    Extract the hospital name from this query. 
    Available hospitals: General Hospital, St. Mary Medical, City Emergency, University Hospital, Riverside Clinic
    
    Query: {query}
    
    Respond with just the hospital name or "none" if not found.
    """
    
    return generate(prompt).strip()

# Test the understanding
print("Testing LLM Understanding:")
print("="*50)

test_queries = [
    "How long at General Hospital?",
    "Best hospital to visit when in a rush?",
    "Quickest ER?"
]

for query in test_queries:
    intent = understand_er_query(query)
    print(f"Query: '{query}'")
    print(f"Intent: {intent}\n")

print("✅ LLM understands the MEANING, not just keywords!")

Testing LLM Understanding:
Query: 'How long at General Hospital?'
Intent: check_wait

Query: 'Best hospital to visit when in a rush?'
Intent: find_fastest

Query: 'Quickest ER?'
Intent: find_fastest

✅ LLM understands the MEANING, not just keywords!


**Now that we see how LLM can help in creating a better ER system, let's try to incorporate it to our simple agent.**

In [145]:
def er_agent_v2(query):
    """Agent with LLM-powered understanding"""
    
    print(f"🎯 Query: {query}")
    
    # Step 1: Understand intent using LLM
    intent = understand_er_query(query)
    print(f"💭 Intent: {intent}")
    
    # Step 2: Execute based on intent
    if intent == "check_wait":
        hospital = extract_hospital_name(query)
        if hospital != "none":
            print(f"📍 Hospital: {hospital}")
            queue = get_queue_length(hospital)
            avg = get_treatment_time(hospital)
            wait = calculate_wait(queue, avg)
            result = f"Wait at {hospital}: {wait} minutes ({wait/60:.1f} hours)"
        else:
            result = "Please specify which hospital"
            
    elif intent == "find_fastest":
        best_hospital = None
        min_wait = float('inf')
        
        for hospital in list_hospitals():
            queue = get_queue_length(hospital)
            avg = get_treatment_time(hospital)
            wait = calculate_wait(queue, avg)
            if wait < min_wait:
                min_wait = wait
                best_hospital = hospital
        
        result = f"Fastest: {best_hospital} ({min_wait} minutes wait)"
        
    else:
        result = "I couldn't understand that request"
    
    print(f"📊 Result: {result}")
    return result

# Test the improved agent
print("Testing Agent v2 (LLM Understanding):")
print("="*50)

# These all work now!
er_agent_v2("How long at General Hospital?")
print()
er_agent_v2("Best hospital to visit when in a rush?")
print()
er_agent_v2("Quickest ER?")

Testing Agent v2 (LLM Understanding):
🎯 Query: How long at General Hospital?
💭 Intent: check_wait
📍 Hospital: General Hospital
📊 Result: Wait at General Hospital: 300 minutes (5.0 hours)

🎯 Query: Best hospital to visit when in a rush?
💭 Intent: find_fastest
📊 Result: Fastest: Riverside Clinic (60 minutes wait)

🎯 Query: Quickest ER?
💭 Intent: find_fastest
📊 Result: Fastest: Riverside Clinic (60 minutes wait)


'Fastest: Riverside Clinic (60 minutes wait)'

Let's look at a slightly more complex query:

**Query**: What's the wait at General Hospital and is it faster than St. Mary?

This needs:
1. Check wait at General Hospital
2. Check wait at St. Mary Medical
3. Compare them

Let's see if our current agent with the LLM can handle this.

In [147]:
complex_query = "What's the wait at General Hospital and is it faster than St. Mary"
result = er_agent_v2(complex_query)
print(f"Result: {result}")

print("\n💡 We need to be able to dynamically PLAN and execute MULTIPLE steps...")

🎯 Query: What's the wait at General Hospital and is it faster than St. Mary
💭 Intent: check_wait
📍 Hospital: General Hospital
📊 Result: Wait at General Hospital: 300 minutes (5.0 hours)
Result: Wait at General Hospital: 300 minutes (5.0 hours)

💡 We need to be able to dynamically PLAN and execute MULTIPLE steps...


Now, based on our previous examples and last week's exercises, we've seen that a LLM can actually do this.

In [148]:
def plan_er_query(query):
    """Let LLM plan which tools to use"""
    
    tools_description = """
    Available tools:
    - get_queue_length(hospital) - Returns the current queue at a given hospital
    - get_treatment_time(hospital) - Returns the avg minutes per patient
    - calculate_wait(queue, time) - Returns the total wait time given the current queue length and the average treatment time
    - get_distance(hospital) - Returns the distance to the hospital
    - calculate_travel_time(distance) - Returns the travel time given the distance
    - list_hospitals() - Returns a list of all nearby hospitals
    """
    
    prompt = f"""
    {tools_description}
    
    User query: {query}
    
    What tool(s) should be called to answer this? 
    List the exact function calls needed, one per line.
    """
    
    plan = generate(prompt).strip()
    return plan

# Test the planning
print("Testing LLM Planning:")
print("="*50)

queries = [
    "What's the wait at General Hospital?",
    "Which hospital is fastest?",
    "Compare wait times at General Hospital and St. Mary Medical"
]

for query in queries:
    print(f"\nQuery: '{query}'")
    print("Plan:")
    plan = plan_er_query(query)
    print(plan)

Testing LLM Planning:

Query: 'What's the wait at General Hospital?'
Plan:
```
get_queue_length("General Hospital")
get_treatment_time("General Hospital")
calculate_wait(queue, time)
```

Query: 'Which hospital is fastest?'
Plan:
```plaintext
list_hospitals()
get_queue_length(hospital)
get_treatment_time(hospital)
get_distance(hospital)
calculate_travel_time(distance)
calculate_wait(queue, time)
```

Query: 'Compare wait times at General Hospital and St. Mary Medical'
Plan:
```
list_hospitals()
get_queue_length("General Hospital")
get_treatment_time("General Hospital")
get_queue_length("St. Mary Medical")
get_treatment_time("St. Mary Medical")
```


In [172]:
def er_agent_v3(query):
    """Agent where LLM decides which tools to use"""
    
    # Available tools
    tools = {
        "get_queue_length": get_queue_length,
        "get_treatment_time": get_treatment_time,
        "calculate_wait": calculate_wait,
        "get_distance": get_distance,
        "calculate_travel_time": calculate_travel_time,
        "list_hospitals": list_hospitals
    }
    
    tools_description = """
    Available tools:
    - get_queue_length(hospital) - Returns the current queue at a given hospital
    - get_treatment_time(hospital) - Returns the avg minutes per patient
    - calculate_wait(queue, time) - Returns the total wait time given the current queue length and the average treatment time
    - get_distance(hospital) - Returns the distance to the hospital
    - calculate_travel_time(distance) - Returns the travel time given the distance
    - list_hospitals() - Returns a list of all nearby hospitals
    """
    
    print(f"🎯 Query: {query}")
    
    # Think: What tool to use?
    prompt = f"""
    Here are some functions available to use:
    - get_queue_length(hospital: str) - Returns the current queue length at a given hospital
    - get_treatment_time(hospital: str) - Returns the average minutes per patient. Used to get the average treatment time per patient at a given hospital
    - calculate_wait(queue_length: int, treatment_time: float) - Given the current queue length at a hospital and its average treatment time per patient, returns the total wait time expected
    - get_distance(hospital: str) - Returns the distance to the hospital
    - calculate_travel_time(distance: float) - Returns the travel time given the distance. Usually used when travel time needs to be considered when calculating total time (added to wait time)
    - list_hospitals() - Returns a list of all nearby hospitals
    
    User query: {query}
    
    Respond with ONLY the function call needed (e.g., get_queue_length('General Hospital'))
    If multiple calls needed, just give the first one.
    """
    
    tool_call = generate(prompt).strip()
    print(f"💭 Planning: {tool_call}")
    
    # Act: Execute the tool
    try:
        result = eval(tool_call, tools)
        print(f"⚡ Executed: {tool_call} → {result}")
    except Exception as e:
        result = f"Error: {e}"
        print(f"❌ Failed: {e}")
    
    # For demonstration, just return the raw result
    # (In reality, we'd interpret it better)
    return f"Result: {result}"

# Test it
print("Testing Agent v3 (LLM Chooses Tools):")
print("="*50)

er_agent_v3("How many people at General Hospital?")
print()
er_agent_v3("What's the average time at Riverside Clinic?")

Testing Agent v3 (LLM Chooses Tools):
🎯 Query: How many people at General Hospital?
💭 Planning: get_queue_length('General Hospital')
⚡ Executed: get_queue_length('General Hospital') → 12

🎯 Query: What's the average time at Riverside Clinic?
💭 Planning: get_treatment_time('Riverside Clinic')
⚡ Executed: get_treatment_time('Riverside Clinic') → 20


'Result: 20'

Now let's look deeper into this query:

**Query**: What's the wait at General Hospital?

This actually needs THREE tools:
1. get_queue_length('General Hospital') → 12
2. get_treatment_time('General Hospital') → 25
3. calculate_wait(12, 25) → 300 minutes

<hr>
💡A LLM can identify which functions to call accurately, but to identify the correct arguments for the function calls, it sometimes needs the results of the previous calls.

In the above example, the LLM thinks that this is what should be executed:
1. get_queue_length('General Hospital') 
2. get_treatment_time('General Hospital')
3. calculate_wait(queue_length, treatment_time)

We can see that even though steps 1 and 2 that the LLM generated can be executed, step 3 can't because the LLM can't predict what the values are to pass to calculate_wait. It needs the output of the `get_queue_length` and the `get_treatment_time calls` to actually formulate the correct function call with the right parameters. 

**🤔 What if we let the agent call tools in a LOOP untils it's done?**

In [171]:
def er_agent_v4(query, max_steps=30):
    tools = {
        "get_queue_length": get_queue_length,
        "get_treatment_time": get_treatment_time,
        "calculate_wait": calculate_wait,
        "get_distance": get_distance,
        "calculate_travel_time": calculate_travel_time,
        "list_hospitals": list_hospitals
    }
    
    
    print(f"🎯 QUERY: {query}")
    print("─" * 40)
    
    history = []
    
    for step in range(max_steps):
        print(f"\n📍 Step {step + 1}:")
        
        prompt = f"""
        Query: {query}
        
        Previous steps: {history}
        
        Here are some tools that you can use:
        - get_queue_length(hospital: str) - Returns the current queue length at a given hospital
        - get_treatment_time(hospital: str) - Returns the average minutes per patient. Used to get the average treatment time per patient at a given hospital
        - calculate_wait(queue_length: int, treatment_time: float) - Given the current queue length at a hospital and its average treatment time per patient, returns the total wait time expected
        - get_distance(hospital: str) - Returns the distance to the hospital
        - calculate_travel_time(distance: float) - Returns the travel time given the distance. Usually used when travel time needs to be considered when calculating total time (added to wait time)
        - list_hospitals() - Returns a list of all nearby hospitals

        What's the next step? If done, say "DONE".
        Otherwise, give the exact function call. Respond with just the function call without any code blocks.
        """
        
        response = generate(prompt).strip()
        print(f"LLM Response: {response}")
        
        if "DONE" in response.upper():
            break
            
        try:
            result = eval(response, tools)
            print(f"Execution Result: {result}")
            history.append(f"{response} → {result}")
        except Exception as e:
            print(f"❌ Error: {e}")
            history.append(f"{response} → Error")
    
    # Generate final answer once done
    final_prompt = f"""
    Question: {query}
    Steps taken: {history}
    
    Give a brief, clear answer to the original question.
    """
    
    answer = generate(final_prompt).strip()
    print(f"\n✅ ANSWER: {answer}")
    return answer

# Test the agent
print("Testing Agent v4:")
print("="*50)

er_agent_v4("What's the wait at General Hospital?")

Testing Agent v4:
🎯 QUERY: What's the wait at General Hospital?
────────────────────────────────────────

📍 Step 1:
LLM Response: get_queue_length("General Hospital")
Execution Result: 12

📍 Step 2:
LLM Response: get_treatment_time("General Hospital")
Execution Result: 25

📍 Step 3:
LLM Response: calculate_wait(12, 25)
Execution Result: 300

📍 Step 4:
LLM Response: DONE

✅ ANSWER: The wait at General Hospital is approximately 300 minutes.


'The wait at General Hospital is approximately 300 minutes.'

In [178]:
print("\nLet's test the previous variations (some of which failed):")
print("="*50)

test_queries = [
    "What's the wait at General Hospital?",
    "General Hospital waiting time?",
    "Which ER is fastest?",
    "Best hospital to visit when in a rush?",
    "Best hospital to visit when in a rush, accounting for wait times and travel, not just based on queue length?"
]

for query in test_queries:
    result = er_agent_v4(query)
    print(f"\nQuery: '{query}'")
    print(f"Result: {result}")
    print("=" * 50)
    


**Let's test the previous variations (some of which failed):**
🎯 QUERY: What's the wait at General Hospital?
────────────────────────────────────────

📍 Step 1:
LLM Response: get_queue_length("General Hospital")
Execution Result: 12

📍 Step 2:
LLM Response: get_treatment_time("General Hospital")
Execution Result: 25

📍 Step 3:
LLM Response: calculate_wait(12, 25)
Execution Result: 300

📍 Step 4:
LLM Response: DONE

✅ ANSWER: The wait at General Hospital is approximately 300 minutes.

Query: 'What's the wait at General Hospital?'
Result: The wait at General Hospital is approximately 300 minutes.
🎯 QUERY: General Hospital waiting time?
────────────────────────────────────────

📍 Step 1:
LLM Response: list_hospitals()
Execution Result: ['General Hospital', 'St. Mary Medical', 'City Emergency', 'University Hospital', 'Riverside Clinic']

📍 Step 2:
LLM Response: get_queue_length('General Hospital')
Execution Result: 12

📍 Step 3:
LLM Response: get_treatment_time('General Hospital')
Executi

In [166]:
er_agent_v4("Which is faster: General Hospital or St. Mary Medical?")

print("\n" + "="*50)

er_agent_v4("Find the hospital with the shortest wait time")

print("\n" + "="*50)

er_agent_v4("If I go to City Emergency, how long including travel time?")

🎯 QUERY: Which is faster: General Hospital or St. Mary Medical?
────────────────────────────────────────

📍 Step 1:
LLM Response: get_queue_length("General Hospital")
Execution Result: 12

📍 Step 2:
LLM Response: get_queue_length("St. Mary Medical")
Execution Result: 5

📍 Step 3:
LLM Response: get_treatment_time("General Hospital")
Execution Result: 25

📍 Step 4:
LLM Response: calculate_wait(12, 25)
Execution Result: 300

📍 Step 5:
LLM Response: get_treatment_time("St. Mary Medical")
Execution Result: 45

✅ ANSWER: St. Mary Medical is faster. It has a shorter queue (5) and a treatment time of 45 minutes, while General Hospital has a queue of 12 and a treatment time of 25 minutes, resulting in a longer wait time of 300 minutes.

🎯 QUERY: Find the hospital with the shortest wait time
────────────────────────────────────────

📍 Step 1:
LLM Response: list_hospitals()
Execution Result: ['General Hospital', 'St. Mary Medical', 'City Emergency', 'University Hospital', 'Riverside Clinic']

📍 S

'The total time, including travel, is approximately 633.4 minutes (or about 10.6 hours).'

#### **EVOLUTION OF OUR AGENT**
========================

- **Version 1: Keyword Matching**:

  ❌ Breaks with slight variations
  
  ❌ Can't understand meaning
  
- **Version 2: LLM Understanding**

  ✅ Understands meaning
  
  ❌ Still follows rigid patterns
  
- **Version 3: LLM Chooses Tools**

  ✅ Flexible tool selection
  
  ❌ Only one tool at a time (as subsequent tool calls may depend on previous call results)
  
- **Version 4: Executing steps in loop**

  ✅ Multiple steps
  
  ✅ Can execute until a goal is met
  
  ✅ No hardcoded pipelines!

#### **PSEUDOCODE FOR THE AGENT LOOP:**
─────────────────
<pre>
done = False
while not done:
    thought = think(query, context)     # What to do?
    action = execute(thought)           # Do it
    observation = observe(action)       # What happened?
    if observation.is_done():
        done = True                     # Stop if done
    context.update(observation)         # Remember it if needed
answer = finalize(context)
return answer
</pre>

**What we just built in v4 has a name in AI research: ReAct (Reasoning and Acting)**

It was introduced in this [paper](https://arxiv.org/abs/2210.03629)

It uses something call ReAct prompting to allow a LLM to reason and act until a given goal is met.
[ReAct Prompting Guide here](https://www.promptingguide.ai/techniques/react)

<pre>
The ReAct Loop:
┌─────────────────────────────────────────┐
│              User Query                  │
└─────────────┬───────────────────────────┘
              ▼
        ┌─────────────┐
        │   THOUGHT   │ ← "What needs to be done?"
        └─────┬───────┘
              ▼
        ┌─────────────┐
        │   ACTION    │ ← "Execute the tool/function"
        └─────┬───────┘
              ▼
        ┌─────────────┐
        │ OBSERVATION │ ← "What does the result mean?"
        └─────┬───────┘
              ▼
    ┌───────────────────┐
    │ Task Complete?    │
    │ No → Loop back    │
    │ Yes → Return      │
    └───────────────────┘

</pre>

We've been using the term "Agent" for a while in this workshop. Let's go a bit deeper into exactly what that means.

**[This](https://www.anthropic.com/engineering/building-effective-agents)** blog post by Anthropic explains the concept of "Agents" really well. So whenever we say "agent" in our project, for all intents and purposes we are thinking of the definition by Anthropic.

<hr>

## Building Our AI-Powered Analytics Assistant

Now that we understand how agents work, let's briefly design our actual project.

### What Users Want

Imagine a user with a sales dataset like this:

```
date       | product  | category    | price | quantity | region
-----------|----------|-------------|-------|----------|--------
2024-01-01 | Laptop   | Electronics | 1200  | 2        | North
2024-01-02 | Mouse    | Accessories | 25    | 10       | South
2024-01-03 | Keyboard | Accessories | 75    | 5        | North
2024-01-04 | Monitor  | Electronics | 300   | 3        | East
2024-01-05 | Laptop   | Electronics | 1200  | 1        | South
...
```

They'll ask things like:
- "Show me the data"
- "Filter for sales above $1000"
- "What's the average price by category?"
- "Plot sales over time"
- "Which region has the fastest growing sales?"

### From Queries to Tools

Looking at these queries, we see patterns:
- **Everything needs data loaded** → `load_csv()`
- **Users want to see data** → `show_data()`, `show_info()`
- **Lots of filtering** → `filter_rows()`
- **Aggregations everywhere** → `group_and_aggregate()`
- **Visual understanding** → `create_plot()`

### Tool Design Philosophy

Instead of one mega-tool or overly specific tools, we usually want **composable operations** (but does depend on the use case):

```python
# ❌ Too broad
def analyze_data(operation, **params)

# ❌ Too specific
def show_sales_by_region()

# ✅ Just right - composable!
def filter_rows(condition)
def group_and_aggregate(group_by, column, operation)
```

Each tool does ONE thing. The LLM combines them intelligently to answer complex queries.

### Our Starting Toolset

We'll keep it simple and start with these capabilities:

```
load_csv(filepath)
show_info()
show_data(n_rows)
filter_rows(condition)
group_and_aggregate(group_by, column, operation)
calculate_statistics(column, stat_type)
create_plot(plot_type, **kwargs)
... and a few more
```

### Simulation

**Query**: "What's the average price by category?"

**Agent breaks it down**:
1. Group by category
2. Calculate mean of price
3. Return formatted results

Just like our ER agent, but now with pandas operations instead of ER queries!

### Before Next Week

Think about:
1. What 3-5 queries would you ask YOUR data?
2. Think about what tools make sense for your data and domain
3. Think about the granularity of the tools

In the next workshop, we'll build one version of the assistant with a dataset of my choice together.