# Workshop 3: Building AI Agents -- Foundations

Last week we explored how LLMs work and how to guide them with prompts and structured extraction. Today we push further: we discover what LLMs *cannot* do, what they *can* do surprisingly well, and how combining those abilities creates something powerful -- an AI agent.

In [None]:
# Setup and imports
from utils.display import output_box, llm_response, separator
from openai import OpenAI
from pydantic import BaseModel

openai_client = OpenAI()


def generate(prompt, temperature=0):
    """Generate a response from the LLM. Same helper from Workshop 2."""
    response = openai_client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": prompt}
        ],
        temperature=temperature
    )
    return response.choices[0].message.content.strip()

## Part 1: What LLMs Can and Cannot Do

Last week we learned how LLMs work -- predicting the next token -- and how to guide them with prompts and structured extraction. Today we'll push further: what happens when we need LLMs to DO things in the real world?

In [None]:
# Can the LLM do math?
response = generate("What is 1,847 x 293? Give me just the number.")

llm_response(response, label="LLM's Answer")
print(f"Actual answer: {1847 * 293:,}")

The LLM attempted the math but may be wrong. It generates text that *looks like* math -- predicting plausible tokens -- but it never actually runs a multiplication algorithm.

In [None]:
# Can the LLM check the current time?
response = generate("What is the exact current time right now?")

llm_response(response, label="LLM on Current Time")

In [None]:
# Can the LLM check today's weather?
response = generate("What is the weather like in New York City right now?")

llm_response(response, label="LLM on Live Weather")

In [None]:
# Can the LLM read a file on our computer?
response = generate("List all the files in my current directory.")

llm_response(response, label="LLM on Local Files")

> **Key Insight:** LLMs can only generate text. They cannot perform calculations, access files, check the time, or fetch live data. Everything they produce is a prediction of what text should come next.

## Part 2: But LLMs Can Plan

We just saw that LLMs cannot *do* things. But notice something interesting -- when asked about files, the LLM described exactly *how* to list files. Let's explore this ability further.

In [None]:
# Can the LLM plan how to find a restaurant's wait time?
response = generate(
    "If I wanted to know the wait time at a restaurant, "
    "what steps would I need to take?"
)

llm_response(response, label="LLM's Plan for Wait Time")

In [None]:
# Can the LLM plan how to find the cheapest restaurant?
response = generate(
    "If I wanted to find the cheapest restaurant nearby, "
    "what information would I need to gather?"
)

llm_response(response, label="LLM's Plan for Cheapest Restaurant")

In [None]:
output_box(
    "LLMs are excellent at understanding questions and describing what needs "
    "to be done. What if instead of asking LLMs to DO things, we ask them to "
    "PLAN what needs to be done -- and then a system executes the plan?",
    label="KEY INSIGHT",
    style="warning"
)

## Part 3: From Text to Data (W2 Refresher)

Our scenario today: helping people find restaurants. We have some restaurant info, but it's buried in messy text...

In [None]:
# Unstructured restaurant descriptions (first 3)
RESTAURANT_DESCRIPTIONS = {
    "Olive Garden": (
        "A family-friendly Italian chain known for their unlimited "
        "breadsticks and pasta. Typical dinner runs $15-25 per person. "
        "Vegetarian options available including eggplant parm and "
        "pasta primavera. Located on Main Street, about 5 minutes "
        "from downtown. Open until 10 PM on weekdays."
    ),
    "Sushi Palace": (
        "An upscale Japanese restaurant with an extensive omakase "
        "menu. Expect to spend $35-60 per person for dinner. Limited "
        "vegetarian options, mostly edamame and veggie rolls. Tucked "
        "away in the arts district. Closes at 11 PM most nights."
    ),
    "Burger Barn": (
        "A no-frills burger joint with the best smash burgers in town. "
        "Meals run $8-15 per person. They have a black bean burger "
        "for vegetarians. Right next to the highway exit, very easy "
        "to find. Kitchen closes at 9 PM sharp."
    ),
}

In [None]:
# Unstructured restaurant descriptions (remaining 3)
RESTAURANT_DESCRIPTIONS.update({
    "Taj Mahal": (
        "Authentic Indian cuisine with a wood-fired tandoor oven. "
        "Dinner is typically $18-30 per person. Excellent vegetarian "
        "selection with paneer dishes, dal, and veggie biryani. "
        "Located in the university quarter. Open until 10:30 PM."
    ),
    "Dragon Wok": (
        "A popular Chinese takeout spot with generous portions. Most "
        "dishes are $10-18 per person. A few vegetarian stir-fry "
        "options available. Situated on the east side near the park. "
        "Open until 9:30 PM, later on weekends."
    ),
    "La Piazza": (
        "Fine dining Italian with handmade pasta and an award-winning "
        "wine list. Plan on $45-80 per person for a full dinner. "
        "Vegetarian tasting menu available on request. Located in "
        "the waterfront district with beautiful views. Closes at "
        "11 PM, reservations recommended."
    ),
})

# Show one example
print("Olive Garden description:")
print(RESTAURANT_DESCRIPTIONS["Olive Garden"])

In [None]:
# Define a Pydantic model to extract structured info
# Same structured extraction pattern from Workshop 2

class RestaurantInfo(BaseModel):
    price_per_person_low: int
    price_per_person_high: int
    has_vegetarian: bool
    closing_time: str
    special_notes: str

In [None]:
# Extract structured data from one description
completion = openai_client.chat.completions.parse(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "Extract restaurant information from the description."},
        {"role": "user", "content": RESTAURANT_DESCRIPTIONS["Olive Garden"]}
    ],
    response_format=RestaurantInfo,
)

info = completion.choices[0].message.parsed

print(f"Price range: ${info.price_per_person_low}-${info.price_per_person_high}")
print(f"Vegetarian options: {info.has_vegetarian}")
print(f"Closing time: {info.closing_time}")
print(f"Notes: {info.special_notes}")

> **Key Insight:** This is the same structured extraction from last week -- now used as a building block. We can turn messy text into clean data that tools can work with.

## Part 4: From Planning to Execution

We know LLMs can plan and we know how to structure data. Now let's see if an LLM can tell a system which function to call -- and the system can actually execute it.

In [None]:
# Restaurant structured data -- used by agent tools
RESTAURANTS = {
    "Olive Garden": {"cuisine": "Italian", "price_range": "$$", "rating": 4.2, "distance_miles": 2.5},
    "Sushi Palace": {"cuisine": "Japanese", "price_range": "$$$", "rating": 4.7, "distance_miles": 5.0},
    "Burger Barn": {"cuisine": "American", "price_range": "$", "rating": 3.8, "distance_miles": 1.0},
    "Taj Mahal": {"cuisine": "Indian", "price_range": "$$", "rating": 4.5, "distance_miles": 3.2},
    "Dragon Wok": {"cuisine": "Chinese", "price_range": "$", "rating": 4.0, "distance_miles": 4.5},
    "La Piazza": {"cuisine": "Italian", "price_range": "$$$", "rating": 4.8, "distance_miles": 6.0},
}

# Simulated real-time wait times (minutes)
WAIT_TIMES = {
    "Olive Garden": 25,
    "Sushi Palace": 45,
    "Burger Barn": 5,
    "Taj Mahal": 15,
    "Dragon Wok": 30,
    "La Piazza": 60,
}

In [None]:
# Simple tool functions

def get_wait_time(restaurant):
    """Get current wait time in minutes at a restaurant."""
    for name, wait in WAIT_TIMES.items():
        if name.lower() == restaurant.lower():
            return wait
    available = ", ".join(WAIT_TIMES.keys())
    return f"Restaurant '{restaurant}' not found. Available: {available}"


def get_rating(restaurant):
    """Get the rating (1-5 stars) for a restaurant."""
    for name, info in RESTAURANTS.items():
        if name.lower() == restaurant.lower():
            return info["rating"]
    available = ", ".join(RESTAURANTS.keys())
    return f"Restaurant '{restaurant}' not found. Available: {available}"

In [None]:
# Test the tools
print(f"Wait time at Olive Garden: {get_wait_time('Olive Garden')} minutes")
print(f"Rating of Sushi Palace: {get_rating('Sushi Palace')} stars")

In [None]:
# Ask the LLM to generate a function call
prompt = (
    "Given these available functions:\n"
    "- get_wait_time(restaurant) - Get current wait time in minutes\n"
    "- get_rating(restaurant) - Get rating (1-5 stars)\n\n"
    "What function call would answer: "
    "'How long is the wait at Olive Garden?'\n\n"
    "Respond with ONLY the function call, nothing else."
)

function_call_text = generate(prompt)

llm_response(function_call_text, label="LLM Generated Function Call")
print(f"Type: {type(function_call_text).__name__} -- still just text!")

In [None]:
# Execute the LLM's function call
tools_dict = {
    "get_wait_time": get_wait_time,
    "get_rating": get_rating,
}

# Strip any markdown code fences the LLM may have added
clean_call = function_call_text.strip()
if clean_call.startswith("```"):
    clean_call = clean_call.split("\n")[1]
    if clean_call.endswith("```"):
        clean_call = clean_call[:-3]

result = eval(clean_call, {"__builtins__": {}}, tools_dict)

print(f"Function call: {clean_call}")
print(f"Result: {result} minutes")

In [None]:
# Complete flow: user question -> LLM plans -> system executes
user_question = "What's the rating of Sushi Palace?"

separator("Step 1: User asks a question")
print(f"User: {user_question}")

separator("Step 2: LLM decides which function to call")
plan_prompt = (
    "Given these available functions:\n"
    "- get_wait_time(restaurant) - Get current wait time in minutes\n"
    "- get_rating(restaurant) - Get rating (1-5 stars)\n\n"
    f"What function call would answer: '{user_question}'\n\n"
    "Respond with ONLY the function call, nothing else."
)
planned_call = generate(plan_prompt)
print(f"LLM says: {planned_call}")

In [None]:
# Execute the planned call and show the result
separator("Step 3: System executes the function")
clean_call = planned_call.strip()
if clean_call.startswith("```"):
    clean_call = clean_call.split("\n")[1]
    if clean_call.endswith("```"):
        clean_call = clean_call[:-3]
result = eval(clean_call, {"__builtins__": {}}, tools_dict)
print(f"Result: {result}")

separator("Step 4: Result returned to user")
output_box(
    f"Question: {user_question}\n"
    f"Answer: {result}",
    label="Final Answer",
    style="success"
)

> **Key Insight:** The LLM never executed anything itself. It generated a function call as text, and our system ran it. This pattern -- LLM plans, system executes -- is the core idea behind AI agents.

### What We Discovered

In [None]:
output_box(
    "1. User asks a question\n"
    "2. LLM decides which function to call\n"
    "3. System executes the function\n"
    "4. Result comes back to the user\n\n"
    "This is the foundation of how agents work.",
    label="What We Discovered",
    style="success"
)