# Workshop 3: Building AI Agents -- Foundations

Last week we explored how LLMs work and how to guide them with prompts and structured extraction. Today we push further -- we'll code our way up to AI agents and discover what makes them work.

In [None]:
# Setup and imports
from utils.display import output_box, llm_response, separator
from openai import OpenAI
from pydantic import BaseModel

openai_client = OpenAI()


def generate(prompt, temperature=0):
    """Generate a response from the LLM. Same helper from Workshop 2."""
    response = openai_client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": prompt}
        ],
        temperature=temperature
    )
    return response.choices[0].message.content.strip()

## Part 1: Let's Test the LLM

Last week we learned how LLMs work and how to guide them with prompts. Today let's push further -- what happens when we ask an LLM to do something beyond generating text?

In [None]:
# Can the LLM do math?
response = generate("What is 1,847 x 293? Give me just the number.")

llm_response(response, label="LLM's Answer")
print(f"Actual answer: {1847 * 293:,}")

Interesting -- the LLM gave us a number, but is it right? It generates text that *looks like* math -- predicting plausible tokens -- but it never actually runs a multiplication algorithm.

In [None]:
# Can the LLM check the current time?
response = generate("What is the exact current time right now?")

llm_response(response, label="LLM on Current Time")
# Can the LLM check today's weather?
response = generate("What is the weather like in New York City right now?")

llm_response(response, label="LLM on Live Weather")
# Can the LLM read a file on our computer?
response = generate("List all the files in my current directory.")

llm_response(response, label="LLM on Local Files")
# Can the LLM check today's weather?
response = generate("What is the weather like in New York City right now?")

llm_response(response, label="LLM on Live Weather")
# Can the LLM read a file on our computer?
response = generate("List all the files in my current directory.")

llm_response(response, label="LLM on Local Files")

> **Key Insight:** Did you notice the pattern? In every case -- math, time, files, weather -- the LLM produced text that *looks like* an answer but never actually performed the action. LLMs can only generate text. They cannot perform calculations, access files, check the time, or fetch live data.

## Part 2: Wait -- Something Interesting Happened

We just saw that LLMs cannot *do* things. But look back at those responses -- when asked about files, the LLM described exactly *how* to list files. When asked about time, it explained *how* to check. Let's explore this ability further.

In [None]:
# Can the LLM plan how to find a restaurant's wait time?
response = generate(
    "If I wanted to know the wait time at a restaurant, "
    "what steps would I need to take?"
)

llm_response(response, label="LLM's Plan for Wait Time")
# Can the LLM plan how to find the cheapest restaurant?
response = generate(
    "If I wanted to find the cheapest restaurant nearby, "
    "what information would I need to gather?"
)

llm_response(response, label="LLM's Plan for Cheapest Restaurant")

In [None]:
output_box(
    "What if instead of asking LLMs to DO things, we ask them to "
    "PLAN what needs to be done -- and then a system executes the plan?",
    label="THE KEY QUESTION",
    style="warning"
)

## Part 3: From Text to Data (W2 Refresher)

To explore this idea, we need data to work with. Our scenario today: helping people find restaurants. We have some restaurant info, but it's buried in messy text...

In [None]:
# Unstructured restaurant descriptions (first 3)
RESTAURANT_DESCRIPTIONS = {
    "Olive Garden": (
        "A family-friendly Italian chain known for their unlimited "
        "breadsticks and pasta. Typical dinner runs $15-25 per person. "
        "Vegetarian options available including eggplant parm and "
        "pasta primavera. Located on Main Street, about 5 minutes "
        "from downtown. Open until 10 PM on weekdays."
    ),
    "Sushi Palace": (
        "An upscale Japanese restaurant with an extensive omakase "
        "menu. Expect to spend $35-60 per person for dinner. Limited "
        "vegetarian options, mostly edamame and veggie rolls. Tucked "
        "away in the arts district. Closes at 11 PM most nights."
    ),
    "Burger Barn": (
        "A no-frills American burger joint with the best smash burgers in town. "
        "Meals run $8-15 per person. They have a black bean burger "
        "for vegetarians. Right next to the highway exit, very easy "
        "to find. Kitchen closes at 9 PM sharp."
    ),
}

In [None]:
# Unstructured restaurant descriptions (remaining 3)
RESTAURANT_DESCRIPTIONS.update({
    "Taj Mahal": (
        "Authentic Indian cuisine with a wood-fired tandoor oven. "
        "Dinner is typically $18-30 per person. Excellent vegetarian "
        "selection with paneer dishes, dal, and veggie biryani. "
        "Located in the university quarter. Open until 10:30 PM."
    ),
    "Dragon Wok": (
        "A popular Chinese takeout spot with generous portions. Most "
        "dishes are $10-18 per person. A few vegetarian stir-fry "
        "options available. Situated on the east side near the park. "
        "Open until 9:30 PM, later on weekends."
    ),
    "La Piazza": (
        "Fine dining Italian with handmade pasta and an award-winning "
        "wine list. Plan on $45-80 per person for a full dinner. "
        "Vegetarian tasting menu available on request. Located in "
        "the waterfront district with beautiful views. Closes at "
        "11 PM, reservations recommended."
    ),
})

# Show one example
print("Olive Garden description:")
print(RESTAURANT_DESCRIPTIONS["Olive Garden"])

In [None]:
# Define a Pydantic model to extract structured info
# Same structured extraction pattern from Workshop 2

class RestaurantInfo(BaseModel):
    cuisine: str
    price_per_person_low: int
    price_per_person_high: int
    has_vegetarian: bool
    closing_time: str
    notes: str

In [None]:
# Extract structured data from one description
completion = openai_client.chat.completions.parse(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "Extract restaurant information from the description."},
        {"role": "user", "content": RESTAURANT_DESCRIPTIONS["Olive Garden"]}
    ],
    response_format=RestaurantInfo,
)

info = completion.choices[0].message.parsed

print(f"Cuisine: {info.cuisine}")
print(f"Price range: ${info.price_per_person_low}-${info.price_per_person_high}")
print(f"Vegetarian options: {info.has_vegetarian}")
print(f"Closing time: {info.closing_time}")
print(f"Notes: {info.notes}")

In [None]:
# Now extract from ALL restaurant descriptions
extracted = {}
for name, description in RESTAURANT_DESCRIPTIONS.items():
    completion = openai_client.chat.completions.parse(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "Extract restaurant information from the description."},
            {"role": "user", "content": description}
        ],
        response_format=RestaurantInfo,
    )
    extracted[name] = completion.choices[0].message.parsed
    print(f"{name}: {extracted[name].cuisine}, ${extracted[name].price_per_person_low}-${extracted[name].price_per_person_high}")

print(f"\nExtracted structured data from {len(extracted)} restaurants.")

> **Key Insight:** This is the same structured extraction from last week -- now used as a building block. We can turn messy text into clean data that tools can work with.

## Part 4: Can the LLM Actually Use a Tool?

We know LLMs can describe what needs to be done and we know how to structure data. Now let's test something: can an LLM tell a system which function to call -- and can the system actually execute it?

In [None]:
# Build restaurant data from extraction results + other data sources
RESTAURANTS = {}
for name, info in extracted.items():
    RESTAURANTS[name] = {
        "cuisine": info.cuisine,
        "price_per_person_low": info.price_per_person_low,
        "price_per_person_high": info.price_per_person_high,
        "has_vegetarian": info.has_vegetarian,
        "closing_time": info.closing_time,
        "notes": info.notes,
    }

# Ratings from review aggregator (separate data source)
RATINGS = {
    "Olive Garden": 4.2, "Sushi Palace": 4.7, "Burger Barn": 3.8,
    "Taj Mahal": 4.5, "Dragon Wok": 4.0, "La Piazza": 4.8,
}

# Distance depends on the user's location (not a restaurant property)
DISTANCES = {
    "Olive Garden": 2.5, "Sushi Palace": 5.0, "Burger Barn": 1.0,
    "Taj Mahal": 3.2, "Dragon Wok": 4.5, "La Piazza": 6.0,
}

for name in RESTAURANTS:
    RESTAURANTS[name]["rating"] = RATINGS[name]
    RESTAURANTS[name]["distance_miles"] = DISTANCES[name]

# Simulated real-time wait times (changes every minute in theory)
WAIT_TIMES = {
    "Olive Garden": 25, "Sushi Palace": 45, "Burger Barn": 5,
    "Taj Mahal": 15, "Dragon Wok": 30, "La Piazza": 60,
}

# Show what our tools will work with
print("Restaurant data (extraction + live sources):\n")
for name, data in RESTAURANTS.items():
    print(f"  {name}:")
    print(f"    Cuisine: {data['cuisine']}")
    print(f"    Price: ${data['price_per_person_low']}-${data['price_per_person_high']}/person")
    print(f"    Vegetarian: {'Yes' if data['has_vegetarian'] else 'No'}")
    print(f"    Rating: {data['rating']} stars, {data['distance_miles']} mi away")
    print()

In [None]:
# Simple tool functions

def get_wait_time(restaurant):
    """Get current wait time in minutes at a restaurant."""
    for name, wait in WAIT_TIMES.items():
        if name.lower() == restaurant.lower():
            return wait
    available = ", ".join(WAIT_TIMES.keys())
    return f"Restaurant '{restaurant}' not found. Available: {available}"


def get_rating(restaurant):
    """Get the rating (1-5 stars) for a restaurant."""
    for name, info in RESTAURANTS.items():
        if name.lower() == restaurant.lower():
            return info["rating"]
    available = ", ".join(RESTAURANTS.keys())
    return f"Restaurant '{restaurant}' not found. Available: {available}"

In [None]:
# Test the tools
print(f"Wait time at Olive Garden: {get_wait_time('Olive Garden')} minutes")
print(f"Rating of Sushi Palace: {get_rating('Sushi Palace')} stars")

In [None]:
# Ask the LLM to generate a function call
prompt = (
    "Given these available functions:\n"
    "- get_wait_time(restaurant) - Get current wait time in minutes\n"
    "- get_rating(restaurant) - Get rating (1-5 stars)\n\n"
    "What function call would answer: "
    "'How long is the wait at Olive Garden?'\n\n"
    "Respond with ONLY the function call, nothing else."
)

function_call_text = generate(prompt)

llm_response(function_call_text, label="LLM Generated Function Call")
print(f"Type: {type(function_call_text).__name__} -- still just text!")

In [None]:
# Execute the LLM's function call
tools_dict = {
    "get_wait_time": get_wait_time,
    "get_rating": get_rating,
}

# Strip any markdown code fences the LLM may have added
clean_call = function_call_text.strip()
if clean_call.startswith("```"):
    clean_call = clean_call.split("\n")[1]
    if clean_call.endswith("```"):
        clean_call = clean_call[:-3]

result = eval(clean_call, {"__builtins__": {}}, tools_dict)

print(f"Function call: {clean_call}")
print(f"Result: {result} minutes")

In [None]:
# Complete flow: user question -> LLM plans -> system executes
user_question = "What's the rating of Sushi Palace?"

separator("Step 1: User asks a question")
print(f"User: {user_question}")

separator("Step 2: LLM decides which function to call")
plan_prompt = (
    "Given these available functions:\n"
    "- get_wait_time(restaurant) - Get current wait time in minutes\n"
    "- get_rating(restaurant) - Get rating (1-5 stars)\n\n"
    f"What function call would answer: '{user_question}'\n\n"
    "Respond with ONLY the function call, nothing else."
)
planned_call = generate(plan_prompt)
print(f"LLM says: {planned_call}")
# Execute the planned call and show the result
separator("Step 3: System executes the function")
clean_call = planned_call.strip()
if clean_call.startswith("```"):
    clean_call = clean_call.split("\n")[1]
    if clean_call.endswith("```"):
        clean_call = clean_call[:-3]
result = eval(clean_call, {"__builtins__": {}}, tools_dict)
print(f"Result: {result}")

separator("Step 4: Result returned to user")
output_box(
    f"Question: {user_question}\n"
    f"Answer: {result}",
    label="Final Answer",
    style="success"
)

> **Key Insight:** Did you notice what just happened? The LLM never executed anything itself -- it generated a function call as text, and our system ran it. This pattern -- LLM plans, system executes -- is the core idea behind AI agents.

In [None]:
output_box(
    "1. User asks a question\n"
    "2. LLM decides which function to call\n"
    "3. System executes the function\n"
    "4. Result comes back to the user\n\n"
    "This is the foundation of how agents work.",
    label="What We Discovered",
    style="success"
)

## Part 5: Restaurant Finder -- Building Our Scenario

Now let's build something real to test our idea. Imagine a restaurant finder that can answer questions like "What's the shortest wait near me?" or "Best-rated Italian place?" First, we need tools the agent can call.

In [None]:
# All 7 tool functions for our Restaurant Finder agent

def get_wait_time(restaurant):
    """Get current wait time in minutes at a restaurant."""
    for name, wait in WAIT_TIMES.items():
        if name.lower() == restaurant.lower():
            return wait
    available = ", ".join(WAIT_TIMES.keys())
    return f"Restaurant '{restaurant}' not found. Available: {available}"


def get_rating(restaurant):
    """Get the rating (1-5 stars) for a restaurant."""
    for name, info in RESTAURANTS.items():
        if name.lower() == restaurant.lower():
            return info["rating"]
    available = ", ".join(RESTAURANTS.keys())
    return f"Restaurant '{restaurant}' not found. Available: {available}"


def get_distance(restaurant):
    """Get distance in miles to the restaurant."""
    for name, info in RESTAURANTS.items():
        if name.lower() == restaurant.lower():
            return info["distance_miles"]
    available = ", ".join(RESTAURANTS.keys())
    return f"Restaurant '{restaurant}' not found. Available: {available}"


def get_cuisine(restaurant):
    """Get the cuisine type of a restaurant."""
    for name, info in RESTAURANTS.items():
        if name.lower() == restaurant.lower():
            return info["cuisine"]
    available = ", ".join(RESTAURANTS.keys())
    return f"Restaurant '{restaurant}' not found. Available: {available}"


def get_price_range(restaurant):
    """Get the price range ($, $$, $$$) of a restaurant."""
    for name, info in RESTAURANTS.items():
        if name.lower() == restaurant.lower():
            return info["price_range"]
    available = ", ".join(RESTAURANTS.keys())
    return f"Restaurant '{restaurant}' not found. Available: {available}"


def calculate_travel_time(distance):
    """Calculate travel time in minutes. Assumes 2 minutes per mile."""
    return round(distance * 2, 1)


def list_restaurants():
    """Get a list of all nearby restaurants."""
    return list(RESTAURANTS.keys())


print("All 7 tool functions defined:")
print("  get_wait_time(restaurant)      - Wait time in minutes")
print("  get_rating(restaurant)          - Rating (1-5 stars)")
print("  get_distance(restaurant)        - Distance in miles")
print("  get_cuisine(restaurant)         - Cuisine type")
print("  get_price_range(restaurant)     - Price range ($, $$, $$$)")
print("  calculate_travel_time(distance) - Travel time in minutes")
print("  list_restaurants()              - All restaurant names")

In [None]:
# Test individual tools
print(f"Wait time at Olive Garden: {get_wait_time('Olive Garden')} minutes")
print(f"Rating of Sushi Palace: {get_rating('Sushi Palace')} stars")
print(f"Distance to Burger Barn: {get_distance('Burger Barn')} miles")
# Test list_restaurants and calculate_travel_time
print("All restaurants:")
for r in list_restaurants():
    print(f"  - {r}")

print(f"\nTravel time for 3.2 miles: {calculate_travel_time(3.2)} minutes")

In [None]:
# Manual walkthrough: "What's the wait at Olive Garden?"
# This is a simple single-tool query

separator("Manual Walkthrough: Simple Query")
print("Question: What's the wait at Olive Garden?")
print("\nTo answer this, we just need one tool call -- straightforward.")
print(f"\nStep 1: Call get_wait_time('Olive Garden')")

wait = get_wait_time("Olive Garden")
print(f"Result: {wait} minutes")

print(f"\nAnswer: The wait at Olive Garden is {wait} minutes.")
print("\nThat was easy -- one question, one tool call, done.")

In [None]:
# Manual walkthrough: "What's the cheapest restaurant within 10 minutes?"

separator("Manual Walkthrough: Complex Query")
print("Question: What's the cheapest restaurant within 10 minutes?")

separator("Step 1: Get all restaurants")
all_restaurants = list_restaurants()
print(f"Restaurants: {all_restaurants}")

separator("Step 2: Check distance for each")
for r in all_restaurants:
    d = get_distance(r)
    print(f"  {r}: {d} miles")

separator("Step 3: Calculate travel times and filter")
within_10 = []
for r in all_restaurants:
    d = get_distance(r)
    t = calculate_travel_time(d)
    status = "YES" if t <= 10 else "no"
    print(f"  {r}: {t} min -- {status}")
    if t <= 10:
        within_10.append(r)

separator("Step 4: Compare prices of nearby restaurants")
for r in within_10:
    p = get_price_range(r)
    print(f"  {r}: {p}")

print("\nThat took 4 steps, and WE had to decide each one ourselves.")
print("Imagine doing this for every user question...")

In [None]:
# Can the LLM plan how to answer the complex query?
plan_prompt = (
    "Given these available tools:\n"
    "- get_wait_time(restaurant) - Get wait time in minutes\n"
    "- get_rating(restaurant) - Get rating (1-5 stars)\n"
    "- get_distance(restaurant) - Get distance in miles\n"
    "- get_cuisine(restaurant) - Get cuisine type\n"
    "- get_price_range(restaurant) - Get price range ($, $$, $$$)\n"
    "- calculate_travel_time(distance) - Get travel time in minutes\n"
    "- list_restaurants() - Get list of all restaurants\n\n"
    "How would you answer: \'What is the cheapest restaurant within "
    "10 minutes?\'\n"
    "List the steps and which tool you would use at each step."
)

plan = generate(plan_prompt)
llm_response(plan, label="LLM's Plan")

## Part 6: Building the Agent

Let's see if we can automate those manual steps. Our first attempt will be simple...

### Agent v1: Keyword Matching

In [None]:
def restaurant_agent_v1(query):
    """Agent v1: Uses keyword matching to pick a tool."""
    # Try to find a restaurant name in the query
    restaurant = None
    for name in RESTAURANTS:
        if name.lower() in query.lower():
            restaurant = name
            break

    if "wait" in query.lower():
        return f"Wait time at {restaurant}: {get_wait_time(restaurant)} minutes"
    elif "rating" in query.lower() or "rated" in query.lower():
        return f"Rating of {restaurant}: {get_rating(restaurant)} stars"
    elif "distance" in query.lower() or "far" in query.lower():
        return f"Distance to {restaurant}: {get_distance(restaurant)} miles"
    elif "cuisine" in query.lower() or "type" in query.lower():
        return f"Cuisine at {restaurant}: {get_cuisine(restaurant)}"
    elif "price" in query.lower() or "cost" in query.lower():
        return f"Price range at {restaurant}: {get_price_range(restaurant)}"
    else:
        return "Sorry, I don't understand that question."

In [None]:
# v1 works for simple keyword matches
result = restaurant_agent_v1("What's the wait at Olive Garden?")
output_box(result, label="v1 Success", style="success")

In [None]:
# v1 breaks on natural language variations
result1 = restaurant_agent_v1(
    "How long will I have to stand in line at Olive Garden?"
)
output_box(result1, label="v1 Failure: No 'wait' keyword", style="error")

result2 = restaurant_agent_v1("Tell me about Sushi Palace")
output_box(result2, label="v1 Failure: No matching keyword", style="error")

> **Key Insight:** See the problem? The agent only understood exact keywords -- 'wait', 'rating', 'distance'. But users don't talk that way. We need something that *understands* the question, not just matches keywords.

### Agent v2: LLM Understanding

In [None]:
def understand_query(query):
    """Ask the LLM to classify what type of information the user wants."""
    prompt = (
        "Classify this restaurant query into exactly one category:\n"
        "- wait_time\n"
        "- rating\n"
        "- distance\n"
        "- cuisine\n"
        "- price_range\n"
        "- travel_time\n\n"
        f"Query: {query}\n\n"
        "Respond with ONLY the category name, nothing else."
    )
    return generate(prompt)
def extract_restaurant(query):
    """Ask the LLM to extract the restaurant name from the query."""
    prompt = (
        f"Extract the restaurant name from this query:\n"
        f"Query: {query}\n\n"
        f"Available restaurants: {list_restaurants()}\n\n"
        "Respond with ONLY the restaurant name, nothing else."
    )
    return generate(prompt)

In [None]:
def restaurant_agent_v2(query):
    """Agent v2: Uses LLM to understand the query and extract info."""
    category = understand_query(query)
    restaurant = extract_restaurant(query)

    print(f"LLM understood: category={category}, restaurant={restaurant}")

    tools = {
        "wait_time": lambda r: f"Wait: {get_wait_time(r)} minutes",
        "rating": lambda r: f"Rating: {get_rating(r)} stars",
        "distance": lambda r: f"Distance: {get_distance(r)} miles",
        "cuisine": lambda r: f"Cuisine: {get_cuisine(r)}",
        "price_range": lambda r: f"Price range: {get_price_range(r)}",
        "travel_time": lambda r: (
            f"Travel time: {calculate_travel_time(get_distance(r))} min"
        ),
    }

    if category in tools:
        return tools[category](restaurant)
    return f"Could not process query (category: {category})"

In [None]:
# v2 handles natural language -- previously failing queries now work!
result = restaurant_agent_v2(
    "How long will I have to stand in line at Olive Garden?"
)
output_box(result, label="v2 Success: Natural Language", style="success")

In [None]:
# v2 fails on multi-step queries that need multiple tools
result = restaurant_agent_v2(
    "What's the cheapest restaurant within 10 minutes?"
)
output_box(result, label="v2 Failure: Multi-step Query", style="error")

> **Key Insight:** LLM understanding solved the language problem, but some questions need multiple steps. V2 can only make one tool call -- it can't chain information from one call into the next.

### Agent v3: LLM Tool Selection

In [None]:
def restaurant_agent_v3(query):
    """Agent v3: LLM picks the tool AND generates the function call."""
    prompt = (
        "Given these available functions:\n"
        "- get_wait_time(restaurant) - Get wait time in minutes\n"
        "- get_rating(restaurant) - Get rating (1-5 stars)\n"
        "- get_distance(restaurant) - Get distance in miles\n"
        "- get_cuisine(restaurant) - Get cuisine type\n"
        "- get_price_range(restaurant) - Get price ($, $$, $$$)\n"
        "- calculate_travel_time(distance) - Travel time (2 min/mile)\n"
        "- list_restaurants() - Get all restaurant names\n\n"
        f"What function call answers: '{query}'\n\n"
        "Respond with ONLY the function call, nothing else."
    )
    response = generate(prompt).strip()
    if response.startswith("```"):
        response = response.split("\n")[1]
        if response.endswith("```"):
            response = response[:-3]
    tools_dict = {
        "get_wait_time": get_wait_time, "get_rating": get_rating,
        "get_distance": get_distance, "get_cuisine": get_cuisine,
        "get_price_range": get_price_range,
        "calculate_travel_time": calculate_travel_time,
        "list_restaurants": list_restaurants,
    }
    print(f"LLM generated: {response}")
    result = eval(response, {"__builtins__": {}}, tools_dict)
    return result

In [None]:
# v3 works for any single-step query -- the LLM picks the right tool
result1 = restaurant_agent_v3("How long is the wait at Taj Mahal?")
output_box(f"Wait query: {result1}", label="v3 Success", style="success")

result2 = restaurant_agent_v3("What kind of food does Dragon Wok serve?")
output_box(f"Cuisine query: {result2}", label="v3 Success", style="success")

V3 lets the LLM choose the tool -- powerful! But it still only makes ONE call. For our complex query ("cheapest within 10 min?"), we need the agent to call one tool, look at the result, decide what to do next, call another tool, and keep going until it has enough information.

> **Key Insight:** Single-call agents can't solve multi-step problems. We need a loop: call a tool, observe the result, decide the next step, repeat.

### Agent v4: The Loop

What if the agent could keep going? Instead of making ONE call, let it loop: think about what to do next, call a tool, look at the result, decide the next step. Keep going until the question is fully answered.

In [None]:
# Tool descriptions for the agent prompt
TOOL_DESCRIPTIONS = (
    "Available tools:\n"
    "- get_wait_time(restaurant) - Wait time in minutes\n"
    "- get_rating(restaurant) - Rating (1-5 stars)\n"
    "- get_distance(restaurant) - Distance in miles\n"
    "- get_cuisine(restaurant) - Cuisine type\n"
    "- get_price_range(restaurant) - Price ($, $$, $$$)\n"
    "- calculate_travel_time(distance) - Travel time\n"
    "- list_restaurants() - All restaurant names"
)

def restaurant_agent_v4(query, max_steps=10):
    """Agent v4: ReAct loop -- reason, act, observe, repeat."""
    tools = {
        "get_wait_time": get_wait_time, "get_rating": get_rating,
        "get_distance": get_distance, "get_cuisine": get_cuisine,
        "get_price_range": get_price_range,
        "calculate_travel_time": calculate_travel_time,
        "list_restaurants": list_restaurants,
    }
    history = []

    for step in range(max_steps):
        prompt = (f"Query: {query}\n\nPrevious steps: {history}\n\n"
                  f"{TOOL_DESCRIPTIONS}\n\n"
                  "If you have enough info to answer, respond: DONE: [answer]\n"
                  "Otherwise respond with ONLY the next function call.")

        # Show the full prompt at steps 3-5 to reveal growing context
        if 2 <= step <= 4:
            separator(f"Full Prompt Sent to LLM (Step {step + 1})")
            print(prompt)
            separator()
            print("Notice: The prompt includes ALL previous results.")
            print("The agent re-reads everything each step.\n")
        elif step > 4:
            print(f"Step {step + 1}: Prompt includes {len(history)} previous steps (not shown)")

        response = generate(prompt).strip()
        if response.upper().startswith("DONE"):
            answer = response[5:].strip().lstrip(":").strip()
            output_box(answer, label="Agent Answer", style="success")
            return answer

        # Clean markdown code fences if present
        if response.startswith("```"):
            lines = response.split("\n")
            response = lines[1] if len(lines) > 1 else response
            if response.endswith("```"): response = response[:-3]
            response = response.strip()

        try:
            result = eval(response, {"__builtins__": {}}, tools)
            print(f"  Step {step + 1}: {response} -> {result}")
            history.append({"step": step + 1, "call": response, "result": str(result)})
        except Exception as e:
            print(f"  Step {step + 1}: {response} -> Error: {e}")
            history.append({"step": step + 1, "call": response, "result": f"Error: {e}"})

    output_box("Agent reached maximum steps without finishing.", label="Warning", style="warning")

In [None]:
# v4 test: simple query (should complete in 1-2 steps)
separator("v4 Test: Simple Query")
restaurant_agent_v4("What's the wait at Olive Garden?")

# v4 test: complex query (requires multiple tools)
# Watch the growing context at step 3+!
separator("v4 Test: Complex Query")
restaurant_agent_v4("What's the cheapest restaurant within 10 minutes of me?")

# v4 test: very complex query (chains many tool calls)
separator("v4 Test: Very Complex Query")
restaurant_agent_v4(
    "What's the best-rated Italian restaurant factoring in wait time and travel time?"
)

> **Key Insight:** Did you notice how the prompt grows with each step? The agent re-reads its ENTIRE history every time it asks the LLM what to do next. Previous tool results become part of the next prompt -- that's how the agent "knows" what happened before.

## Part 7: The ReAct Pattern
### The Agent Evolution

**Agent v1 (Keyword Matching):** Hard-coded rules. Breaks on natural language variations.

**Agent v2 (LLM Understanding):** LLM classifies the question. Single tool call only.

**Agent v3 (LLM Tool Selection):** LLM picks the tool AND generates the call. Still single call.

**Agent v4 (The Loop):** LLM plans, acts, observes, repeats. Handles any complexity.

### The Pattern Has a Name

Think about what our v4 agent does each step:
1. **Reason** about what information it needs
2. **Act** by calling a tool
3. **Observe** the result
4. **Repeat** until the question is answered

This loop we built -- Reason, Act, Observe, Repeat -- has a name in AI research: **ReAct** (Reasoning + Acting).

Anthropic defines an agent as: *"A system that uses an LLM to decide what actions to take and in what order."* That's exactly what our v4 agent does -- the LLM decides the next action, the system executes it, and the result informs the next decision.

In [None]:
output_box(
    "An agent is just a loop: the LLM reasons about what to do, "
    "the system executes the action, and the result feeds back into "
    "the next reasoning step. Everything we built today -- from v1 "
    "to v4 -- was the journey of discovering why this loop is necessary.",
    label="What We Built",
    style="warning"
)

## Part 8: What's Next
Today we built agents with toy tools -- get_wait_time, get_rating, get_distance. These are fun, but they work with hardcoded data in Python dictionaries.

What about real data? Real databases? Real APIs?

Next workshop, we'll design tools that work with actual data -- and discover that tool DESIGN is where the real engineering challenge begins.
### Coming Up: Workshop 4

**Workshop 4: Tool Design for Agents** -- How to turn real-world data sources into tools an agent can use effectively.