# The Agent Loop

The most important concept in this entire course. Everything else â€” tools, handoffs, guardrails â€” is just decoration on top of this one idea.

An agent is a **while loop with an LLM as the control flow decision-maker**.

![](2026-02-14-14-06-49.png)

The loop: user message â†’ LLM decides â†’ either **respond** (exit) or **call a tool** â†’ execute tool â†’ feed result back â†’ LLM decides again â†’ repeat.

The key insight: **the model decides when to stop**. That's what makes it an agent, not a chain.

In [1]:
# Setup â€” run once
# pip install openai-agents

import os
# os.environ["OPENAI_API_KEY"] = "sk-..."

## Demo 1: The Simplest Agent â€” One Iteration

No tools. The loop runs exactly **once**: message in â†’ LLM responds â†’ done.

This is the degenerate case. The agent has nothing to do except think and respond.

In [3]:
import nest_asyncio
nest_asyncio.apply()
from agents import Agent, Runner

agent = Agent(
    name="Simple Agent",
    instructions="You are a helpful assistant. Be concise.",
)

result = Runner.run_sync(agent, "What is the capital of Portugal?")
print(result.final_output)

The capital of Portugal is Lisbon.


That's it. Five lines. The `Runner` executed the agent loop, but since there are no tools, the LLM had no reason to loop â€” it just answered.

Let's peek at what actually happened inside the loop:

In [4]:
# Inspect the raw items the Runner produced
for item in result.raw_responses:
    print(type(item).__name__, "â†’", getattr(item, 'output', None))

ModelResponse â†’ [ResponseOutputMessage(id='msg_0dd0b550dbc9d6f300699081b23a388197897d3492f92555b1', content=[ResponseOutputText(annotations=[], text='The capital of Portugal is Lisbon.', type='output_text', logprobs=[])], role='assistant', status='completed', type='message')]


One response. One iteration. The model decided immediately: "I can answer this without any tools."

---

## Demo 2: Add One Tool â€” Watch the Loop Iterate

Now we give the agent a tool. The loop should run **twice**:
1. LLM decides to call the tool
2. Tool executes, result fed back â†’ LLM decides to respond

We'll add a print inside the tool so you can *see* the loop in action.

In [5]:
from agents import Agent, Runner, function_tool

@function_tool
def get_temperature(city: str) -> str:
    """Get the current temperature for a city."""
    print(f"  ðŸ”§ [Tool called] get_temperature('{city}')")
    # Fake data â€” in production this would hit a weather API
    temperatures = {"lisbon": "22Â°C", "tokyo": "18Â°C", "new york": "5Â°C"}
    return temperatures.get(city.lower(), "Unknown city")

agent = Agent(
    name="Weather Agent",
    instructions="You help users check the weather. Be concise.",
    tools=[get_temperature],
)

print("Sending message to agent...\n")
result = Runner.run_sync(agent, "What's the temperature in Lisbon?")
print(f"\nðŸ’¬ Final output: {result.final_output}")

Sending message to agent...

  ðŸ”§ [Tool called] get_temperature('Lisbon')

ðŸ’¬ Final output: The current temperature in Lisbon is 22Â°C.


Notice the sequence:
1. We sent the message
2. The LLM **decided** to call `get_temperature` (not us â€” the model chose this)
3. The tool ran and returned a result
4. The LLM got the result back, **decided** it had enough information, and responded

Two iterations. The model decided both *what to do* and *when to stop*.

---

## Demo 3: Multi-Step Reasoning â€” The Loop Plans

Give the agent a task that requires **two tools called in sequence**. Now we'll see three iterations â€” and the model *planning* its approach.

In [6]:
from agents import Agent, Runner, function_tool

@function_tool
def get_temperature_celsius(city: str) -> str:
    """Get the current temperature in Celsius for a city."""
    print(f"  ðŸ”§ [Tool called] get_temperature_celsius('{city}')")
    temperatures = {"lisbon": "22", "tokyo": "18", "new york": "5"}
    return temperatures.get(city.lower(), "Unknown city")

@function_tool
def celsius_to_fahrenheit(celsius: str) -> str:
    """Convert a Celsius temperature to Fahrenheit."""
    print(f"  ðŸ”§ [Tool called] celsius_to_fahrenheit('{celsius}')")
    f = round(float(celsius) * 9/5 + 32, 1)
    return str(f)

agent = Agent(
    name="Weather Agent v2",
    instructions="You help users check the weather. Be concise.",
    tools=[get_temperature_celsius, celsius_to_fahrenheit],
)

print("Sending message to agent...\n")
result = Runner.run_sync(agent, "What's the temperature in Lisbon in Fahrenheit?")
print(f"\nðŸ’¬ Final output: {result.final_output}")

Sending message to agent...

  ðŸ”§ [Tool called] get_temperature_celsius('Lisbon')
  ðŸ”§ [Tool called] celsius_to_fahrenheit('22')

ðŸ’¬ Final output: The temperature in Lisbon is 71.6Â°F.


Three iterations. The model:
1. Called `get_temperature_celsius` first (it needs the raw number)
2. Took that result, called `celsius_to_fahrenheit` (sequential reasoning)
3. Got both results, synthesized a final answer

Nobody told it the order. **The LLM figured out the dependency chain on its own.** That's the agent loop doing its job.

---

## Demo 4: Inspecting the Conversation Between Iterations

What does the "conversation" actually look like between loop iterations? Let's inspect `result.to_input_list()` â€” this is the raw message history the Runner built.

In [7]:
import json

# Show the full message history the loop produced
for i, item in enumerate(result.to_input_list()):
    role = item.get("role", "unknown")
    
    if role == "user":
        print(f"[{i}] ðŸ‘¤ USER: {item['content']}")
    elif role == "assistant":
        content = item.get("content")
        tool_calls = item.get("tool_calls", [])
        if content:
            print(f"[{i}] ðŸ¤– ASSISTANT: {content}")
        if tool_calls:
            for tc in tool_calls:
                print(f"[{i}] ðŸ¤– ASSISTANT â†’ tool_call: {tc}")
    elif role == "tool":
        print(f"[{i}] ðŸ”§ TOOL RESULT: {item.get('content', item)}")
    else:
        print(f"[{i}] {role}: {json.dumps(item, indent=2, default=str)[:200]}")

[0] ðŸ‘¤ USER: What's the temperature in Lisbon in Fahrenheit?
[1] unknown: {
  "arguments": "{\"city\":\"Lisbon\"}",
  "call_id": "call_fYYORL97tJYXOt0RC0UWWf05",
  "name": "get_temperature_celsius",
  "type": "function_call",
  "id": "fc_05a7ac8d8067a05600699081f310fc8190af
[2] unknown: {
  "call_id": "call_fYYORL97tJYXOt0RC0UWWf05",
  "output": "22",
  "type": "function_call_output"
}
[3] unknown: {
  "arguments": "{\"celsius\":\"22\"}",
  "call_id": "call_zjUhXt1AfPOabbSuij5UcJ53",
  "name": "celsius_to_fahrenheit",
  "type": "function_call",
  "id": "fc_05a7ac8d8067a05600699081f494e08190ad024
[4] unknown: {
  "call_id": "call_zjUhXt1AfPOabbSuij5UcJ53",
  "output": "71.6",
  "type": "function_call_output"
}
[5] ðŸ¤– ASSISTANT: [{'annotations': [], 'text': 'The temperature in Lisbon is 71.6Â°F.', 'type': 'output_text', 'logprobs': []}]


This is what's actually happening inside the `Runner`. Each iteration appends to this list â€” tool calls, tool results, and finally the assistant's response. The LLM sees the *entire history* each time it's called, which is how it knows what it already did and what's left to do.

---

## Demo 5: When the Loop Goes Wrong

The agent loop is a while loop. While loops can spin. Let's see what happens when we give the agent a task it **can't solve** with its available tools â€” and protect ourselves with `max_turns`.

In [10]:
from agents import Agent, Runner, function_tool

@function_tool
def get_temperature_celsius(city: str) -> str:
    """Get the current temperature in Celsius for a city."""
    print(f"  ðŸ”§ [Tool called] get_temperature_celsius('{city}')")
    # This tool only knows about 3 cities
    temperatures = {"lisbon": "22", "tokyo": "18", "new york": "5"}
    return temperatures.get(city.lower(), "Unknown city")

agent = Agent(
    name="Limited Agent",
    instructions="You help users check the weather. Always use your tools to look up temperatures.",
    tools=[get_temperature_celsius],
)

print("Asking about a city the tool doesn't know...\n")
result = Runner.run_sync(
    agent,
    "What's the temperature in Reykjavik?",
    max_turns=5  # Safety net: stop after 5 iterations
)
print(f"\nðŸ’¬ Final output: {result.final_output}")

Asking about a city the tool doesn't know...

  ðŸ”§ [Tool called] get_temperature_celsius('Reykjavik')

ðŸ’¬ Final output: I couldn't retrieve the temperature for Reykjavik. The system seems to have trouble recognizing the city. You might want to try checking a weather website or app.


The model might handle this gracefully ("I don't have data for Reykjavik") â€” or it might try calling the tool multiple times hoping for a different result. The behavior depends on the instructions and the model's judgment.

**This is why `max_turns` exists.** It's a circuit breaker for the agent loop. In production, you always set it.

---

## Demo 6: The Loop Without the SDK

To really understand what `Runner.run_sync` is doing, here's the same logic written as a raw while loop. **This is not how you'd write production code** â€” but it demystifies the abstraction.

In [11]:
from openai import OpenAI
import json

client = OpenAI()

# The same tool, defined manually
tools = [{
    "type": "function",
    "function": {
        "name": "get_temperature",
        "description": "Get the current temperature for a city.",
        "parameters": {
            "type": "object",
            "properties": {"city": {"type": "string"}},
            "required": ["city"]
        }
    }
}]

def execute_tool(name, args):
    if name == "get_temperature":
        temps = {"lisbon": "22Â°C", "tokyo": "18Â°C"}
        return temps.get(args["city"].lower(), "Unknown")

# The agent loop â€” written explicitly
messages = [
    {"role": "system", "content": "You help with weather. Be concise."},
    {"role": "user", "content": "What's the temperature in Lisbon?"}
]

turn = 0
while turn < 10:  # max_turns safety
    turn += 1
    print(f"--- Loop iteration {turn} ---")
    
    response = client.chat.completions.create(
        model="gpt-5-mini",
        messages=messages,
        tools=tools,
    )
    
    message = response.choices[0].message
    messages.append(message)
    
    # EXIT CONDITION: no tool calls â†’ the model decided to respond
    if not message.tool_calls:
        print(f"Model responded: {message.content}")
        break
    
    # CONTINUE: execute tool calls, feed results back
    for tc in message.tool_calls:
        args = json.loads(tc.function.arguments)
        print(f"Model called: {tc.function.name}({args})")
        result = execute_tool(tc.function.name, args)
        print(f"Tool returned: {result}")
        messages.append({
            "role": "tool",
            "tool_call_id": tc.id,
            "content": result
        })

--- Loop iteration 1 ---
Model called: get_temperature({'city': 'Lisbon'})
Tool returned: 22Â°C
--- Loop iteration 2 ---
Model responded: The temperature in Lisbon is 22Â°C.


**That's the entire agent loop.** A while loop, an LLM call, a branch: did the model call a tool or not? The OpenAI Agents SDK wraps this in `Runner` and adds error handling, tracing, guardrails, and handoff support â€” but the core is exactly this.

---

## Takeaways

- An agent is a **loop**, not a single LLM call
- The **model** decides what to do next (call a tool or respond)
- The **model** decides when to stop (no hardcoded exit condition)
- `Runner.run_sync` is doing the loop for you â€” with `max_turns` as a safety net
- The conversation history grows with each iteration â€” the model sees everything it already did

Next up: we'll go deeper on **tools** â€” the things that give the agent actual capabilities.