# Introduction: Building AI Agents from Scratch

AI agents are behind today‚Äôs most powerful applications ‚Äî from chat assistants to automated reasoning systems. Instead of relying on black‚Äëbox frameworks, this tutorial shows you how to **build agents step by step**, so you understand *how* they think, reason, and act.

You‚Äôll use Python, Pydantic, and the Outlines library to give a model structure, logic, and tool‚Äëuse abilities. By coding agents directly, you‚Äôll see how large language models interpret instructions, perform calculations, and stay consistent through structured generation.

## Why It Matters

Learning to build agents from scratch helps you:
- **Control your outputs.** Structured schemas ensure that models return predictable, machine‚Äëreadable data.  
- **Add real reasoning.** Multi‚Äëstep loops let the model ‚Äúthink‚Äù before answering.  
- **Integrate tools for accuracy.** Combine the creativity of a language model with the precision of code utilities like calculators or converters.  
- **Debug and trust behavior.** Watching an agent reason in steps makes its logic transparent.  

## What You‚Äôll Build

1. **Structured Generation:** A reliable dad joke generator that outputs valid JSON.  
2. **Math Dad Joke Agent:** A reasoning‚Äëenabled version that checks its math with a calculator tool.  
3. **Extension Exercise:** Your own agent: a unit converter joke generator, or even a STEM joke bot.

By the end, you‚Äôll know how to design, guide, and test your own reasoning‚Äëready AI agents ‚Äî all built from fundamentals.

## Part 1: Understanding Structured Generation

One of the biggest challenges when working with large language models (LLMs) is that they typically produce unstructured text‚Äîfree-form answers that might not follow a consistent or machine-readable format. Structured generation solves this problem by defining a schema (a data structure the model must follow) and constraining the output to match it. This is particularly useful for applications that require reliability and consistency, such as agents, tools, or pipelines that process AI-generated data automatically.

In this demo, structured generation is handled through the Pydantic library and the Outlines package, which enforces that the model‚Äôs output adheres to a specified structure.  Our demo will walk you through a LLM-joke generator.  Let's start by highlighting some key components of the code demo. 

### Connecting to Model

```python
openai_client = OpenAI(
    api_key="ollama",
    base_url="http://localhost:11434/v1"
)
```

This block connects to a local Ollama instance that supports the OpenAI-compatible API. We‚Äôre using the standard OpenAI SDK but pointing it at a local endpoint ‚Äî in this case, llama3.2.

### Wrap the Model in Outlines 

```python
model = from_openai(openai_client, model_name="llama3.2")
```

The from_openai function from Outlines bridges the OpenAI client and structured output generation. It automatically:

- Prompts the model with JSON constraints.
- Validates the response against the Pydantic schema.
- Returns only schema-compliant data.

This means you never have to write manual parsing code or regex hacks to clean up model responses.

### Use Pydantic to Define a Schema for Joke Generator 

```python 
class DadJoke(BaseModel):
    setup: str = Field(..., description="Joke setup")
    punchline: str = Field(..., description="Punchline")
```

Here, DadJoke defines a simple schema with two string fields ‚Äî setup and punchline.

- BaseModel ensures data validation and type checking.
- Field adds helpful descriptions for the model.

By using this model, we tell the LLM: ‚ÄúYou must return valid JSON that fits this schema.‚Äù

### Generation Function 

```python
def generate_dad_joke(topic: str):
    prompt = f"Tell a perfect dad joke about {topic}."
    joke = model(prompt, DadJoke)
    return joke
```
This function defines a natural-language prompt and runs the model, specifying DadJoke as the schema.
The call model(prompt, DadJoke) ensures structured output that fits the schema automatically.

OK, now that we have walked through the major highlights of the code, let's start generating jokes! 

In [3]:
from outlines import from_openai
from pydantic import BaseModel, Field
from openai import OpenAI
import json

# Ollama OpenAI-compatible endpoint
openai_client = OpenAI(
    api_key="ollama",  # Dummy key for local Ollama
    base_url="http://localhost:11434/v1"  # Ollama's OpenAI-compatible API
)

model = from_openai(openai_client, model_name="qwen3:latest")

# Simple Pydantic structure
class DadJoke(BaseModel):
    setup: str = Field(..., description="Joke setup")
    punchline: str = Field(..., description="Punchline")

def generate_dad_joke(topic: str):
    """Generate perfect dad jokes with guaranteed JSON structure"""
    prompt = f"Tell a perfect data joke about {topic}."
    
    # from_openai handles structured generation automatically
    joke = model(prompt, DadJoke)
    return joke

# Demo
topic = input("Dad joke about? ")
joke = generate_dad_joke(topic)
joke = json.loads(joke)
#print(f"Q: {joke["setup"]}")
#print(f"A: {joke["punchline"]} üòÇ")
print("Question: {}".format(joke['setup']))
print("Answer: {}".format(joke['punchline']))

Dad joke about?  soccer


Question: Why did the soccer player get kicked out of the data meeting?
Answer: Because he kept making offside assists‚Äîhis data was always in the wrong half!


## Part 2: Building a Math‚ÄëEnabled Dad Joke Agent

In the first example, we created a simple structured generator that could produce dad jokes with a predictable JSON output. That version gave us reliability and clean data, but it didn‚Äôt *think* about the joke ‚Äî it just produced one in a single shot.

In this section, we‚Äôll take the next step and build a **reasoning‚Äëenabled math-joke agent**. This new version can:
- Think through multiple steps before deciding on a punchline.
- Use a **calculator tool** to double‚Äëcheck its math.
- Still produce structured, validated output using the same pattern from before.

This introduces a key concept for agent design: **structured reasoning**. Instead of returning just data, the model communicates its thought process and actions in a controlled schema.  Below we highlight key code differences:

### 1. Expanding the Schema System

To support reasoning and tool use, we introduce two new Pydantic models in addition to our original `DadJoke` structure:

```python
class ToolCall(BaseModel):
    name: str = Field("calculator", description="Use this tool to perform simple arithmetic.")
    x: float = Field(..., description="First number")
    y: float = Field(..., description="Second number")
    operation: str = Field(..., description="Operation: 'add', 'subtract', 'multiply', or 'divide'")

class AgentAction(BaseModel):
    thought: str = Field(..., description="Internal reasoning about what to do next.")
    action: Optional[ToolCall] = Field(None, description="Tool call if calculation is needed.")
    final_joke: Optional[DadJoke] = Field(None, description="Final dad joke when ready.")
```

The ToolCall schema defines how the model can request a calculation, while AgentAction acts as the brain of the agent ‚Äî containing what the model thinks, what action (if any) it wants to take, and finally, the completed joke.  The AgentAction schema is a ReAct style agent. 

This is a powerful extension of structured generation. Instead of controlling the format of a single response, we‚Äôre now structuring the entire reasoning process. Whether the model is thinking, calculating, or concluding, every step follows a defined schema.

### 2. Giving the Agent a Calculator

We add a simple tool for performing arithmetic operations:

```python
def calculator_tool(x: float, y: float, operation: str) -> str:
    """Perform simple math operations including division."""
    try:
        if operation == "add":
            return str(x + y)
    ...
```

Even though the model is generating creative text, arithmetic precision comes from the code ‚Äî not language prediction. When the model ‚Äúcalls‚Äù this tool, our code executes the function, captures the result, and feeds it back as an observation for the next reasoning step. This demonstrates how structured agents blend statistical reasoning (language generation) and symbolic computation (exact math).

### 3. Creating an Agent Loop 

Now we build a reasoning loop that allows the model to think and choose to either call the calculator, interpret results, or make the final joke:

```python 
def generate_dad_joke_agent(topic: str, max_steps: int = 3):
    ...
    for step in range(max_steps):
        response = model(prompt, AgentAction)
        if action_obj.action:
            result = calculator_tool(...)
            observations.append(f"Calculator result: {result}")
        elif action_obj.final_joke:
            return action_obj.final_joke

    ...
```

Each iteration represents one reasoning cycle. The model produces a structured AgentAction, which might include a ToolCall. If it does, our program executes that tool and provides the result as context in the next turn.

This is how we move from single‚Äëturn generation to multi‚Äëstep reasoning ‚Äî the agent uses a feedback loop to verify and improve its reasoning before producing the final answer.

### 4. Prompt Template 

Finally, we need to provide a detailed prompt to the agent instructing it on how to reason and choose appropriate next steps. 

```python
context = f"""
You are a REASONING‚ÄëENABLED Dad Joke Agent writing a math‚Äëthemed joke about the number '{topic}'.
...
"""
```

OK, now that we have walked through the major code changes, let's take a look at and execute our math-joke-agent generator! 

In [4]:
from outlines import from_openai
from pydantic import BaseModel, Field
from openai import OpenAI
from typing import Optional, List, Dict
import json

# Dad joke structure
class DadJoke(BaseModel):
    setup: str = Field(..., description="Joke setup or question")
    punchline: str = Field(..., description="Funny punchline or answer")

# Calculator tool schema (explicit fields, now includes division)
class ToolCall(BaseModel):
    name: str = Field(
        "calculator",
        description="Use this tool to perform simple arithmetic operations: add, subtract, multiply, or divide."
    )
    x: float = Field(..., description="First number for the operation")
    y: float = Field(..., description="Second number for the operation")
    operation: str = Field(
        ..., 
        description="The mathematical operation to perform: one of 'add', 'subtract', 'multiply', 'divide'."
    )

# Agent reasoning schema
class AgentAction(BaseModel):
    thought: str = Field(..., description="Internal reasoning about what to do next.")
    action: Optional[ToolCall] = Field(None, description="Tool call if calculation is needed.")
    final_joke: Optional[DadJoke] = Field(None, description="Final dad joke when ready.")

# Ollama/OpenAI-compatible model
openai_client = OpenAI(api_key="ollama", base_url="http://localhost:11434/v1")
model = from_openai(openai_client, model_name="qwen3:latest")

# Calculator tool implementation (now supports division)
def calculator_tool(x: float, y: float, operation: str) -> str:
    """Perform simple math operations including division."""
    try:
        if operation == "add":
            return str(x + y)
        elif operation == "subtract":
            return str(x - y)
        elif operation == "multiply":
            return str(x * y)
        elif operation == "divide":
            if y == 0:
                return "Error: Division by zero"
            return str(x / y)
        else:
            return f"Error: Unknown operation '{operation}'"
    except Exception as e:
        return f"Error: {e}"

# Main agent loop
def generate_dad_joke_agent(topic: str, max_steps: int = 3):
    context = f"""
You are a REASONING-ENABLED Dad Joke Agent writing a math-themed joke about the number '{topic}'.

**Your Objective:**
Create a funny dad joke that includes math related to this number.
The math in the joke **must be correct** ‚Äî you can use the calculator tool to verify or compute results.

**Available Tool:**
- calculator(x, y, operation): Performs 'add', 'subtract', 'multiply', or 'divide'.

**Reasoning Guidelines:**
1. Think through what kind of math fact or operation fits a joke about {topic}.
2. If you need a computation, call the calculator tool with the appropriate parameters.
3. After receiving an observation, USE that information in your next reasoning step ‚Äî do NOT repeat the same calculation unless a new one is necessary.
4. Once the math is verified or computed, produce the final dad joke using the `final_joke` field.

**Your output each turn must be one of the following:**
- A reasoning step (`thought`) and a tool call (`action`).
- A reasoning step (`thought`) and the final joke (`final_joke`).

Example Thought Flow:
- Thought: ‚ÄúMaybe 8 divided by 2 equals 4 could fit a joke.‚Äù
- Action ‚Üí calculator(8, 2, 'divide')
- Observation: Calculator result: 4
- Thought: ‚ÄúNice, that‚Äôs correct! I‚Äôll use 4 as the punchline number.‚Äù
- Final_joke ‚Üí (‚ÄúWhy did 8 break up with 2? Because it couldn‚Äôt handle the division.‚Äù)

Be concise, logical, and make the math part funny.
"""

    messages: List[Dict] = [{"role": "user", "content": context}]
    observations = []

    for step in range(max_steps):
        print(f"\n--- Step {step + 1} ---")
        
        prompt = "\n".join([f"{m['role']}: {m['content']}" for m in messages[-4:]])
        if observations:
            prompt += f"\nLatest Observation: {observations[-1]}"

        # Generate structured response
        try:
            response = model(prompt, AgentAction)
            action_obj = (
                response if isinstance(response, AgentAction)
                else AgentAction.model_validate_json(response)
            )
        except Exception as e:
            print(f"Parse error: {e}")
            return {"error": "Generation failed"}

        print(f"Thought: {action_obj.thought}")

        # Handle tool calls
        if action_obj.action:
            tool = action_obj.action
            print(f"Tool Call: {tool.name}({tool.x}, {tool.y}, operation={tool.operation})")
            result = calculator_tool(tool.x, tool.y, tool.operation)
            
            observation = f"Calculator result: {result}"
            observations.append(observation)
            
            messages.extend([
                {"role": "assistant", "content": f"Action: calculator({tool.x}, {tool.y}, '{tool.operation}')"},
                {"role": "system", "content": observation}
            ])
        
        elif action_obj.final_joke:
            return action_obj.final_joke

        else:
            print("No action or final joke returned.")
            break

    return {"error": f"Max steps ({max_steps}) reached without joke"}


topic = input("Dad joke about the number: ")
result = generate_dad_joke_agent(topic)

if "error" in result:
    print(f"‚ùå {result['error']}")
else:
    print("\nüòÇ JOKE:")
    print("Question: {}".format(result.setup))
    print("Answer: {}".format(result.punchline))


Dad joke about the number:  25



--- Step 1 ---
Thought: Maybe 100 divided by 4 equals 25 could fit a joke.
Tool Call: calculator(100.0, 4.0, operation=divide)

--- Step 2 ---
Thought: I need a math fact involving 25. 5 squared equals 25, which could work for a joke about being a 'square'.
Tool Call: calculator(5.0, 5.0, operation=multiply)

--- Step 3 ---
Thought: Both operations (100 √∑ 4 and 5 √ó 5) correctly yield 25.0. This demonstrates that different arithmetic operations can result in the same value, highlighting the versatility of mathematical calculations.

üòÇ JOKE:
Question: Why did the calculator get promoted?
Answer: It always gave the right results‚Äîwhether dividing, multiplying, or even adding up to 25!


### Exercise 1: Convert the math-joke-agent above into a unit-conversion-joke generator.

Choose your preferred unit conversions, add an additional tool call to do the unit conversion, update prompt templates, then see if your agent can create jokes based on both reliable math and unit conversions! 

### Extra Credit / Take Home Exercise 

Build a STEM Joke Agent that can generate either math, physics, or computer science jokes depending on user choice.

- Use structured generation to keep output consistent (schema with topic_type, setup, punchline).
- Introduce reasoning steps using the AgentAction pattern.
- Implement at least one tool relevant to each topic (examples: calculator for math, unit_converter for physics, or ascii_tool for programming humor).
- The agent should think, select a tool, verify correctness, and then output a final structured joke.