# Lesson 7: Structured Output

## The Problem: Free-Form Text

In previous lessons, the agent returned **free-form text** — easy for humans to read, but **hard for code to process**.

For example, if you ask the agent to create an article outline, it returns:
```
Here's an outline for the SEO article:
1. Introduction to SEO
2. On-page factors...
```

**Problem:** How do you extract just the title? Get the list of sections? Very difficult!

**Solution: Structured Output** — force the agent to return data in a **predefined format**, like filling out a form.

In [None]:
# First, let's see what free-form text looks like
from dotenv import load_dotenv
load_dotenv()

from agno.agent import Agent
from agno.models.anthropic import Claude

agent_free = Agent(
    name="Free Text Agent",
    model=Claude(id="claude-sonnet-4-5-20250929"),
    instructions=["Create an outline for an SEO article."],
)

response = agent_free.run("Create an outline for an article about 'On-Page SEO Optimization'")
print("--- Free-form text (hard to process in code) ---")
print(response.content)
print(f"\nData type: {type(response.content)}")
# It's just a long string — you can't easily extract the title or sections!

## First: What is JSON?

Before we learn structured output, you need to understand **JSON** — the format agents use to return structured data.

**JSON** (JavaScript Object Notation) is a universal text format for data. It looks almost identical to Python dicts and lists:

```json
{
  "title": "SEO Guide 2026",
  "word_count": 2000,
  "keywords": ["seo", "ranking", "optimization"],
  "published": true
}
```

**JSON vs Python dict — spot the similarities:**

| Feature | Python dict | JSON |
|---------|-------------|------|
| Key-value pairs | `{"name": "Viet"}` | `{"name": "Viet"}` |
| Lists | `["a", "b"]` | `["a", "b"]` |
| Numbers | `20` | `20` |
| Booleans | `True` / `False` | `true` / `false` |
| Null | `None` | `null` |

Why does JSON matter? When agents return structured data, they return it as JSON. Python then converts it into objects you can use in code. That's what `output_schema` does — it tells the agent "return your answer as JSON in this exact format."

## Pydantic Models — Creating a "Form Template" for Data

**Pydantic** lets you define what your data should look like. Think of it as creating a **form template**:

```python
class ArticleOutline(BaseModel):
    title: str          # Title field (text)
    sections: list[str] # Sections field (list of text)
    target_keyword: str # Keyword field (text)
```

Each field has:
- A **name** (`title`, `sections`, ...)
- A **type** (`str` = text, `list[str]` = list of text, `int` = number)
- A **description** (`Field(description="...")`) — helps the agent understand what to fill in

In [None]:
from pydantic import BaseModel, Field

# Define a "form template" for an article outline
class ArticleOutline(BaseModel):
    title: str = Field(description="Article title")
    sections: list[str] = Field(description="List of main sections")
    target_keyword: str = Field(description="Primary keyword")

# View the model's structure
print("ArticleOutline structure:")
print(ArticleOutline.model_json_schema())

## output_schema — Forcing the Agent to Follow the Template

When you add `output_schema=ArticleOutline` to an agent, it will:
1. **Read your template** (ArticleOutline)
2. **Think** and generate content
3. **Fill in the form** following the exact structure
4. Return a **Python object** — you can access `outline.title`, `outline.sections`, etc.

```python
agent = Agent(
    output_schema=ArticleOutline,  # Just add this!
)
```

In [None]:
# Agent with structured output
agent = Agent(
    name="Outline Creator",
    model=Claude(id="claude-sonnet-4-5-20250929"),
    output_schema=ArticleOutline,
    instructions=["Create an outline for an SEO article."],
)

response = agent.run("Create an outline for an article about 'On-Page SEO Optimization'")
outline = response.content

print(f"Title: {outline.title}")
print(f"Keyword: {outline.target_keyword}")
print(f"\nSections:")
for i, section in enumerate(outline.sections, 1):
    print(f"  {i}. {section}")

## Why Structured Output Matters

Structured data lets you:

1. **Save to a database** — `db.save(outline.title, outline.sections)`
2. **Pass to another agent** — Writer Agent receives the outline from Researcher Agent
3. **Export to files** — Generate CSV, JSON, HTML from the data
4. **Process in code** — Count sections, filter keywords, etc.

This is the foundation for building **automated pipelines** — data flows accurately from one agent to the next.

In [None]:
# Access structured data easily — just like a regular Python object!
print("--- Accessing structured data ---")
print(f"Title: {outline.title}")
print(f"Keyword: {outline.target_keyword}")
print(f"Number of sections: {len(outline.sections)}")
print(f"First section: {outline.sections[0]}")
print(f"Last section: {outline.sections[-1]}")

# Convert to dict or JSON
print(f"\n--- As dictionary ---")
print(outline.model_dump())

print(f"\n--- As JSON ---")
print(outline.model_dump_json(indent=2))

## Lesson 7 Summary

What you learned:
- **The problem** with free-form text: hard to process in code
- **Pydantic BaseModel**: create a "form template" for data (name, type, description)
- **output_schema**: force the agent to return data in the exact format
- Access data easily: `outline.title`, `outline.sections[0]`
- Convert data: `.model_dump()` (dict), `.model_dump_json()` (JSON)

**Next lesson:** We'll **chain** multiple agents together — the output of one agent becomes the input for the next!

## Exercise

Modify the `ArticleOutline` schema to add a new field: `word_count_target` (an integer with description "Target word count for the article").

Then create an agent with this updated schema and run it. Check that the agent fills in your new field.

Hints:
- Add `word_count_target: int = Field(description="...")` to the class
- Access it with `outline.word_count_target` after running

In [None]:
# Exercise: Write your code here
