# Lesson 4: Structured Outputs

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/harshit-vibes/lyzr-adk-demo/blob/master/notebooks/04_structured_outputs.ipynb)


**üü° Intermediate ¬∑ ‚è± 25 min**

---

By default agents return free-form text. Structured outputs let you define exactly what shape the response should take using Pydantic models ‚Äî giving you type safety, IDE autocomplete, and easy downstream processing.

## What you will learn

- Define Pydantic response models with field descriptions
- Get typed agent responses instead of raw strings
- Work with nested models for hierarchical data
- Handle optional fields gracefully

## Prerequisites

Before starting this lesson, make sure you have completed:

- **Lesson 1** ‚Äî Getting Started: `Studio`, `create_agent`, `agent.run`
- **Lesson 2** ‚Äî Providers and Models
- **Lesson 3** ‚Äî Agent Lifecycle

`pydantic` is included with `lyzr-adk`, so no separate install is needed. If you are running in a fresh environment, the next cell handles everything.

In [None]:
!pip install lyzr-adk[jupyter] -q

In [None]:
import os
from lyzr import Studio
from pydantic import BaseModel, Field
from typing import List, Optional

API_KEY = os.getenv("LYZR_API_KEY", "YOUR_LYZR_API_KEY")
studio = Studio(api_key=API_KEY)
print("Ready!")

## Why Structured Outputs?

When an agent returns free-form text, consuming that output in code requires fragile string parsing:

```python
# Without structured outputs ‚Äî brittle
response = agent.run("What is the rating of Inception?")
text = response.response  # "I'd rate Inception an 8.7 out of 10..."
# Now you need regex or LLM parsing to extract 8.7
```

Structured outputs solve this by having the agent return a Pydantic model instance directly:

| Feature | Free-form text | Structured output |
|---------|---------------|-------------------|
| Type safety | No | Yes |
| IDE autocomplete | No | Yes |
| Validation | No | Yes (Pydantic) |
| Downstream processing | Fragile parsing | Direct field access |
| Agent guidance | Prompt only | Field names + descriptions |

The field names and `Field(description=...)` values are passed to the agent as guidance, so descriptive field names and clear descriptions lead to more accurate extraction.

## Defining Your First Response Model

Create a Pydantic `BaseModel` subclass. Use `Field(description=...)` to tell the agent exactly what each field should contain ‚Äî the richer the description, the more accurate the result.

In [None]:
class MovieReview(BaseModel):
    title: str = Field(description="The exact title of the movie")
    year: int = Field(description="Release year")
    rating: float = Field(description="Rating out of 10.0")
    summary: str = Field(description="One sentence plot summary")
    pros: List[str] = Field(description="List of 3 things the movie does well")
    cons: List[str] = Field(description="List of 2 weaknesses")
    recommended: bool = Field(description="Whether you recommend watching it")

print("Model defined:", list(MovieReview.model_fields.keys()))

## Creating a Specialized Agent

The agent's `role`, `goal`, and `instructions` work together with the response model's field descriptions. A focused agent with clear instructions produces more consistent structured responses.

In [None]:
reviewer = studio.create_agent(
    name="Movie Critic",
    provider="openai/gpt-4o",
    role="Professional film critic",
    goal="Provide structured movie reviews",
    instructions="Always fill in all fields completely and accurately.",
    response_model=MovieReview   # pass the Pydantic model at agent creation time
)
print(f"Reviewer agent created: {reviewer.id}")

## Getting a Structured Response

With `response_model` set on the agent, `agent.run()` returns the Pydantic model instance **directly** ‚Äî not wrapped in an `AgentResponse`.

| | Without `response_model` | With `response_model` |
|---|---|---|
| `run()` returns | `AgentResponse` | Your Pydantic model directly |
| Get text | `response.response` | ‚Äî (it's already typed) |
| Field access | `str` parsing | `obj.field_name` |

> **Note:** Set `response_model` at `create_agent()` time, not on each `run()` call.

In [None]:
# response_model was set on the agent, so run() returns the Pydantic model directly
review: MovieReview = reviewer.run("Review the movie Inception (2010)")

print(f"Title:       {review.title} ({review.year})")
print(f"Rating:      {review.rating}/10")
print(f"Summary:     {review.summary}")
print(f"Pros:        {', '.join(review.pros)}")
print(f"Cons:        {', '.join(review.cons)}")
print(f"Recommended: {'Yes ‚úÖ' if review.recommended else 'No ‚ùå'}")

## Nested Models

Pydantic models can contain other models, enabling hierarchical data structures. This is particularly useful for extraction tasks where you need to pull multiple entities from a single piece of text.

```
ExtractedPeople
‚îú‚îÄ‚îÄ total_count: int
‚îî‚îÄ‚îÄ people: List[Person]
    ‚îú‚îÄ‚îÄ name: str
    ‚îú‚îÄ‚îÄ age: Optional[int]
    ‚îî‚îÄ‚îÄ occupation: str
```

The agent understands the nesting and populates each level correctly.

In [None]:
class Person(BaseModel):
    name: str = Field(description="Full name of the person")
    age: Optional[int] = Field(None, description="Age if mentioned, otherwise None")
    occupation: str = Field(description="Their job or role")

class ExtractedPeople(BaseModel):
    people: List[Person] = Field(description="All people mentioned in the text")
    total_count: int = Field(description="Total number of people found")

extractor = studio.create_agent(
    name="Entity Extractor",
    provider="openai/gpt-4o",
    role="Information extraction specialist",
    goal="Extract structured people data from text",
    instructions="Extract every person mentioned. If age is not stated, leave it as null.",
    response_model=ExtractedPeople   # response_model set at creation time
)

text = "Alice Chen (35, software engineer) and Bob Martinez (designer, mid-30s) co-founded the startup with Dr. Sarah Kim."
data: ExtractedPeople = extractor.run(text)  # returns ExtractedPeople directly

print(f"Found {data.total_count} people:")
for person in data.people:
    age_str = str(person.age) if person.age else "unknown"
    print(f"  ‚Ä¢ {person.name}, age {age_str}, {person.occupation}")

## Common Mistake: Passing `response_format` to `run()`

The structured output model must be set at **agent creation** via `response_model=`, not passed to each `run()` call. Passing it to `run()` raises a `TypeError`.

Also: with `response_model` set, `run()` returns the typed object directly ‚Äî calling `.response` on it will raise an `AttributeError`.

In [None]:
# ‚ùå Mistake 1: passing response_format to run() instead of create_agent()
no_model_agent = studio.create_agent(
    name="No Model Agent", provider="openai/gpt-4o",
    role="Critic", goal="Review movies", instructions="Review movies."
)
try:
    from pydantic import BaseModel as BM
    class TempModel(BM):
        title: str
    no_model_agent.run("Review Inception", response_format=TempModel)  # ‚ùå
except Exception as e:
    print(f"‚ùå response_format in run() error: {type(e).__name__}: {e}")

# ‚ùå Mistake 2: calling .response on a structured result (it's already the model)
matrix: MovieReview = reviewer.run("Review The Matrix (1999)")
try:
    print(matrix.response)   # ‚ùå MovieReview has no .response attribute
except AttributeError as e:
    print(f"‚ùå .response on typed result: {e}")

# ‚úÖ Correct: access fields directly ‚Äî run() returned the model
print(f"‚úÖ Rating: {matrix.rating}/10")

## Exercise

Define a `ProductSummary` Pydantic model with the following fields:

| Field | Type | Description |
|-------|------|-------------|
| `name` | `str` | Product name |
| `price_usd` | `float` | Price in US dollars |
| `category` | `str` | Product category |
| `features` | `List[str]` | Key product features |
| `in_stock` | `bool` | Whether the product is currently in stock |

Then create an extraction agent and extract structured data from the product description below.

> **Hint:** Write a descriptive `Field(description=...)` for each field ‚Äî this directly guides what the agent extracts.

In [None]:
# TODO: Define a ProductSummary Pydantic model
class ProductSummary(BaseModel):
    name: str = Field(description=...)
    price_usd: float = Field(description=...)
    category: str = Field(description=...)
    features: List[str] = Field(description=...)
    in_stock: bool = Field(description=...)

# TODO: Create an agent with response_model=ProductSummary
product_agent = studio.create_agent(
    name=...,
    provider="openai/gpt-4o",
    role=...,
    goal=...,
    instructions=...,
    response_model=ProductSummary   # ‚Üê response_model goes here, not in run()
)

# TODO: Extract structured data ‚Äî run() returns ProductSummary directly
description = """
The TurboBlend Pro 3000 is a high-performance blender priced at $149.99.
It features a 1200W motor, 6 stainless steel blades, 64oz BPA-free container,
and 5 speed settings. Currently in stock in the Kitchen Appliances category.
"""

product: ProductSummary = product_agent.run(description)
print(f"Product: {product.name}")
print(f"Price: ${product.price_usd}")
# ... print other fields

## Summary

### Structured vs Unstructured Output

| | Unstructured | Structured |
|---|---|---|
| Where model is set | ‚Äî | `response_model=MyModel` in `create_agent()` |
| `run()` returns | `AgentResponse` | `MyModel` instance directly |
| Get text | `response.response` | Direct field access |
| Type safety | None | Full Pydantic validation |
| Optional fields | Manual handling | `Optional[T] = None` |
| Nested data | Complex parsing | Nested `BaseModel` |

### Key Takeaways

1. **`response_model` goes on `create_agent()`** ‚Äî not on each `run()` call.
2. **`run()` returns the model directly** ‚Äî there is no `.response` wrapper when `response_model` is set.
3. **Use `Field(description=...)`** ‚Äî descriptions guide the agent; richer descriptions = more accurate extraction.
4. **Each agent has one response shape** ‚Äî create separate agents for different output models.
5. **`Optional[T] = None`** for fields that may not always be present in the source text.

## Next Steps

**Lesson 5: Memory and Sessions** ‚Äî Learn how to give agents persistent memory so they can maintain context across multiple turns of a conversation.

---

*Lesson 4 of 10 ¬∑ lyzr-adk fundamentals series*