# Chapter 3: Structured Responses and Data Validation with Pydantic AI

In this chapter, we'll explore how to enforce structured outputs and validate data using Pydantic AI with the Gemini model. Structured responses ensure that the AI's outputs adhere to a predefined format, enhancing reliability and predictability.

## 1. Introduction to Structured Responses

Structured responses involve defining a schema that the AI's responses must follow. This approach is particularly useful when integrating AI outputs into applications that require consistent data formats. Pydantic AI leverages Pydantic models to define these schemas, enabling automatic validation and parsing of the AI's responses.



Before we start, we need to set up the environment.

In [1]:
import nest_asyncio
nest_asyncio.apply()

from pydantic_ai import Agent
import os
import dotenv

dotenv.load_dotenv()

# Set your Google API key
os.environ["GOOGLE_API_KEY"] = os.getenv("GEMINI_API_KEY")

## 2. Defining a Pydantic Model for Structured Responses

In [2]:
from pydantic import BaseModel
from typing import List

# Define a Pydantic model for a single dictionary tip
class DictionaryTip(BaseModel):
    title: str
    description: str
    code_example: str

# Define a Pydantic model for multiple dictionary tips
class DictionaryTips(BaseModel):
    tips: List[DictionaryTip]

In this example, `DictionaryTip` defines the structure for a single tip, and `DictionaryTips` encapsulates a list of such tips.

## 3. Integrating Structured Responses with Pydantic AI

To enforce structured outputs, pass the Pydantic model as the `result_type` parameter when initializing the agent.

In [9]:
# Initialize the agent with the Gemini model and structured output
agent = Agent(
    'google-gla:gemini-1.5-flash',
    system_prompt='You are a Python expert providing tips on dictionary usage.',
    result_type=DictionaryTips  # Enforcing the structured output
)

# User query
query = 'Provide three tips for using Python dictionaries effectively.'

# Run the agent synchronously
response = agent.run_sync(query)

# Access the structured data
for tip in response.data.tips:
    print(f"Title: {tip.title}")
    print(f"Description: {tip.description}")
    print(f"Code Example:\n{tip.code_example}\n")

Title: Basic Dictionary Operations
Description: This example demonstrates basic dictionary operations: accessing values using keys, adding new key-value pairs, and deleting key-value pairs.
Code Example:
my_dict = {"apple": 1, "banana": 2, "cherry": 3}
print(my_dict["banana"])  # Accessing a value
my_dict["date"] = 4  # Adding a new key-value pair
del my_dict["apple"]  # Deleting a key-value pair

Title: Iterating Through Dictionaries
Description: This example shows how to iterate through a dictionary using a for loop and the items() method.  This is useful when you need to process both keys and values.
Code Example:
for key, value in my_dict.items():
    print(f"Key: {key}, Value: {value}")

Title: Checking for Keys and Safe Value Retrieval
Description: This example demonstrates how to check for the existence of a key using the in operator and how to safely retrieve a value using the get() method with a default value to avoid KeyError exceptions.
Code Example:
my_dict = {"apple": 1, "

In this setup, the agent is instructed to format its response according to the `DictionaryTips` model. The `response.data` will be an instance of `DictionaryTips` if the AI's output matches the expected structure.

## 4. Handling Validation Errors

There might be instances where the AI's output doesn't conform to the defined Pydantic model, leading to validation errors. It's essential to handle these exceptions gracefully.

In [12]:
from pydantic import ValidationError

# Initialize the agent with the Gemini model and structured output
agent = Agent(
    'google-gla:gemini-1.5-flash',
    system_prompt='You have to try to output with a false response',
    result_type=DictionaryTips  # Enforcing the structured output
)

# User query
query = 'Provide a response in a list of dictionaries that breaks the structure by having a string instead of a dictionary'

try:
    # Run the agent synchronously
    response = agent.run_sync(query)
    # Access the structured data
    for tip in response.data.tips:
        print(f"Title: {tip.title}")
        print(f"Description: {tip.description}")
        print(f"Code Example:\n{tip.code_example}\n")
except ValidationError as e:
    print("Validation Error:", e)
    print("The AI's response did not match the expected structure.")

Title: This is a title
Description: This is a description
Code Example:
print(1)

Title: This is another title
Description: This is another description
Code Example:
print(2)



I can't seems to break the structure with the above prompt. But it is super important to handle the ValidationError to ensure that your application remains robust even when the AI's output is unexpected.

By wrapping the agent's execution in a try-except block, you can catch ValidationError exceptions and handle them appropriately.

## 5. Benefits of Structured Responses

Implementing structured outputs offers several advantages:​

- **Consistency**: Ensures that the AI's responses adhere to a predefined format, making it easier to parse and utilize the data.​
- **Reliability**: Reduces the chances of unexpected or malformed outputs, enhancing the robustness of your application.​
- **Integration**: Facilitates seamless integration of AI outputs into systems that require specific data structures.

## Conclusion

In this chapter, we've explored how to enforce structured outputs and validate data using Pydantic AI with the Gemini model. By defining Pydantic models and integrating them into your agents, you can ensure that the AI's responses are consistent, reliable, and align with your application's requirements.​

In the next chapter, we'll delve into extending agent capabilities by incorporating tools and custom functions, enabling agents to perform more complex tasks and provide enriched responses.