# Pydantic — Step‑by‑Step Master Notebook


---

# Lesson 1: Lesson 2: Pydantic Basics

> This section comes from the uploaded notebook. Execute the cells in sequence. Where needed, earlier imports/utilities are repeated for clarity.


### Mini‑Index

- Lesson 2: Pydantic Basics


# Lesson 2: Pydantic Basics

In this lesson, you'll learn the fundamentals of Pydantic models for data validation using a customer support system as your example application. You'll see how to define data models, validate user input, and handle validation errors gracefully.

By the end of this lesson, you'll be able to:
- Create Pydantic models to validate user input data
- Handle validation errors with proper error handling
- Use optional fields and field constraints in your models
- Work with JSON data validation methods

---

In [None]:
# Import libraries needed for the lesson
from pydantic import BaseModel, ValidationError, EmailStr
import json

### Define a UserInput Pydantic model and populate it with data

In [None]:
# Create a Pydantic model for validating user input
class UserInput(BaseModel):
    name: str
    email: EmailStr
    query: str

In [None]:
# Create a model instance
user_input = UserInput(
    name="Joe User", 
    email="joe.user@example.com", 
    query="I forgot my password."
)
print(user_input)

### Note: the following cell will produce a validation error. You can correct the error by following along with the video, or just proceed with the rest of the notebook as cells below do not depend on this cell. 

In [None]:
# Attempt to create another model instance with an invalid email
user_input = UserInput(
    name="Joe User", 
    email="not-an-email", 
    query="I forgot my password."
)
print(user_input)

### Define a function for error handling and try different inputs

In [None]:
# Define a function to handle user input validation safely
def validate_user_input(input_data):
    try:
        # Attempt to create a UserInput model instance from user input data
        user_input = UserInput(**input_data)
        print(f"✅ Valid user input created:")
        print(f"{user_input.model_dump_json(indent=2)}")
        return user_input
    except ValidationError as e:
        # Capture and display validation errors in a readable format
        print(f"❌ Validation error occurred:")
        for error in e.errors():
            print(f"  - {error['loc'][0]}: {error['msg']}")
        return None

In [None]:
# Create an instance of UserInput using validate_user_input() function
input_data = {
    "name": "Joe User", 
    "email": "joe.user@example.com",
    "query": "I forgot my password."
}

user_input = validate_user_input(input_data)

In [None]:
# Attempt to create an instance of UserInput with missing query field
input_data = {
    "name": "Joe User", 
    "email": "joe.user@example.com"
}

user_input = validate_user_input(input_data)

### Update your UserInput data model with additional fields and experiment with different input data

In [None]:
# Import additional libraries for enhanced validation
from pydantic import Field
from typing import Optional
from datetime import date

# Define a new UserInput model with optional fields
class UserInput(BaseModel):
    name: str
    email: EmailStr
    query: str
    order_id: Optional[int] = Field(
        None,
        description="5-digit order number (cannot start with 0)",
        ge=10000,
        le=99999
    )
    purchase_date: Optional[date] = None

In [None]:
# Define a dictionary with required fields only
input_data = {
    "name": "Joe User",
    "email": "joe.user@example.com",
    "query": "I forgot my password."
}

# Validate the user input data
user_input = validate_user_input(input_data)

In [None]:
print(user_input)

In [None]:
# Define a dictionary with all fields including optional ones
input_data = {
    "name": "Joe User",
    "email": "joe.user@example.com",
    "query": f"""I bought a laptop carrying case and it turned out to be 
             the wrong size. I need to return it.""",
    "order_id": 12345,
    "purchase_date": date(2025, 12, 31)
}

# Validate the user input data
user_input = validate_user_input(input_data)

In [None]:
# Define a dictionary with all fields and including additional ones
input_data = {
    "name": "Joe User",
    "email": "joe.user@example.com",
    "query": f"""I bought a laptop carrying case and it turned out to be 
             the wrong size. I need to return it.""",
    "order_id": 12345,
    "purchase_date": date(2025, 12, 31),
    "system_message": "logging status regarding order processing...",
    "iteration": 1 
}

# Validate the user input data
user_input = validate_user_input(input_data)

In [None]:
print(user_input)

In [None]:
# Create an instance of UserInput with valid data
input_data = {
    "name": "Joe User",
    "email": "joe.user@example.com",
    "query": f"""I bought a laptop carrying case and it turned out to be 
             the wrong size. I need to return it.""",
    "order_id": 12345,
    "purchase_date": "2025-12-31"
}

user_input = validate_user_input(input_data)

In [None]:
# Define order_id as a string
input_data = {
    "name": "Joe User",
    "email": "joe.user@example.com",
    "query": f"""I bought a laptop carrying case and it turned out to be 
             the wrong size. I need to return it.""",
    "order_id": "12345",
    "purchase_date": "2025-12-31"
}

# Validate the user input data
user_input = validate_user_input(input_data)

In [None]:
# Define name field as an integer
input_data = {
    "name": 99999,
    "email": "joe.user@example.com",
    "query": f"""I bought a laptop carrying case and it turned out to be 
             the wrong size. I need to return it.""",
    "order_id": 12345,
    "purchase_date": "2025-12-31"
}

# Validate the user input data
user_input = validate_user_input(input_data)

### Try starting with JSON data as input

In [None]:
# Define user input as JSON data
json_data = '''
{
    "name": "Joe User",
    "email": "joe.user@example.com",
    "query": "I bought a keyboard and mouse and was overcharged.",
    "order_id": 12345,
    "purchase_date": "2025-12-31"
}
'''

# Parse the JSON string into a Python dictionary
input_data = json.loads(json_data)
print("Parsed JSON:", input_data)

In [None]:
# Validate the user iput data
user_input = validate_user_input(input_data)

In [None]:
# Try different JSON input
json_data = '''
{
    "name": "Joe User",
    "email": "joe.user@example.com",
    "query": "My account has been locked for some reason.",
    "order_id": "01234",
    "purchase_date": "2025-12-31"
}
'''

# Parse the JSON into a Python dictionary
input_data = json.loads(json_data)
print("Parsed JSON:", input_data)

In [None]:
# Validate the customer support data from JSON with non-standard formats
user_input = validate_user_input(input_data)

### Try the `model_validate_json` method

### Note: the following cell will produce a validation error. You can correct the error by following along with the video. 

In [None]:
# Parse JSON and validate user input data in one step using model_validate_json method
user_input = UserInput.model_validate_json(json_data)
print(user_input.model_dump_json(indent=2))

---

## Conclusion

In this lesson, you learned how to use Pydantic models to validate user input for a customer support scenario. By defining clear data models and handling validation errors, you can ensure your code only works with well-formed data. This approach helps you build more robust and reliable applications, and sets the stage for more advanced validation and structured output in future lessons.

# Lesson 3: Prompting for structure and setting up a retry method 

In this notebook, you'll learn how to combine Pydantic with retry strategies to reliably extract structured output from an LLM.

By the end, you'll be able to:
- Define structured data models for LLM responses
- Build robust retry mechanisms for validation errors
- Create reusable functions for LLM interactions

---

### Import packages and initialize the OpenAI client

In [None]:
# Import necessary packages
from pydantic import BaseModel, ValidationError, Field, EmailStr
from typing import List, Literal, Optional
import json
from datetime import date
from dotenv import load_dotenv
import openai

In [None]:
# Load environment variables for API access
load_dotenv()
# Initialize OpenAI client for API calls
client = openai.OpenAI()

### Define some sample input data

In [None]:
# Define a JSON string representing user input
user_input_json = '''
{
    "name": "Joe User",
    "email": "joe.user@example.com",
    "query": "I forgot my password.",
    "order_number": null,
    "purchase_date": null
}
'''

### Define your UserInput data model

In [None]:
# Define UserInput model
class UserInput(BaseModel):
    name: str
    email: EmailStr
    query: str
    order_id: Optional[int] = Field(
        None,
        description="5-digit order number (cannot start with 0)",
        ge=10000,
        le=99999
    )
    purchase_date: Optional[date] = None

In [None]:
# Create UserInput instance from JSON data
user_input = UserInput.model_validate_json(user_input_json)

### Create a new data model called CustomerQuery

In [None]:
# Define the CustomerQuery model that inherits from UserInput
class CustomerQuery(UserInput):
    priority: str = Field(
        ..., description="Priority level: low, medium, high"
    )
    category: Literal[
        'refund_request', 'information_request', 'other'
    ] = Field(..., description="Query category")
    is_complaint: bool = Field(
        ..., description="Whether this is a complaint"
    )
    tags: List[str] = Field(..., description="Relevant keyword tags")

### Construct a prompt with example output

In [None]:
# Create a prompt with generic example data to guide LLM.
example_response_structure = f"""{{
    name="Example User",
    email="user@example.com",
    query="I ordered a new computer monitor and it arrived with the screen cracked. I need to exchange it for a new one.",
    order_id=12345,
    purchase_date="2025-12-31",
    priority="medium",
    category="refund_request",
    is_complaint=True,
    tags=["monitor", "support", "exchange"] 
}}"""

In [None]:
# Create prompt with user data and expected JSON structure
prompt = f"""
Please analyze this user query\n {user_input.model_dump_json(indent=2)}:

Return your analysis as a JSON object matching this exact structure 
and data types:
{example_response_structure}

Respond ONLY with valid JSON. Do not include any explanations or 
other text or formatting before or after the JSON object.
"""

print(prompt)

### Define a function to call an LLM and try it with your prompt

In [None]:
# Define a function to call the LLM
def call_llm(prompt, model="gpt-4o"):
    response = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": prompt}]
    )
    return response.choices[0].message.content

In [None]:
# Get response from LLM
response_content = call_llm(prompt)
print(response_content)

### Validate the LLM output using your CustomerQuery model

### Note: the following cell will produce a validation error. This is expected. You can simply proceed with the rest of the notebook as cells below do not depend on this cell. 

In [None]:
# Attempt to parse the response into CustomerQuery model
valid_data = CustomerQuery.model_validate_json(response_content)

### Define a function for error handling

In [None]:
# Define a function to validate an LLM response
def validate_with_model(data_model, llm_response):
    try:
        validated_data = data_model.model_validate_json(llm_response)
        print("data validation successful!")
        print(validated_data.model_dump_json(indent=2))
        return validated_data, None
    except ValidationError as e:
        print(f"error validating data: {e}")
        error_message = (
            f"This response generated a validation error: {e}."
        )
        return None, error_message

In [None]:
# Test your validation function with the LLM response
validated_data, validation_error = validate_with_model(
    CustomerQuery, response_content
)

### Define a function to create a retry prompt including error details

In [None]:
# Define a function to create a retry prompt with error feedback
def create_retry_prompt(
    original_prompt, original_response, error_message
):
    retry_prompt = f"""
This is a request to fix an error in the structure of an llm_response.
Here is the original request:
<original_prompt>
{original_prompt}
</original_prompt>

Here is the original llm_response:
<llm_response>
{original_response}
</llm_response>

This response generated an error: 
<error_message>
{error_message}
</error_message>

Compare the error message and the llm_response and identify what 
needs to be fixed or removed
in the llm_response to resolve this error. 

Respond ONLY with valid JSON. Do not include any explanations or 
other text or formatting before or after the JSON string.
"""
    return retry_prompt

In [None]:
# Create a retry prompt for validation errors
validation_retry_prompt = create_retry_prompt(
    original_prompt=prompt,
    original_response=response_content,
    error_message=validation_error
)

print(validation_retry_prompt)

### Call the LLM with your retry prompt

In [None]:
# Call the LLM with the validation retry prompt
validation_retry_response = call_llm(validation_retry_prompt)
print(validation_retry_response)

In [None]:
# Attempt to validate retry response from LLM
validated_data, validation_error = validate_with_model(
    CustomerQuery, validation_retry_response
)

### Create a second retry prompt

In [None]:
# Create a second retry prompt for validation errors
second_validation_retry_prompt = create_retry_prompt(
    original_prompt=validation_retry_prompt,
    original_response=validation_retry_response,
    error_message=validation_error
)

print(second_validation_retry_prompt)

In [None]:
# Call the LLM with the second validation retry prompt
second_validation_retry_response = call_llm(
    second_validation_retry_prompt
)
print(second_validation_retry_response)

### Define a function to handle multiple retries in a feedback loop

In [None]:
# Define a function to automatically retry an LLM call multiple times
def validate_llm_response(
    prompt, data_model, n_retry=5, model="gpt-4o"
):
    # Initial LLM call
    response_content = call_llm(prompt, model=model)
    current_prompt = prompt

    # Try to validate with the model
    # attempt: 0=initial, 1=first retry, ...
    for attempt in range(n_retry + 1):

        validated_data, validation_error = validate_with_model(
            data_model, response_content
        )

        if validation_error:
            if attempt < n_retry:
                print(f"retry {attempt} of {n_retry} failed, trying again...")
            else:
                print(f"Max retries reached. Last error: {validation_error}")
                return None, (
                    f"Max retries reached. Last error: {validation_error}"
                )

            validation_retry_prompt = create_retry_prompt(
                original_prompt=current_prompt,
                original_response=response_content,
                error_message=validation_error
            )
            response_content = call_llm(
                validation_retry_prompt, model=model
            )
            current_prompt = validation_retry_prompt
            continue

        # If you get here, both parsing and validation succeeded
        return validated_data, None

In [None]:
# Test your complete solution with the original prompt
validated_data, error = validate_llm_response(
    prompt, CustomerQuery
)

### Have a look at the JSON schema of your CustomerQuery data model

In [None]:
# Investigate the model_json_schema for CustomerQuery
data_model_schema = json.dumps(
    CustomerQuery.model_json_schema(), indent=2
)
print(data_model_schema)

In [None]:
# Print the original prompt from above
print(prompt)

### Construct a new prompt using the JSON schema of your data model

In [None]:
# Create new prompt with user input and model_json_schema
prompt = f"""
Please analyze this user query\n {user_input.model_dump_json(indent=2)}:

Return your analysis as a JSON object matching the following schema:
{data_model_schema}

Respond ONLY with valid JSON. Do not include any explanations or 
other text or formatting before or after the JSON object.
"""

In [None]:
# Run your validate_llm_response function with the new prompt
final_analysis, error = validate_llm_response(
    prompt, CustomerQuery
)

---

## Conclusion

In this lesson, you explored how to combine Pydantic models with retry logic to reliably extract structured data from LLM outputs. You practiced building reusable validation functions and prompts, and saw how robust error handling can help you get consistent, usable results from language models. These techniques will help you confidently scale up your LLM-powered workflows.


---

# Lesson 3: Lesson 4: Using Pydantic Models for Structured LLM Output

> This section comes from the uploaded notebook. Execute the cells in sequence. Where needed, earlier imports/utilities are repeated for clarity.


### Mini‑Index

- Lesson 4: Using Pydantic Models for Structured LLM Output


# Lesson 4: Using Pydantic Models for Structured LLM Output

In the previous lesson, you implemented retry mechanisms to handle validation errors, which mimics what some structured output frameworks are doing behind the scenes when they handle validation for you.

In this lesson, you'll experiment with passing you Pydantic model directly in your API call using different frameworks and LLM providers.

By the end of this lesson, you'll be able to:
- Use Pydantic models directly in your API calls to LLMs
- Reliably receive a properly structured response using a variety of different frameworks and LLM providers.

---

### Import all required libraries and set up your environment

In [None]:
# Import packages
from pydantic import BaseModel, Field, EmailStr
from typing import List, Literal, Optional
from openai import OpenAI
import instructor
import anthropic
from dotenv import load_dotenv
from datetime import date

### Define your Pydantic models for user input and LLM output

In [None]:
# Define the UserInput model for customer support queries
class UserInput(BaseModel):
    name: str
    email: EmailStr
    query: str
    order_id: Optional[int] = Field(
        None,
        description="5-digit order number (cannot start with 0)",
        ge=10000,
        le=99999
    )
    purchase_date: Optional[date] = None

# Define the CustomerQuery model that inherits from UserInput
class CustomerQuery(UserInput):
    priority: str = Field(
        ..., description="Priority level: low, medium, high"
    )
    category: Literal[
        'refund_request', 'information_request', 'other'
    ] = Field(..., description="Query category")
    is_complaint: bool = Field(
        ..., description="Whether this is a complaint"
    )
    tags: List[str] = Field(..., description="Relevant keyword tags")

### Provide sample input and validate it using your model

In [None]:
# Define your input data as a JSON string
user_input_json = '''{
    "name": "Joe User",
    "email": "joe.user@example.com",
    "query": "I ordered a new computer monitor and it arrived with the screen cracked. This is the second time this has happened. I need a replacement ASAP.",
    "order_number": 12345,
    "purchase_date": "2025-12-31"
}'''

In [None]:
# Validate the user_input_json by creating a UserInput instance
user_input = UserInput.model_validate_json(user_input_json)

### Build a prompt and call the Anthropic API with the instructor package for structured output

In [None]:
prompt = (
    f"Analyze the following customer query {user_input} "
    f"and provide a structured response."
)

In [None]:
# Load environment variables
load_dotenv()
# Use Anthropic with Instructor to get structured output
anthropic_client = instructor.from_anthropic(
    anthropic.Anthropic()
)

response = anthropic_client.messages.create(
    model="claude-3-7-sonnet-latest",  
    max_tokens=1024,
    messages=[
        {
            "role": "user", 
            "content": prompt
        }
    ],
    response_model=CustomerQuery  
)

In [None]:
# Inspect the returned structured data
print(type(response))
print(response.model_dump_json(indent=2))

### Use OpenAI's structured output API with your Pydantic schema

In [None]:
# Initialize OpenAI client and call passing CustomerQuery in your API call
openai_client = OpenAI()
response = openai_client.beta.chat.completions.parse(
    model="gpt-4o",
    messages=[{"role": "user", "content": prompt}],
    response_format=CustomerQuery
)
response_content = response.choices[0].message.content
print(type(response_content))
print(response_content)

### Additional advanced usage and inspection

In [None]:
# Validate the repsonse you got from the LLM
valid_data = CustomerQuery.model_validate_json(
    response_content
)
print(type(valid_data))
print(valid_data.model_dump_json(indent=2))

In [None]:
# Try the responses API from OpenAI
response = openai_client.responses.parse(
    model="gpt-4o",
    input=[{"role": "user", "content": prompt}],
    text_format=CustomerQuery
)

print(type(response))

In [None]:
# Investigate class inheritance structure of the OpenAI response
def print_class_inheritence(llm_response):
    for cls in type(llm_response).mro():
        print(f"{cls.__module__}.{cls.__name__}")

print_class_inheritence(response)

In [None]:
# Print the response type and content 
print(type(response.output_parsed))
print(response.output_parsed.model_dump_json(indent=2))

In [None]:
# Try out the Pydantic AI package for defining an agent and getting a structured response
from pydantic_ai import Agent
import nest_asyncio
nest_asyncio.apply()

agent = Agent(
    model="google-gla:gemini-2.0-flash",
    output_type=CustomerQuery,
)

response = agent.run_sync(prompt)

In [None]:
# Print out the repsonse type and content
print(type(response.output))
print(response.output.model_dump_json(indent=2))

---

## Conclusion

In this lesson, you learned how to use Pydantic models to extract structured, validated output directly from LLMs using both OpenAI and Anthropic APIs. By defining your expected output schema with Pydantic and passing it directly to the API, you can eliminate manual parsing and validation code and receive reliable, well-formed responses in a single API call. This approach lets you focus on designing clear data models and prompts, making your code more maintainable and robust.


---

# Lesson 4: Lesson 5: Tool Calling with Pydantic Models and OpenAI

> This section comes from the uploaded notebook. Execute the cells in sequence. Where needed, earlier imports/utilities are repeated for clarity.


### Mini‑Index

- Lesson 5: Tool Calling with Pydantic Models and OpenAI


# Lesson 5: Tool Calling with Pydantic Models and OpenAI

In this lesson, you'll learn how to use Pydantic models to define tools for OpenAI's tool calling API. You'll see how to reuse your existing models to create robust, validated tool definitions, and how to handle tool calls in your Python code. This lesson builds on your `UserInput` and `CustomerQuery` models from previous lessons.

By the end of this lesson, you'll be able to:
- Use Pydantic models to define tool schemas for OpenAI's tool calling API
- Register your tool with the API using a validated schema
- Handle tool calls and validate arguments with Pydantic
- Integrate LLM-driven workflows with your own Python functions and data sources

---

### Import all required libraries and set up your environment

In [None]:
# Import packages
from pydantic import BaseModel, Field, EmailStr, field_validator
from pydantic_ai import Agent
from typing import Literal, List, Optional
from datetime import datetime, date
import json
from openai import OpenAI
import anthropic
import instructor
from dotenv import load_dotenv
load_dotenv()
import nest_asyncio
nest_asyncio.apply()

### Define your Pydantic models for user input and LLM output

In [None]:
# Define your UserInput model
class UserInput(BaseModel):
    name: str = Field(..., description="User's name")
    email: EmailStr = Field(..., description="User's email address")
    query: str = Field(..., description="User's query")
    order_id: Optional[str] = Field(
        None,
        description="Order ID if available (format: ABC-12345)"
    )
    # Validate order_id format (e.g., ABC-12345)
    @field_validator("order_id")
    def validate_order_id(cls, order_id):
        import re
        if order_id is None:
            return order_id
        pattern = r"^[A-Z]{3}-\d{5}$"
        if not re.match(pattern, order_id):
            raise ValueError(
                "order_id must be in format ABC-12345 "
                "(3 uppercase letters, dash, 5 digits)"
            )
        return order_id
    purchase_date: Optional[date] = None



In [None]:
# Define your CustomerQuery model
class CustomerQuery(UserInput):
    priority: str = Field(
        ..., description="Priority level: low, medium, high"
    )
    category: Literal[
        'refund_request', 'information_request', 'other'
    ] = Field(..., description="Query category")
    is_complaint: bool = Field(
        ..., description="Whether this is a complaint"
    )
    tags: List[str] = Field(..., description="Relevant keyword tags")

### Validate user input and create a CustomerQuery instance

In [None]:
# Define a function to validate user input
def validate_user_input(user_json: str):
    """Validate user input from a JSON string and return a UserInput 
    instance if valid."""
    try:
        user_input = (
            UserInput.model_validate_json(user_json)
        )
        print("user input validated...")
        return user_input
    except Exception as e:
        print(f" Unexpected error: {e}")
        return None

In [None]:
# Define a function to call an LLM using Pydantic AI to create an instance of CustomerQuery
def create_customer_query(valid_user_json: str) -> CustomerQuery:
    customer_query_agent = Agent(
        model="google-gla:gemini-2.0-flash",
        output_type=CustomerQuery,
    )
    response = customer_query_agent.run_sync(valid_user_json)
    print("CustomerQuery generated...")
    return response.output

### Try out your validation and query creation with sample input

In [None]:
# Define user input JSON data
user_input_json = '''
{
    "name": "Joe User",
    "email": "joe@example.com",
    "query": "When can I expect delivery of the headphones I ordered?",
    "order_id": "ABC-12345",
    "purchase_date": "2025-12-01"
}
'''
# Validate user input and create a CustomerQuery
valid_data = validate_user_input(user_input_json).model_dump_json()
customer_query = create_customer_query(valid_data)
print(type(customer_query))
print(customer_query.model_dump_json(indent=2))

### Define tool input models for FAQ lookup and order status

In [None]:
# Define FAQ Lookup tool input as a Pydantic model
class FAQLookupArgs(BaseModel):
    query: str = Field(..., description="User's query") 
    tags: List[str] = Field(
        ..., description="Relevant keyword tags from the customer query"
    )

In [None]:
# Define Check Order Status tool input as a Pydantic model
class CheckOrderStatusArgs(BaseModel):
    order_id: str = Field(
        ..., description="Customer's order ID (format: ABC-12345)"
    )
    email: EmailStr = Field(..., description="Customer's email address")

    @field_validator("order_id")
    def validate_order_id(cls, order_id):
        import re
        pattern = r"^[A-Z]{3}-\d{5}$"
        if not re.match(pattern, order_id):
            raise ValueError(
                "order_id must be in format ABC-12345 "
                "(3 uppercase letters, dash, 5 digits)"
            )
        return order_id

### Create example FAQ and order databases

In [None]:
# Create a fake FAQ database as a list of entries with keywords
faq_db = [
    {
        "question": "How can I reset my password?",
        "answer": "To reset your password, click 'Forgot Password' on the sign-in page and follow the instructions sent to your email.",
        "keywords": ["password", "reset", "account"]
    },
    {
        "question": "How long does shipping take?",
        "answer": "Standard shipping takes 3-5 business days. You can track your order in your account dashboard.",
        "keywords": ["shipping", "delivery", "order", "tracking"]
    },
    {
        "question": "How can I return an item?",
        "answer": "You can return any item within 30 days of purchase. Visit our returns page to start the process.",
        "keywords": ["return", "refund", "exchange"]
    },
    {
        "question": "How can I delete my account?",
        "answer": "To delete your account, go to your account settings tab and select 'delete account'.",
        "keywords": ["delete", "account", "remove"]
    }
]

# Create a fake order database
order_db = {
    "ABC-12345": {
        "status": "shipped", "estimated_delivery": "2025-12-05",
        "purchase_date": "2025-12-01", "email": "joe@example.com"
    },
    "XYZ-23456": {
        "status": "processing", "estimated_delivery": "2025-12-15",
        "purchase_date": "2025-12-10", "email": "sue@example.com"
    },
    "QWE-34567": {
        "status": "delivered", "estimated_delivery": "2025-12-20",
        "purchase_date": "2025-12-18", "email": "bob@example.com"
    }
}

### Implement tool functions for FAQ lookup and order status

In [None]:
# Define your FAQ lookup tool
def lookup_faq_answer(args: FAQLookupArgs) -> str:
    """Look up an FAQ answer by matching tags and words in query 
    to FAQ entry keywords."""
    query_words = set(word.lower() for word in args.query.split())
    tag_set = set(tag.lower() for tag in args.tags)
    best_match = None
    best_score = 0
    for faq in faq_db:
        keywords = set(k.lower() for k in faq["keywords"])
        score = len(keywords & tag_set) + len(keywords & query_words)
        if score > best_score:
            best_score = score
            best_match = faq
    if best_match and best_score > 0:
        return best_match["answer"]
    return "Sorry, I couldn't find an FAQ answer for your question."

In [None]:
# Define your check order status tool
def check_order_status(args: CheckOrderStatusArgs):
    """Simulate checking the status of a customer's order by 
    order_id and email."""
    order = order_db.get(args.order_id)
    if not order:
        return {
            "order_id": args.order_id,
            "status": "not found",
            "estimated_delivery": None,
            "note": "order_id not found"
        }
    if args.email.lower() != order.get("email", "").lower():
        return {
            "order_id": args.order_id,
            "status": order["status"],
            "estimated_delivery": order["estimated_delivery"],
            "note": "order_id found but email mismatch"
        }
    return {
        "order_id": args.order_id,
        "status": order["status"],
        "estimated_delivery": order["estimated_delivery"],
        "note": "order_id and email match"
    }

### Define tool schemas for OpenAI tool calling

In [None]:
# Define tools for your API call
tool_definitions = [
    {
        "type": "function",
        "function": {
            "name": "lookup_faq_answer",
            "description": "Look up an FAQ answer by matching tags to FAQ entry keywords.",
            "parameters": FAQLookupArgs.model_json_schema()
        }
    },
    {
        "type": "function",
        "function": {
            "name": "check_order_status",
            "description": "Check the status of a customer's order.",
            "parameters": CheckOrderStatusArgs.model_json_schema()
        }
    }
]

### Define your support ticket output model

In [None]:
#Define your final output Pydantic models
class OrderDetails(BaseModel):
    status: str
    estimated_delivery: str
    note: str

class SupportTicket(CustomerQuery):
    recommended_next_action: Literal[
        'escalate_to_agent', 'send_faq_response', 
        'send_order_status', 'no_action_needed'
    ] = Field(
        ..., description="LLM's recommended next action for support"
    )
    order_details: Optional[OrderDetails] = Field(
        None, description="Order details if action is send_order_status"
    )
    faq_response: Optional[str] = Field(
        None, description="FAQ response if action is send_faq_response"
    )
    creation_date: datetime = Field(
        ..., description="Date and time the ticket was created"
    )

### Decide on the next support action using OpenAI tool calling

In [None]:
# Initialize OpenAI client
client = OpenAI()

# Define a function to call OpenAI with tools
def decide_next_action_with_tools(customer_query: CustomerQuery):
    
    support_ticket_schema = json.dumps(
        SupportTicket.model_json_schema(), indent=2
    )
    system_prompt = f"""
        You are a helpful customer support agent. Your job is to 
        determine what support action should be taken for the customer, 
        based on the customer query and the expected fields in the 
        SupportTicket schema below. If more information on a particular 
        order_id or FAQ response would be helpful in responding to the 
        user query and can be obtained by calling a tool, call the 
        appropriate tool to get that information. If an order_id is 
        present in the query, always look up the order status to get 
        more information on the order.

        Here is the JSON schema for the SupportTicket model you must 
        use as context for what information is expected:
        {support_ticket_schema}
    """
    messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": str(customer_query.model_dump())}
    ]
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
        tools=tool_definitions,
        tool_choice="auto"
    )
    message = response.choices[0].message
    tool_calls = getattr(message, "tool_calls", None)
    return message, tool_calls, messages

### Inspect the LLM's outputs and tool calls

In [None]:
# Call the decide_next_action_with_tools function
message, tool_calls, messages = decide_next_action_with_tools(
    customer_query
)
# Investigate the LLM's outputs before proceeding
print("LLM message:\n", json.dumps(message.model_dump(), indent=2))
print(
    "\nTool calls:\n", 
    json.dumps([call.model_dump() for call in tool_calls], indent=2)
)

### Gather tool outputs and prepare for ticket generation

In [None]:
# Define a function to get tool outputs
def get_tool_outputs(tool_calls):
    tool_outputs = []
    if tool_calls:
        for tool_call in tool_calls:
            if tool_call.function.name == "lookup_faq_answer":
                print("Agent requested a call to the Lookup FAQ tool...")
                args = FAQLookupArgs.model_validate_json(
                    tool_call.function.arguments
                )
                result = lookup_faq_answer(args)
                tool_outputs.append({
                    "tool_call_id": tool_call.id, "output": result
                })
                print(f"Lookup FAQ tool returned {result}")
            elif tool_call.function.name == "check_order_status":
                print("Agent requested a call to Check Order Status tool...")
                args = CheckOrderStatusArgs.model_validate_json(
                    tool_call.function.arguments
                )
                result = check_order_status(args)
                tool_outputs.append({
                    "tool_call_id": tool_call.id, "output": result
                })
                print(f"Check Order Status tool returned {result}")
    return tool_outputs

tool_outputs = get_tool_outputs(tool_calls)

# Print tool outputs for inspection
print("Tool outputs:\n", json.dumps(tool_outputs, indent=2))

### Generate a structured support ticket using Anthropic

In [None]:
# Create the Anthropic client with Instructor
anthropic_client = instructor.from_anthropic(
    anthropic.Anthropic()
)

# Define a function to call Anthropic to generate a support ticket
def generate_structured_support_ticket(
    customer_query: CustomerQuery, message, tool_outputs: list
):
    tool_results_str = "\n".join([
        f"Tool: {out['tool_call_id']} Output: {json.dumps(out['output'])}"
        for out in tool_outputs
    ]) if tool_outputs else "No tool calls were made."
    # Concatenate prompt parts into a single string for Anthropic
    prompt = f"""
        You are a support agent. Use all information below to 
        generate a support ticket as a validated Pydantic model.
        Customer query: {customer_query.model_dump_json(indent=2)}
        LLM message: {str(message.content)}
        Tool results: {tool_results_str}
    """
    # Create the message with structured output
    response = anthropic_client.messages.create(
        model="claude-3-7-sonnet-latest",  
        max_tokens=1024,
        messages=[
            {
                "role": "user", 
                "content": prompt
            }
        ],
        response_model=SupportTicket
    )
    
    support_ticket = response
    support_ticket.creation_date = datetime.now()
    return support_ticket

### Print your final support ticket

In [None]:
# Run the final step of generating a support ticket and print output
support_ticket = generate_structured_support_ticket(
    customer_query, message, tool_outputs
)
print(support_ticket.model_dump_json(indent=2))

### Full workflow: validate, query, decide, tool, and generate ticket

In [None]:
# Define new user input data
user_json = '''
{
    "name": "Joe User",
    "email": "joe@example.com",
    "query": "I'm really not happy with this product I bought",
    "order_id": "QWE-34567",
    "purchase_date": null
}
'''

In [None]:
# Run the entire pipeline
valid_user_json = validate_user_input(user_json).model_dump_json()
customer_query = create_customer_query(valid_user_json)
message, tool_calls, messages = decide_next_action_with_tools(
    customer_query
)
tool_outputs = get_tool_outputs(tool_calls)
support_ticket = generate_structured_support_ticket(
    customer_query, message, tool_outputs
)
print(support_ticket.model_dump_json(indent=2))

---

## Conclusion

In this lesson, you learned how to use Pydantic models to define tools for OpenAI's tool calling API and build sophisticated customer support workflows. You saw how to reuse your existing models to create robust, validated tool definitions, handle tool calls in your Python code, and automatically gather information to generate comprehensive support tickets. This approach demonstrates the power of Pydantic for creating end-to-end validated workflows with LLMs, from input validation to tool definitions to structured output generation.