# 1.3 Structured Outputs - Reliable Data from LLMs

This notebook covers **Structured Outputs** - one of OpenAI's most powerful features for production applications.

**Key Topics:**
- Why structured outputs matter
- Structured Outputs vs JSON mode
- Using Pydantic for schema definition
- Common use cases (extraction, classification, forms)
- Streaming structured outputs
- Error handling and refusals

**Why this matters:** Getting reliable, validated data from LLMs is critical for production applications. Structured Outputs guarantees that model output matches your schema.

<a target="_blank" href="https://colab.research.google.com/github/IT-HUSET/ai-agents-course-2025/blob/main/exercises/1.3-structured-outputs.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

## Setup

In [None]:
%pip install openai~=2.1 python-dotenv~=1.0 pydantic~=2.0 --upgrade --quiet

In [None]:
import os
import json
from openai import OpenAI
from pydantic import BaseModel
from typing import List, Optional
from enum import Enum

# Check if running in Google Colab
try:
    from google.colab import userdata
    IN_COLAB = True
    os.environ["OPENAI_API_KEY"] = userdata.get('OPENAI_API_KEY')
    print("✅ Running in Google Colab - API key loaded from secrets")
except ImportError:
    IN_COLAB = False
    try:
        from dotenv import load_dotenv, find_dotenv
        load_dotenv(find_dotenv())
        print("✅ Running locally - API key loaded from .env file")
    except ImportError:
        print("⚠️ python-dotenv not installed")

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

if not os.getenv("OPENAI_API_KEY"):
    print("❌ OPENAI_API_KEY not found!")
    if IN_COLAB:
        print("   → Click the key icon (🔑) in the left sidebar and add 'OPENAI_API_KEY'")
else:
    print("✅ Setup complete")

---

## Part 1: Why Structured Outputs Matter

**The Problem:** LLMs can produce inconsistent JSON, missing fields, or invalid formats.

**The Solution:** Structured Outputs **guarantees** the response matches your JSON schema.

### Without Structured Outputs (Unreliable)

In [None]:
# Regular request - asking for JSON in prompt
response = client.responses.create(
    model="gpt-4o",
    input="""Extract event information as JSON:
    
    "Alice and Bob are meeting for coffee on Friday at 3pm at Starbucks."
    
    Return: {"event_name": ..., "participants": [...], "time": ..., "location": ...}
    """
)

print("Response:")
print(response.output_text)
print("\n⚠️  No guarantee this is valid JSON or matches our schema!")

### With Structured Outputs (Guaranteed)

In [None]:
# Define schema with Pydantic
class EventInfo(BaseModel):
    event_name: str
    participants: List[str]
    time: str
    location: str

# Request with structured output
response = client.responses.parse(
    model="gpt-4o",
    input="Extract event information: 'Alice and Bob are meeting for coffee on Friday at 3pm at Starbucks.'",
    text_format=EventInfo
)

# Get parsed object
event = response.output_parsed

print("Parsed Event:")
print(f"Name: {event.event_name}")
print(f"Participants: {event.participants}")
print(f"Time: {event.time}")
print(f"Location: {event.location}")
print(f"\nType: {type(event)}")
print("\n✅ Guaranteed to match EventInfo schema!")

---

## Part 2: Structured Outputs vs JSON Mode

**JSON Mode** (`text: {format: {type: "json_object"}}`):
- ✅ Guarantees valid JSON
- ❌ Does NOT guarantee schema adherence

**Structured Outputs** (`text_format=YourSchema`):
- ✅ Guarantees valid JSON
- ✅ Guarantees schema adherence
- ✅ Type safety
- ✅ Explicit refusals

### JSON Mode Example

In [None]:
# JSON mode - only guarantees valid JSON
response = client.responses.create(
    model="gpt-4o",
    input="Extract user data as JSON: 'John Doe, 35 years old, engineer'",
    text={"format": {"type": "json_object"}}
)

print("JSON Mode Result:")
result = json.loads(response.output_text)
print(json.dumps(result, indent=2))
print("\n⚠️  Field names and structure may vary! No schema guarantee.")

### Structured Outputs Example

In [None]:
# Define exact schema
class UserProfile(BaseModel):
    full_name: str
    age: int
    occupation: str

# Structured outputs - guarantees schema
response = client.responses.parse(
    model="gpt-4o",
    input="Extract user data: 'John Doe, 35 years old, engineer'",
    text_format=UserProfile
)

user = response.output_parsed

print("Structured Output Result:")
print(f"Name: {user.full_name}")
print(f"Age: {user.age}")
print(f"Occupation: {user.occupation}")
print("\n✅ Always has exactly these fields with correct types!")

---

## Part 3: Defining Schemas with Pydantic

Pydantic provides a clean way to define JSON schemas with type validation.

### Basic Schema

In [None]:
class Product(BaseModel):
    name: str
    price: float
    in_stock: bool
    categories: List[str]

response = client.responses.parse(
    model="gpt-4o",
    input="""Extract product info: 
    'iPhone 15 Pro - $999 - Available now - Categories: Electronics, Phones, Apple'""",
    text_format=Product
)

product = response.output_parsed
print(f"Product: {product.name}")
print(f"Price: ${product.price}")
print(f"In Stock: {product.in_stock}")
print(f"Categories: {', '.join(product.categories)}")

### Using Enums for Controlled Values

In [None]:
class Priority(str, Enum):
    LOW = "low"
    MEDIUM = "medium"
    HIGH = "high"
    URGENT = "urgent"

class TaskStatus(str, Enum):
    TODO = "todo"
    IN_PROGRESS = "in_progress"
    DONE = "done"

class Task(BaseModel):
    title: str
    priority: Priority
    status: TaskStatus
    due_date: Optional[str] = None

response = client.responses.parse(
    model="gpt-4o",
    input="Create a task: 'Fix critical bug in production ASAP'",
    text_format=Task
)

task = response.output_parsed
print(f"Task: {task.title}")
print(f"Priority: {task.priority.value}")
print(f"Status: {task.status.value}")
print(f"\n✅ Priority and status are guaranteed to be valid enum values!")

### Nested Objects

In [None]:
class Address(BaseModel):
    street: str
    city: str
    country: str
    postal_code: str

class Contact(BaseModel):
    name: str
    email: str
    phone: Optional[str] = None
    address: Address

response = client.responses.parse(
    model="gpt-4o",
    input="""Extract contact: 
    'Jane Smith, jane@example.com, lives at 123 Main St, Stockholm, Sweden, 11122'""",
    text_format=Contact
)

contact = response.output_parsed
print(f"Name: {contact.name}")
print(f"Email: {contact.email}")
print(f"Address: {contact.address.street}, {contact.address.city}, {contact.address.country}")

### 🎯 Exercise 1: Define Your Schema

**Task:** Create a schema for extracting information from movie reviews.

**Requirements:**
- Movie title
- Rating (1-5 stars as enum)
- Sentiment (positive/negative/neutral as enum)
- Pros (list of strings)
- Cons (list of strings)
- Reviewer name (optional)

**Test with:** "Great movie! The cinematography was stunning and the plot kept me engaged. A bit long though. 4 stars. - John"

In [None]:
# YOUR CODE HERE

# class Rating(str, Enum):
#     TODO: Define ratings

# class Sentiment(str, Enum):
#     TODO: Define sentiments

# class MovieReview(BaseModel):
#     TODO: Define schema

# Test it
# review_text = "Great movie! The cinematography was stunning and the plot kept me engaged. A bit long though. 4 stars. - John"
# response = client.responses.parse(...)
# review = response.output_parsed
# print(review)

---

## Part 4: Common Use Cases

Structured Outputs excel at specific patterns.

### Use Case 1: Data Extraction

In [None]:
class InvoiceItem(BaseModel):
    description: str
    quantity: int
    unit_price: float
    total: float

class Invoice(BaseModel):
    invoice_number: str
    date: str
    customer_name: str
    items: List[InvoiceItem]
    subtotal: float
    tax: float
    total: float

invoice_text = """
INVOICE #INV-2025-001
Date: 2025-10-06
Customer: Acme Corp

Items:
- Web Development (40 hours @ $150/hr): $6,000
- Design Work (20 hours @ $100/hr): $2,000

Subtotal: $8,000
Tax (25%): $2,000
Total: $10,000
"""

response = client.responses.parse(
    model="gpt-4o",
    input=f"Extract structured data from this invoice:\n\n{invoice_text}",
    text_format=Invoice
)

invoice = response.output_parsed
print(f"Invoice: {invoice.invoice_number}")
print(f"Customer: {invoice.customer_name}")
print(f"\nItems:")
for item in invoice.items:
    print(f"  - {item.description}: ${item.total}")
print(f"\nTotal: ${invoice.total}")

### Use Case 2: Classification

In [None]:
class Category(str, Enum):
    BUG = "bug"
    FEATURE = "feature"
    QUESTION = "question"
    DOCUMENTATION = "documentation"

class Severity(str, Enum):
    LOW = "low"
    MEDIUM = "medium"
    HIGH = "high"
    CRITICAL = "critical"

class IssueClassification(BaseModel):
    category: Category
    severity: Severity
    title: str
    summary: str
    requires_immediate_attention: bool

issue_text = """The application crashes when trying to upload files larger than 10MB. 
This is affecting all users and blocking production deployments. 
Error message: 'Memory allocation failed'."""

response = client.responses.parse(
    model="gpt-4o",
    input=f"Classify this issue:\n\n{issue_text}",
    text_format=IssueClassification
)

classification = response.output_parsed
print(f"Category: {classification.category.value}")
print(f"Severity: {classification.severity.value}")
print(f"Title: {classification.title}")
print(f"Immediate attention: {classification.requires_immediate_attention}")
print(f"\nSummary: {classification.summary}")

### Use Case 3: Form Generation

In [None]:
class FieldType(str, Enum):
    TEXT = "text"
    EMAIL = "email"
    NUMBER = "number"
    TEXTAREA = "textarea"
    SELECT = "select"
    CHECKBOX = "checkbox"

class FormField(BaseModel):
    name: str
    label: str
    type: FieldType
    required: bool
    placeholder: Optional[str] = None
    options: Optional[List[str]] = None  # For select fields

class Form(BaseModel):
    title: str
    description: str
    fields: List[FormField]

response = client.responses.parse(
    model="gpt-4o",
    input="""Create a contact form with:
    - Full name (required)
    - Email (required)
    - Phone number (optional)
    - Inquiry type: General, Support, Sales (required dropdown)
    - Message (required, multiline)
    - Subscribe to newsletter (optional checkbox)
    """,
    text_format=Form
)

form = response.output_parsed
print(f"Form: {form.title}")
print(f"Description: {form.description}")
print(f"\nFields:")
for field in form.fields:
    required = "*" if field.required else ""
    print(f"  - {field.label}{required} ({field.type.value})")
    if field.options:
        print(f"    Options: {', '.join(field.options)}")

### 🎯 Exercise 2: Recipe Extractor

**Task:** Build a recipe extractor that converts free-text recipes into structured data.

**Schema requirements:**
- Recipe name
- Prep time (minutes)
- Cook time (minutes)
- Servings
- Difficulty (easy/medium/hard)
- Ingredients (list with name, quantity, unit)
- Instructions (numbered steps)
- Tags (list: vegetarian, vegan, gluten-free, etc.)

**Test with a free-text recipe of your choice**

In [None]:
# YOUR CODE HERE

# class Difficulty(str, Enum):
#     TODO

# class Ingredient(BaseModel):
#     TODO

# class Recipe(BaseModel):
#     TODO

# Test with a recipe
# recipe_text = """..."""
# response = client.responses.parse(...)
# recipe = response.output_parsed

---

## Part 5: Error Handling and Refusals

Models may refuse to fulfill requests for safety reasons.

### Handling Refusals

In [None]:
class PersonalInfo(BaseModel):
    name: str
    address: str
    ssn: str

# This might trigger a refusal
try:
    response = client.responses.parse(
        model="gpt-4o",
        input="Extract personal info: 'John Doe, 123 Main St, SSN: 123-45-6789'",
        text_format=PersonalInfo
    )
    
    # Check for refusal
    if response.output[0].content[0].type == "refusal":
        print("❌ Model refused:")
        print(response.output[0].content[0].refusal)
    else:
        person = response.output_parsed
        print(f"✅ Extracted: {person.name}")
except Exception as e:
    print(f"Error: {e}")

### Handling Incomplete Responses

In [None]:
# Small max tokens - might be incomplete
class ArticleSummary(BaseModel):
    title: str
    key_points: List[str]
    conclusion: str

try:
    response = client.responses.parse(
        model="gpt-4o",
        input="Summarize this 10-page article about AI...",
        text_format=ArticleSummary,
        max_output_tokens=50  # Intentionally small
    )
    
    # Check completion status
    if response.status == "incomplete":
        print(f"⚠️  Incomplete response: {response.incomplete_details.reason}")
        if response.incomplete_details.reason == "max_output_tokens":
            print("Increase max_output_tokens and retry")
    else:
        summary = response.output_parsed
        print(f"✅ Complete: {summary.title}")
except Exception as e:
    print(f"Error: {e}")

### Proper Error Handling Pattern

In [None]:
def safe_parse(prompt: str, schema: type[BaseModel], max_retries: int = 2):
    """Safely parse with error handling and retries"""
    
    for attempt in range(max_retries):
        try:
            response = client.responses.parse(
                model="gpt-4o",
                input=prompt,
                text_format=schema
            )
            
            # Check for refusal
            if response.output[0].content[0].type == "refusal":
                return {
                    "success": False,
                    "error": "refusal",
                    "message": response.output[0].content[0].refusal
                }
            
            # Check for incomplete
            if response.status == "incomplete":
                return {
                    "success": False,
                    "error": "incomplete",
                    "reason": response.incomplete_details.reason
                }
            
            # Success!
            return {
                "success": True,
                "data": response.output_parsed
            }
            
        except Exception as e:
            if attempt < max_retries - 1:
                print(f"Attempt {attempt + 1} failed, retrying...")
                continue
            return {
                "success": False,
                "error": "exception",
                "message": str(e)
            }

# Test it
result = safe_parse(
    "Extract product: 'iPhone 15 - $999'",
    Product
)

if result["success"]:
    print(f"✅ Success: {result['data'].name}")
else:
    print(f"❌ Error ({result['error']}): {result.get('message', result.get('reason'))}")

---

## Part 6: Streaming Structured Outputs

You can stream structured outputs for better UX.

In [None]:
class BlogPost(BaseModel):
    title: str
    sections: List[str]
    conclusion: str

print("Streaming structured output:")
print("=" * 50)

with client.responses.stream(
    model="gpt-4o",
    input="Write a short blog post about AI agents",
    text_format=BlogPost
) as stream:
    for event in stream:
        if event.type == "response.output_text.delta":
            print(event.delta, end="", flush=True)
        elif event.type == "response.completed":
            print("\n" + "=" * 50)
            print("\n✅ Streaming complete")
    
    # Get final parsed object
    final_response = stream.get_final_response()
    blog = final_response.output_parsed

print(f"\nBlog post: {blog.title}")
print(f"Sections: {len(blog.sections)}")

---

## Part 7: Advanced Patterns

### Chain-of-Thought with Structured Outputs

In [None]:
class ReasoningStep(BaseModel):
    explanation: str
    output: str

class MathSolution(BaseModel):
    steps: List[ReasoningStep]
    final_answer: str

response = client.responses.parse(
    model="gpt-4o",
    input="Solve step-by-step: What is 15% of 240?",
    text_format=MathSolution
)

solution = response.output_parsed

print("Solution steps:")
for i, step in enumerate(solution.steps, 1):
    print(f"\nStep {i}:")
    print(f"  {step.explanation}")
    print(f"  → {step.output}")

print(f"\n✅ Final Answer: {solution.final_answer}")

### Recursive Schemas

In [None]:
from __future__ import annotations
from typing import List

class UIComponent(BaseModel):
    type: str  # div, button, form, etc.
    label: str
    children: List[UIComponent] = []  # Recursive!

# Required for recursive models
UIComponent.model_rebuild()

response = client.responses.parse(
    model="gpt-4o",
    input="Create a simple login form UI structure",
    text_format=UIComponent
)

ui = response.output_parsed

def print_ui(component: UIComponent, indent: int = 0):
    """Recursively print UI structure"""
    print("  " * indent + f"<{component.type}> {component.label}")
    for child in component.children:
        print_ui(child, indent + 1)

print("UI Structure:")
print_ui(ui)

### 🎯 Exercise 3: Complete Application

**Task:** Build a customer support ticket analyzer.

**Requirements:**
1. Create schemas for:
   - Ticket classification (category, urgency, sentiment)
   - Extracted entities (customer name, product, issue)
   - Suggested actions (list of next steps)

2. Implement:
   - Safe parsing with error handling
   - Streaming output
   - Pretty printing of results

3. Test with multiple ticket examples

In [None]:
# YOUR CODE HERE

# Define your schemas
# Implement ticket analyzer
# Test with example tickets

ticket_example = """
Subject: Urgent - Payment Failed
From: jane@example.com

Hi, I tried to pay for my premium subscription but the payment keeps failing. 
I've tried 3 different cards and none work. This is really frustrating as I need 
access to the premium features for an important presentation tomorrow.

Can you help?
Jane
"""

---

## Summary

In this notebook, you learned:

✅ **Why Structured Outputs:** Guaranteed schema adherence vs ad-hoc JSON  
✅ **Pydantic schemas:** Types, enums, nested objects, optional fields  
✅ **Common use cases:** Data extraction, classification, form generation  
✅ **Error handling:** Refusals, incomplete responses, retries  
✅ **Streaming:** Real-time structured output generation  
✅ **Advanced patterns:** Chain-of-thought, recursive schemas  

**Key Takeaways:**
- Always use Structured Outputs for production applications
- Define clear schemas with Pydantic
- Handle errors gracefully (refusals, incomplete)
- Use enums for controlled values
- Stream for better UX

**Next Steps:**
- Notebook 1.4: Prompt Engineering (improve quality of structured outputs)
- Notebook 1.5: Context Management (handle long conversations)
- Notebook 1.6: Agentic Applications (combine with tools)

**Resources:**
- [Structured Outputs Documentation](https://platform.openai.com/docs/guides/structured-outputs)
- [Pydantic Documentation](https://docs.pydantic.dev/)
- [OpenAI Cookbook - Structured Outputs](https://cookbook.openai.com/examples/structured_outputs_intro)