# Lab: Introduction to Pydantic & Structured Outputs

## 📋 Objectives

- Define and use Pydantic models (BaseModel & Field) to enforce data schemas
- Validate Python data structures and parse raw JSON into typed models

In [1]:
import IPython
import sys

def clean_notebook():
    IPython.display.clear_output(wait=True)
    print("Notebook cleaned.")

!pip install openai

# Install Pydantic
!pip install pydantic

# Clean up the notebook
clean_notebook()

Notebook cleaned.


# Part 1 – Defining Your First Model

In [2]:
from pydantic import BaseModel, Field
from typing import List, Optional

class Book(BaseModel):
    title: str = Field(..., description="Title of the book")
    author: str
    pages: int = Field(..., gt=0, description="Number of pages, must be >0")
    tags: List[str] = Field(default_factory=list, description="Optional labels")


# Pydantic BaseModel and Field Lab Tutorial

## Overview
This lab teaches you how to use Pydantic's `BaseModel` and `Field` for data validation, serialization, and creating robust data models in Python. Pydantic is essential for modern Python applications, especially APIs and data processing.

## Prerequisites
- Basic Python knowledge (classes, dictionaries, type hints)
- Python 3.7+ installed
- Pydantic installed: `pip install pydantic`

## Part 1: Introduction to BaseModel

### What is BaseModel?
BaseModel is Pydantic's core class that provides automatic data validation, serialization, and parsing using Python type annotations.

### Exercise 1.1: Your First BaseModel
Create a simple user model:

```python
from pydantic import BaseModel
from typing import Optional

class User(BaseModel):
    name: str
    age: int
    email: str
    is_active: bool = True  # Default value

# Create an instance
user = User(name="Alice", age=25, email="alice@example.com")
print(user)
print(user.name)  # Access attributes
print(user.model_dump())  # Convert to dictionary
```

**Expected Output:**
```
name='Alice' age=25 email='alice@example.com' is_active=True
Alice
{'name': 'Alice', 'age': 25, 'email': 'alice@example.com', 'is_active': True}
```

### Exercise 1.2: Automatic Validation
Try creating invalid data and see how Pydantic handles it:

```python
# This will raise a ValidationError
try:
    invalid_user = User(name="Bob", age="not a number", email="bob@example.com")
except ValueError as e:
    print(f"Validation Error: {e}")

# This will work - Pydantic converts string to int
valid_user = User(name="Charlie", age="30", email="charlie@example.com")
print(valid_user.age, type(valid_user.age))  # 30 <class 'int'>
```

## Part 2: Introduction to Field

### What is Field?
Field allows you to add validation rules, default values, descriptions, and constraints to your model attributes.

### Exercise 2.1: Basic Field Usage
```python
from pydantic import BaseModel, Field

class Product(BaseModel):
    name: str = Field(..., min_length=1, max_length=100, description="Product name")
    price: float = Field(..., gt=0, description="Price must be positive")
    quantity: int = Field(default=0, ge=0, description="Quantity in stock")
    category: str = Field("general", description="Product category")

# Create a product
product = Product(name="Laptop", price=999.99, quantity=5, category="electronics")
print(product)
```

**Field Parameters:**
- `...` (Ellipsis): Required field
- `default`: Default value
- `gt`, `ge`, `lt`, `le`: Greater than, greater/equal, less than, less/equal
- `min_length`, `max_length`: String length constraints
- `description`: Field description for documentation

### Exercise 2.2: Advanced Field Validation
```python
from pydantic import BaseModel, Field, validator
import re

class Employee(BaseModel):
    name: str = Field(..., min_length=2, max_length=50)
    employee_id: str = Field(..., regex=r'^EMP\d{4}$')  # Format: EMP1234
    salary: float = Field(..., gt=0, le=1000000)
    department: str = Field(..., min_length=1)
    phone: str = Field(..., regex=r'^\+?1?\d{9,15}$')

# Test the validation
try:
    emp = Employee(
        name="John Doe",
        employee_id="EMP1234",
        salary=75000,
        department="Engineering",
        phone="+1234567890"
    )
    print("Valid employee created:", emp)
except ValueError as e:
    print(f"Validation Error: {e}")
```

## Part 3: Practical Examples

### Exercise 3.1: API Response Model
Create a model for handling API responses:

```python
from pydantic import BaseModel, Field
from typing import List, Optional
from datetime import datetime

class APIResponse(BaseModel):
    success: bool = Field(default=True)
    message: str = Field(..., max_length=500)
    data: Optional[dict] = None
    timestamp: datetime = Field(default_factory=datetime.now)
    errors: List[str] = Field(default_factory=list)

# Usage
response = APIResponse(
    message="Data retrieved successfully",
    data={"users": [{"id": 1, "name": "Alice"}]}
)
print(response.model_dump())
```

### Exercise 3.2: Configuration Model
Create a configuration model for an application:

```python
from pydantic import BaseModel, Field, validator
from typing import List

class DatabaseConfig(BaseModel):
    host: str = Field(..., min_length=1)
    port: int = Field(default=5432, ge=1, le=65535)
    database: str = Field(..., min_length=1)
    username: str = Field(..., min_length=1)
    password: str = Field(..., min_length=8)
    
    @validator('password')
    def validate_password(cls, v):
        if not any(c.isupper() for c in v):
            raise ValueError('Password must contain uppercase letter')
        if not any(c.islower() for c in v):
            raise ValueError('Password must contain lowercase letter')
        if not any(c.isdigit() for c in v):
            raise ValueError('Password must contain digit')
        return v

class AppConfig(BaseModel):
    app_name: str = Field(..., min_length=1)
    debug: bool = Field(default=False)
    allowed_hosts: List[str] = Field(default_factory=list)
    database: DatabaseConfig
    max_connections: int = Field(default=100, ge=1, le=1000)

# Usage
config = AppConfig(
    app_name="MyApp",
    database=DatabaseConfig(
        host="localhost",
        database="myapp_db",
        username="admin",
        password="SecurePass123"
    ),
    allowed_hosts=["localhost", "127.0.0.1"]
)
print(config.model_dump())
```

## Part 4: Advanced Features

### Exercise 4.1: Custom Validators
```python
from pydantic import BaseModel, Field, validator
import re

class User(BaseModel):
    username: str = Field(..., min_length=3, max_length=20)
    email: str = Field(...)
    age: int = Field(..., ge=13, le=120)
    
    @validator('username')
    def validate_username(cls, v):
        if not re.match(r'^[a-zA-Z0-9_]+$', v):
            raise ValueError('Username can only contain letters, numbers, and underscores')
        return v.lower()  # Convert to lowercase
    
    @validator('email')
    def validate_email(cls, v):
        if '@' not in v:
            raise ValueError('Invalid email format')
        return v.lower()

# Test
user = User(username="John_Doe", email="JOHN@EXAMPLE.COM", age=25)
print(user)  # Notice username and email are lowercase
```

### Exercise 4.2: Model Inheritance
```python
from pydantic import BaseModel, Field
from datetime import datetime

class BaseEntity(BaseModel):
    id: int = Field(..., ge=1)
    created_at: datetime = Field(default_factory=datetime.now)
    updated_at: datetime = Field(default_factory=datetime.now)

class User(BaseEntity):
    name: str = Field(..., min_length=1)
    email: str = Field(...)

class Product(BaseEntity):
    name: str = Field(..., min_length=1)
    price: float = Field(..., gt=0)
    category: str = Field(...)

# Usage
user = User(id=1, name="Alice", email="alice@example.com")
product = Product(id=1, name="Laptop", price=999.99, category="Electronics")
print(user)
print(product)
```

## Part 5: Hands-On Challenges

### Challenge 1: E-commerce Order System
Create models for an e-commerce order system:

```python
from pydantic import BaseModel, Field, validator
from typing import List, Optional
from datetime import datetime
from enum import Enum

class OrderStatus(str, Enum):
    PENDING = "pending"
    CONFIRMED = "confirmed"
    SHIPPED = "shipped"
    DELIVERED = "delivered"
    CANCELLED = "cancelled"

class OrderItem(BaseModel):
    product_id: int = Field(..., ge=1)
    product_name: str = Field(..., min_length=1)
    quantity: int = Field(..., ge=1)
    unit_price: float = Field(..., gt=0)
    
    @property
    def total_price(self) -> float:
        return self.quantity * self.unit_price

class Order(BaseModel):
    order_id: str = Field(..., regex=r'^ORD\d{6}$')
    customer_email: str = Field(...)
    items: List[OrderItem] = Field(..., min_items=1)
    status: OrderStatus = Field(default=OrderStatus.PENDING)
    order_date: datetime = Field(default_factory=datetime.now)
    shipping_address: str = Field(..., min_length=10)
    
    @validator('customer_email')
    def validate_email(cls, v):
        if '@' not in v or '.' not in v:
            raise ValueError('Invalid email format')
        return v.lower()
    
    @property
    def total_amount(self) -> float:
        return sum(item.total_price for item in self.items)

# Test your implementation
order = Order(
    order_id="ORD123456",
    customer_email="customer@example.com",
    items=[
        OrderItem(product_id=1, product_name="Laptop", quantity=1, unit_price=999.99),
        OrderItem(product_id=2, product_name="Mouse", quantity=2, unit_price=25.99)
    ],
    shipping_address="123 Main St, City, State 12345"
)

print(f"Order Total: ${order.total_amount:.2f}")
print(order.model_dump())
```

### Challenge 2: Student Grade Management
Create a system to manage student grades:

```python
from pydantic import BaseModel, Field, validator
from typing import List, Dict, Optional
from datetime import datetime

class Grade(BaseModel):
    subject: str = Field(..., min_length=1)
    score: float = Field(..., ge=0, le=100)
    max_score: float = Field(default=100, gt=0)
    date: datetime = Field(default_factory=datetime.now)
    
    @property
    def percentage(self) -> float:
        return (self.score / self.max_score) * 100
    
    @property
    def letter_grade(self) -> str:
        pct = self.percentage
        if pct >= 90: return "A"
        elif pct >= 80: return "B"
        elif pct >= 70: return "C"
        elif pct >= 60: return "D"
        else: return "F"

class Student(BaseModel):
    student_id: str = Field(..., regex=r'^STU\d{6}$')
    name: str = Field(..., min_length=1)
    email: str = Field(...)
    grades: List[Grade] = Field(default_factory=list)
    
    @validator('email')
    def validate_email(cls, v):
        if '@' not in v:
            raise ValueError('Invalid email format')
        return v.lower()
    
    def add_grade(self, grade: Grade):
        self.grades.append(grade)
    
    @property
    def gpa(self) -> float:
        if not self.grades:
            return 0.0
        
        grade_points = {"A": 4.0, "B": 3.0, "C": 2.0, "D": 1.0, "F": 0.0}
        total_points = sum(grade_points[grade.letter_grade] for grade in self.grades)
        return total_points / len(self.grades)

# Test your implementation
student = Student(
    student_id="STU123456",
    name="Alice Johnson",
    email="alice.johnson@school.edu"
)

student.add_grade(Grade(subject="Mathematics", score=85))
student.add_grade(Grade(subject="Science", score=92))
student.add_grade(Grade(subject="English", score=78))

print(f"Student: {student.name}")
print(f"GPA: {student.gpa:.2f}")
for grade in student.grades:
    print(f"{grade.subject}: {grade.score}/100 ({grade.letter_grade})")
```

## Part 6: Best Practices and Tips

### Key Takeaways:
1. **Use Type Hints**: Always provide type annotations for better validation
2. **Leverage Field**: Use Field for validation rules, defaults, and documentation
3. **Custom Validators**: Create custom validation logic when needed
4. **Model Inheritance**: Use inheritance for common fields across models
5. **Property Methods**: Use @property for calculated fields
6. **Error Handling**: Always handle ValidationError exceptions
7. **Documentation**: Use Field descriptions for better API documentation

### Common Patterns:
```python
# Required field with validation
name: str = Field(..., min_length=1, max_length=100)

# Optional field with default
status: str = Field(default="active", regex=r'^(active|inactive)$')

# Numeric constraints
age: int = Field(..., ge=0, le=150)
price: float = Field(..., gt=0)

# List validation
tags: List[str] = Field(default_factory=list, max_items=10)
```

## Lab Exercises

Complete these exercises to test your understanding:

1. Create a `Book` model with title, author, ISBN, price, and publication date
2. Add validation to ensure ISBN follows the correct format
3. Create a `Library` model that contains a list of books
4. Add methods to add/remove books and calculate total library value
5. Create a `BookReview` model with rating validation (1-5 stars)

## Additional Resources
- [Pydantic Documentation](https://docs.pydantic.dev/)
- [Type Hints Documentation](https://docs.python.org/3/library/typing.html)
- [Python Enum Documentation](https://docs.python.org/3/library/enum.html)

Practice these concepts and you'll master Pydantic's BaseModel and Field for robust data validation in your Python applications!