# Pydantic Fundamentals - 5 Essential Questions

This notebook covers five essential questions for mastering Pydantic fundamentals, from beginner to intermediate level.

**Topics Covered:**
1. Optional Fields & Default Values
2. Type Coercion
3. Handling Validation Errors
4. Model Creation Methods
5. Nested Models & Lists

In [None]:
# Install pydantic if not already installed
# !pip install pydantic

---
## Question 1: Optional Fields and Default Values

**Key Concept:** In Pydantic, field optionality is controlled by whether you provide a default value and how you use type annotations.

- **Required Fields** - No default value
- **Optional Fields with Defaults** - Have a default value
- **Nullable Fields** - Can be `None`

### 1.1 Required Fields (No Default)

In [None]:
from pydantic import BaseModel, ValidationError

class User(BaseModel):
    id: int          # Required - must be provided
    username: str    # Required - must be provided

# This works
user = User(id=1, username="alice")
print(f"User created: {user}")

In [None]:
# This fails - missing required field 'username'
try:
    user = User(id=1)  # Missing 'username'
except ValidationError as e:
    print("Validation Error:")
    print(e)

### 1.2 Optional Fields with Default Values

In [None]:
class UserWithDefaults(BaseModel):
    id: int
    username: str
    role: str = "user"           # Optional with default
    is_active: bool = True       # Optional with default
    credits: int = 0             # Optional with default

# Only required fields provided - defaults are used
user = UserWithDefaults(id=1, username="alice")
print(f"role: {user.role}")
print(f"is_active: {user.is_active}")
print(f"credits: {user.credits}")

In [None]:
# Override defaults
user = UserWithDefaults(id=2, username="bob", role="admin", credits=100)
print(f"role: {user.role}")
print(f"credits: {user.credits}")

### 1.3 Nullable Fields (Can Be None)

In [None]:
from typing import Optional

class UserNullable(BaseModel):
    id: int
    username: str
    email: Optional[str] = None      # Can be None, defaults to None
    phone: str | None = None         # Python 3.10+ syntax
    bio: Optional[str] = "No bio"    # Can be None, but defaults to string

# All valid examples
user1 = UserNullable(id=1, username="alice")
print(f"user1.email: {user1.email}")

user2 = UserNullable(id=2, username="bob", email="bob@example.com")
print(f"user2.email: {user2.email}")

user3 = UserNullable(id=3, username="charlie", email=None)
print(f"user3.email: {user3.email} (explicitly set to None)")

### 1.4 Important Distinction: Default vs Nullable

In [None]:
class Product(BaseModel):
    name: str
    price: float
    
    # These are DIFFERENT:
    description: str = "No description"        # Always string, never None
    category: Optional[str] = None             # Can be None or string
    tags: list[str] = []                       # Always list, never None
    metadata: Optional[dict] = None            # Can be None or dict
    
product = Product(name="Coffee", price=4.99)
print(f"description: {product.description} (type: {type(product.description).__name__})")
print(f"category: {product.category} (type: {type(product.category).__name__})")
print(f"tags: {product.tags} (type: {type(product.tags).__name__})")
print(f"metadata: {product.metadata} (type: {type(product.metadata).__name__})")

### 1.5 Using Field() for Complex Defaults

In [None]:
from pydantic import Field

class Config(BaseModel):
    app_name: str
    debug: bool = Field(default=False)
    max_connections: int = Field(default=100, ge=1, le=1000)
    allowed_hosts: list[str] = Field(default_factory=list)  # Mutable default

config = Config(app_name="MyApp")
print(f"debug: {config.debug}")
print(f"max_connections: {config.max_connections}")
print(f"allowed_hosts: {config.allowed_hosts}")

---
## Question 2: Type Coercion

**Key Concept:** Pydantic doesn't just validate types—it also converts them. This can be convenient but may cause data loss if not understood properly.

### 2.1 Common Coercion Patterns

In [None]:
class CoercionModel(BaseModel):
    integer: int
    floating: float
    text: str
    flag: bool

# Type coercion in action
m = CoercionModel(
    integer="123",       # str → int
    floating="3.14",     # str → float
    text=456,            # int → str
    flag=1               # int → bool
)

print(f"integer: {m.integer} (type: {type(m.integer).__name__})")
print(f"floating: {m.floating} (type: {type(m.floating).__name__})")
print(f"text: {m.text} (type: {type(m.text).__name__})")
print(f"flag: {m.flag} (type: {type(m.flag).__name__})")

### 2.2 Data Loss Example - Float to Int

In [None]:
class Score(BaseModel):
    points: int

# Decimal is truncated, not rounded!
score1 = Score(points=99.9)
print(f"99.9 → {score1.points} (lost 0.9!)")

score2 = Score(points=99.1)
print(f"99.1 → {score2.points} (lost 0.1!)")

### 2.3 Boolean Coercion - Surprising Behavior

In [None]:
class Settings(BaseModel):
    enabled: bool

# These all become True
print("Values that become True:")
print(f"  1 → {Settings(enabled=1).enabled}")
print(f"  'yes' → {Settings(enabled='yes').enabled}")
print(f"  'true' → {Settings(enabled='true').enabled}")

# These all become False
print("\nValues that become False:")
print(f"  0 → {Settings(enabled=0).enabled}")
print(f"  '' → {Settings(enabled='').enabled}")
print(f"  'false' → {Settings(enabled='false').enabled}")

### 2.4 When Coercion Causes Problems

In [None]:
# Financial calculation issue
class Transaction(BaseModel):
    amount: int  # Cents

# User sends dollars as float
transaction = Transaction(amount=19.99)
print(f"Expected 1999 cents, got: {transaction.amount} cents")
print("⚠️ This could be a serious bug in a financial application!")

### 2.5 Solution - Strict Mode

In [None]:
class StrictModel(BaseModel):
    user_id: int = Field(strict=True)
    price: float = Field(strict=True)

# Now coercion is disabled
try:
    model = StrictModel(user_id="123", price="9.99")
except ValidationError as e:
    print("With strict mode, strings are NOT converted:")
    print(e)

In [None]:
# Only exact types work with strict mode
model = StrictModel(user_id=123, price=9.99)
print(f"✓ Works with exact types: user_id={model.user_id}, price={model.price}")

---
## Question 3: Handling Validation Errors

**Key Concept:** Pydantic raises `ValidationError` when data doesn't match the model schema. This error contains structured information about all validation failures.

### 3.1 Basic Error Handling

In [None]:
class UserValidation(BaseModel):
    id: int
    username: str
    age: int

# Catch validation errors
try:
    user = UserValidation(id="abc", username=123, age="invalid")
except ValidationError as e:
    print("Validation Error:")
    print(e)

### 3.2 Accessing Error Details Programmatically

In [None]:
try:
    user = UserValidation(id="abc", username=123, age="invalid")
except ValidationError as e:
    # Get error count
    print(f"Found {e.error_count()} errors\n")
    
    # Get list of error dictionaries
    for error in e.errors():
        print(f"Field: {error['loc']}")
        print(f"Message: {error['msg']}")
        print(f"Type: {error['type']}")
        print(f"Input: {error['input']}")
        print("---")

### 3.3 Getting JSON Error Response (Perfect for APIs)

In [None]:
import json

try:
    user = UserValidation(id="abc", username=123, age="invalid")
except ValidationError as e:
    error_json = e.json()
    print("JSON Error Response:")
    print(json.dumps(json.loads(error_json), indent=2))

### 3.4 Custom Error Messages with Validators

In [None]:
from pydantic import field_validator

class ProductValidation(BaseModel):
    name: str
    price: float
    
    @field_validator('price')
    @classmethod
    def price_must_be_positive(cls, v):
        if v <= 0:
            raise ValueError('Price must be greater than zero')
        return v

try:
    product = ProductValidation(name="Coffee", price=-5.99)
except ValidationError as e:
    for error in e.errors():
        print(f"Custom error message: {error['msg']}")

### 3.5 Nested Model Errors

In [None]:
class AddressValidation(BaseModel):
    street: str
    zip_code: str = Field(pattern=r'^\d{5}$')

class PersonValidation(BaseModel):
    name: str
    address: AddressValidation

try:
    person = PersonValidation(
        name="Alice",
        address={"street": "Main St", "zip_code": "invalid"}
    )
except ValidationError as e:
    for error in e.errors():
        print(f"Location: {error['loc']}")
        print(f"Message: {error['msg']}")

---
## Question 4: Model Creation Methods

**Key Concept:** Pydantic offers multiple ways to create model instances:
- Direct instantiation with keyword arguments
- `model_validate()` for dictionaries
- `model_validate_json()` for JSON strings
- `model_construct()` for skipping validation (dangerous!)

### 4.1 Direct Instantiation (Keyword Arguments)

In [None]:
class UserCreate(BaseModel):
    id: int
    username: str

# Direct instantiation with keyword arguments
user = UserCreate(id=1, username="alice")
print(f"Direct instantiation: {user}")

### 4.2 model_validate() - Parse Dictionary

In [None]:
# When you have a dictionary
data = {"id": 2, "username": "bob"}
user = UserCreate.model_validate(data)
print(f"model_validate(): {user}")

# Equivalent to unpacking
user = UserCreate(**data)  # Same result
print(f"Unpacking (**data): {user}")

### 4.3 model_validate_json() - Parse JSON String

In [None]:
# When you have a JSON string
json_string = '{"id": 3, "username": "charlie"}'
user = UserCreate.model_validate_json(json_string)
print(f"model_validate_json(): {user}")

### 4.4 model_construct() - Skip Validation (Use with Caution!)

In [None]:
# Create without validation - use with EXTREME caution
user = UserCreate.model_construct(id="not_an_int", username=12345)
print(f"model_construct(): {user}")
print(f"⚠️ id is actually: '{user.id}' (type: {type(user.id).__name__})")
print("⚠️ No validation happened! This is dangerous with untrusted data.")

### 4.5 When to Use Each Method

In [None]:
from datetime import datetime

class Event(BaseModel):
    name: str
    timestamp: datetime

# 1. Direct instantiation - when you have individual values
event1 = Event(name="Meeting", timestamp=datetime.now())
print(f"1. Direct: {event1}")

# 2. model_validate() - when parsing dict from API/database
api_response = {"name": "Conference", "timestamp": "2024-06-15T10:00:00"}
event2 = Event.model_validate(api_response)
print(f"2. model_validate(): {event2}")

# 3. model_validate_json() - when parsing JSON string
json_data = '{"name":"Workshop","timestamp":"2024-07-01T14:00:00"}'
event3 = Event.model_validate_json(json_data)
print(f"3. model_validate_json(): {event3}")

# 4. model_construct() - when loading trusted data at scale (no validation)
database_row = {"name": "Seminar", "timestamp": datetime(2024, 8, 1)}
event4 = Event.model_construct(**database_row)
print(f"4. model_construct(): {event4}")

### 4.6 Performance Comparison

In [None]:
import time

# Setup
data_dict = {"id": 1, "username": "test"}
iterations = 10000

# Method 1: Direct with unpacking
start = time.time()
for _ in range(iterations):
    user = UserCreate(**data_dict)
time1 = time.time() - start

# Method 2: model_validate()
start = time.time()
for _ in range(iterations):
    user = UserCreate.model_validate(data_dict)
time2 = time.time() - start

# Method 3: model_construct() (no validation)
start = time.time()
for _ in range(iterations):
    user = UserCreate.model_construct(**data_dict)
time3 = time.time() - start

print(f"Performance comparison ({iterations:,} iterations):")
print(f"  Direct (**data):     {time1:.3f}s")
print(f"  model_validate():    {time2:.3f}s")
print(f"  model_construct():   {time3:.3f}s (fastest but no validation!)")

---
## Question 5: Nested Models and Lists

**Key Concept:** Pydantic excels at validating nested, hierarchical data structures. Validation automatically cascades through all levels.

### 5.1 Basic Nested Model

In [None]:
class Address(BaseModel):
    street: str
    city: str
    zip_code: str

class Person(BaseModel):
    name: str
    age: int
    address: Address  # Nested model

# Create with nested dictionary
person = Person(
    name="Alice",
    age=30,
    address={
        "street": "123 Main St",
        "city": "Boston",
        "zip_code": "02101"
    }
)

# Access nested attributes
print(f"Name: {person.name}")
print(f"City: {person.address.city}")
print(f"Zip: {person.address.zip_code}")

### 5.2 Nested Validation Cascades

In [None]:
# Invalid nested data is caught
try:
    person = Person(
        name="Bob",
        age="invalid",  # Error at Person level
        address={
            "street": "456 Oak Ave",
            "city": 999,  # Error at Address level
            "zip_code": "10001"
        }
    )
except ValidationError as e:
    print("Errors at multiple levels:")
    for error in e.errors():
        print(f"  {error['loc']}: {error['msg']}")

### 5.3 List of Primitives

In [None]:
class TodoList(BaseModel):
    title: str
    items: list[str]
    tags: list[str] = []

todo = TodoList(
    title="Shopping",
    items=["Milk", "Bread", "Eggs"],
    tags=["groceries", "urgent"]
)

print(f"Todo: {todo.title}")
print(f"Items: {todo.items}")
print(f"Number of items: {len(todo.items)}")

### 5.4 List of Nested Models

In [None]:
class Item(BaseModel):
    name: str
    price: float
    quantity: int

class Order(BaseModel):
    order_id: int
    items: list[Item]  # List of Item models

# Create with list of dictionaries
order = Order(
    order_id=1001,
    items=[
        {"name": "Coffee", "price": 4.99, "quantity": 2},
        {"name": "Muffin", "price": 3.50, "quantity": 1}
    ]
)

# Access items
print(f"Order #{order.order_id}")
for item in order.items:
    print(f"  {item.name}: ${item.price} x {item.quantity}")

# Calculate total
total = sum(item.price * item.quantity for item in order.items)
print(f"Total: ${total:.2f}")

### 5.5 Complex Nested Structure

In [None]:
class Contact(BaseModel):
    email: str
    phone: str | None = None

class CompanyAddress(BaseModel):
    street: str
    city: str
    country: str = "USA"

class Company(BaseModel):
    name: str
    employees: list[str]
    address: CompanyAddress
    contact: Contact

class Employee(BaseModel):
    username: str
    company: Company

# Deep nesting with validation at all levels
employee = Employee(
    username="alice",
    company={
        "name": "Tech Corp",
        "employees": ["Bob", "Charlie", "Diana"],
        "address": {
            "street": "100 Tech Blvd",
            "city": "San Francisco"
        },
        "contact": {
            "email": "info@techcorp.com",
            "phone": "+1-555-0100"
        }
    }
)

# Deep attribute access
print(f"Employee: {employee.username}")
print(f"Company: {employee.company.name}")
print(f"City: {employee.company.address.city}")
print(f"Email: {employee.company.contact.email}")
print(f"Number of employees: {len(employee.company.employees)}")

### 5.6 Optional Nested Models

In [None]:
class Profile(BaseModel):
    bio: str
    website: str | None = None

class UserProfile(BaseModel):
    username: str
    profile: Optional[Profile] = None  # Nested model can be None

# User without profile
user1 = UserProfile(username="bob")
print(f"User 1 profile: {user1.profile}")

# User with profile
user2 = UserProfile(
    username="alice",
    profile={"bio": "Software Engineer", "website": "alice.dev"}
)
print(f"User 2 bio: {user2.profile.bio}")

### 5.7 Serialization of Nested Models

In [None]:
class SimpleAddress(BaseModel):
    street: str
    city: str

class SimplePerson(BaseModel):
    name: str
    address: SimpleAddress

person = SimplePerson(
    name="Alice",
    address={"street": "123 Main St", "city": "NYC"}
)

# Convert to dict - maintains nested structure
data = person.model_dump()
print("model_dump():")
print(json.dumps(data, indent=2))

# Convert to JSON
json_str = person.model_dump_json()
print(f"\nmodel_dump_json(): {json_str}")

### 5.8 Real-World Example - API Response

In [None]:
from datetime import datetime

class Author(BaseModel):
    id: int
    name: str

class Comment(BaseModel):
    id: int
    text: str
    author: Author
    created_at: datetime

class Post(BaseModel):
    id: int
    title: str
    content: str
    author: Author
    comments: list[Comment]
    tags: list[str] = []

# Parse complex API response
api_response = {
    "id": 1,
    "title": "Learning Pydantic",
    "content": "Pydantic is awesome!",
    "author": {"id": 100, "name": "Alice"},
    "comments": [
        {
            "id": 1,
            "text": "Great post!",
            "author": {"id": 101, "name": "Bob"},
            "created_at": "2024-01-15T10:30:00"
        },
        {
            "id": 2,
            "text": "Very helpful!",
            "author": {"id": 102, "name": "Charlie"},
            "created_at": "2024-01-15T11:00:00"
        }
    ],
    "tags": ["python", "pydantic", "tutorial"]
}

post = Post.model_validate(api_response)

# Easy access to nested data
print(f"Post: '{post.title}' by {post.author.name}")
print(f"Tags: {', '.join(post.tags)}")
print(f"\n{len(post.comments)} comments:")
for comment in post.comments:
    print(f"  - {comment.author.name}: {comment.text}")

---
## Summary: Key Takeaways

| Topic | Key Points |
|-------|------------|
| **Optional Fields & Defaults** | Use `= value` for defaults, `Optional[Type]` for nullable, `Field(default_factory=...)` for mutable defaults |
| **Type Coercion** | Pydantic converts types automatically; use `Field(strict=True)` for critical fields |
| **Validation Errors** | Use `e.errors()` for programmatic access, `e.json()` for API responses |
| **Model Creation** | Use `model_validate()` for dicts, `model_validate_json()` for JSON, `model_construct()` only for trusted data |
| **Nested Models** | Validation cascades through all levels; use `model_dump()` for serialization |