# üéØ Pydantic Fundamentals: 5 Essential Questions

**Beginner to Intermediate Level**

This notebook covers the essential foundations of Pydantic:

1. **Optional Fields & Defaults** - Understanding field requirements and nullability
2. **Type Coercion** - How automatic type conversion works and its pitfalls
3. **Validation Errors** - Catching, inspecting, and communicating errors effectively
4. **Model Creation Methods** - Choosing the right method for different scenarios
5. **Nested Models** - Building and validating complex hierarchical structures

In [None]:
# Setup - Install pydantic if needed
# !pip install pydantic

---

## Question 1: How do you define optional fields and default values in Pydantic models?

### üí° Interviewer's Intent:
The interviewer wants to assess your understanding of field requirements in Pydantic. They're checking whether you know the difference between required fields, optional fields with defaults, and nullable fields. This is fundamental for API design where some fields are mandatory while others are optional or have default behaviors.

### Required Fields (No Default)

Fields without default values are required and must be provided.

In [None]:
from pydantic import BaseModel, ValidationError

class User(BaseModel):
    id: int          # Required - must be provided
    username: str    # Required - must be provided

# This works
user = User(id=1, username="alice")
print(f"Created user: {user}")

In [None]:
# This fails - missing required fields
try:
    user = User(id=1)  # Missing 'username'
except ValidationError as e:
    print("‚ùå ValidationError:")
    print(e)

### Optional Fields with Default Values

Fields with default values are optional - defaults are used if not provided.

In [None]:
class UserWithDefaults(BaseModel):
    id: int
    username: str
    role: str = "user"           # Optional with default
    is_active: bool = True       # Optional with default
    credits: int = 0             # Optional with default

# Only required fields provided - defaults are used
user = UserWithDefaults(id=1, username="alice")
print(f"role: {user.role}")           # "user"
print(f"is_active: {user.is_active}") # True
print(f"credits: {user.credits}")     # 0

In [None]:
# Override defaults
user = UserWithDefaults(id=2, username="bob", role="admin", credits=100)
print(f"role: {user.role}")       # "admin"
print(f"credits: {user.credits}") # 100

### Nullable Fields (Can Be None)

Use `Optional[Type]` or `Type | None` for fields that can accept `None`.

In [None]:
from typing import Optional

class UserNullable(BaseModel):
    id: int
    username: str
    email: Optional[str] = None      # Can be None, defaults to None
    phone: str | None = None          # Python 3.10+ syntax
    bio: Optional[str] = "No bio"     # Can be None, but defaults to string

# All valid
user1 = UserNullable(id=1, username="alice")
print(f"user1.email: {user1.email}")  # None

user2 = UserNullable(id=2, username="bob", email="bob@example.com")
print(f"user2.email: {user2.email}")  # "bob@example.com"

user3 = UserNullable(id=3, username="charlie", email=None)
print(f"user3.email: {user3.email}")  # None (explicitly set)

### ‚ö†Ô∏è Important Distinction

Default strings vs. nullable fields are different!

In [None]:
class Product(BaseModel):
    name: str
    price: float
    
    # These are DIFFERENT:
    description: str = "No description"        # Always string, never None
    category: Optional[str] = None             # Can be None or string
    tags: list[str] = []                       # Always list, never None
    metadata: Optional[dict] = None            # Can be None or dict
    
product = Product(name="Coffee", price=4.99)
print(f"description: {product.description}")  # "No description" (string)
print(f"category: {product.category}")        # None
print(f"tags: {product.tags}")                # []
print(f"metadata: {product.metadata}")        # None

### Using Field() for Complex Defaults

In [None]:
from pydantic import Field

class Config(BaseModel):
    app_name: str
    debug: bool = Field(default=False)
    max_connections: int = Field(default=100, ge=1, le=1000)
    allowed_hosts: list[str] = Field(default_factory=list)  # Mutable default

config = Config(app_name="MyApp")
print(f"debug: {config.debug}")                      # False
print(f"max_connections: {config.max_connections}")  # 100
print(f"allowed_hosts: {config.allowed_hosts}")      # []

### üîë Key Insight

> If a field has no default value, it's required. Use `= value` for simple defaults, `Optional[Type]` when None is acceptable, and `Field(default_factory=...)` for mutable defaults like lists or dicts. This pattern is essential for designing flexible yet safe API models.

---

## Question 2: How does type coercion work in Pydantic and when might it cause issues?

### üí° Interviewer's Intent:
The interviewer is testing whether you understand that Pydantic doesn't just validate types‚Äîit also converts them. They want to know if you're aware of potential data loss during coercion and when you should use strict mode to prevent unwanted conversions. This is critical for financial applications or scenarios where type precision matters.

### Common Coercion Patterns

In [None]:
from pydantic import BaseModel

class Model(BaseModel):
    integer: int
    floating: float
    text: str
    flag: bool

# Type coercion in action
m = Model(
    integer="123",       # str ‚Üí int
    floating="3.14",     # str ‚Üí float
    text=456,            # int ‚Üí str
    flag=1               # int ‚Üí bool
)

print(f"integer: {m.integer} (type: {type(m.integer).__name__})")
print(f"floating: {m.floating} (type: {type(m.floating).__name__})")
print(f"text: {m.text} (type: {type(m.text).__name__})")
print(f"flag: {m.flag} (type: {type(m.flag).__name__})")

### ‚ö†Ô∏è Data Loss Example - Float to Int

In [None]:
class Score(BaseModel):
    points: int

# Decimal is truncated, not rounded!
score = Score(points=99.9)
print(f"99.9 ‚Üí {score.points} (lost 0.9!)")

score = Score(points=99.1)
print(f"99.1 ‚Üí {score.points} (lost 0.1!)")

### Boolean Coercion - Surprising Behavior

In [None]:
class Settings(BaseModel):
    enabled: bool

# These all become True
print("Values that become True:")
print(f"  1 ‚Üí {Settings(enabled=1).enabled}")
print(f"  'yes' ‚Üí {Settings(enabled='yes').enabled}")
print(f"  'true' ‚Üí {Settings(enabled='true').enabled}")

# These all become False
print("\nValues that become False:")
print(f"  0 ‚Üí {Settings(enabled=0).enabled}")
print(f"  '' ‚Üí {Settings(enabled='').enabled}")
print(f"  'false' ‚Üí {Settings(enabled='false').enabled}")

### String Coercion - Anything to String

In [None]:
class Log(BaseModel):
    message: str

# Almost anything becomes a string
print(f"123 ‚Üí '{Log(message=123).message}'")
print(f"3.14 ‚Üí '{Log(message=3.14).message}'")
print(f"True ‚Üí '{Log(message=True).message}'")
print(f"[1, 2, 3] ‚Üí '{Log(message=[1, 2, 3]).message}'")

### ‚ö†Ô∏è When Coercion Causes Problems

In [None]:
# Financial calculation issue
class Transaction(BaseModel):
    amount: int  # Cents

# User sends dollars as float
transaction = Transaction(amount=19.99)
print(f"‚ùå Got {transaction.amount} cents, not 1999 cents!")

# Configuration parsing issue
class Config(BaseModel):
    max_retries: int

# String from environment variable
config = Config(max_retries="3.5")
print(f"‚ùå Got {config.max_retries}, not 3.5 - silently truncated")

### ‚úÖ Solution - Strict Mode

In [None]:
from pydantic import Field, ValidationError

class StrictModel(BaseModel):
    user_id: int = Field(strict=True)
    price: float = Field(strict=True)

# Now coercion is disabled
try:
    model = StrictModel(user_id="123", price="9.99")
except ValidationError as e:
    print("‚ùå Strict mode prevents coercion:")
    print(e)
    
# Only exact types work
model = StrictModel(user_id=123, price=9.99)
print(f"\n‚úÖ Works with exact types: {model}")

### üîë Key Insight

**When to Use Strict Mode:**
1. **Financial data** - Prevent precision loss
2. **IDs and keys** - Ensure exact type matching
3. **Critical configuration** - Avoid silent type conversions
4. **Security tokens** - No unexpected transformations

**When Coercion is Helpful:**
1. **API input** - Accept "123" as 123 for convenience
2. **Form data** - All form values are strings
3. **Environment variables** - Always strings, need conversion
4. **CSV/JSON parsing** - Types may not be exact

---

## Question 3: How do you handle validation errors in Pydantic and extract useful error information?

### üí° Interviewer's Intent:
The interviewer wants to know if you can properly catch, inspect, and communicate validation errors to users or logs. They're testing whether you understand the structure of ValidationError and how to provide helpful error messages in production applications, especially in API contexts where error responses need to be clear and actionable.

### Basic Error Handling

In [None]:
from pydantic import BaseModel, ValidationError

class User(BaseModel):
    id: int
    username: str
    age: int

# Catch validation errors
try:
    user = User(id="abc", username=123, age="invalid")
except ValidationError as e:
    print("‚ùå ValidationError caught:")
    print(e)

### Accessing Error Details Programmatically

In [None]:
try:
    user = User(id="abc", username=123, age="invalid")
except ValidationError as e:
    # Get error count
    print(f"Found {e.error_count()} errors\n")
    
    # Get list of error dictionaries
    for i, error in enumerate(e.errors(), 1):
        print(f"Error {i}:")
        print(f"  Field: {error['loc']}")
        print(f"  Message: {error['msg']}")
        print(f"  Type: {error['type']}")
        print(f"  Input: {error['input']}")
        print()

### Getting JSON Error Response (Perfect for APIs)

In [None]:
import json

try:
    user = User(id="abc", username=123, age="invalid")
except ValidationError as e:
    error_json = e.json()
    print("JSON formatted errors (perfect for API responses):")
    print(json.dumps(json.loads(error_json), indent=2))

### Custom Error Messages with Field Validators

In [None]:
from pydantic import field_validator

class Product(BaseModel):
    name: str
    price: float
    
    @field_validator('price')
    @classmethod
    def price_must_be_positive(cls, v):
        if v <= 0:
            raise ValueError('Price must be greater than zero')
        return v

try:
    product = Product(name="Coffee", price=-5.99)
except ValidationError as e:
    for error in e.errors():
        print(f"‚ùå {error['loc'][0]}: {error['msg']}")

### Nested Model Errors

In [None]:
from pydantic import Field

class Address(BaseModel):
    street: str
    zip_code: str = Field(pattern=r'^\d{5}$')

class Person(BaseModel):
    name: str
    address: Address

try:
    person = Person(
        name="Alice",
        address={"street": "Main St", "zip_code": "invalid"}
    )
except ValidationError as e:
    for error in e.errors():
        print(f"Location: {error['loc']}")
        print(f"Message: {error['msg']}")

### üîë Key Insight

> ValidationError provides structured, detailed information about what went wrong and where. Use `e.errors()` for programmatic access, `e.json()` for API responses, and always catch ValidationError when parsing untrusted data. This makes debugging easier and provides users with actionable error messages.

---

## Question 4: What is the difference between `model_validate()` and creating an instance directly?

### üí° Interviewer's Intent:
The interviewer is checking whether you understand the different ways to create Pydantic model instances and when to use each method. They want to know if you're aware of `model_validate()` for parsing dictionaries, `model_validate_json()` for JSON strings, and `model_construct()` for bypassing validation. This is important for performance optimization and API design.

### Direct Instantiation (Keyword Arguments)

In [None]:
from pydantic import BaseModel

class User(BaseModel):
    id: int
    username: str

# Direct instantiation with keyword arguments
user = User(id=1, username="alice")
print(f"Direct: {user}")

### model_validate() - Parse Dictionary

In [None]:
# When you have a dictionary
data = {"id": 2, "username": "bob"}
user = User.model_validate(data)
print(f"model_validate(): {user}")

# Equivalent to unpacking
user = User(**data)  # Same result
print(f"Unpacking (**data): {user}")

### model_validate_json() - Parse JSON String

In [None]:
# When you have a JSON string
json_string = '{"id": 3, "username": "charlie"}'
user = User.model_validate_json(json_string)
print(f"model_validate_json(): {user}")

# More efficient than manual parsing:
import json
data = json.loads(json_string)
user = User.model_validate(data)
print(f"json.loads + model_validate: {user}")

### ‚ö†Ô∏è model_construct() - Skip Validation (Dangerous!)

In [None]:
# Create without validation - use with extreme caution
user = User.model_construct(id="not_an_int", username=12345)
print(f"model_construct(): {user}")
print(f"‚ö†Ô∏è id is: {user.id} (type: {type(user.id).__name__}) - no validation happened!")

### When to Use Each Method

In [None]:
from datetime import datetime

class Event(BaseModel):
    name: str
    timestamp: datetime

# 1. Direct instantiation - when you have individual values
event1 = Event(name="Meeting", timestamp=datetime.now())
print(f"1. Direct instantiation: {event1}")

# 2. model_validate() - when parsing dict from API/database
api_response = {"name": "Conference", "timestamp": "2024-06-15T10:00:00"}
event2 = Event.model_validate(api_response)
print(f"2. model_validate(): {event2}")

# 3. model_validate_json() - when parsing JSON string
json_data = '{"name":"Workshop","timestamp":"2024-07-01T14:00:00"}'
event3 = Event.model_validate_json(json_data)
print(f"3. model_validate_json(): {event3}")

# 4. model_construct() - when loading trusted data at scale
database_row = {"name": "Seminar", "timestamp": datetime(2024, 8, 1)}
event4 = Event.model_construct(**database_row)  # No validation
print(f"4. model_construct(): {event4}")

### Performance Comparison

In [None]:
import time

# Setup
data_dict = {"id": 1, "username": "test"}

# Method 1: Direct with unpacking
start = time.time()
for _ in range(10000):
    user = User(**data_dict)
time1 = time.time() - start

# Method 2: model_validate()
start = time.time()
for _ in range(10000):
    user = User.model_validate(data_dict)
time2 = time.time() - start

# Method 3: model_construct() (no validation)
start = time.time()
for _ in range(10000):
    user = User.model_construct(**data_dict)
time3 = time.time() - start

print(f"Direct (**data): {time1:.3f}s")
print(f"model_validate(): {time2:.3f}s")
print(f"model_construct(): {time3:.3f}s (fastest but no validation!)")

### üîë Key Insight

> Use direct instantiation or `model_validate()` for external data that needs validation. Use `model_validate_json()` for efficient JSON parsing. Only use `model_construct()` when loading from trusted sources (like databases) where data is already validated and performance is critical. **Never use `model_construct()` with user input.**

---

## Question 5: How do you work with nested models and lists in Pydantic?

### üí° Interviewer's Intent:
The interviewer wants to assess your ability to model complex, hierarchical data structures. They're checking whether you understand how validation cascades through nested models, how to work with lists of models, and how to properly structure data for real-world scenarios like API responses with related entities or configuration files with nested sections.

### Basic Nested Model

In [None]:
from pydantic import BaseModel

class Address(BaseModel):
    street: str
    city: str
    zip_code: str

class Person(BaseModel):
    name: str
    age: int
    address: Address  # Nested model

# Create with nested dictionary
person = Person(
    name="Alice",
    age=30,
    address={
        "street": "123 Main St",
        "city": "Boston",
        "zip_code": "02101"
    }
)

# Access nested attributes
print(f"person.address.city: {person.address.city}")
print(f"person.address.zip_code: {person.address.zip_code}")

### Nested Validation Cascades

In [None]:
from pydantic import ValidationError

# Invalid nested data is caught
try:
    person = Person(
        name="Bob",
        age="invalid",  # Error at Person level
        address={
            "street": "456 Oak Ave",
            "city": 999,  # Error at Address level
            "zip_code": "10001"
        }
    )
except ValidationError as e:
    print("Errors at multiple levels:")
    for error in e.errors():
        print(f"  {error['loc']}: {error['msg']}")

### List of Primitives

In [None]:
class TodoList(BaseModel):
    title: str
    items: list[str]
    tags: list[str] = []

todo = TodoList(
    title="Shopping",
    items=["Milk", "Bread", "Eggs"],
    tags=["groceries", "urgent"]
)

print(f"items: {todo.items}")
print(f"len(items): {len(todo.items)}")

### List of Nested Models

In [None]:
class Item(BaseModel):
    name: str
    price: float
    quantity: int

class Order(BaseModel):
    order_id: int
    items: list[Item]  # List of Item models

# Create with list of dictionaries
order = Order(
    order_id=1001,
    items=[
        {"name": "Coffee", "price": 4.99, "quantity": 2},
        {"name": "Muffin", "price": 3.50, "quantity": 1}
    ]
)

# Access items
print("Order items:")
for item in order.items:
    print(f"  {item.name}: ${item.price} x {item.quantity}")

# Calculate total
total = sum(item.price * item.quantity for item in order.items)
print(f"\nTotal: ${total:.2f}")

### Complex Nested Structure

In [None]:
class Contact(BaseModel):
    email: str
    phone: str | None = None

class Address(BaseModel):
    street: str
    city: str
    country: str = "USA"

class Company(BaseModel):
    name: str
    employees: list[str]
    address: Address
    contact: Contact

class User(BaseModel):
    username: str
    company: Company

# Deep nesting with validation at all levels
user = User(
    username="alice",
    company={
        "name": "Tech Corp",
        "employees": ["Bob", "Charlie", "Diana"],
        "address": {
            "street": "100 Tech Blvd",
            "city": "San Francisco"
        },
        "contact": {
            "email": "info@techcorp.com",
            "phone": "+1-555-0100"
        }
    }
)

# Deep attribute access
print(f"user.company.address.city: {user.company.address.city}")
print(f"user.company.contact.email: {user.company.contact.email}")
print(f"Number of employees: {len(user.company.employees)}")

### Optional Nested Models

In [None]:
from typing import Optional

class Profile(BaseModel):
    bio: str
    website: str | None = None

class User(BaseModel):
    username: str
    profile: Optional[Profile] = None  # Nested model can be None

# User without profile
user1 = User(username="bob")
print(f"user1.profile: {user1.profile}")

# User with profile
user2 = User(
    username="alice",
    profile={"bio": "Software Engineer", "website": "alice.dev"}
)
print(f"user2.profile.bio: {user2.profile.bio}")

### Serialization of Nested Models

In [None]:
import json

class Address(BaseModel):
    street: str
    city: str

class Person(BaseModel):
    name: str
    address: Address

person = Person(
    name="Alice",
    address={"street": "123 Main St", "city": "NYC"}
)

# Convert to dict - maintains nested structure
data = person.model_dump()
print("model_dump():")
print(json.dumps(data, indent=2))

# Convert to JSON
json_str = person.model_dump_json()
print(f"\nmodel_dump_json(): {json_str}")

### üåü Real-World Example - API Response

In [None]:
from datetime import datetime
from typing import Optional

class Author(BaseModel):
    id: int
    name: str

class Comment(BaseModel):
    id: int
    text: str
    author: Author
    created_at: datetime

class Post(BaseModel):
    id: int
    title: str
    content: str
    author: Author
    comments: list[Comment]
    tags: list[str] = []

# Parse complex API response
api_response = {
    "id": 1,
    "title": "Learning Pydantic",
    "content": "Pydantic is awesome!",
    "author": {"id": 100, "name": "Alice"},
    "comments": [
        {
            "id": 1,
            "text": "Great post!",
            "author": {"id": 101, "name": "Bob"},
            "created_at": "2024-01-15T10:30:00"
        },
        {
            "id": 2,
            "text": "Very helpful!",
            "author": {"id": 102, "name": "Charlie"},
            "created_at": "2024-01-15T11:00:00"
        }
    ],
    "tags": ["python", "pydantic", "tutorial"]
}

post = Post.model_validate(api_response)

# Easy access to nested data
print(f"üìù Post by {post.author.name}")
print(f"üìä {len(post.comments)} comments")
print(f"üè∑Ô∏è Tags: {', '.join(post.tags)}")
print("\nüí¨ Comments:")
for comment in post.comments:
    print(f"  - {comment.author.name}: {comment.text}")

### üîë Key Insight

> Nested models allow you to build complex, hierarchical data structures with validation at every level. Pydantic automatically validates all nested models and lists, making it perfect for parsing complex API responses, configuration files, or any structured data. Access is intuitive with dot notation, and serialization maintains the nested structure.

---

## üìö Summary: Beginner to Intermediate Mastery

These five questions cover the essential foundations of Pydantic:

| # | Topic | Key Takeaway |
|---|-------|-------------|
| 1 | **Optional Fields & Defaults** | No default = required; `= value` for defaults; `Optional[Type]` for nullable |
| 2 | **Type Coercion** | Automatic conversion can lose data; use `strict=True` for critical fields |
| 3 | **Validation Errors** | Use `e.errors()` for programmatic access, `e.json()` for APIs |
| 4 | **Model Creation Methods** | `model_validate()` for dicts, never `model_construct()` with user input |
| 5 | **Nested Models** | Validation cascades through all levels; perfect for complex data |

Mastering these concepts will prepare you for most real-world Pydantic use cases, from simple API validation to complex data processing pipelines. üöÄ