# Pydantic Learning Guide - Interactive Notebook

**Pydantic** is a powerful data validation library for Python that uses Python type annotations. It provides runtime type checking, automatic data parsing, clear error messages, and JSON Schema generation.

## What You'll Learn:
1. How to create and use Pydantic models
2. Data validation and type conversion
3. Custom validators for complex logic
4. Nested models for complex structures
5. Field configuration and constraints
6. Real-world examples and best practices

Let's dive in!

## 1. Import Required Libraries

Before working with Pydantic, we need to install and import it along with other useful libraries.

**Installation:**
```bash
pip install pydantic
pip install pydantic[email]  # For email validation
pip install pydantic[all]    # All optional dependencies
```

**Key Imports:**
- `BaseModel`: Base class for all Pydantic models
- `Field`: For field-level configuration
- `field_validator`: Decorator for field validation
- `model_validator`: Decorator for model-level validation
- `ValidationError`: Exception raised during validation

In [1]:
# Import Pydantic components
from pydantic import BaseModel, Field, field_validator, model_validator, ValidationError
from typing import Optional, List, Dict, Union
from datetime import datetime
from enum import Enum
import json

# Verify Pydantic is installed
import pydantic
print(f"Pydantic version: {pydantic.__version__}")
print("All required imports successful!")

ModuleNotFoundError: No module named 'pydantic'

## 2. Understanding Pydantic Models

A **Pydantic model** is a class that inherits from `BaseModel` and uses type annotations to define its fields.

### Key Concepts:
- **Type Annotations**: Define what type each field should be
- **Automatic Validation**: Pydantic validates data against the type annotations
- **Type Coercion**: Compatible types are automatically converted
- **Instance Creation**: Create model instances with keyword arguments
- **Data Access**: Access field values as attributes

### Basic Model Structure:
```python
class User(BaseModel):
    id: int           # Required integer field
    name: str         # Required string field
    email: str        # Required email field
    age: int          # Required integer field
```

### Creating Instances:
```python
user = User(id=1, name="Aayush", email="aayush@example.com", age=25)
print(user.name)  # Access attribute
```

In [None]:
# Example 1: Basic Pydantic Model
class User(BaseModel):
    id: int
    name: str
    email: str
    age: int

# Create an instance
user1 = User(id=1, name="Aayush", email="aayush@example.com", age=25)
print("User 1:")
print(f"  Name: {user1.name}")
print(f"  Email: {user1.email}")
print(f"  Age: {user1.age}")

# Another instance
user2 = User(id=2, name="Aman", email="aman@example.com", age=28)
print(f"\nUser 2: {user2.name} - {user2.email}")

# Example 2: Optional Fields and Default Values
class Product(BaseModel):
    product_id: int
    name: str
    price: float
    seller: str = "Unknown"        # Default value
    description: Optional[str] = None  # Optional field

product = Product(product_id=1, name="Laptop", price=999.99)
print(f"\nProduct: {product.name}")
print(f"Seller: {product.seller}")
print(f"Description: {product.description}")

## 3. Data Validation with Pydantic

Pydantic automatically validates data during model instantiation:

### Validation Features:
- **Type Checking**: Ensures field values match declared types
- **Type Coercion**: Converts compatible types (e.g., string "25" → int 25)
- **Required Fields**: Fields without defaults must be provided
- **Optional Fields**: Can be omitted or set to None
- **Validation Errors**: Clear error messages when validation fails

### Type Coercion Examples:
- String "999.99" → float 999.99
- Integer 25 → float 25.0
- String "5" → integer 5

### When Validation Fails:
Pydantic raises a `ValidationError` with detailed information about what went wrong, including:
- Field name that failed
- Error message
- Error type
- Input value provided

In [None]:
# Example 1: Automatic Type Conversion
class Item(BaseModel):
    name: str
    price: float
    quantity: int

# Pydantic automatically converts compatible types
item = Item(name="Mouse", price="29.99", quantity="5")
print("Automatic Type Conversion:")
print(f"  Price: {item.price} (type: {type(item.price).__name__})")
print(f"  Quantity: {item.quantity} (type: {type(item.quantity).__name__})")

# Example 2: Validation Error - Invalid Type
print("\n\nValidation Error Example:")
try:
    invalid_item = Item(name="Keyboard", price="not_a_number", quantity=10)
except ValidationError as e:
    print("Validation failed!")
    print(f"Errors: {e}")
    print("\nDetailed error information:")
    for error in e.errors():
        print(f"  Field: {error['loc'][0]}")
        print(f"  Message: {error['msg']}")
        print(f"  Type: {error['type']}")

# Example 3: Missing Required Field
print("\n\nMissing Required Field:")
try:
    incomplete = Item(name="Monitor")  # Missing price and quantity
except ValidationError as e:
    print("Error - Missing required fields:")
    for error in e.errors():
        print(f"  {error['loc'][0]}: {error['msg']}")

## 4. Custom Validators

**Custom validators** allow you to add complex validation logic beyond basic type checking.

### Using `@field_validator` Decorator:
- Applied to individual fields
- Can validate a single field or multiple fields
- Receives the value being validated
- Should return the validated value or raise `ValueError`

### Validator Modes:
- **`mode='before'`**: Runs before Pydantic's standard validation
- **`mode='after'`** (default): Runs after Pydantic's standard validation

### Common Validation Patterns:
- String length and format checks
- Value range validation
- Email and URL validation
- Cross-field validation
- Custom business logic

In [None]:
# Example 1: Field Validator for Single Field
class UserProfile(BaseModel):
    username: str
    age: int
    email: str
    
    @field_validator('username')
    @classmethod
    def username_must_be_alphanumeric(cls, v):
        if not v.replace('_', '').isalnum():
            raise ValueError('Username must be alphanumeric (underscore allowed)')
        return v.lower()
    
    @field_validator('age')
    @classmethod
    def age_must_be_adult(cls, v):
        if v < 18:
            raise ValueError('User must be at least 18 years old')
        if v > 120:
            raise ValueError('Please enter a valid age')
        return v

# Valid user
try:
    user = UserProfile(username="John_Doe", age=25, email="john@example.com")
    print("Valid User Created:")
    print(f"  Username: {user.username}")
    print(f"  Age: {user.age}")
except ValidationError as e:
    print(f"Error: {e}")

# Invalid user - age too young
print("\n\nInvalid Age Example:")
try:
    invalid_user = UserProfile(username="Jane_Doe", age=16, email="jane@example.com")
except ValidationError as e:
    print(f"Error: {e.errors()[0]['msg']}")

# Example 2: Multiple Field Validators
class PasswordReset(BaseModel):
    password: str
    confirm_password: str
    
    @field_validator('password')
    @classmethod
    def password_must_be_strong(cls, v):
        if len(v) < 8:
            raise ValueError('Password must be at least 8 characters')
        if not any(c.isupper() for c in v):
            raise ValueError('Password must contain uppercase letter')
        if not any(c.isdigit() for c in v):
            raise ValueError('Password must contain digit')
        return v
    
    @field_validator('confirm_password')
    @classmethod
    def passwords_match(cls, v, info):
        if 'password' in info.data and v != info.data['password']:
            raise ValueError('Passwords do not match')
        return v

# Valid password
try:
    reset = PasswordReset(password="SecurePass123", confirm_password="SecurePass123")
    print("\n\nValid Password Set")
except ValidationError as e:
    print(f"Error: {e}")

# Mismatched passwords
print("\nMismatched Passwords Example:")
try:
    reset = PasswordReset(password="SecurePass123", confirm_password="DifferentPass123")
except ValidationError as e:
    print(f"Error: {e.errors()[0]['msg']}")

## 5. Working with Nested Models

**Nested models** allow you to structure complex data with multiple levels of hierarchy.

### Benefits of Nested Models:
- **Code Organization**: Separate concerns into different model classes
- **Reusability**: Use models in multiple places
- **Type Safety**: Full validation for nested data
- **Readability**: Clear data structure
- **Maintainability**: Easier to update and test

### Creating Nested Models:
1. Define inner model classes
2. Use the inner models as field types in outer models
3. Pydantic automatically validates nested data

### Accessing Nested Data:
```python
user.address.city  # Access nested field
user.skills[0].name  # Access nested list items
```

In [None]:
# Example 1: Basic Nested Models
class Address(BaseModel):
    street: str
    city: str
    state: str
    zip_code: str

class Person(BaseModel):
    name: str
    email: str
    address: Address

# Create instance with nested data
person = Person(
    name="Aayush Kumar",
    email="aayush@example.com",
    address={
        "street": "123 Main St",
        "city": "Mumbai",
        "state": "Maharashtra",
        "zip_code": "400001"
    }
)

print("Nested Model Example:")
print(f"Name: {person.name}")
print(f"City: {person.address.city}")
print(f"Full Address: {person.address.street}, {person.address.city}, {person.address.state}")

# Example 2: Complex Nested Structures with Lists
class Skill(BaseModel):
    name: str
    level: int  # 1-10
    years_experience: float

class Developer(BaseModel):
    name: str
    email: str
    skills: List[Skill]
    location: Address

# Create complex nested structure
developer = Developer(
    name="Aman Singh",
    email="aman@example.com",
    skills=[
        {"name": "Python", "level": 9, "years_experience": 5.5},
        {"name": "FastAPI", "level": 8, "years_experience": 3.0},
        {"name": "PostgreSQL", "level": 7, "years_experience": 4.0}
    ],
    location={
        "street": "456 Tech Road",
        "city": "Delhi",
        "state": "Delhi",
        "zip_code": "110001"
    }
)

print(f"\n\nDeveloper: {developer.name}")
print(f"Location: {developer.location.city}")
print(f"Skills ({len(developer.skills)}):")
for skill in developer.skills:
    print(f"  - {skill.name}: Level {skill.level} ({skill.years_experience} years)")

# Example 3: Convert nested model to dictionary
print("\n\nNested Model to Dictionary:")
developer_dict = developer.model_dump()
print(json.dumps(developer_dict, indent=2))

## 6. Field Configuration and Constraints

The **`Field()`** function allows detailed configuration of model fields:

### Field Configuration Options:
- **`default`**: Default value if not provided
- **`default_factory`**: Function to generate default value
- **`alias`**: Alternative name for the field
- **`title`**: Human-readable field title
- **`description`**: Field documentation
- **`examples`**: Example values

### Numeric Constraints:
- **`gt`**: Greater than
- **`ge`**: Greater than or equal
- **`lt`**: Less than
- **`le`**: Less than or equal

### String Constraints:
- **`min_length`**: Minimum string length
- **`max_length`**: Maximum string length
- **`pattern`**: Regular expression pattern
- **`to_lower`**: Convert to lowercase
- **`to_upper`**: Convert to uppercase
- **`strip_whitespace`**: Remove leading/trailing spaces

### Collection Constraints:
- **`min_length`**: Minimum items in list/dict
- **`max_length`**: Maximum items in list/dict

In [None]:
# Example 1: Numeric and String Constraints
class Product(BaseModel):
    product_id: int = Field(gt=0, description="Product ID must be positive")
    name: str = Field(min_length=3, max_length=50, description="Product name")
    price: float = Field(gt=0, le=999999.99, description="Price in USD")
    stock: int = Field(ge=0, le=10000, description="Stock quantity")
    description: Optional[str] = Field(default=None, max_length=500)

# Valid product
product = Product(
    product_id=1,
    name="Gaming Laptop",
    price=1299.99,
    stock=50,
    description="High-performance laptop for gaming"
)

print("Product with Constraints:")
print(f"  ID: {product.product_id}")
print(f"  Name: {product.name}")
print(f"  Price: ${product.price}")
print(f"  Stock: {product.stock} units")

# Example 2: String Processing with Field
class UserAccount(BaseModel):
    username: str = Field(min_length=3, max_length=20, to_lower=True)
    email: str = Field(to_lower=True)
    code: str = Field(to_upper=True, min_length=6)
    full_name: str = Field(strip_whitespace=True)

user_account = UserAccount(
    username="John_Doe_123",
    email="JOHN@EXAMPLE.COM",
    code="abc123",
    full_name="  John Doe  "
)

print("\n\nString Processing with Field:")
print(f"  Username: {user_account.username}")  # Lowercased
print(f"  Email: {user_account.email}")  # Lowercased
print(f"  Code: {user_account.code}")  # Uppercased
print(f"  Full Name: '{user_account.full_name}'")  # Stripped

# Example 3: Collection Constraints
class TeamProject(BaseModel):
    project_name: str
    tags: List[str] = Field(min_length=1, max_length=5, description="Project tags")
    team_members: List[str] = Field(min_length=2, description="Team members")

project = TeamProject(
    project_name="E-commerce Platform",
    tags=["python", "fastapi", "database"],
    team_members=["Aayush", "Aman", "Shiv"]
)

print("\n\nCollection Constraints:")
print(f"  Project: {project.project_name}")
print(f"  Tags ({len(project.tags)}): {', '.join(project.tags)}")
print(f"  Team Members ({len(project.team_members)}): {', '.join(project.team_members)}")

# Example 4: Default Values
class BlogPost(BaseModel):
    title: str
    content: str
    author: str
    published: bool = Field(default=False)
    status: str = Field(default="draft")
    created_at: datetime = Field(default_factory=datetime.now)

post = BlogPost(
    title="Pydantic Guide",
    content="Learn Pydantic...",
    author="Ut"
)

print("\n\nDefault Values:")
print(f"  Title: {post.title}")
print(f"  Published: {post.published}")
print(f"  Status: {post.status}")

## 7. Data Serialization

**Serialization** is the process of converting Pydantic models to different formats (dict, JSON).

### Common Serialization Methods:

1. **`model_dump()`**: Convert model to Python dictionary
   - `include`: Specify fields to include
   - `exclude`: Specify fields to exclude
   - `by_alias`: Use field aliases in output

2. **`model_dump_json()`**: Convert model to JSON string
   - `indent`: Pretty print JSON with indentation

3. **`model_validate()`**: Create model from dictionary

4. **`model_validate_json()`**: Create model from JSON string

### Use Cases:
- API responses
- Database storage
- Data export
- External integrations
- Caching

In [None]:
# Example 1: Converting to Dictionary
class Employee(BaseModel):
    employee_id: int
    name: str
    department: str
    salary: float
    email: str

employee = Employee(
    employee_id=101,
    name="Aayush Kumar",
    department="Engineering",
    salary=75000,
    email="aayush@company.com"
)

# Convert to dictionary
emp_dict = employee.model_dump()
print("Model to Dictionary:")
print(emp_dict)

# Exclude sensitive information
emp_public = employee.model_dump(exclude={'salary', 'email'})
print("\n\nDictionary (excluding salary and email):")
print(emp_public)

# Include only specific fields
emp_basic = employee.model_dump(include={'name', 'department'})
print("\n\nDictionary (only name and department):")
print(emp_basic)

# Example 2: JSON Serialization
print("\n\nJSON Serialization:")
json_string = employee.model_dump_json(indent=2)
print(json_string)

# Example 3: Creating Model from JSON
print("\n\nCreating Model from JSON:")
json_data = '''
{
    "employee_id": 102,
    "name": "Aman Singh",
    "department": "Marketing",
    "salary": 65000,
    "email": "aman@company.com"
}
'''

new_employee = Employee.model_validate_json(json_data)
print(f"Loaded: {new_employee.name} from {new_employee.department}")

# Example 4: Creating Model from Dictionary
print("\n\nCreating Model from Dictionary:")
emp_data = {
    "employee_id": 103,
    "name": "Shiv Patel",
    "department": "HR",
    "salary": 60000,
    "email": "shiv@company.com"
}

another_employee = Employee.model_validate(emp_data)
print(f"Created: {another_employee.name} - {another_employee.employee_id}")

## 8. Real-World Example: Complete Application

Let's build a complete example combining all concepts we've learned:
- Nested models
- Custom validators
- Field constraints
- Serialization

This example simulates a user registration and profile system.

In [None]:
# Real-World Example: User Management System

class ContactInfo(BaseModel):
    phone: str = Field(pattern=r'^\+?1?\d{9,15}$', description="Phone number")
    city: str = Field(min_length=2, max_length=50)
    country: str = Field(default="India")

class UserPreferences(BaseModel):
    notifications_enabled: bool = True
    theme: str = Field(default="light")
    language: str = Field(default="en")

class UserRegistration(BaseModel):
    username: str = Field(min_length=3, max_length=20)
    email: str
    password: str
    age: int = Field(ge=18, le=120)
    contact: ContactInfo
    preferences: UserPreferences = Field(default_factory=UserPreferences)
    created_at: datetime = Field(default_factory=datetime.now)
    
    @field_validator('username')
    @classmethod
    def validate_username(cls, v):
        if not v.replace('_', '').isalnum():
            raise ValueError('Username must be alphanumeric (underscore allowed)')
        return v.lower()
    
    @field_validator('password')
    @classmethod
    def validate_password(cls, v):
        if len(v) < 8:
            raise ValueError('Password must be at least 8 characters')
        if not any(c.isupper() for c in v):
            raise ValueError('Password must contain uppercase')
        if not any(c.isdigit() for c in v):
            raise ValueError('Password must contain digit')
        return v

# Create a valid user
try:
    user_data = {
        "username": "John_Dev",
        "email": "john@example.com",
        "password": "SecurePass123",
        "age": 28,
        "contact": {
            "phone": "+919876543210",
            "city": "Mumbai"
        }
    }
    
    user = UserRegistration(**user_data)
    
    print("✓ User Registered Successfully!")
    print(f"\nUser Details:")
    print(f"  Username: {user.username}")
    print(f"  Email: {user.email}")
    print(f"  Age: {user.age}")
    print(f"  City: {user.contact.city}")
    print(f"  Country: {user.contact.country}")
    print(f"  Theme: {user.preferences.theme}")
    print(f"  Created: {user.created_at.strftime('%Y-%m-%d %H:%M:%S')}")
    
    # Serialize to JSON
    print("\n\nUser Data (JSON):")
    print(user.model_dump_json(indent=2))
    
except ValidationError as e:
    print("✗ Registration Failed!")
    for error in e.errors():
        print(f"  {error['loc'][0]}: {error['msg']}")

# Try invalid registration
print("\n\n" + "="*50)
print("Invalid Registration Attempt:")
print("="*50)

try:
    invalid_user = UserRegistration(
        username="ab",  # Too short
        email="john@example.com",
        password="weak",  # Too weak
        age=16,  # Too young
        contact={
            "phone": "123",  # Invalid format
            "city": "LA"
        }
    )
except ValidationError as e:
    print("✗ Multiple validation errors detected:")
    for error in e.errors():
        field = error['loc'][0]
        print(f"  • {field}: {error['msg']}")

## 9. Best Practices and Tips

### ✓ Do's:
1. **Always use type hints** - Be explicit about field types
2. **Use Field() for configuration** - Add descriptions, constraints, and examples
3. **Validate early** - Catch errors at the boundary of your application
4. **Document your models** - Use docstrings and descriptions
5. **Test validators** - Ensure validation logic works correctly
6. **Use nested models** - Organize complex data structures
7. **Handle ValidationError** - Provide meaningful error messages to users

### ✗ Don'ts:
1. **Avoid mutable defaults** - Use `default_factory` instead of `default=[]`
2. **Don't validate too much** - Keep validators simple and focused
3. **Don't ignore type hints** - They improve IDE support and readability
4. **Avoid circular references** - Use `ForwardRef` or restructure if needed
5. **Don't mix business logic in validators** - Keep them focused on validation

### Common Patterns:
- Use `Optional[T]` for nullable fields
- Use `List[T]` for collections
- Use `Union[T1, T2]` for multiple types
- Use `Literal["a", "b"]` for fixed choices
- Use `ConfigDict` for model-level configuration

### Performance Tips:
- Reuse models instead of creating new ones
- Use `validate_assignment=False` if not needed
- Consider using `frozen=True` for immutable models
- Profile your validators if they're complex

In [None]:
# Best Practices Examples

# 1. Use Field() for configuration
class WellDocumentedModel(BaseModel):
    user_id: int = Field(
        description="Unique user identifier",
        gt=0,
        examples=[1, 2, 3]
    )
    username: str = Field(
        description="Username for login",
        min_length=3,
        max_length=20,
        examples=["john_dev", "jane_admin"]
    )
    email: str = Field(
        description="User email address"
    )

print("Well-Documented Model Example:")
print(WellDocumentedModel.model_json_schema()['properties']['user_id'])

# 2. Handle ValidationError gracefully
def safe_create_user(data: dict) -> Optional[UserRegistration]:
    """Safely create user with error handling"""
    try:
        return UserRegistration(**data)
    except ValidationError as e:
        print("Validation errors:")
        for error in e.errors():
            field = error['loc'][0]
            message = error['msg']
            print(f"  {field}: {message}")
        return None

# Test safe creation
print("\n\nSafe User Creation:")
result = safe_create_user({
    "username": "valid_user",
    "email": "user@example.com",
    "password": "ValidPass123",
    "age": 25,
    "contact": {"phone": "+919999999999", "city": "Bangalore"}
})

if result:
    print(f"✓ User created: {result.username}")
else:
    print("✗ User creation failed")

# 3. Use default_factory for mutable defaults
from typing import List

class ProjectTeam(BaseModel):
    name: str
    members: List[str] = Field(default_factory=list)  # ✓ Correct
    tags: List[str] = Field(default_factory=list)     # ✓ Correct

team1 = ProjectTeam(name="Team A")
team1.members.append("Aayush")

team2 = ProjectTeam(name="Team B")
team2.members.append("Aman")

print("\n\nMutable Default Example:")
print(f"Team 1 members: {team1.members}")
print(f"Team 2 members: {team2.members}")
print("✓ Teams have separate member lists")

print("\n\n" + "="*50)
print("Pydantic Learning Complete!")
print("="*50)