---
## 2. Installation and Setup

### Installing Pydantic

```bash
pip install pydantic
pip install pydantic[email]  # With email validation
pip install pydantic[dotenv]  # With .env support
```

### Version Info
This tutorial covers **Pydantic v2** (latest version with significant improvements)

In [212]:
# Check Pydantic version
import pydantic
print(f"Pydantic Version: {pydantic.__version__}")

# Import key components
from pydantic import BaseModel, Field, validator, field_validator
from typing import List, Optional, Dict, Any
from datetime import datetime
import json

Pydantic Version: 2.10.6


---
## 3. Basic Models

### Creating Basic Pydantic Models

In [213]:
# Simple User Model
class User(BaseModel):
    id: int
    name: str
    email: str
    age: int

# Creating instances
user1 = User(id=1, name="Alice", email="alice@example.com", age=25)
print(f"User: {user1}")
print(f"User Name: {user1.name}")
print(f"User Dict: {user1.model_dump()}")
print(f"User JSON: {user1.model_dump_json()}")

User: id=1 name='Alice' email='alice@example.com' age=25
User Name: Alice
User Dict: {'id': 1, 'name': 'Alice', 'email': 'alice@example.com', 'age': 25}
User JSON: {"id":1,"name":"Alice","email":"alice@example.com","age":25}


In [214]:
# Model with Optional and Default values
class Product(BaseModel):
    name: str
    price: float
    description: Optional[str] = None  # Optional field
    in_stock: bool = True  # Default value
    quantity: int = 0

# With optional fields
product1 = Product(name="Laptop", price=999.99)
print(f"Product: {product1}")

# With all fields
product2 = Product(
    name="Mouse",
    price=29.99,
    description="Wireless mouse",
    in_stock=True,
    quantity=50
)
print(f"\nProduct 2: {product2}")

Product: name='Laptop' price=999.99 description=None in_stock=True quantity=0

Product 2: name='Mouse' price=29.99 description='Wireless mouse' in_stock=True quantity=50


In [215]:
# Type Conversion - Pydantic automatically converts types
user_data = {
    "id": "1",  # String instead of int
    "name": "Bob",
    "email": "bob@example.com",
    "age": "30"  # String instead of int
}

user2 = User(**user_data)
print(f"Converted User: {user2}")
print(f"ID type: {type(user2.id)}")
print(f"Age type: {type(user2.age)}")

Converted User: id=1 name='Bob' email='bob@example.com' age=30
ID type: <class 'int'>
Age type: <class 'int'>


---
## 4. Field Validation

### Using Pydantic Field with Constraints

In [216]:
from pydantic import Field, conint, confloat, constr
from typing import Annotated

# Model with Field constraints
class Student(BaseModel):
    id: int = Field(..., gt=0, description="Student ID must be positive")
    name: str = Field(..., min_length=1, max_length=100)
    email: str = Field(..., pattern=r"^[^@]+@[^@]+\.[^@]+$")
    gpa: float = Field(..., ge=0.0, le=4.0, description="GPA between 0 and 4")
    age: int = Field(..., ge=15, le=100)

# Valid student
student = Student(
    id=1,
    name="Charlie",
    email="charlie@example.com",
    gpa=3.8,
    age=20
)
print(f"Valid Student: {student}")

# Invalid student - will raise ValidationError
try:
    invalid_student = Student(
        id=-1,  # Negative ID
        name="",  # Empty name
        email="invalid-email",  # Invalid email
        gpa=5.0,  # Out of range
        age=200  # Out of range
    )
except Exception as e:
    print(f"\nValidation Error:\n{e}")

ImportError: cannot import name 'Annotated' from 'typing' (/usr/lib/python3.8/typing.py)

In [None]:
# Constrained types
class Account(BaseModel):
    username: Annotated[str, Field(min_length=3, max_length=20)]
    password: Annotated[str, Field(min_length=8)]
    balance: Annotated[float, Field(ge=0.0)]
    
account = Account(
    username="johndoe",
    password="secure_password_123",
    balance=1000.50
)
print(f"Account: {account}")

NameError: name 'Annotated' is not defined

---
## 5. Advanced Field Features

### Aliases, Discriminators, and Complex Field Options

In [None]:
# Field with aliases
class APIUser(BaseModel):
    user_id: int = Field(..., alias="userId")
    full_name: str = Field(..., alias="fullName")
    email_address: str = Field(..., alias="emailAddress")
    
    class Config:
        populate_by_name = True  # Accept both field name and alias

# Using alias
api_data = {"userId": 1, "fullName": "Alice Smith", "emailAddress": "alice@example.com"}
user = APIUser(**api_data)
print(f"User: {user}")
print(f"User dict with alias: {user.model_dump(by_alias=True)}")
print(f"User dict without alias: {user.model_dump()}")

User: user_id=1 full_name='Alice Smith' email_address='alice@example.com'
User dict with alias: {'userId': 1, 'fullName': 'Alice Smith', 'emailAddress': 'alice@example.com'}
User dict without alias: {'user_id': 1, 'full_name': 'Alice Smith', 'email_address': 'alice@example.com'}


In [None]:
# Field with title, description, and examples
class Article(BaseModel):
    title: str = Field(
        ...,
        title="Article Title",
        description="The title of the article",
        min_length=5,
        max_length=200
    )
    content: str = Field(
        ...,
        title="Article Content",
        description="The main content of the article",
        min_length=50
    )
    tags: List[str] = Field(
        default=[],
        title="Tags",
        description="Article tags for categorization",
        max_length=10
    )
    published: bool = Field(default=False, title="Published Status")

article = Article(
    title="Learning Pydantic",
    content="Pydantic is a powerful validation library..." * 5,
    tags=["python", "validation", "pydantic"]
)
print(f"Article: {article}")
print(f"\nSchema: {Article.model_json_schema()}")

Article: title='Learning Pydantic' content='Pydantic is a powerful validation library...Pydantic is a powerful validation library...Pydantic is a powerful validation library...Pydantic is a powerful validation library...Pydantic is a powerful validation library...' tags=['python', 'validation', 'pydantic'] published=False

Schema: {'properties': {'title': {'description': 'The title of the article', 'maxLength': 200, 'minLength': 5, 'title': 'Article Title', 'type': 'string'}, 'content': {'description': 'The main content of the article', 'minLength': 50, 'title': 'Article Content', 'type': 'string'}, 'tags': {'default': [], 'description': 'Article tags for categorization', 'items': {'type': 'string'}, 'maxItems': 10, 'title': 'Tags', 'type': 'array'}, 'published': {'default': False, 'title': 'Published Status', 'type': 'boolean'}}, 'required': ['title', 'content'], 'title': 'Article', 'type': 'object'}


---
## 6. Custom Validators

### Implementing Custom Validation Logic

In [None]:
from pydantic import field_validator, model_validator
from pydantic_core import PydanticCustomError

# Field validators
class Person(BaseModel):
    first_name: str
    last_name: str
    email: str
    age: int
    
    @field_validator('first_name', 'last_name')
    @classmethod
    def names_must_be_title_case(cls, v):
        if not v.istitle():
            raise ValueError('Names must be in title case')
        return v
    
    @field_validator('email')
    @classmethod
    def email_must_be_valid(cls, v):
        if '@' not in v or '.' not in v:
            raise ValueError('Invalid email format')
        return v.lower()
    
    @field_validator('age')
    @classmethod
    def age_must_be_reasonable(cls, v):
        if v < 0 or v > 150:
            raise ValueError('Age must be between 0 and 150')
        return v

# Valid person
person = Person(
    first_name="John",
    last_name="Doe",
    email="JOHN@EXAMPLE.COM",
    age=30
)
print(f"Valid Person: {person}")

# Invalid person
try:
    invalid = Person(
        first_name="john",  # Not title case
        last_name="Doe",
        email="invalid",
        age=30
    )
except Exception as e:
    print(f"\nValidation Error: {e}")

In [None]:
# Model validators (cross-field validation)
class DateRange(BaseModel):
    start_date: datetime
    end_date: datetime
    
    @model_validator(mode='after')
    def validate_date_range(self):
        if self.start_date > self.end_date:
            raise ValueError('start_date must be before end_date')
        return self

# Valid range
valid_range = DateRange(
    start_date=datetime(2025, 1, 1),
    end_date=datetime(2025, 12, 31)
)
print(f"Valid Range: {valid_range}")

# Invalid range
try:
    invalid_range = DateRange(
        start_date=datetime(2025, 12, 31),
        end_date=datetime(2025, 1, 1)
    )
except Exception as e:
    print(f"\nValidation Error: {e}")

---
## 7. Model Configuration

### Advanced Configuration Options

In [None]:
from pydantic import ConfigDict, field_serializer

# Model with advanced configuration
class ConfiguredModel(BaseModel):
    model_config = ConfigDict(
        str_strip_whitespace=True,  # Strip whitespace from strings
        validate_default=True,  # Validate default values
        validate_assignment=True,  # Validate when setting attributes
        frozen=False,  # Make model mutable (True for immutable)
        use_enum_values=True,  # Use enum values instead of names
        populate_by_name=True,  # Accept field name or alias
        json_schema_extra={"example": {"name": "John"}}
    )
    
    name: str
    email: str
    age: int = Field(default=0)

# Whitespace is automatically stripped
model = ConfiguredModel(
    name="  John Doe  ",
    email="  john@example.com  ",
    age=30
)
print(f"Stripped whitespace: {model}")

# Validate on assignment
try:
    model.age = "invalid"  # Will raise validation error
except Exception as e:
    print(f"\nValidation on assignment: {e}")

In [None]:
# Frozen (immutable) models
class ImmutableUser(BaseModel):
    model_config = ConfigDict(frozen=True)
    
    id: int
    name: str

immutable_user = ImmutableUser(id=1, name="Alice")
print(f"Immutable User: {immutable_user}")

try:
    immutable_user.name = "Bob"  # Will raise error
except Exception as e:
    print(f"\nCannot modify frozen model: {e}")

---
## 8. Serialization and Deserialization

### Converting Models to/from Various Formats

In [None]:
class Order(BaseModel):
    order_id: int
    customer_name: str
    items: List[str]
    total_price: float
    created_at: datetime

order = Order(
    order_id=123,
    customer_name="Alice",
    items=["Laptop", "Mouse", "Keyboard"],
    total_price=1100.00,
    created_at=datetime.now()
)

# Serialization options
print("1. model_dump() - Python dict:")
print(order.model_dump())

print("\n2. model_dump_json() - JSON string:")
print(order.model_dump_json())

print("\n3. model_dump_json(indent=2) - Pretty JSON:")
print(order.model_dump_json(indent=2))

print("\n4. model_dump(exclude_unset=True) - Only set fields:")
print(order.model_dump(exclude_unset=True))

In [None]:
# Deserialization
json_data = '''
{
    "order_id": 456,
    "customer_name": "Bob",
    "items": ["Monitor", "Desk"],
    "total_price": 500.00,
    "created_at": "2025-01-15T10:30:00"
}
'''

# From JSON string
order_from_json = Order.model_validate_json(json_data)
print(f"Order from JSON: {order_from_json}")

# From dictionary
dict_data = {
    "order_id": 789,
    "customer_name": "Charlie",
    "items": ["Headphones"],
    "total_price": 150.00,
    "created_at": "2025-01-20T15:45:00"
}
order_from_dict = Order.model_validate(dict_data)
print(f"\nOrder from dict: {order_from_dict}")

In [None]:
# Custom serialization
from decimal import Decimal

class Product(BaseModel):
    name: str
    price: Decimal
    in_stock: bool
    
    @field_serializer('price')
    def serialize_price(self, v: Decimal) -> str:
        return f"${float(v):.2f}"
    
    @field_serializer('in_stock')
    def serialize_stock(self, v: bool) -> str:
        return "Available" if v else "Out of Stock"

product = Product(name="Laptop", price=Decimal("999.99"), in_stock=True)
print(f"Product: {product}")
print(f"\nSerialized: {product.model_dump()}")

---
## 9. Nested Models

### Working with Complex Hierarchical Data

In [None]:
# Nested models
class Address(BaseModel):
    street: str
    city: str
    state: str
    zip_code: str

class Contact(BaseModel):
    email: str
    phone: str

class Employee(BaseModel):
    id: int
    name: str
    address: Address  # Nested model
    contact: Contact  # Nested model
    salary: float

# Creating nested model
employee = Employee(
    id=1,
    name="Alice",
    address=Address(
        street="123 Main St",
        city="New York",
        state="NY",
        zip_code="10001"
    ),
    contact=Contact(
        email="alice@example.com",
        phone="555-1234"
    ),
    salary=80000.00
)

print(f"Employee: {employee}")
print(f"\nEmployee as JSON:\n{employee.model_dump_json(indent=2)}")

In [None]:
# Nested models from JSON
employee_json = '''
{
    "id": 2,
    "name": "Bob",
    "address": {
        "street": "456 Oak Ave",
        "city": "Boston",
        "state": "MA",
        "zip_code": "02101"
    },
    "contact": {
        "email": "bob@example.com",
        "phone": "555-5678"
    },
    "salary": 90000.00
}
'''

employee2 = Employee.model_validate_json(employee_json)
print(f"Employee from JSON: {employee2}")
print(f"City: {employee2.address.city}")
print(f"Email: {employee2.contact.email}")

In [None]:
# List of nested models
class Company(BaseModel):
    name: str
    employees: List[Employee]

company = Company(
    name="Tech Corp",
    employees=[
        Employee(
            id=1,
            name="Alice",
            address=Address(street="123 Main", city="NYC", state="NY", zip_code="10001"),
            contact=Contact(email="alice@tech.com", phone="555-1111"),
            salary=80000
        ),
        Employee(
            id=2,
            name="Bob",
            address=Address(street="456 Oak", city="Boston", state="MA", zip_code="02101"),
            contact=Contact(email="bob@tech.com", phone="555-2222"),
            salary=85000
        )
    ]
)

print(f"Company: {company.name}")
print(f"Number of employees: {len(company.employees)}")
for emp in company.employees:
    print(f"  - {emp.name}: {emp.contact.email}")

---
## 10. Root Models and Generic Models

### Advanced Model Types

In [None]:
from pydantic import RootModel
from typing import TypeVar, Generic

# Root Model - Model with single root value
class Numbers(RootModel[List[int]]):
    """A list of numbers"""
    root: List[int]
    
    def get_total(self) -> int:
        return sum(self.root)
    
    def get_average(self) -> float:
        return self.get_total() / len(self.root) if self.root else 0

numbers = Numbers([1, 2, 3, 4, 5])
print(f"Numbers: {numbers}")
print(f"Total: {numbers.get_total()}")
print(f"Average: {numbers.get_average()}")
print(f"As list: {numbers.root}")

In [None]:
# Generic Models
T = TypeVar('T')

class Response(BaseModel, Generic[T]):
    status: str
    message: str
    data: T
    timestamp: datetime = Field(default_factory=datetime.now)

# Generic response with different types
user_response = Response[User](
    status="success",
    message="User retrieved",
    data=User(id=1, name="Alice", email="alice@example.com", age=25)
)
print(f"User Response: {user_response}")

product_response = Response[Product](
    status="success",
    message="Product retrieved",
    data=Product(name="Laptop", price=999.99)
)
print(f"\nProduct Response: {product_response}")

---
## 11. Advanced Validation Patterns

### Complex Validation Scenarios

In [None]:
from pydantic import conlist
from enum import Enum

# Conditional validation
class Status(str, Enum):
    PENDING = "pending"
    COMPLETED = "completed"
    CANCELLED = "cancelled"

class PaymentInfo(BaseModel):
    card_number: Optional[str] = None
    card_holder: Optional[str] = None
    expiry_date: Optional[str] = None

class Invoice(BaseModel):
    id: int
    amount: float
    status: Status
    payment: Optional[PaymentInfo] = None
    
    @model_validator(mode='after')
    def validate_payment(self):
        if self.status == Status.COMPLETED and not self.payment:
            raise ValueError("Completed invoices must have payment info")
        if self.status in (Status.PENDING, Status.CANCELLED) and self.payment:
            raise ValueError(f"Invoice cannot have payment info when {self.status}")
        return self

# Valid invoice
invoice = Invoice(
    id=1,
    amount=100.00,
    status=Status.COMPLETED,
    payment=PaymentInfo(
        card_number="1234-5678-9012-3456",
        card_holder="John Doe",
        expiry_date="12/25"
    )
)
print(f"Valid Invoice: {invoice}")

In [None]:
# Discriminated unions
from pydantic import Field
from typing import Union

class EmailEvent(BaseModel):
    type: str = "email"
    recipient: str
    subject: str

class SMSEvent(BaseModel):
    type: str = "sms"
    phone: str
    message: str

class PushEvent(BaseModel):
    type: str = "push"
    user_id: int
    title: str
    body: str

Event = Union[EmailEvent, SMSEvent, PushEvent]

# Creating different events
events = [
    EmailEvent(recipient="user@example.com", subject="Welcome"),
    SMSEvent(phone="555-1234", message="Verification code: 1234"),
    PushEvent(user_id=1, title="Alert", body="New message received")
]

for event in events:
    print(f"{event.type}: {event}")

---
## 12. Performance and Best Practices

### Optimization Tips and Recommended Patterns

In [None]:
import time
from typing import List

# Performance: Use validation on assignment carefully
class OptimizedModel(BaseModel):
    model_config = ConfigDict(
        validate_assignment=False  # Disable for better performance
    )
    values: List[int]

class UnoptimizedModel(BaseModel):
    model_config = ConfigDict(
        validate_assignment=True  # Slower validation on each assignment
    )
    values: List[int]

# Performance test
start = time.time()
for i in range(1000):
    model = OptimizedModel(values=[1, 2, 3, 4, 5])
optimized_time = time.time() - start

start = time.time()
for i in range(1000):
    model = UnoptimizedModel(values=[1, 2, 3, 4, 5])
unoptimized_time = time.time() - start

print(f"Optimized (no validate_assignment): {optimized_time:.4f}s")
print(f"Unoptimized (validate_assignment): {unoptimized_time:.4f}s")
print(f"Difference: {unoptimized_time - optimized_time:.4f}s ({(unoptimized_time/optimized_time - 1)*100:.1f}% slower)")

In [None]:
# Best Practices
print("""\n=== PYDANTIC BEST PRACTICES ===""")
print("""
1. **Type Hints**: Always use explicit type hints
   - Better validation and IDE support
   - Enables static type checking

2. **Field Validation**: Use Field() for constraints
   - More readable than custom validators
   - Better error messages

3. **Custom Validators**: Only when Field() isn't enough
   - Use field_validator for single fields
   - Use model_validator for cross-field validation

4. **Configuration**: Use ConfigDict for model behavior
   - Control validation, serialization, and assignment
   - Configure globally for consistency

5. **Serialization**: Choose appropriate method
   - model_dump() for Python objects
   - model_dump_json() for APIs
   - Use aliases for external APIs

6. **Error Handling**: Always catch ValidationError
   - Provides detailed error information
   - Use error details for user feedback

7. **Performance**: Consider validation costs
   - Disable validate_assignment if not needed
   - Use frozen=True for immutable objects
   - Cache JSON schemas

8. **Testing**: Validate test data thoroughly
   - Use pytest fixtures with Pydantic models
   - Test edge cases and invalid inputs

9. **API Design**: Use Pydantic models for APIs
   - Request validation with Pydantic
   - Response serialization
   - Automatic OpenAPI schema generation (with FastAPI)

10. **Documentation**: Use Field descriptions
    - Generates better JSON schemas
    - Helps API documentation
""")

In [None]:
# Real-world example: API request/response
from typing import Optional
from enum import Enum

class SortOrder(str, Enum):
    ASC = "asc"
    DESC = "desc"

class UserQuery(BaseModel):
    """User search query"""
    page: int = Field(default=1, ge=1, description="Page number (1-indexed)")
    limit: int = Field(default=10, ge=1, le=100, description="Items per page")
    sort_by: str = Field(default="name", description="Field to sort by")
    sort_order: SortOrder = Field(default=SortOrder.ASC, description="Sort order")
    search: Optional[str] = Field(default=None, description="Search query")
    
    @field_validator('sort_by')
    @classmethod
    def validate_sort_field(cls, v):
        allowed = ['name', 'email', 'created_at']
        if v not in allowed:
            raise ValueError(f'sort_by must be one of {allowed}')
        return v

class UserResponse(BaseModel):
    """User API response"""
    id: int
    name: str
    email: str
    created_at: datetime

class QueryResult(BaseModel, Generic[T]):
    """Generic query result"""
    data: List[T]
    total: int
    page: int
    limit: int
    total_pages: int

# Example usage
query = UserQuery(page=1, limit=20, sort_by="email", sort_order=SortOrder.DESC)
print(f"Query: {query}")
print(f"\nQuery JSON: {query.model_dump_json(indent=2)}")

# Simulated result
result = QueryResult[
UserResponse](
    data=[
        UserResponse(id=1, name="Alice", email="alice@example.com", created_at=datetime.now()),
        UserResponse(id=2, name="Bob", email="bob@example.com", created_at=datetime.now()),
    ],
    total=2,
    page=1,
    limit=20,
    total_pages=1
)
print(f"\nResult JSON: {result.model_dump_json(indent=2)}")

---
## Summary

### Key Concepts

1. **BaseModel**: Foundation for all Pydantic models
2. **Type Hints**: Enable automatic validation
3. **Field**: Advanced field configuration
4. **Validators**: Custom validation logic
5. **ConfigDict**: Control model behavior
6. **Serialization**: Convert to JSON/dict
7. **Nested Models**: Complex hierarchical data
8. **Generics**: Reusable model templates

### Common Use Cases

- ‚úÖ REST API validation (request/response)
- ‚úÖ Configuration management
- ‚úÖ Data pipeline validation
- ‚úÖ Database model validation
- ‚úÖ Type-safe data processing
- ‚úÖ JSON schema generation

### Resources

- Official Documentation: https://docs.pydantic.dev
- GitHub Repository: https://github.com/pydantic/pydantic
- Pydantic with FastAPI: https://fastapi.tiangolo.com

### Next Steps

1. Integrate Pydantic with FastAPI for APIs
2. Learn ORM models with SQLAlchemy
3. Explore Pydantic plugins and extensions
4. Implement complex validation patterns
5. Study performance optimization techniques

---

**Happy Coding! üêç**