# Part II: Data Validation and Serialization with Pydantic

## Chapter 5: Advanced Data Handling

As your applications grow in complexity, you'll need more sophisticated data handling capabilities. This chapter explores advanced Pydantic features including serialization control, file uploads, form data handling, and configuration management with `pydantic-settings`.

---

### 5.1 `BaseModel` vs `dataclasses`: Choosing the Right Tool

Python's standard library includes `dataclasses`, which provides similar functionality to Pydantic's `BaseModel`. Understanding when to use each is crucial for writing clean, efficient code.

#### Python `dataclasses` Overview

Introduced in Python 3.7, `dataclasses` provides a decorator and functions for automatically adding special methods to classes:

```python
from dataclasses import dataclass, field
from datetime import datetime


@dataclass
class User:
    """A simple dataclass for user data."""

    id: int
    username: str
    email: str
    is_active: bool = True
    created_at: datetime = field(default_factory=datetime.now)


# Creating instances
user1 = User(id=1, username="alice", email="alice@example.com")
user2 = User(id=1, username="alice", email="alice@example.com")

print(user1)  # User(id=1, username='alice', email='alice@example.com', is_active=True, created_at=...)
print(user1 == user2)  # True (dataclasses implement __eq__)
```

#### Pydantic `BaseModel` Overview

Pydantic's `BaseModel` provides validation on top of data storage:

```python
from pydantic import BaseModel, ValidationError
from datetime import datetime


class User(BaseModel):
    """A Pydantic model for user data with validation."""

    id: int
    username: str
    email: str
    is_active: bool = True
    created_at: datetime = None


# Creating instances with validation
user1 = User(id=1, username="alice", email="alice@example.com")
user2 = User(id=1, username="alice", email="alice@example.com")

print(user1)  # id=1 username='alice' email='alice@example.com' is_active=True created_at=None
print(user1 == user2)  # True

# Type coercion
user3 = User(id="3", username="charlie", email="charlie@example.com")
print(user3.id)  # 3 (string was coerced to int)
print(type(user3.id))  # <class 'int'>

# Validation errors
try:
    user4 = User(id="not_a_number", username="diana", email="diana@example.com")
except ValidationError as e:
    print(e.errors()[0]["msg"])  # Input should be a valid integer...
```

#### Feature Comparison

| Feature | `@dataclass` | Pydantic `BaseModel` |
|---------|--------------|---------------------|
| Type hints | Required | Required |
| Runtime validation | ❌ No | ✅ Yes |
| Type coercion | ❌ No | ✅ Yes |
| JSON serialization | Manual (`asdict`) | Built-in (`model_dump_json`) |
| JSON deserialization | Manual | Built-in (`model_validate_json`) |
| Default factories | `field(default_factory=...)` | `Field(default_factory=...)` |
| Field validation | Manual | Built-in with `Field()` |
| Custom validators | Manual | `@field_validator` |
| OpenAPI schema | ❌ No | ✅ Yes |
| Performance | Faster | Slower (due to validation) |
| Immutable models | `frozen=True` | `frozen=True` in config |

#### Pydantic Dataclasses: The Best of Both Worlds

Pydantic provides a `@dataclass` decorator that adds validation to standard dataclasses:

```python
from pydantic import ValidationError
from pydantic.dataclasses import dataclass


@dataclass
class User:
    id: int
    username: str
    email: str
    age: int = 0


# Validation is enabled!
user = User(id=1, username="alice", email="alice@example.com", age="25")
print(user.age)  # 25 (string coerced to int)
print(type(user.age))  # <class 'int'>

# Invalid data raises ValidationError
try:
    User(id="invalid", username="bob", email="bob@example.com")
except ValidationError as e:
    print(e.errors()[0]["type"])  # int_parsing
```

#### When to Use Each

**Use Python `@dataclass` when:**
- You need simple data containers without validation
- Performance is critical (no validation overhead)
- You're working in a pure Python environment
- You don't need JSON serialization/deserialization

**Use Pydantic `BaseModel` when:**
- You need runtime validation
- You're building APIs with FastAPI
- You need JSON schema generation
- You need complex field validation
- You need serialization with aliases and exclusions

**Use Pydantic `@dataclass` when:**
- You want validation with dataclass behavior
- You need `__eq__`, `__repr__`, etc. based on fields
- You prefer dataclass syntax but need Pydantic features

```python
from dataclasses import dataclass as python_dataclass
from pydantic import BaseModel, ValidationError
from pydantic.dataclasses import dataclass as pydantic_dataclass


# Standard Python dataclass - no validation
@python_dataclass
class PythonUser:
    id: int
    name: str


# Pydantic BaseModel - full validation
class PydanticUser(BaseModel):
    id: int
    name: str


# Pydantic dataclass - validation with dataclass features
@pydantic_dataclass
class PydanticDataclassUser:
    id: int
    name: str


# Testing
python_user = PythonUser(id="invalid", name=123)  # No error - types not enforced
print(python_user.id)  # "invalid" (string, not int!)

try:
    pydantic_user = PydanticUser(id="invalid", name=123)  # ValidationError
except ValidationError as e:
    print(f"Pydantic BaseModel error: {e.errors()[0]['type']}")

try:
    pydantic_dc_user = PydanticDataclassUser(id="invalid", name=123)  # ValidationError
except ValidationError as e:
    print(f"Pydantic dataclass error: {e.errors()[0]['type']}")
```

#### Converting Between Types

```python
from dataclasses import dataclass, asdict
from pydantic import BaseModel


@dataclass
class DataclassUser:
    id: int
    name: str
    email: str


class PydanticUser(BaseModel):
    id: int
    name: str
    email: str


# Dataclass to dict (standard library)
dc_user = DataclassUser(id=1, name="Alice", email="alice@example.com")
dc_dict = asdict(dc_user)
print(dc_dict)  # {'id': 1, 'name': 'Alice', 'email': 'alice@example.com'}

# Pydantic to dict (built-in)
p_user = PydanticUser(id=1, name="Alice", email="alice@example.com")
p_dict = p_user.model_dump()
print(p_dict)  # {'id': 1, 'name': 'Alice', 'email': 'alice@example.com'}

# Converting dataclass to Pydantic
dc_user = DataclassUser(id=1, name="Alice", email="alice@example.com")
p_user = PydanticUser.model_validate(asdict(dc_user))
print(p_user)  # id=1 name='Alice' email='alice@example.com'

# Pydantic to JSON
json_str = p_user.model_dump_json()
print(json_str)  # '{"id":1,"name":"Alice","email":"alice@example.com"}'

# JSON to Pydantic
p_user2 = PydanticUser.model_validate_json(json_str)
print(p_user2)  # id=1 name='Alice' email='alice@example.com'
```

---

> **Industry Standard:** For FastAPI applications, always use Pydantic `BaseModel`. The validation and serialization features are essential for API development. Use standard dataclasses only for internal data structures that don't interact with external APIs.

---

### 5.2 Serialization: Controlling JSON Output, Aliases, and Excluding Fields

Serialization is the process of converting a model instance to a format suitable for storage or transmission (typically JSON or dictionaries). Pydantic provides extensive control over this process.

#### Basic Serialization

```python
from pydantic import BaseModel
from datetime import datetime


class User(BaseModel):
    id: int
    username: str
    email: str
    created_at: datetime


user = User(
    id=1,
    username="alice",
    email="alice@example.com",
    created_at=datetime(2024, 1, 15, 10, 30, 0),
)

# Convert to dictionary
user_dict = user.model_dump()
print(user_dict)
# {'id': 1, 'username': 'alice', 'email': 'alice@example.com', 'created_at': datetime.datetime(2024, 1, 15, 10, 30)}

# Convert to JSON string
user_json = user.model_dump_json()
print(user_json)
# '{"id":1,"username":"alice","email":"alice@example.com","created_at":"2024-01-15T10:30:00"}'
```

#### Excluding Fields

Sometimes you need to exclude sensitive or unnecessary fields from output:

```python
from pydantic import BaseModel, Field


class User(BaseModel):
    id: int
    username: str
    email: str
    password: str = Field(exclude=True)  # Always excluded
    api_key: str = Field(exclude=True)   # Always excluded
    is_admin: bool = False


user = User(
    id=1,
    username="alice",
    email="alice@example.com",
    password="secret123",
    api_key="key-abc-123",
    is_admin=True,
)

# password and api_key are automatically excluded
print(user.model_dump())
# {'id': 1, 'username': 'alice', 'email': 'alice@example.com', 'is_admin': True}

# They're also excluded from JSON
print(user.model_dump_json())
# '{"id":1,"username":"alice","email":"alice@example.com","is_admin":true}'
```

#### Dynamic Field Exclusion

For runtime control over which fields to exclude:

```python
from pydantic import BaseModel


class Product(BaseModel):
    id: int
    name: str
    price: float
    cost: float  # Internal cost - may want to exclude
    stock: int
    supplier: str  # May want to exclude for customers


product = Product(
    id=1,
    name="Laptop",
    price=999.99,
    cost=500.00,
    stock=50,
    supplier="TechSupplier Inc.",
)

# Exclude specific fields
customer_view = product.model_dump(exclude={"cost", "supplier"})
print(customer_view)
# {'id': 1, 'name': 'Laptop', 'price': 999.99, 'stock': 50}

# Exclude nested fields (if model had nested structures)
# user.model_dump(exclude={"address": {"street", "city"}})

# Include only specific fields
minimal_view = product.model_dump(include={"id", "name", "price"})
print(minimal_view)
# {'id': 1, 'name': 'Laptop', 'price': 999.99}
```

#### Field Aliases

Aliases allow different names in input vs output, useful for:
- Converting snake_case to camelCase for APIs
- Handling reserved Python keywords
- Working with legacy data formats

```python
from pydantic import BaseModel, Field


class User(BaseModel):
    id: int = Field(alias="userId")
    username: str = Field(alias="userName")
    email_address: str = Field(alias="emailAddress")
    is_active: bool = Field(alias="isActive", default=True)

    model_config = {"populate_by_name": True}  # Allow both alias and field name


# Input uses aliases
user = User(userId=1, userName="alice", emailAddress="alice@example.com")
print(user.username)  # "alice" (accessed by field name)

# Output uses field names by default
print(user.model_dump())
# {'id': 1, 'username': 'alice', 'email_address': 'alice@example.com', 'is_active': True}

# Output uses aliases with by_alias=True
print(user.model_dump(by_alias=True))
# {'userId': 1, 'userName': 'alice', 'emailAddress': 'alice@example.com', 'isActive': True}

# JSON with aliases
print(user.model_dump_json(by_alias=True))
# '{"userId":1,"userName":"alice","emailAddress":"alice@example.com","isActive":true}'
```

#### Serialization Aliases

Use `serialization_alias` to have different aliases for input and output:

```python
from pydantic import BaseModel, Field


class Product(BaseModel):
    id: int
    name: str = Field(
        validation_alias="product_name",  # Used for input/validation
        serialization_alias="productName",  # Used for output/serialization
    )
    unit_price: float = Field(
        validation_alias="price",
        serialization_alias="unitPrice",
    )


# Input uses validation_alias
product = Product(id=1, product_name="Laptop", price=999.99)
print(product.name)  # "Laptop"
print(product.unit_price)  # 999.99

# Output uses serialization_alias
print(product.model_dump(by_alias=True))
# {'id': 1, 'productName': 'Laptop', 'unitPrice': 999.99}
```

#### Serialization Mode: `json` vs `python`

Control how values are serialized:

```python
from pydantic import BaseModel
from datetime import datetime


class Event(BaseModel):
    name: str
    occurred_at: datetime


event = Event(name="Launch", occurred_at=datetime(2024, 1, 15, 10, 30, 0))

# Python mode: keeps Python types (datetime stays as datetime)
python_dict = event.model_dump(mode="python")
print(type(python_dict["occurred_at"]))  # <class 'datetime.datetime'>

# JSON mode: converts to JSON-compatible types (datetime becomes ISO string)
json_dict = event.model_dump(mode="json")
print(json_dict["occurred_at"])  # "2024-01-15T10:30:00"
print(type(json_dict["occurred_at"]))  # <class 'str'>
```

#### Custom Serializers

For complex serialization logic, use custom serializers:

```python
from datetime import datetime
from typing import Annotated, Any

from pydantic import BaseModel, field_serializer


class User(BaseModel):
    id: int
    username: str
    created_at: datetime

    @field_serializer("created_at")
    def serialize_created_at(self, value: datetime) -> str:
        """Custom serializer for created_at field."""
        return value.strftime("%Y-%m-%d %H:%M:%S")


user = User(
    id=1,
    username="alice",
    created_at=datetime(2024, 1, 15, 10, 30, 0),
)

print(user.model_dump())
# {'id': 1, 'username': 'alice', 'created_at': '2024-01-15 10:30:00'}
```

#### Multiple Custom Serializers

```python
from datetime import datetime
from decimal import Decimal
from typing import Any

from pydantic import BaseModel, field_serializer, model_serializer


class Product(BaseModel):
    id: int
    name: str
    price: Decimal
    created_at: datetime
    updated_at: datetime | None = None

    @field_serializer("price", when_used="json")
    def serialize_price(self, value: Decimal) -> str:
        """Serialize price as string with 2 decimal places for JSON."""
        return f"${float(value):.2f}"

    @field_serializer("created_at", "updated_at")
    def serialize_datetime(self, value: datetime | None) -> str | None:
        """Serialize datetimes as ISO format."""
        if value is None:
            return None
        return value.isoformat()


product = Product(
    id=1,
    name="Laptop",
    price=Decimal("999.99"),
    created_at=datetime(2024, 1, 15, 10, 30, 0),
)

# JSON mode - price gets formatted
print(product.model_dump(mode="json"))
# {'id': 1, 'name': 'Laptop', 'price': '$999.99', 'created_at': '2024-01-15T10:30:00', 'updated_at': None}

# Python mode - price stays as Decimal
print(product.model_dump(mode="python"))
# {'id': 1, 'name': 'Laptop', 'price': Decimal('999.99'), 'created_at': datetime.datetime(2024, 1, 15, 10, 30), 'updated_at': None}
```

#### Complete Serialization Example

```python
from datetime import datetime
from decimal import Decimal
from typing import Annotated, Any

from pydantic import BaseModel, Field, field_serializer


class Address(BaseModel):
    street: str = Field(serialization_alias="streetAddress")
    city: str
    state: str = Field(serialization_alias="stateCode")
    zip_code: str = Field(serialization_alias="postalCode")


class Product(BaseModel):
    id: int
    name: str
    description: str | None = Field(default=None, exclude=True)  # Always excluded
    price: Decimal
    cost: Decimal = Field(exclude=True)  # Always excluded
    stock: int
    created_at: datetime

    @field_serializer("price")
    def serialize_price(self, value: Decimal) -> float:
        return float(value)

    @field_serializer("created_at")
    def serialize_created_at(self, value: datetime) -> str:
        return value.strftime("%Y-%m-%d")


class Order(BaseModel):
    id: int = Field(serialization_alias="orderId")
    customer_name: str = Field(serialization_alias="customerName")
    products: list[Product]
    shipping_address: Address = Field(serialization_alias="shippingAddress")
    total: Decimal

    @field_serializer("total")
    def serialize_total(self, value: Decimal) -> str:
        return f"${value:.2f}"


# Create an order
order = Order(
    id=1001,
    customer_name="Alice Johnson",
    products=[
        Product(
            id=1,
            name="Laptop",
            description="A powerful laptop",  # Will be excluded
            price=Decimal("999.99"),
            cost=Decimal("500.00"),  # Will be excluded
            stock=50,
            created_at=datetime(2024, 1, 15),
        ),
        Product(
            id=2,
            name="Mouse",
            description="Wireless mouse",  # Will be excluded
            price=Decimal("29.99"),
            cost=Decimal("10.00"),  # Will be excluded
            stock=200,
            created_at=datetime(2024, 1, 16),
        ),
    ],
    shipping_address=Address(
        street="123 Main St",
        city="San Francisco",
        state="CA",
        zip_code="94105",
    ),
    total=Decimal("1029.98"),
)

# Serialize with aliases
print(order.model_dump(by_alias=True, mode="json"))
# {
#     'orderId': 1001,
#     'customerName': 'Alice Johnson',
#     'products': [
#         {'id': 1, 'name': 'Laptop', 'price': 999.99, 'stock': 50, 'created_at': '2024-01-15'},
#         {'id': 2, 'name': 'Mouse', 'price': 29.99, 'stock': 200, 'created_at': '2024-01-16'}
#     ],
#     'shippingAddress': {'streetAddress': '123 Main St', 'city': 'San Francisco', 'stateCode': 'CA', 'postalCode': '94105'},
#     'total': '$1029.98'
# }
```

---

### 5.3 Parsing & Validation: Handling Form Data and File Uploads

Web APIs often need to handle form data and file uploads. FastAPI provides seamless integration with Pydantic for these use cases.

#### Handling Form Data

Form data is typically sent as `application/x-www-form-urlencoded` or `multipart/form-data`. FastAPI provides the `Form` class to handle this:

```python
from fastapi import FastAPI, Form

app = FastAPI()


@app.post("/login")
async def login(
    username: str = Form(...),
    password: str = Form(...),
    remember_me: bool = Form(default=False),
):
    """
    Handle login form submission.

    Form fields are declared with Form() instead of regular parameters.
    """
    return {
        "username": username,
        "remember_me": remember_me,
        # Never return passwords in responses!
    }
```

**Testing with curl:**

```bash
curl -X POST http://localhost:8000/login \
  -H "Content-Type: application/x-www-form-urlencoded" \
  -d "username=alice&password=secret&remember_me=true"
```

#### Form Data with Pydantic Models

For complex forms, use `Form` parameters with Pydantic models:

```python
from fastapi import FastAPI, Form
from pydantic import BaseModel, EmailStr

app = FastAPI()


class RegistrationForm(BaseModel):
    username: str
    email: EmailStr
    password: str
    confirm_password: str
    accept_terms: bool


@app.post("/register", response_model=RegistrationForm)
async def register(
    username: str = Form(...),
    email: str = Form(...),
    password: str = Form(...),
    confirm_password: str = Form(...),
    accept_terms: bool = Form(...),
):
    """
    Handle registration form.

    We collect individual form fields and create a Pydantic model.
    """
    form_data = RegistrationForm(
        username=username,
        email=email,
        password=password,
        confirm_password=confirm_password,
        accept_terms=accept_terms,
    )

    # Additional validation
    if form_data.password != form_data.confirm_password:
        from fastapi import HTTPException

        raise HTTPException(status_code=400, detail="Passwords do not match")

    if not form_data.accept_terms:
        from fastapi import HTTPException

        raise HTTPException(status_code=400, detail="Must accept terms and conditions")

    return form_data
```

#### Using `as_form` for Automatic Form Parsing

Create a reusable decorator to parse forms directly into Pydantic models:

```python
from fastapi import FastAPI, Form, Request
from pydantic import BaseModel, EmailStr

app = FastAPI()


def as_form(cls):
    """
    Decorator to add an as_form class method to a Pydantic model.
    This allows the model to be populated from form data.
    """

    @classmethod
    def get_form(cls, **data):
        return cls(**data)

    cls.as_form = get_form
    return cls


# Alternative: Use pydantic's built-in approach with Form parameters
class UserForm(BaseModel):
    username: str
    email: EmailStr
    bio: str | None = None


@app.post("/profile")
async def update_profile(
    username: str = Form(...),
    email: str = Form(...),
    bio: str | None = Form(default=None),
):
    """Update user profile from form data."""
    form = UserForm(username=username, email=email, bio=bio)
    return {"message": "Profile updated", "user": form.model_dump()}
```

#### File Uploads

FastAPI handles file uploads with the `UploadFile` and `File` classes:

```python
from fastapi import FastAPI, File, UploadFile
from pathlib import Path

app = FastAPI()


@app.post("/upload")
async def upload_file(file: UploadFile):
    """
    Upload a single file.

    UploadFile provides:
    - filename: Original filename
    - content_type: MIME type
    - file: Spooled temporary file
    - read(): Read file contents
    - write(): Write to file
    """
    contents = await file.read()

    # Get file info
    return {
        "filename": file.filename,
        "content_type": file.content_type,
        "size": len(contents),
    }


@app.post("/upload-multiple")
async def upload_multiple_files(files: list[UploadFile] = File(...)):
    """Upload multiple files."""
    results = []
    for file in files:
        contents = await file.read()
        results.append(
            {
                "filename": file.filename,
                "content_type": file.content_type,
                "size": len(contents),
            }
        )
    return {"files": results}
```

#### Saving Uploaded Files

```python
import os
from pathlib import Path
from fastapi import FastAPI, UploadFile, File, HTTPException

app = FastAPI()

# Create uploads directory
UPLOAD_DIR = Path("uploads")
UPLOAD_DIR.mkdir(exist_ok=True)


@app.post("/upload-save")
async def upload_and_save(file: UploadFile):
    """Upload and save a file to disk."""

    # Validate file type
    allowed_types = ["image/jpeg", "image/png", "image/gif", "application/pdf"]
    if file.content_type not in allowed_types:
        raise HTTPException(
            status_code=400,
            detail=f"File type {file.content_type} not allowed",
        )

    # Validate file size (max 10MB)
    MAX_SIZE = 10 * 1024 * 1024  # 10MB
    contents = await file.read()
    if len(contents) > MAX_SIZE:
        raise HTTPException(status_code=400, detail="File too large (max 10MB)")

    # Generate safe filename
    safe_filename = f"{file.filename}"
    file_path = UPLOAD_DIR / safe_filename

    # Write file
    with open(file_path, "wb") as f:
        f.write(contents)

    return {
        "filename": file.filename,
        "saved_as": safe_filename,
        "path": str(file_path),
        "size": len(contents),
    }
```

#### Combining Form Data and Files

Handle forms that include both text fields and file uploads:

```python
from fastapi import FastAPI, Form, File, UploadFile

app = FastAPI()


@app.post("/submit-article")
async def submit_article(
    title: str = Form(...),
    author: str = Form(...),
    content: str = Form(...),
    cover_image: UploadFile | None = File(default=None),
    tags: str = Form(default=""),
):
    """
    Submit an article with optional cover image.

    Combines form fields with file upload.
    """
    article = {
        "title": title,
        "author": author,
        "content": content,
        "tags": tags.split(",") if tags else [],
    }

    if cover_image:
        image_contents = await cover_image.read()
        article["cover_image"] = {
            "filename": cover_image.filename,
            "content_type": cover_image.content_type,
            "size": len(image_contents),
        }

    return {"article": article}
```

#### File Upload with Pydantic Validation

Create a comprehensive file upload system with validation:

```python
import os
from pathlib import Path
from typing import Annotated

from fastapi import FastAPI, UploadFile, File, HTTPException
from pydantic import BaseModel

app = FastAPI()

UPLOAD_DIR = Path("uploads")
UPLOAD_DIR.mkdir(exist_ok=True)


class FileMetadata(BaseModel):
    filename: str
    content_type: str
    size: int
    path: str


class UploadResponse(BaseModel):
    success: bool
    message: str
    file: FileMetadata | None = None


ALLOWED_TYPES = {
    "image/jpeg": "jpg",
    "image/png": "png",
    "image/gif": "gif",
    "application/pdf": "pdf",
    "text/plain": "txt",
}

MAX_SIZE = 10 * 1024 * 1024  # 10MB


def validate_file(file: UploadFile) -> tuple[bool, str]:
    """Validate file type and return extension."""
    if file.content_type not in ALLOWED_TYPES:
        return False, f"File type {file.content_type} not allowed"

    return True, ALLOWED_TYPES[file.content_type]


@app.post("/upload-validated", response_model=UploadResponse)
async def upload_validated(
    file: Annotated[UploadFile, File(description="File to upload")],
):
    """Upload a file with comprehensive validation."""

    # Validate file type
    is_valid, extension = validate_file(file)
    if not is_valid:
        raise HTTPException(status_code=400, detail=extension)

    # Read file contents
    contents = await file.read()

    # Validate file size
    if len(contents) > MAX_SIZE:
        raise HTTPException(status_code=400, detail="File too large (max 10MB)")

    # Generate unique filename
    import uuid

    unique_id = uuid.uuid4().hex[:8]
    original_name = Path(file.filename).stem
    safe_filename = f"{original_name}_{unique_id}.{extension}"
    file_path = UPLOAD_DIR / safe_filename

    # Save file
    with open(file_path, "wb") as f:
        f.write(contents)

    return UploadResponse(
        success=True,
        message="File uploaded successfully",
        file=FileMetadata(
            filename=file.filename,
            content_type=file.content_type,
            size=len(contents),
            path=str(file_path),
        ),
    )
```

#### Streaming File Uploads

For large files, streaming is more memory-efficient:

```python
from fastapi import FastAPI, UploadFile, File

app = FastAPI()

UPLOAD_DIR = Path("uploads")
UPLOAD_DIR.mkdir(exist_ok=True)


@app.post("/upload-stream")
async def upload_stream(file: UploadFile):
    """
    Upload a large file using streaming.

    This is more memory-efficient for large files
    as it doesn't load the entire file into memory.
    """
    file_path = UPLOAD_DIR / file.filename

    # Stream the file to disk
    with open(file_path, "wb") as f:
        while chunk := await file.read(1024 * 1024):  # Read in 1MB chunks
            f.write(chunk)

    return {"filename": file.filename, "message": "File uploaded successfully"}
```

---

### 5.4 Settings Management: Using `pydantic-settings` for Environment Variables and Configuration

Managing configuration across different environments (development, staging, production) is crucial for any application. `pydantic-settings` provides a robust solution for loading settings from environment variables, `.env` files, and other sources.

#### Installing pydantic-settings

```bash
# Using uv
uv add pydantic-settings

# Using pip
pip install pydantic-settings
```

#### Basic Settings Class

```python
from pydantic_settings import BaseSettings


class Settings(BaseSettings):
    """Application settings loaded from environment variables."""

    app_name: str = "My API"
    debug: bool = False
    database_url: str = "sqlite:///./app.db"
    secret_key: str = "change-me-in-production"


# Create settings instance
settings = Settings()

print(settings.app_name)      # "My API" (or value from APP_NAME env var)
print(settings.debug)         # False (or value from DEBUG env var)
print(settings.database_url)  # "sqlite:///./app.db" (or from DATABASE_URL)
```

#### Environment Variable Naming Convention

`pydantic-settings` automatically maps field names to environment variables:
- Field names are converted to uppercase
- Underscores are preserved
- Example: `database_url` → `DATABASE_URL`

```python
# .env file
APP_NAME=Production API
DEBUG=false
DATABASE_URL=postgresql://user:pass@localhost/db
SECRET_KEY=super-secret-key-for-production
```

```python
from pydantic_settings import BaseSettings


class Settings(BaseSettings):
    app_name: str
    debug: bool
    database_url: str
    secret_key: str


# Settings automatically loaded from .env
settings = Settings()
print(settings.app_name)  # "Production API"
```

#### Loading from `.env` Files

```python
from pydantic_settings import BaseSettings, SettingsConfigDict


class Settings(BaseSettings):
    model_config = SettingsConfigDict(
        env_file=".env",  # Path to .env file
        env_file_encoding="utf-8",
        case_sensitive=False,  # APP_NAME and app_name are equivalent
        extra="ignore",  # Ignore extra env vars not in model
    )

    app_name: str = "My API"
    database_url: str
    secret_key: str


settings = Settings()
```

#### Different Environments

```python
from pydantic_settings import BaseSettings, SettingsConfigDict


class Settings(BaseSettings):
    model_config = SettingsConfigDict(
        env_file=".env",  # Default .env
        env_file_encoding="utf-8",
    )

    # Application
    app_name: str = "My API"
    debug: bool = False
    environment: str = "development"

    # Database
    database_url: str
    database_pool_size: int = 5

    # Security
    secret_key: str
    allowed_hosts: list[str] = ["localhost", "127.0.0.1"]


# Load environment-specific .env file
import os

env = os.getenv("ENVIRONMENT", "development")
env_file = f".env.{env}"  # .env.development, .env.production, etc.


class EnvironmentSettings(BaseSettings):
    model_config = SettingsConfigDict(
        env_file=env_file,
        env_file_encoding="utf-8",
    )

    app_name: str = "My API"
    debug: bool = False
    database_url: str
    secret_key: str


settings = EnvironmentSettings()
```

#### Nested Settings

For complex configurations, organize settings into nested models:

```python
from pydantic import Field
from pydantic_settings import BaseSettings


class DatabaseSettings(BaseSettings):
    """Database-specific settings."""

    host: str = "localhost"
    port: int = 5432
    name: str = "app_db"
    user: str = "postgres"
    password: str = ""

    @property
    def url(self) -> str:
        """Generate database URL."""
        return f"postgresql://{self.user}:{self.password}@{self.host}:{self.port}/{self.name}"


class RedisSettings(BaseSettings):
    """Redis-specific settings."""

    host: str = "localhost"
    port: int = 6379
    db: int = 0
    password: str | None = None

    @property
    def url(self) -> str:
        """Generate Redis URL."""
        if self.password:
            return f"redis://:{self.password}@{self.host}:{self.port}/{self.db}"
        return f"redis://{self.host}:{self.port}/{self.db}"


class SecuritySettings(BaseSettings):
    """Security-specific settings."""

    secret_key: str = Field(default="change-me")
    algorithm: str = "HS256"
    access_token_expire_minutes: int = 30


class Settings(BaseSettings):
    """Main application settings."""

    # Application
    app_name: str = "My API"
    debug: bool = False
    environment: str = "development"

    # Nested settings - loaded from DB_, REDIS_, SECURITY_ prefixes
    database: DatabaseSettings = Field(default_factory=DatabaseSettings)
    redis: RedisSettings = Field(default_factory=RedisSettings)
    security: SecuritySettings = Field(default_factory=SecuritySettings)


settings = Settings()

print(settings.database.url)
print(settings.redis.url)
print(settings.security.secret_key)
```

#### Required vs Optional Settings

```python
from pydantic import Field, ValidationError
from pydantic_settings import BaseSettings


class Settings(BaseSettings):
    # Required settings - no default, must be provided
    database_url: str  # Raises error if not set
    secret_key: str  # Raises error if not set

    # Optional settings - have defaults
    app_name: str = "My API"
    debug: bool = False
    max_connections: int = 100

    # Optional with None default
    redis_url: str | None = None


# This will raise ValidationError if DATABASE_URL or SECRET_KEY are not set
try:
    settings = Settings()
except ValidationError as e:
    print("Configuration error:")
    for error in e.errors():
        print(f"  - {error['loc'][0]}: {error['msg']}")
```

#### Validation in Settings

Use the same validators as Pydantic models:

```python
from pydantic import Field, field_validator, model_validator
from pydantic_settings import BaseSettings


class Settings(BaseSettings):
    # Database
    database_url: str
    database_pool_size: int = Field(default=5, ge=1, le=100)

    # Security
    secret_key: str = Field(min_length=32)
    access_token_expire_minutes: int = Field(default=30, ge=1)

    # CORS
    allowed_origins: list[str] = Field(default_factory=lambda: ["http://localhost:3000"])

    @field_validator("secret_key")
    @classmethod
    def validate_secret_key(cls, v: str) -> str:
        if v == "change-me" or v == "secret":
            raise ValueError("Secret key must be changed from default value")
        return v

    @field_validator("allowed_origins", mode="before")
    @classmethod
    def parse_origins(cls, v: str | list[str]) -> list[str]:
        """Parse comma-separated origins from environment variable."""
        if isinstance(v, str):
            return [origin.strip() for origin in v.split(",")]
        return v


# Environment variables:
# DATABASE_URL=postgresql://localhost/db
# SECRET_KEY=this-is-a-very-long-secret-key-for-production
# ALLOWED_ORIGINS=http://localhost:3000,https://example.com

settings = Settings()
print(settings.allowed_origins)  # ["http://localhost:3000", "https://example.com"]
```

#### Complete Settings Example for FastAPI

```python
# config.py
from functools import lru_cache
from typing import Annotated

from pydantic import Field, field_validator
from pydantic_settings import BaseSettings, SettingsConfigDict


class DatabaseSettings(BaseSettings):
    """Database configuration."""

    model_config = SettingsConfigDict(env_prefix="DB_")

    host: str = "localhost"
    port: int = 5432
    name: str = "app"
    user: str = "postgres"
    password: str = ""

    @property
    def url(self) -> str:
        return f"postgresql://{self.user}:{self.password}@{self.host}:{self.port}/{self.name}"


class RedisSettings(BaseSettings):
    """Redis configuration."""

    model_config = SettingsConfigDict(env_prefix="REDIS_")

    host: str = "localhost"
    port: int = 6379
    db: int = 0
    password: str | None = None

    @property
    def url(self) -> str:
        if self.password:
            return f"redis://:{self.password}@{self.host}:{self.port}/{self.db}"
        return f"redis://{self.host}:{self.port}/{self.db}"


class Settings(BaseSettings):
    """Application settings."""

    model_config = SettingsConfigDict(
        env_file=".env",
        env_file_encoding="utf-8",
        case_sensitive=False,
        extra="ignore",
    )

    # Application
    app_name: str = "FastAPI Application"
    app_version: str = "1.0.0"
    debug: bool = False
    environment: str = "development"

    # Server
    host: str = "0.0.0.0"
    port: int = 8000
    workers: int = 1

    # Security
    secret_key: str = Field(default="change-me-in-production", min_length=32)
    algorithm: str = "HS256"
    access_token_expire_minutes: int = 30

    # CORS
    allowed_origins: list[str] = Field(default_factory=lambda: ["http://localhost:3000"])

    # Rate Limiting
    rate_limit_requests: int = 100
    rate_limit_period_seconds: int = 60

    # Nested settings
    database: DatabaseSettings = Field(default_factory=DatabaseSettings)
    redis: RedisSettings = Field(default_factory=RedisSettings)

    @field_validator("allowed_origins", mode="before")
    @classmethod
    def parse_cors_origins(cls, v: str | list[str]) -> list[str]:
        if isinstance(v, str):
            return [origin.strip() for origin in v.split(",")]
        return v

    @field_validator("environment")
    @classmethod
    def validate_environment(cls, v: str) -> str:
        allowed = {"development", "staging", "production", "testing"}
        if v.lower() not in allowed:
            raise ValueError(f"Environment must be one of: {allowed}")
        return v.lower()


@lru_cache
def get_settings() -> Settings:
    """
    Get settings instance (cached).

    Using lru_cache ensures settings are loaded only once.
    """
    return Settings()


# Type alias for dependency injection
SettingsDep = Annotated[Settings, Depends(get_settings)]
```

```python
# main.py
from fastapi import FastAPI, Depends
from config import Settings, get_settings

app = FastAPI()


@app.get("/info")
async def info(settings: Settings = Depends(get_settings)):
    """Get application information."""
    return {
        "app_name": settings.app_name,
        "version": settings.app_version,
        "environment": settings.environment,
        "debug": settings.debug,
    }


@app.get("/config")
async def config(settings: Settings = Depends(get_settings)):
    """Get current configuration (sanitized)."""
    return {
        "database_host": settings.database.host,
        "database_port": settings.database.port,
        "redis_host": settings.redis.host,
        "allowed_origins": settings.allowed_origins,
    }
```

**Sample `.env` file:**

```env
# Application
APP_NAME=My Production API
APP_VERSION=2.0.0
DEBUG=false
ENVIRONMENT=production

# Server
HOST=0.0.0.0
PORT=8000
WORKERS=4

# Security
SECRET_KEY=your-super-secret-key-that-is-at-least-32-characters-long
ALGORITHM=HS256
ACCESS_TOKEN_EXPIRE_MINUTES=60

# CORS
ALLOWED_ORIGINS=https://example.com,https://api.example.com

# Rate Limiting
RATE_LIMIT_REQUESTS=100
RATE_LIMIT_PERIOD_SECONDS=60

# Database
DB_HOST=postgres.example.com
DB_PORT=5432
DB_NAME=production_db
DB_USER=app_user
DB_PASSWORD=secure_password

# Redis
REDIS_HOST=redis.example.com
REDIS_PORT=6379
REDIS_DB=0
REDIS_PASSWORD=redis_password
```

---

### Summary

In this chapter, you've learned advanced data handling techniques:

1. **BaseModel vs dataclasses**: Understanding when to use each tool and how Pydantic dataclasses bridge the gap.

2. **Serialization Control**: Using aliases, field exclusion, custom serializers, and output modes to control how your models are converted to JSON and dictionaries.

3. **Form Data and File Uploads**: Handling `application/x-www-form-urlencoded` data, file uploads with validation, and combining forms with files.

4. **Settings Management**: Using `pydantic-settings` to manage configuration across environments with automatic environment variable loading, `.env` file support, and nested settings.

---

### Exercises

1. **Serialization Challenge**: Create a `User` model with:
   - Fields: `id`, `username`, `email`, `password`, `is_admin`, `created_at`
   - Password always excluded from output
   - `created_at` serialized as ISO format
   - Aliases for camelCase output (`createdAt`, `isAdmin`)
   - `is_admin` only included for admin users (conditional exclusion)

2. **File Upload API**: Create endpoints for:
   - Uploading a single image with validation (type, size)
   - Uploading multiple files with descriptions
   - A form combining text fields with a file upload

3. **Settings Configuration**: Build a settings system with:
   - Environment-specific `.env` files
   - Nested settings for database, Redis, and OAuth
   - Validation for required settings in production
   - A `/config` endpoint that returns sanitized configuration

4. **Form Handling**: Create a registration form handler that:
   - Accepts username, email, password, confirm_password, and avatar upload
   - Validates passwords match
   - Validates email format
   - Validates avatar is an image under 5MB
   - Returns user data (without password) and avatar metadata

---

### What's Next?

**Chapter 6: Dependency Injection System** will explore one of FastAPI's most powerful features:
- Understanding the `Depends` function and dependency injection
- Creating reusable dependencies for database sessions and authentication
- Dependency trees and nested dependencies
- Yield dependencies for setup and teardown logic
- Overriding dependencies for testing

