# Pydantic Is All You Need - Jason Liu

[![Twitter Handle](https://img.shields.io/badge/Twitter-@gaohongnan-blue?style=social&logo=twitter)](https://twitter.com/gaohongnan)
[![LinkedIn Profile](https://img.shields.io/badge/@gaohongnan-blue?style=social&logo=linkedin)](https://linkedin.com/in/gao-hongnan)
[![GitHub Profile](https://img.shields.io/badge/GitHub-gao--hongnan-lightgrey?style=social&logo=github)](https://github.com/gao-hongnan)
![Tag](https://img.shields.io/badge/Tag-Organized_Chaos-orange)

```{contents}
:local:
```

In [94]:
from datetime import datetime
from typing import Any, Dict, List

from pydantic import BaseModel, Field, ValidationError, ValidationInfo, field_validator
from rich.pretty import pprint
from typing_extensions import Self

In [42]:
class User(BaseModel):
    id: int = Field(..., description="The user id", examples=[1, 2, 3])
    name: str = Field(..., min_length=2, max_length=50)
    email: str = Field(..., description="Email of the user")
    birth_date: datetime
    is_active: bool = True

In [44]:
pprint(User.model_json_schema())
pprint(User.model_fields)

In [54]:
class Users(BaseModel):
    random_attribute: Dict[str, List[int]] = Field(..., description="A random attribute.")
    users: list[User]

In [53]:
pprint(Users.model_json_schema())
pprint(Users.model_fields)

Below is a showcase of how pydantic coerce, parse and validate user inputs.

In [None]:
try:
    user = User(
        id="123",                           # String input but coerced to int
        name="Alice",                       # String input with correct length
        email="alice@example.com",          # String input
        birth_date="1990-01-01T00:00:00",   # String input but parsed to datetime
        is_active="yes"                     # String input but coerced to bool
    )
    pprint(user)

    user_all_input_types_correct = User(
        id=123,
        name="Alice",
        email="alice@example.com",
        birth_date=datetime(1990, 1, 1),
        is_active=True
    )
    pprint(user_all_input_types_correct)
    assert user == user_all_input_types_correct
except ValidationError as exc:
    print("Validation error:\n")
    pprint(exc)

Below is a failed case where the parsing and validation fails, so it shows you
that actual type checking and data validation is taking place.

In [55]:
try:
    user = User(
        id="abc",  # Can't be parsed to int
        name=[1, 2, 3],
        email="not_an_email",
        birth_date="invalid_date",
        is_active=None
    )
    pprint(user)
except ValidationError as exc:
    print("Validation error:\n")
    pprint(exc)

Validation error:



## Field Validators

### Before

In [76]:
class ComplexUser(BaseModel):
    id: int
    name: str
    code: str
    status: str


Consider that your company the `id` all starts with the prefix `ID-` with _unique_
integers following it. Then the internal parser cannot coerce the string
`ID-12345` into an integer. Since the integers following `ID-` is unique,
we can just add a `field_validator` to extract the integer part and validate
it. We would want to use a `before` field validator because we want our
custom validation/parsing/coerce logic to happen _before_ the default
pydantic parsing logic.

In [77]:
try:
    model = ComplexUser(id="ID-12345", name="Prefixed ID", code="CODE_456", status="inactive")
    pprint(model)
except ValidationError as exc:
    pprint(exc)

To add the `before` validator, we can use the `field_validator` decorator.

In [78]:
class ComplexUser(BaseModel):
    id: int
    name: str
    code: str
    status: str


    @field_validator('id', mode='before')
    @classmethod
    def preprocess_id(cls: Self, v: Any) -> int:
        if isinstance(v, str) and v.startswith('ID-'):
                print(f"Preprocessing ID: {v}")
                return int(v[3:])
        return v


In [80]:
try:
    model = ComplexUser(id="ID-12345", name="John Doe", code="CODE_456", status="inactive")
    pprint(model)
except ValidationError as exc:
    pprint(exc)

Preprocessing ID: ID-12345


So we see that when the default pydantic parsing may fail, we can add
`before` field validators to handle the parsing and validation of the raw
input data first, before the default pydantic parsing logic takes over.

### After

In a similar vein, we can also add `after` field validators to handle the
parsed data after the default pydantic parsing logic has taken place.
The `after` field validator is useful for post-processing or additional
validation on parsed data. Due to the nature of the `after` field validator,
the parsed data is guaranteed to be of the correct type and is up to you
to post-process it.

Consider the case where you want to capitalize the `name` field after it
has been parsed. We will use `.title()` because we want to capitalize the
first letter of each word in the string and not just the first letter of
the entire string.

In [87]:
class ComplexUser(BaseModel):
    id: int
    name: str
    code: str
    status: str

    @field_validator('id', mode='before')
    @classmethod
    def preprocess_id(cls: Self, v: Any) -> int:
        if isinstance(v, str) and v.startswith('ID-'):
                print(f"Preprocessing ID: {v}")
                return int(v[3:])
        return v

    @field_validator('name', mode="after")
    @classmethod
    def capitalize_name(cls: Self, v: str) -> str:
        print(f"Capitalizing name: {v}")
        return v.title()

try:
    model = ComplexUser(id="ID-12345", name="john doe", code="CODE_456", status="inactive")
    pprint(model)
except ValidationError as exc:
    pprint(exc)

Preprocessing ID: ID-12345
Capitalizing name: john doe


We see that when user input a string that is all low caps `john doe`, the
`after` field validator will capitalize the first letter of each word in
the string.

However, since it happens after the validation internally, we can actually do
naughty things like changing the value of the field to something else. For example,
no one is stopping me from just returning a list of integers in the `after` field
`capitalize_name` validator.

In [88]:
class ComplexUser(BaseModel):
    id: int
    name: str
    code: str
    status: str

    @field_validator('id', mode='before')
    @classmethod
    def preprocess_id(cls: Self, v: Any) -> int:
        if isinstance(v, str) and v.startswith('ID-'):
                print(f"Preprocessing ID: {v}")
                return int(v[3:])
        return v

    @field_validator('name', mode="after")
    @classmethod
    def capitalize_name(cls: Self, v: str) -> str:
        print(f"Capitalizing name: {v}")
        return [1,2,3]

try:
    model = ComplexUser(id="ID-12345", name="john doe", code="CODE_456", status="inactive")
    pprint(model)
except ValidationError as exc:
    pprint(exc)

Preprocessing ID: ID-12345
Capitalizing name: john doe


And the code still runs without any errors. So be careful when using `after` field
validators as it can be used to change the value of the field to something else.

### Plain

Completely replaces Pydantic's internal validation and is responsible for all type checking and validation.
No other validators are called after this and this is useful when you need full control over validation logic.

In [93]:
class ComplexUser(BaseModel):
    id: int
    name: str
    code: str
    status: str

    @field_validator('id', mode='before')
    @classmethod
    def preprocess_id(cls: Self, v: Any) -> int:
        if isinstance(v, str) and v.startswith('ID-'):
                print(f"Preprocessing ID: {v}")
                return int(v[3:])
        return v

    @field_validator('name', mode="after")
    @classmethod
    def capitalize_name(cls: Self, v: str) -> str:
        print(f"Capitalizing name: {v}")
        return [1,2,3]

    @field_validator('code', mode='plain')
    @classmethod
    def validate_code(cls: Self, v: Any) -> str:
        if not isinstance(v, str) or not v.startswith('CODE_'):
            raise ValueError("Code must be a string starting with 'CODE_'")
        return v

try:
    model = ComplexUser(id="ID-12345", name="john doe", code="AAA", status="inactive")
    pprint(model)
except ValidationError as exc:
    pprint(exc)


try:
    model = ComplexUser(id="ID-12345", name="john doe", code="CODE_AAA", status="inactive")
    pprint(model)
except ValidationError as exc:
    pprint(exc)

Preprocessing ID: ID-12345
Capitalizing name: john doe


Preprocessing ID: ID-12345
Capitalizing name: john doe


### Wrap

See [discussion here](https://stackoverflow.com/questions/77007885/pydantic-v2-model-validatormode-wrap-how-to-use-modelwrapvalidatorhandl)
to get a glimpse of how to use `wrap` validator.

- Can run code before and after Pydantic's internal validation
- Receives a handler function to call the inner validator
- Can modify input before validation and output after validation
- Can catch and handle validation errors from inner validators

In [97]:
class ComplexUser(BaseModel):
    id: int
    name: str
    code: str
    status: str

    @field_validator('id', mode='before')
    @classmethod
    def preprocess_id(cls: Self, v: Any) -> int:
        if isinstance(v, str) and v.startswith('ID-'):
                print(f"Preprocessing ID: {v}")
                return int(v[3:])
        return v

    @field_validator('name', mode="after")
    @classmethod
    def capitalize_name(cls: Self, v: str) -> str:
        print(f"Capitalizing name: {v}")
        return [1,2,3]

    @field_validator('code', mode='plain')
    @classmethod
    def validate_code(cls: Self, v: Any) -> str:
        if not isinstance(v, str) or not v.startswith('CODE_'):
            raise ValueError("Code must be a string starting with 'CODE_'")
        return v

    @field_validator('status', mode='wrap')
    @classmethod
    def validate_status(cls, value: Any, handler: Any, info: ValidationInfo) -> str:
        # pre-processing
        if isinstance(value, str):
            value = value.upper()

        # inner validator
        try:
            validated = handler(value)
            pprint(validated)
        except ValueError as exc:
            raise ValueError(f"Invalid status: {exc}") from exc

        # post-processing
        if validated not in ['ACTIVE', 'INACTIVE']:
            raise ValueError("Status must be either 'ACTIVE' or 'INACTIVE'")

        return validated

In [98]:
try:
    model = ComplexUser(id="ID-12345", name="john doe", code="CODE_AAA", status="inactive")
    pprint(model)
except ValidationError as exc:
    pprint(exc)

Preprocessing ID: ID-12345
Capitalizing name: john doe
