# Introduction to pydantic

Pydantic is a Python package to validate data. You can download it from https://pypi.org/project/pydantic/, The documentation is available from https://docs.pydantic.dev/

## Validate types and required fields
Here's simple example that defines a person using pydantic. The first name and last name are required. The date the person was born on is optional because it has a default value, in this case `None`.

In [22]:
from datetime import date
from pydantic import BaseModel

class Person1(BaseModel):
    first_name: str
    last_name: str
    born_on: date | None = None

Here's how we can create a person:

In [24]:
Person1(first_name='Alice', last_name='Adams')

Person1(first_name='Alice', last_name='Adams', born_on=None)

In [25]:
Person1(first_name="Bob", last_name="Brown", born_on=date(1960, 1, 1))

Person1(first_name='Bob', last_name='Brown', born_on=datetime.date(1960, 1, 1))

Trying to create a person without last name fails:

In [26]:
Person1(first_name='Alice')

ValidationError: 1 validation error for Person1
last_name
  Field required [type=missing, input_value={'first_name': 'Alice'}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.9/v/missing

## Validate `dict`s
We can also create a person from a Python `dict`. This is useful because for example [json.load()](https://docs.python.org/3/library/json.html), [csv.DictReader](https://docs.python.org/3/library/csv.html#csv.DictReader) or [PyYAML.load()](https://pyyaml.org/wiki/PyYAMLDocumentation) all return `dict`s.

In [27]:
person_dict = {
  "first_name": "Claire",
  "last_name": "Clark",
}
Person1(**person_dict)

Person1(first_name='Claire', last_name='Clark', born_on=None)

## Custom field validation
To validate a field, add a field validator that in case the value does not check out, raises a `ValueError`:

In [28]:
import re
from pydantic import BaseModel, field_validator

class Person2(BaseModel):
    first_name: str
    last_name: str
    born_on: date | None = None

    @field_validator("first_name", "last_name")
    @classmethod
    def validated_name_part(cls, value: str) -> str:
        result = value.strip()
        if not re.match(r"^[a-zA-Z]+$", result):
            raise ValueError(f"name must contain only letters, but got: {result!r}")
        return result

Now we can still define valid persons as before:

In [29]:
Person2(first_name='Alice', last_name='Adams')

Person2(first_name='Alice', last_name='Adams', born_on=None)

## Validation errors
But trying a name with punctuation characters fails:

In [30]:
Person2(first_name=123, last_name='?!&')

ValidationError: 2 validation errors for Person2
first_name
  Input should be a valid string [type=string_type, input_value=123, input_type=int]
    For further information visit https://errors.pydantic.dev/2.9/v/string_type
last_name
  Value error, name must contain only letters, but got: '?!&' [type=value_error, input_value='?!&', input_type=str]
    For further information visit https://errors.pydantic.dev/2.9/v/value_error

Notice that pydantic validates all fields, even if some of them fail. So you might get multiple errors at once.

## Advanced validation with annotated fields
Pydantic allows to annotate fields to enable validation beyond type checks, for instance whether an `int` is within a specified range:

In [31]:
from pydantic import BaseModel, Field
from typing_extensions import Annotated

class Restaurant(BaseModel):
    rating: Annotated[int, Field(strict=True, ge=1, le=5)]
    
Restaurant(rating=3)

Restaurant(rating=3)

In [32]:
Restaurant(rating=-999)

ValidationError: 1 validation error for Restaurant
rating
  Input should be greater than or equal to 1 [type=greater_than_equal, input_value=-999, input_type=int]
    For further information visit https://errors.pydantic.dev/2.9/v/greater_than_equal

## Validation across fields
In addition to field validators, model validators can validate that multiple fields are consistent to each other. For example, a person can only have died after they have been born:

In [34]:
from pydantic import BaseModel, model_validator

class Person3(BaseModel):
    first_name: str
    last_name: str
    born_on: date
    died_on: date | None = None
    
    @model_validator(mode="after")
    def check_is_born_before_died(self) -> "Person3":
        if self.died_on is not None:
            if self.born_on >= self.died_on:
                raise ValueError(
                    "A person must be born before they died."
                )
        return self

Let's try:

In [35]:
Person3(first_name="Alice", last_name="Adams", born_on=date(1970, 1, 1))

Person3(first_name='Alice', last_name='Adams', born_on=datetime.date(1970, 1, 1), died_on=None)

In [36]:
Person3(first_name="Bob", last_name="Smith", born_on=date(1960, 1, 1), died_on=date(2020, 1, 1))

Person3(first_name='Bob', last_name='Smith', born_on=datetime.date(1960, 1, 1), died_on=datetime.date(2020, 1, 1))

But this fails:

In [37]:
Person3(first_name="Claire", last_name="Clark", born_on=date(2020, 1, 1), died_on=date(1950, 1, 1))

ValidationError: 1 validation error for Person3
  Value error, A person must be born before they died. [type=value_error, input_value={'first_name': 'Claire', ...tetime.date(1950, 1, 1)}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.9/v/value_error

## Camel case vs snake case
If the input uses camel case field names, as it is common with JSON, the Python code can remain snake case by using an `alias`. This allows the source code to be conformant with the [PEP8 naming guidelines](https://peps.python.org/pep-0008/).

In [38]:
from pydantic import BaseModel, Field

class Person4(BaseModel):
    first_name: str = Field(alias="firstName")
    last_name: str = Field(alias="lastName")

For example:

In [39]:
json = {"firstName": "Alice", "lastName": "Adams"}
Person4(**json)

Person4(first_name='Alice', last_name='Adams')