# Pydantic

[Official help](https://pydantic-docs.helpmanual.io/)

Pydantic lets you define and validate data structures in Python using native type annotations.


In [1]:
from datetime import datetime
from typing import List, Optional
import pydantic

## Basic Usage

The core functionality of pydantic comes through the `BaseModel` class and can be used similarly to regular Python `dataclasses` but a bit more heavy-duty.
This can include defaults (from which pydantic can infer the correct type) and classes from the `typing` module.

In [2]:
class User(pydantic.BaseModel):
    id: int
    name = 'John Doe'
    signup_ts: Optional[datetime] = None
    friends: List[int] = []

external_data = {
    'id': '123',
    'signup_ts': '2019-06-01 12:22',
    'friends': [1, 2, '3'],
}
user = User(**external_data)
user

User(id=123, signup_ts=datetime.datetime(2019, 6, 1, 12, 22), friends=[1, 2, 3], name='John Doe')

And then the attributes can be accessed more or less as you'd expect:

In [3]:
user.id

123

In [4]:
repr(user.signup_ts)

'datetime.datetime(2019, 6, 1, 12, 22)'

In [5]:
user.friends

[1, 2, 3]

In [6]:
user.dict()

{'id': 123,
 'signup_ts': datetime.datetime(2019, 6, 1, 12, 22),
 'friends': [1, 2, 3],
 'name': 'John Doe'}

## Errors

Pydantic will raise an error if the typing checks fail.
These can be extracted out as a JSON string:

In [7]:
try:
    User(signup_ts='broken', friends=[1, 2, 'not number'])
except pydantic.ValidationError as e:
    print(e.json(indent = 4))

[
    {
        "loc": [
            "id"
        ],
        "msg": "field required",
        "type": "value_error.missing"
    },
    {
        "loc": [
            "signup_ts"
        ],
        "msg": "invalid datetime format",
        "type": "value_error.datetime"
    },
    {
        "loc": [
            "friends",
            2
        ],
        "msg": "value is not a valid integer",
        "type": "type_error.integer"
    }
]


## Field Types

Pydantic allows the standard / `typing` data types.
The [official documentation](https://pydantic-docs.helpmanual.io/usage/types/) has a good list but a useful subset is:

| Data Type            | Notes                                         |
|----------------------|-----------------------------------------------|
| `None`               |                                               |
| `bool`               | Common-sense conversion (not just evaluated). |
| `int`                | Will attempt conversion.                      |
| `float`              | Will attempt conversion.                      |
| `str`                | Will attempt conversion.                      |
| `bytes`              |                                               |
| `datetime.date`      |                                               |
| `datetime.time`      |                                               |
| `datetime.datetime`  |                                               |
| `datetime.timedelta` |                                               |
| `typing.Any`         | Can be `None` - optional.                     |
| `typing.Union`       | Can get a bit messy.                          |
| `typing.Dict`        |                                               |
| `typing.Set`         |                                               |
| `typing.List`        |                                               |
| `enum.Enum`          |                                               |
| `decimal.Decimal`    |                                               |
| `pathlib.Path`       |                                               |
| `uuid.UUID`          |                                               |

But also has it's own [additional types](https://pydantic-docs.helpmanual.io/usage/types/#pydantic-types) that cover a lot of common use-cases out of the box.

## Validators

You can also have more complicated validation, for example:

In [8]:
class UserModel(pydantic.BaseModel):
    name: str
    username: str
    password1: str
    password2: str

    @pydantic.validator('name')
    def name_must_contain_space(cls, v):
        if ' ' not in v:
            raise ValueError('must contain a space')
        return v.title()

    @pydantic.validator('password2')
    def passwords_match(cls, v, values, **kwargs):
        if 'password1' in values and v != values['password1']:
            raise ValueError('passwords do not match')
        return v

    @pydantic.validator('username')
    def username_alphanumeric(cls, v):
        assert v.isalnum(), 'must be alphanumeric'
        return v

user = UserModel(
    name='samuel colvin',
    username='scolvin',
    password1='zxcvbn',
    password2='zxcvbn',
)

Which allows more complicated, arbiitary validation between fields:

In [9]:
try:
    UserModel(
        name='samuel',
        username='scolvin',
        password1='zxcvbn',
        password2='zxcvbn2',
    )
except pydantic.ValidationError as e:
    print(e)

2 validation errors for UserModel
name
  must contain a space (type=value_error)
password2
  passwords do not match (type=value_error)


A few notes:

* These are class methods,
* They should either:
    * Return the parsed value,
    * Raise a `ValueError`, `TypeError` or `AssertionError` (though `assert` statements may be ignored in some Python environments),
* The second attribute is the value to validate (naming doesn't matter),
* Can have one of several named arguments:
    * `values` - a dictionary with previously validated fields (fields are validated in the order they are defined),
    * `config` - the model config,
    * `field` - the field being validated (useful for more complex field definitions),
    * `**kwargs`` - all of the above,
* If validation fails on one field, it won't be available for the next field,


### Pre-validation

If `pre=True` is set in the `validator` definition the method is run before validation - useful for on-initialisation fixing.

## Re-using Validators

Validators can be applied to multiple fields at once by passing multiple names, or all the fields with `"*"`.

## Validating Iterables

Passing `each_item=True` will cause the validator to be called on each item in the iterable.
This should work for both `List`s and `Dict`s.
Alternatively you can iterate over the items as part of the validator.

In [10]:
from typing import List
from pydantic import BaseModel, ValidationError, validator


class DemoModel(pydantic.BaseModel):
    square_numbers: List[int] = []
    cube_numbers: List[int] = []

    @pydantic.validator('*', pre=True)
    def split_str(cls, v):
        if isinstance(v, str):
            return v.split('|')
        return v

    @pydantic.validator('cube_numbers', 'square_numbers')
    def check_sum(cls, v):
        if sum(v) > 42:
            raise ValueError('sum of numbers greater than 42')
        return v

    @pydantic.validator('square_numbers', each_item=True)
    def check_squares(cls, v):
        assert v ** 0.5 % 1 == 0, f'{v} is not a square number'
        return v

    @pydantic.validator('cube_numbers', each_item=True)
    def check_cubes(cls, v):
        # 64 ** (1 / 3) == 3.9999999999999996 (!)
        # this is not a good way of checking cubes
        assert v ** (1 / 3) % 1 == 0, f'{v} is not a cubed number'
        return v

DemoModel(square_numbers=[1, 4, 9])

DemoModel(square_numbers=[1, 4, 9], cube_numbers=[])

In [11]:
DemoModel(square_numbers='1|4|16')

DemoModel(square_numbers=[1, 4, 16], cube_numbers=[])

## Forcing Validation

Validation isn't run on fields without a value specified, but `always=True` will force it to be called (e.g. for dynamic default generation in conjunction with `pre=True`).

In [12]:
class DemoModel(pydantic.BaseModel):
    ts: datetime = None

    @pydantic.validator('ts', pre=True, always=True)
    def set_ts_now(cls, v):
        return v or datetime.now()

DemoModel()

DemoModel(ts=datetime.datetime(2021, 3, 29, 20, 46, 41, 194498))