# Pydantic

Pydantic is a data validation library. You write a data definition with pure Python, Pydantic enforces the types.

|                 | **pydantic** | **dataclass** |
|-----------------|--------------|---------------|
|      Type Hints |       ✅      |       ✅       |
| Data Validation |       ✅      |       ❌       |
|   Serialization |       ✅      |       ⚠️       |
|        Built-In |       ❌      |       ✅       |

Use cases for Pydantic:
- Complex data models
- Need to do a lot of serde
- Need to work a lot with external APIs

Dataclasses with work for everything else.

In [1]:
from datetime import datetime
from typing import List, Tuple

from pydantic import BaseModel, Field, computed_field
import uuid

class Delivery(BaseModel):
    id: str = Field(default_factory=lambda: uuid.uuid4().hex)
    timestamp: datetime
    dimensions: Tuple[int, int, int]

    @computed_field
    @property
    def volume(self) -> int:
        return self.dimensions[0] * self.dimensions[1] * self.dimensions[2]

`Field` is a very rich class. It helps to build a lot of various validators, discriminators, and control immutability. `Field` can be used with regular dataclasses, but it is not recommended.  
See [Concepts -> Fields](https://docs.pydantic.dev/latest/concepts/fields/) for details.

In [2]:
valid_delivery = Delivery(
    timestamp="2020-01-02T03:04:05Z",
    dimensions=[10, 20, 30]
)
print(valid_delivery)
print(valid_delivery.model_dump())

id='38f31e7c9dc14f659e33ed44aa494ca8' timestamp=datetime.datetime(2020, 1, 2, 3, 4, 5, tzinfo=TzInfo(UTC)) dimensions=(10, 20, 30) volume=6000
{'id': '38f31e7c9dc14f659e33ed44aa494ca8', 'timestamp': datetime.datetime(2020, 1, 2, 3, 4, 5, tzinfo=TzInfo(UTC)), 'dimensions': (10, 20, 30), 'volume': 6000}


In [3]:
invalid_delivery = Delivery(
    timestamp="July 4th, 1776",
    dimensions=[10, 20, 30]
)

ValidationError: 1 validation error for Delivery
timestamp
  Input should be a valid datetime or date, invalid character in year [type=datetime_from_date_parsing, input_value='July 4th, 1776', input_type=str]
    For further information visit https://errors.pydantic.dev/2.6/v/datetime_from_date_parsing

# Custom Validation

If you want to implement an extra validator for a field, Pydantic provides a `validator` decorator for that.

In [None]:
from pydantic import field_validator

class User(BaseModel):
    username: str
    account_id: int
    delivery: Delivery

    @field_validator("account_id")
    @classmethod
    def validate_account_id(cls, value):
        if value < 0:
            raise ValueError("account_id has to be a positive integer")
        return value

valid_user = User(
    username="johndoe",
    account_id=123,
    delivery=valid_delivery
)

In [None]:
invalid_user = User(
    username="johndoe",
    account_id=-123,
    delivery=valid_delivery
)

ValidationError: 1 validation error for User
account_id
  Value error, account_id has to be a positive integer [type=value_error, input_value=-123, input_type=int]
    For further information visit https://errors.pydantic.dev/2.6/v/value_error

# JSON Serialization

Pydantic provides a built-in support for JSON serde.

In [None]:
user_json_obj = valid_user.json()
print(user_json_obj)

user_dict_obj = valid_user.dict()
print(user_dict_obj)

{"username":"johndoe","account_id":123,"delivery":{"id":"034743c53b734a73bdc5d91621774817","timestamp":"2020-01-02T03:04:05Z","dimensions":[10,20,30],"volume":6000}}
{'username': 'johndoe', 'account_id': 123, 'delivery': {'id': '034743c53b734a73bdc5d91621774817', 'timestamp': datetime.datetime(2020, 1, 2, 3, 4, 5, tzinfo=TzInfo(UTC)), 'dimensions': (10, 20, 30), 'volume': 6000}}


In [None]:
user = User.parse_raw(user_json_obj)
print(user)

username='johndoe' account_id=123 delivery=Delivery(id='034743c53b734a73bdc5d91621774817', timestamp=datetime.datetime(2020, 1, 2, 3, 4, 5, tzinfo=TzInfo(UTC)), dimensions=(10, 20, 30), volume=6000)


In [None]:
print(user.model_json_schema())

{'$defs': {'Delivery': {'properties': {'id': {'title': 'Id', 'type': 'string'}, 'timestamp': {'format': 'date-time', 'title': 'Timestamp', 'type': 'string'}, 'dimensions': {'maxItems': 3, 'minItems': 3, 'prefixItems': [{'type': 'integer'}, {'type': 'integer'}, {'type': 'integer'}], 'title': 'Dimensions', 'type': 'array'}}, 'required': ['timestamp', 'dimensions'], 'title': 'Delivery', 'type': 'object'}}, 'properties': {'username': {'title': 'Username', 'type': 'string'}, 'account_id': {'title': 'Account Id', 'type': 'integer'}, 'delivery': {'$ref': '#/$defs/Delivery'}}, 'required': ['username', 'account_id', 'delivery'], 'title': 'User', 'type': 'object'}


# Pydanttic Settings

Settings management using Pydantic. It will look for a `.env` file if there is no environment variable to use.

In [1]:
import os
from pydantic import Field
from pydantic_settings import BaseSettings, SettingsConfigDict

os.environ['AUTH_KEY'] = 'auth_key!'
os.environ['API_KEY'] = 'api_key!!!'

class Settings(BaseSettings):
    # extra="allow" allows for extra fields in the .env file
    model_config = SettingsConfigDict(env_file='.env', env_file_encoding='utf-8')  # if you want to use .env file

    auth_key: str = Field(alias='AUTH_KEY')  # alias is case-insensitive
    api_key: str = Field()  # no alias needed

settings = Settings()
print(settings.model_dump())

{'auth_key': 'auth_key!', 'api_key': 'api_key!!!'}
