## Imports

In [None]:
from pydantic import BaseModel


## Pydantic `BaseModel`.

- A pydantic model is baisc a class that inherits from tyhe `BaseModel` class.
- To define fields its baically needed to define class attributes.

In [1]:
from pydantic import BaseModel

class Person(BaseModel):
    first_name: str
    last_name: str
    age: int

In [2]:
p = Person(first_name="Isac", last_name="Newton", age=84)
p

Person(first_name='Isac', last_name='Newton', age=84)

- What happens if we try to pass to the attributes different datatypes?
- Pydantic tries to `cast` the datatypes to the correct ones, if possible it just does the job, otherwise it will rise a `ValidationError`.
- It basically guarantes that we will have our correct type.

In [3]:
Person(first_name=100, last_name=200, age="3")

Person(first_name='100', last_name='200', age=3)

In [4]:
Person(first_name=100, last_name=200, age="x")

ValidationError: 1 validation error for Person
age
  value is not a valid integer (type=type_error.integer)

- In the same way, if we try to create a "not complete" instance of the class, it rises an `ValidationError`. 

In [5]:
from pydantic import ValidationError

try: 
    Person(first_name="Isac")
except ValidationError as e:
    print(e)

2 validation errors for Person
last_name
  field required (type=value_error.missing)
age
  field required (type=value_error.missing)


- It is possible to catch this exception as JSON, allowing to deal with it at runtime.

In [8]:
from pydantic import ValidationError

try: 
    Person(first_name="Isac")
except ValidationError as e:
    print(e.json())

[
  {
    "loc": [
      "last_name"
    ],
    "msg": "field required",
    "type": "value_error.missing"
  },
  {
    "loc": [
      "age"
    ],
    "msg": "field required",
    "type": "value_error.missing"
  }
]


## Working with optional fields

In [11]:
from pydantic import BaseModel
from typing import Optional # for python <3.10

class Person(BaseModel):
    first_name: str
    last_name: str
    age: int | None # for python >=3.10
    # age: Optional[int] # for python <3.10

In [12]:
Person(first_name="Isac", last_name="Newton")

Person(first_name='Isac', last_name='Newton', age=None)

## Specifying default values

In [13]:
from pydantic import BaseModel

# specifying default value == None is the same as have an optional field.
class Person(BaseModel):
    first_name: str = None
    last_name: str
    age: int = None 

In [14]:
Person(last_name="Newton")

Person(first_name=None, last_name='Newton', age=None)

In [16]:
try:
    Person(first_name="Isac", age=84)
except ValidationError as e:
    print(e.json())

[
  {
    "loc": [
      "last_name"
    ],
    "msg": "field required",
    "type": "value_error.missing"
  }
]


## Serializing the model as an dictionary or direct to JSON format.

In [17]:
from pydantic import BaseModel

class Person(BaseModel):
    first_name: str = "Isaac"
    last_name: str
    age: int = 84 

In [18]:
isaac = Person(last_name="Newton")
isaac.dict()

{'first_name': 'Isaac', 'last_name': 'Newton', 'age': 84}

In [20]:
# creating a json string
isaac.json()

'{"first_name": "Isaac", "last_name": "Newton", "age": 84}'

In [23]:
# to exclude/include keys from the dictionary - must be passad as an Set.

isaac.dict(exclude={"first_name", "age"})

{'last_name': 'Newton'}

In [26]:
# to work with json we can do the same as before.
# It uses JSON 'dumps' function, so it is possible to use all parameters from this method

isaac.json(include={"last_name", "age"}, indent=4)

'{\n    "last_name": "Newton",\n    "age": 84\n}'

In [27]:
print(isaac.json(include={"last_name", "age"}, indent=4))

{
    "last_name": "Newton",
    "age": 84
}


### Desiarize
- Get a json string or a dictionary and create an instance of the model.
- It will also `cast` the data appropriatly.
- It avoids the lag required using json library for serialization and deserialization.

In [28]:
from datetime import date
from pydantic import BaseModel

class Person(BaseModel):
    first_name: str = "Isaac"
    last_name: str
    birth: date

In [31]:
isaac_dict = {
    "first_name": "Isaac",
    "last_name": "Newton",
    "birth": date(1643, 1, 4)
}

isaac = Person.parse_obj(isaac_dict)
isaac

Person(first_name='Isaac', last_name='Newton', birth=datetime.date(1643, 1, 4))

In [32]:
isaac_dict = {
    "first_name": "Isaac",
    "last_name": "Newton",
    "birth": "1643-01-04" # pydantic automatically casts the datatype when its possible.
}

isaac = Person.parse_obj(isaac_dict)
isaac

Person(first_name='Isaac', last_name='Newton', birth=datetime.date(1643, 1, 4))

In [37]:
# it works the same way with json string.

isaac_json = '''
{
    "first_name": "Isaac",
    "last_name": "Newton",
    "birth": "1643-01-04"
}
'''

# the method changes to 'parse_raw'
isaac = Person.parse_raw(isaac_json)
isaac

Person(first_name='Isaac', last_name='Newton', birth=datetime.date(1643, 1, 4))

## Fields aliases
- Casting from json `cammelCase` to python `snake_case` formats.
- For `JSON` files the naming convention says that the fields should be `cammelCase`: `firstName`.

In [38]:
from pydantic import Field
from datetime import date
from pydantic import BaseModel

class Person(BaseModel):
    first_name: str = Field(default="Isaac", alias="firstName")
    last_name: str = Field(alias="lastName")
    birth: date = None


In [77]:
isaac_dict_snake = {
    "first_name": "Isaac",
    "last_name": "Newton",
    "birth": "1643-01-04"
}

try:
    isaac = Person.parse_obj(isaac_dict_snake)
except ValidationError as e:
    print(e.json())
    
isaac

[
  {
    "loc": [
      "lastName"
    ],
    "msg": "field required",
    "type": "value_error.missing"
  }
]


Person(first_name='Isaac', last_name='Newton', birth=None)

In [44]:
# in order to deserialize the data when you are using aliases, it is necessary to use the alias name, "cammelcase".

isaac_dict = {
    "firstName": "Isaac",
    "lastName": "Newton",
    "birth": "1643-01-04"
}

try:
    isaac = Person.parse_obj(isaac_dict)
except ValidationError as e:
    print(e.json())
    
isaac

Person(first_name='Isaac', last_name='Newton', birth=datetime.date(1643, 1, 4))

In [48]:
# when we deserialiaze, it is used the field name (snakecase) and not the field alias, for both dict and json

isaac.dict()

{'first_name': 'Isaac',
 'last_name': 'Newton',
 'birth': datetime.date(1643, 1, 4)}

In [47]:
isaac.json()

'{"first_name": "Isaac", "last_name": "Newton", "birth": "1643-01-04"}'

In [50]:
# When you use alias and you must to create an instance of the model, it is necessary to use the alias - cammelcase

Person(last_name="Newton")

ValidationError: 1 validation error for Person
lastName
  field required (type=value_error.missing)

In [51]:
Person(lastName="Newton")

Person(first_name='Isaac', last_name='Newton', birth=None)

- In order to be able to use both field name or alias, it is necessary to set a configuration.

In [82]:
from pydantic import Field
from datetime import date
from pydantic import BaseModel

class Person(BaseModel):
    first_name: str = Field(default=None, alias="firstName")
    last_name: str = Field(alias="lastName")
    birth: date = None

    class Config:
        allow_population_by_field_name = True

In [83]:
isaac = Person(first_name="Isaac", last_name="Newton")
isaac

Person(first_name='Isaac', last_name='Newton', birth=None)

In [84]:
isaac_dict_cammel = {
    "firstName": "Isaac",
    "lastName": "Newton",
    "birth": "1643-01-04"
}

Person.parse_obj(isaac_dict_cammel)

Person(first_name='Isaac', last_name='Newton', birth=datetime.date(1643, 1, 4))

In [85]:
isaac = Person(first_name="Isaac", last_name="Newton", birth=None)
isaac

Person(first_name='Isaac', last_name='Newton', birth=None)

In [86]:
isaac.dict(by_alias=True)

{'firstName': 'Isaac', 'lastName': 'Newton', 'birth': None}

In [87]:
isaac.json(by_alias=True)

'{"firstName": "Isaac", "lastName": "Newton", "birth": null}'

In [88]:
isaac.dict()

{'first_name': 'Isaac', 'last_name': 'Newton', 'birth': None}

In [89]:
data_junk = {**isaac_dict_snake, "junk": "extra field"}
data_junk

{'first_name': 'Isaac',
 'last_name': 'Newton',
 'birth': '1643-01-04',
 'junk': 'extra field'}

In [91]:
# By default it ignores the extra field

p = Person.parse_obj(data_junk)
p

Person(first_name='Isaac', last_name='Newton', birth=datetime.date(1643, 1, 4))

In [92]:
hasattr(p, "first_name")

True

In [93]:
hasattr(p, "junk")

False

### Configure how pydant deal with extra fields
- Ignore (default)
- Add as a field to the instance of the model.
- Forbite - raise an `ValidationException`.

`from pydantic import Extra` - extra in an `Enum` object.

In [94]:
from datetime import date
from pydantic import BaseModel, Field, Extra

class Person(BaseModel):
    first_name: str = Field(default=None, alias="firstName")
    last_name: str = Field(alias="lastName")
    birth: date = None

    class Config:
        allow_population_by_field_name = True
        extra = Extra.allow


In [95]:
p = Person(**data_junk) # same as Person.parse_obj(data_junk)
p

Person(first_name='Isaac', last_name='Newton', birth=datetime.date(1643, 1, 4), junk='extra field')

In [96]:
p.dict()

{'first_name': 'Isaac',
 'last_name': 'Newton',
 'birth': datetime.date(1643, 1, 4),
 'junk': 'extra field'}

In [97]:
from datetime import date
from pydantic import BaseModel, Field, Extra

class Person(BaseModel):
    first_name: str = Field(default=None, alias="firstName")
    last_name: str = Field(alias="lastName")
    birth: date = None

    class Config:
        allow_population_by_field_name = True
        extra = Extra.forbid 
        # it is best practice to use forbit, this way you can know that there is an extra field and it probably should be there or should be in the model definmition.

In [99]:
try:
    p = Person.parse_obj(data_junk)
    p
except ValidationError as e:
    print(e.json())

[
  {
    "loc": [
      "junk"
    ],
    "msg": "extra fields not permitted",
    "type": "value_error.extra"
  }
]


## Easier way to define aliases
- Uses customized `BaseModel` class
- useful when you have lots of fields with aliases

In [106]:
def snake_to_json_camel(value: str) -> str:
    if not isinstance(value, str):
        raise ValueError("Value must be a string.")
    
    words = value.split("_")
    value = "".join(word.title() for word in words if word)
    return f"{value[0].lower()}{value[1:]}"

In [107]:
snake_to_json_camel("__string___test__")

'stringTest'

In [108]:
class CustomBaseModel(BaseModel):
    class Config:
        alias_generator = snake_to_json_camel
        extra = Extra.forbid
        allow_population_by_field_name = True

In [109]:
class Person(CustomBaseModel):
    first_name: str = None
    last_name: str
    birth: date = None

In [110]:
p = Person(first_name="Isaac", lastName="Newton")
p

Person(first_name='Isaac', last_name='Newton', birth=None)

In [111]:
p = Person(first_name="Isaac", lastName="Newton", snn=1234)
p

ValidationError: 1 validation error for Person
snn
  extra fields not permitted (type=value_error.extra)

## Validation fields

In [114]:
from pydantic import conint # constraint integer

class Test(CustomBaseModel):
    age: conint(gt=0, le=150) # age must be grater than 0 and less or equal to 150

In [115]:
Test(age=10)

Test(age=10)

In [116]:
Test(age=-225)

ValidationError: 1 validation error for Test
age
  ensure this value is greater than 0 (type=value_error.number.not_gt; limit_value=0)

In [117]:
from pydantic import conint, constr # constraint string

class Test(CustomBaseModel):
    age: conint(gt=0, le=150) # age must be grater than 0 and less or equal to 150
    first_name: str = None
    # strip_whitespace -> remove white spaces
    # strict -> do not casts value if it can be casted - it must pass a string type.
    # curtail_length -> if the length is greater that he value (25), it uses only the first 25 chars - no exception raised.
    last_name: constr(strip_whitespace=True, strict=True, min_length=2, curtail_length=25)

In [118]:
Test(age=84, last_name="    Newton  ")

Test(age=84, first_name=None, last_name='Newton')

In [119]:
Test(age=84, last_name=1236)

ValidationError: 1 validation error for Test
lastName
  str type expected (type=type_error.str)

In [122]:
Test(age=84, last_name="     A       ")

ValidationError: 1 validation error for Test
lastName
  ensure this value has at least 2 characters (type=value_error.any_str.min_length; limit_value=2)

In [121]:
Test(age=84, last_name="Newton--Newton--Newton--Newton--Newton--Newton--")

Test(age=84, first_name=None, last_name='Newton--Newton--Newton--N')

In [123]:
from pydantic import validator

class Test(CustomBaseModel):
    hash_tag: str

    @validator("hash_tag")
    def validade_hash_tag(cls, value):
        if not value.startswith("#"):
            raise ValueError("Hash tag must start with a '#'")
        return value.lower()

In [124]:
t = Test(hash_tag="#TEST")
t

Test(hash_tag='#test')

In [125]:
t = Test(hash_tag="TEST")
t

ValidationError: 1 validation error for Test
hashTag
  Hash tag must start with a '#' (type=value_error)

In [127]:
from pydantic import conint, constr
from pydantic import validator

class Test(CustomBaseModel):
    hash_tag: constr(min_length=5, strip_whitespace=True)

    @validator("hash_tag")
    def validate_hash_tag(cls, value: str):
        if not value.startswith("#"):
            return f"#{value.lower()}"
        return value.lower()

In [129]:
t = Test(hash_tag="TEST5")
t

Test(hash_tag='#test5')

In [137]:
from enum import Enum
from typing import List, Tuple, Union

class PolygonType(Enum):
    triangle = 3
    tetragon = 4
    pentagon = 5
    hexagon = 6

In [138]:
p = PolygonType.triangle
p

<PolygonType.triangle: 3>

In [139]:
display(p.name)
display(p.value)

'triangle'

3

In [149]:
from pydantic import validator

class PolygonModel(CustomBaseModel):
    polygon_type: PolygonType
    vertices: list[tuple[int | float, int | float]]

    @validator("vertices")
    def validate_vertices(cls, value, values): # the value and values must be this names, pydantic uses it
        polygon_type = values.get("polygon_type")
        
        if polygon_type:
            num_vertices_required = polygon_type.value
            print(num_vertices_required)

            if len(value) != num_vertices_required:
                raise ValueError(
                    f"For a {polygon_type.name}, exactly {polygon_type.value} vertices are required"
                )
        
        return value

In [150]:
PolygonModel(polygon_type=PolygonType.triangle, vertices=[(1, 1), (2, 2), (3, 3)])

3


PolygonModel(polygon_type=<PolygonType.triangle: 3>, vertices=[(1, 1), (2, 2), (3, 3)])

In [151]:
PolygonModel(polygon_type=PolygonType.triangle, vertices=[(1, 1), (2, 2), (3, 3+2j)])

ValidationError: 2 validation errors for PolygonModel
vertices -> 2 -> 1
  value is not a valid integer (type=type_error.integer)
vertices -> 2 -> 1
  value is not a valid float (type=type_error.float)

In [152]:
PolygonModel(polygon_type=PolygonType.pentagon, vertices=[(1, 1), (2, 2), (3, 3)])

5


ValidationError: 1 validation error for PolygonModel
vertices
  For a pentagon, exactly 5 vertices are required (type=value_error)

## Nested Models

- Post
    - byline (one or more authors)
        - author:
            - first_name (required, min 2 chars, max 20 chars)
            - last_name (required, min 1 chars, max 20 chars)
            - display_name (optional, default to first name, initial of last name, min 1 char, max 20 chars)
    - title (required, at leat 10 chars, max 50 chars, force title case)
    - sub_title (optional, min 20 chars, max 100 chars)
    - body (required, min 100 chars, max None)
    - links (0 or more)
        - link:
            - name (required, min 5 chars, max 25 chars)
            - url (required, valid url, must include scheme [http/https])

In [187]:
from pydantic import BaseModel, conlist, constr
from pydantic import validator, AnyHttpUrl

class CustomBaseModel(BaseModel):
    class Config:
        extra = Extra.forbid
        allow_population_by_field_name = True

class Author(CustomBaseModel):
    first_name: constr(min_length=2, max_length=20, strip_whitespace=True)
    last_name: constr(min_length=1, max_length=20, strip_whitespace=True)
    display_name: constr(min_length=1, max_length=25) = None

    # always = True forces the validator to run, even if display_name is None, this is
    # how we can set a dynamic default value
    @validator("display_name", always=True)
    def validate_display_name(cls, value, values):
        # validator runs, even if previous did not validate properly
        # so we need to run our code only if prior fields validate OK
        if not value and "first_name" and "last_name" in values:
            first_name = values["first_name"]
            last_name = values["last_name"]
            return f"{first_name} {(last_name[0]).upper()}"
        return value

class Link(CustomBaseModel):
    name: constr(min_length=5, max_length=25)
    url: AnyHttpUrl

class Post(CustomBaseModel):
    byline: conlist(item_type=Author, min_items=1)
    title: constr(min_length=10, max_length=50, strip_whitespace=True)
    sub_title: constr(min_length=20, max_length=100, strip_whitespace=True) = None
    body: constr(min_length=100, strip_whitespace=True)
    links: list[Link] = []

    @validator("title")
    def validate_title(cls, value):
        return value and value.title()

In [188]:
Link(name="links", url="https://link")

Link(name='links', url=AnyHttpUrl('https://link', ))

In [189]:
Author(first_name="John", last_name="Do")

Author(first_name='John', last_name='Do', display_name='John D')

In [195]:
post = Post(
    byline=[
        Author(first_name="John", last_name="Do"),
        Author(first_name="Mary", last_name="Do", display_name="Little Mary")
    ],
    title="the long Hair of a bold man",
    body="test, " * 20,
    links=[
        Link(name="links", url="https://link"),
        Link(name="links2", url="https://link2")
    ]
)

print(post.json(indent=4))

{
    "byline": [
        {
            "first_name": "John",
            "last_name": "Do",
            "display_name": "John D"
        },
        {
            "first_name": "Mary",
            "last_name": "Do",
            "display_name": "Little Mary"
        }
    ],
    "title": "The Long Hair Of A Bold Man",
    "sub_title": null,
    "body": "test, test, test, test, test, test, test, test, test, test, test, test, test, test, test, test, test, test, test, test,",
    "links": [
        {
            "name": "links",
            "url": "https://link"
        },
        {
            "name": "links2",
            "url": "https://link2"
        }
    ]
}


## Work with pydantic model object

In [196]:
post_2 = Post(
    byline=[
        Author(first_name="John", last_name="Do"),
        Author(first_name="Mary", last_name="Do", display_name="Little Mary")
    ],
    title="the long Hair of a bold man",
    body="test, " * 20,
    links=[
        Link(name="links", url="https://link"),
        Link(name="links2", url="https://link2")
    ]
)

In [198]:
post_2.byline.append(Author(first_name="Felipe", last_name="Vasconcelos"))
print(post_2.json(indent=4))

{
    "byline": [
        {
            "first_name": "John",
            "last_name": "Do",
            "display_name": "John D"
        },
        {
            "first_name": "Mary",
            "last_name": "Do",
            "display_name": "Little Mary"
        },
        {
            "first_name": "Felipe",
            "last_name": "Vasconcelos",
            "display_name": "Felipe V"
        },
        {
            "first_name": "Felipe",
            "last_name": "Vasconcelos",
            "display_name": "Felipe V"
        }
    ],
    "title": "The Long Hair Of A Bold Man",
    "sub_title": null,
    "body": "test, test, test, test, test, test, test, test, test, test, test, test, test, test, test, test, test, test, test, test,",
    "links": [
        {
            "name": "links",
            "url": "https://link"
        },
        {
            "name": "links2",
            "url": "https://link2"
        }
    ]
}


In [199]:
post_2.byline.append(Author(first_name="Felipe", last_name="Vasconcelos"))
print(post_2.json(by_alias=True, indent=4))

{
    "byline": [
        {
            "first_name": "John",
            "last_name": "Do",
            "display_name": "John D"
        },
        {
            "first_name": "Mary",
            "last_name": "Do",
            "display_name": "Little Mary"
        },
        {
            "first_name": "Felipe",
            "last_name": "Vasconcelos",
            "display_name": "Felipe V"
        },
        {
            "first_name": "Felipe",
            "last_name": "Vasconcelos",
            "display_name": "Felipe V"
        },
        {
            "first_name": "Felipe",
            "last_name": "Vasconcelos",
            "display_name": "Felipe V"
        }
    ],
    "title": "The Long Hair Of A Bold Man",
    "sub_title": null,
    "body": "test, test, test, test, test, test, test, test, test, test, test, test, test, test, test, test, test, test, test, test,",
    "links": [
        {
            "name": "links",
            "url": "https://link"
        },
        {
      

In [201]:
print(Post.schema_json(indent=4))

{
    "title": "Post",
    "type": "object",
    "properties": {
        "byline": {
            "title": "Byline",
            "minItems": 1,
            "type": "array",
            "items": {
                "$ref": "#/definitions/Author"
            }
        },
        "title": {
            "title": "Title",
            "minLength": 10,
            "maxLength": 50,
            "type": "string"
        },
        "sub_title": {
            "title": "Sub Title",
            "minLength": 20,
            "maxLength": 100,
            "type": "string"
        },
        "body": {
            "title": "Body",
            "minLength": 100,
            "type": "string"
        },
        "links": {
            "title": "Links",
            "default": [],
            "type": "array",
            "items": {
                "$ref": "#/definitions/Link"
            }
        }
    },
    "required": [
        "byline",
        "title",
        "body"
    ],
    "additionalProperties": false,