### The `pydantic` Library

In this tutorial we're going to take a look at the [pydantic](https://pydantic-docs.helpmanual.io/) library.

This is a great library for creating data models in Python, leveraging Python's built-in type hints.

I first came across this library when I started using [FastAPI](https://fastapi.tiangolo.com/) for API development.

Since then I have been using `pydantic` for way more than just FastAPI - I find myself using that library practically in almost every project I write.

It makes defining data structures (or **models** in Pydantic's terminology) with "strict typing" very easy. Furthermore, Pydantic provides data validation (what basically gives us "strict typing", in the sense that we are guaranteed that fields in the structure will be of some specific type), as well as the ability to easily serialize and deserialize data (Python `dict` objects, JSON and even supports pickling).

It also helps with IDE features such as inspection, type checking and auto-completion.

I find myself using Pydantic outside of FastAPI, both for the IDE benefits, as well as easy marshalling of data with NoSQL databases.

You will need to install `pydantic` into your virtual environment - see [here](https://pydantic-docs.helpmanual.io/install/) - although if you use `pipenv`, the `Pipfile` included in this github repo already contains Pydantic.

There are also plugins available for VSCode and PyCharm that can better leverage Pydantic models - see [PyCharm plugin](https://pydantic-docs.helpmanual.io/pycharm_plugin/) and [VSCode plugin](https://pydantic-docs.helpmanual.io/visual_studio_code/).

#### Basics

Let's start by creating a very simple data model.

In [1]:
from pydantic import BaseModel

`BaseModel` is the base class we should inherit from in order to define pydantic models.

In [2]:
class Person(BaseModel):
    first_name: str
    last_name: str
    age: int

We can now create instances of `Person`, and those instances will have properties (or **fields**, to use Pydantic's terminology)  `first_name`, `last_name` and `age`:

In [3]:
p = Person(first_name='Isaac', last_name='Newton', age=84)

In [4]:
p

Person(first_name='Isaac', last_name='Newton', age=84)

Watch what happens when we pass different types to create instances:

In [5]:
Person(first_name=100, last_name=200, age='3')

Person(first_name='100', last_name='200', age=3)

As you can see, pydantic automatically cast the integers `100` and `200` to strings since we specified that `first_name` and `last_name` were strings, while it cast the string `'3'` to the integer `3` since we specified that `age` was an `int`.

By default, the fields we defined are required, and pydantic will raise an exception if we omit one of the values:

In [6]:
from pydantic import ValidationError

try:
    Person(first_name='Isaac')
except ValidationError as ex:
    print(ex)

2 validation errors for Person
last_name
  field required (type=value_error.missing)
age
  field required (type=value_error.missing)


We can even get the exception as a JSON string:

In [7]:
try:
    Person(first_name='Isaac')
except ValidationError as ex:
    print(ex.json())

[
  {
    "loc": [
      "last_name"
    ],
    "msg": "field required",
    "type": "value_error.missing"
  },
  {
    "loc": [
      "age"
    ],
    "msg": "field required",
    "type": "value_error.missing"
  }
]


Very useful for programmatically dealing with validation exceptions.

Let's actually make one of the fields optional - simply by using a type hint:

In [8]:
class Person(BaseModel):
    first_name: str
    last_name: str
    age: int | None

In [9]:
Person(first_name='Isaac', last_name='Newton')

Person(first_name='Isaac', last_name='Newton', age=None)

Or, if you are working with Python versions that do not support the `|` operator, we can do it this way as well (and I'll use this from now on, to remain backward compatible with earlier version of Python):

In [10]:
from typing import Optional

In [11]:
class Person(BaseModel):
    first_name: str
    last_name: str
    age: Optional[int]

In [12]:
Person(first_name='Isaac', last_name='Newton')

Person(first_name='Isaac', last_name='Newton', age=None)

We can even specify a default value (in which case the optional is no longer necessary):

In [13]:
class Person(BaseModel):
    first_name: str = None
    last_name: str
    age: int = None

In [14]:
Person(last_name='Newton')

Person(first_name=None, last_name='Newton', age=None)

We can also serialize the model to a Python dictionary:

In [15]:
p = Person(first_name='Isaac', last_name='Newton', age=84)

In [16]:
p.dict()

{'first_name': 'Isaac', 'last_name': 'Newton', 'age': 84}

As well as directly to JSON:

In [17]:
p.json()

'{"first_name": "Isaac", "last_name": "Newton", "age": 84}'

In this serialization we could also opt to exclude (or include) only certain fields:

In [18]:
p.dict(exclude={'first_name', 'age'})

{'last_name': 'Newton'}

And it works the same with JSON, and since pydantic uses Python's `dumps`, we can even specify any arguments that `dumps` would understand, including custom JSON encoders.

In [19]:
print(p.json(include={'last_name', 'age'}, indent=4))

{
    "last_name": "Newton",
    "age": 84
}


> Side note: this is where I find the `pyperclip` library which I did a video on previously, for copying these json data models to my clipboard, for debugging purposes, pasting somewhere else, etc.

In [20]:
import pyperclip

pyperclip.copy(p.json(indent=4))

print(pyperclip.paste())

{
    "first_name": "Isaac",
    "last_name": "Newton",
    "age": 84
}


Just like pydantic can serialize a model to a Python `dict` or a JSON string, it can also deserialize either - **and** it will cast the data appropriately, thereby avoiding a lot of the leg work that may be required for JSON serialization/deserialization.

Let's see this when we add a `date` to our model for example:

In [21]:
from datetime import date

class Person(BaseModel):
    first_name: str = None
    last_name: str
    dob: date

In [22]:
data = {
    "first_name": "Isaac",
    "last_name": "Newton",
    "dob": date(1643, 1, 4)
}

Let's load this dictionary into our `Person` model, using the `parse_obj` method:

In [23]:
p = Person.parse_obj(data)
p

Person(first_name='Isaac', last_name='Newton', dob=datetime.date(1643, 1, 4))

Since our dictionary object contained a `date` object, we should not be too surprised to see that the Pydantic model's `dob` field is a `date` object too.

But, we actually get the same behavior with JSON too:

In [24]:
json = '''
{
    "first_name": "Isaac",
    "last_name": "Newton",
    "dob": "1643-01-04"
}
'''

And we can deserialize this JSON string, using the `parse_raw()` method:

In [25]:
p = Person.parse_raw(json)
p

Person(first_name='Isaac', last_name='Newton', dob=datetime.date(1643, 1, 4))

As you can see Pydantic was able to parse the `dob` attribute value from a string to an actual date object (you could of course override how to deserialize values in your JSON objects by deserializing the JSON string yourself first, using a custom decoder, and then using Pydantic's `parse_obj` method.)

But we'll come back to a simpler way of baking that right into the Pydantic model itself.

#### Field Aliases

One thing you may have noticed, is that the Pydantic model uses **snake case** for field names, since that's the convention for Python code - but in JSON we typically use **camel case**. Pydantic supports this!!

In [26]:
from pydantic import Field

We're going to expand how we defined a field in our class, by using Pydantic's `Field` class, which gives us more control than just a type hint by itself - in particular, we'll use it to specify a field **alias**:

In [27]:
class Person(BaseModel):
    first_name: str = Field(alias='firstName', default=None)
    last_name: str = Field(alias='lastName')
    dob: date = None

Now, we have our `data` dictionary already defined, but it uses the Python variable naming convention:

In [28]:
data

{'first_name': 'Isaac',
 'last_name': 'Newton',
 'dob': datetime.date(1643, 1, 4)}

And we cannot load that dictionary as-is anymore, since we specified the field aliases, and those get used when deserializing:

In [29]:
try:
    Person.parse_obj(data)
except ValidationError as ex:
    print(ex.json())

[
  {
    "loc": [
      "lastName"
    ],
    "msg": "field required",
    "type": "value_error.missing"
  }
]


We can change our dictionary:

In [30]:
data2 = {
    'firstName': 'Isaac',
    'lastName': 'Newton',
    'dob': date(1643, 1, 4)
}

In [31]:
p = Person.parse_obj(data2)
p

Person(first_name='Isaac', last_name='Newton', dob=datetime.date(1643, 1, 4))

And when we deserialize our object we get this:

In [32]:
p.dict()

{'first_name': 'Isaac',
 'last_name': 'Newton',
 'dob': datetime.date(1643, 1, 4)}

In [33]:
print(p.json(indent=2))

{
  "first_name": "Isaac",
  "last_name": "Newton",
  "dob": "1643-01-04"
}


As you can see, the serialization uses the field names, not the aliases. We'll come back to that in a minute.

We can no longer use the field names to create objects using the constructor either:

In [34]:
try:
    Person(last_name='Newton')
except ValidationError as ex:
    print(ex.json())

[
  {
    "loc": [
      "lastName"
    ],
    "msg": "field required",
    "type": "value_error.missing"
  }
]


However, we can instruct Pydantic to allow us to use these field names, as well as the aliases. 

To do so, we need to _configure_ our Pydantic model (and there are lots of configurations than can be done!).

In [35]:
class Person(BaseModel):
    first_name: str = Field(alias='firstName', default=None)
    last_name: str = Field(alias='lastName')
    dob: date = None
        
    class Config:
        allow_population_by_field_name = True

And now, we can use either the field names of the aliases (though in practice I usually do not use this feature and always use the aliases - but up to you).

In [36]:
p = Person(first_name = 'Isaac', lastName='Newton')
p

Person(first_name='Isaac', last_name='Newton', dob=None)

In [37]:
Person.parse_obj(data)

Person(first_name='Isaac', last_name='Newton', dob=datetime.date(1643, 1, 4))

In [38]:
Person.parse_obj(data2)

Person(first_name='Isaac', last_name='Newton', dob=datetime.date(1643, 1, 4))

Just as before, serializing the model will use the field names, not the aliases:

In [39]:
p.dict()

{'first_name': 'Isaac', 'last_name': 'Newton', 'dob': None}

In [40]:
p = Person.parse_obj(data2)
print(p.json(indent=2))

{
  "first_name": "Isaac",
  "last_name": "Newton",
  "dob": "1643-01-04"
}


We can choose to use the aliases instead:

In [41]:
print(p.json(by_alias=True, indent=2))

{
  "firstName": "Isaac",
  "lastName": "Newton",
  "dob": "1643-01-04"
}


And same thing with serializing to a Python dict:

In [42]:
p.dict(by_alias=True)

{'firstName': 'Isaac', 'lastName': 'Newton', 'dob': datetime.date(1643, 1, 4)}

#### Extra Fields Behavior

So far we've been pretty good about just passing field names (or aliases) that we know are valid for the model - but what happens if we do not?

In [43]:
data_junk = {**data, "junk": "extraneous field"}
data_junk

{'first_name': 'Isaac',
 'last_name': 'Newton',
 'dob': datetime.date(1643, 1, 4),
 'junk': 'extraneous field'}

In [44]:
Person(**data_junk)

Person(first_name='Isaac', last_name='Newton', dob=datetime.date(1643, 1, 4))

So Pydantic did not complain about this, and basically ignored the `junk` field altogether:

In [45]:
p = Person.parse_obj(data_junk)

In [46]:
hasattr(p, 'first_name')

True

In [47]:
hasattr(p, 'junk')

False

We can actually handle extra fields in multiple ways:
- ignore them (the default)
- allow them, and add them as attributes to the instance
- forbid them, which will raise a validation exception

Let's see how to do that:

In [48]:
from pydantic import Extra

In [49]:
class Person(BaseModel):
    first_name: str = Field(alias='firstName', default=None)
    last_name: str = Field(alias='lastName')
    dob: date = None
        
    class Config:
        allow_population_by_field_name = True
        extra = Extra.allow

In [50]:
p = Person(**data_junk)
p

Person(first_name='Isaac', last_name='Newton', dob=datetime.date(1643, 1, 4), junk='extraneous field')

In [51]:
hasattr(p, 'junk')

True

Or, we can forbid extra fields altogether:

In [52]:
class Person(BaseModel):
    first_name: str = Field(alias='firstName', default=None)
    last_name: str = Field(alias='lastName')
    dob: date = None
        
    class Config:
        allow_population_by_field_name = True
        extra = Extra.forbid

In [53]:
try:
    Person(**data_junk)
except ValidationError as ex:
    print(ex.json())

[
  {
    "loc": [
      "junk"
    ],
    "msg": "extra fields not permitted",
    "type": "value_error.extra"
  }
]


#### Using an Alias Generator

Using aliases for proper JSON naming conventions vs proper Python naming conventions is so common, that there is a much easier way to define field aliases than specifying them for each field. Also, I often use the same configuration for all my Pydantic models, such as forbidding extra fields. 

To make life easier, I end up creating a customized `BaseModel` class that I can then reuse throughout my Pydantic models.

First I define a function to convert from snake case to camel case:

In [54]:
def snake_to_camel_case(value: str) -> str:
    if not isinstance(value, str):
        raise ValueError("Value must be a string.")
    words = value.split('_')
    value = "".join(word.title() for word in words if word)
    return f"{value[0].lower()}{value[1:]}"

In [55]:
snake_to_camel_case("first_name")

'firstName'

In [56]:
snake_to_camel_case("__first_name")

'firstName'

There are also plenty of existing libraries that provide this same functionality (probably using regex, and probably more robust than the above, but that works for me)

Now we can configure our custom base model, to specify that aliases shoudl automatically be generated using that function, and I'll add in the other configurations I want for that custom base model as well:

In [57]:
class CustomBaseModel(BaseModel):
    class Config:
        alias_generator = snake_to_camel_case
        extra = Extra.forbid
        allow_population_by_field_name = True        

Let's now use this custom class to redefine our model:

In [58]:
class Person(CustomBaseModel):
    first_name: str = None
    last_name: str
    dob: date = None

In [59]:
p = Person(first_name='Isaac', lastName='Newton')
p

Person(first_name='Isaac', last_name='Newton', dob=None)

In [60]:
try:
    Person(last_name='Newton', junk='junk')
except ValidationError as ex:
    print(ex.json())

[
  {
    "loc": [
      "junk"
    ],
    "msg": "extra fields not permitted",
    "type": "value_error.extra"
  }
]


There's a whole lot more to Pydantic in terms of defining all these behaviors, so be sure to check their documentation!

#### Field Validations

Now, lets turn our attention to a few ways in which we can fine tune the validation that occurs for various fields, as well as creating more complex field types, using both more advanced Python type hinting, as well as using composition of Pydantic models.

You should read [this](https://pydantic-docs.helpmanual.io/usage/types/) in Pydantic's documentation for a full run down of the various field types and validations that can be performed.

Let's look at a few.

##### Constrained Numerics

For starters, we can specify that numbers should be in a certain range.

In [61]:
from pydantic import conint

class Test(CustomBaseModel):
    age: conint(gt=0, le=150)

Here we get a constraint that `age` must be greater than `0`, and less than (or equal to) `150`.

In [62]:
Test(age=10)

Test(age=10)

In [63]:
try:
    Test(age=-1)
except ValidationError as ex:
    print(ex.json())

[
  {
    "loc": [
      "age"
    ],
    "msg": "ensure this value is greater than 0",
    "type": "value_error.number.not_gt",
    "ctx": {
      "limit_value": 0
    }
  }
]


In [64]:
try:
    Test(age=200)
except ValidationError as ex:
    print(ex.json())

[
  {
    "loc": [
      "age"
    ],
    "msg": "ensure this value is less than or equal to 150",
    "type": "value_error.number.not_le",
    "ctx": {
      "limit_value": 150
    }
  }
]


##### Constrained Strings

We can also use `constr` to refine our string types:

In [65]:
from pydantic import constr

In [66]:
class Test(CustomBaseModel):
    first_name: str = None
    last_name: constr(strip_whitespace=True, strict=True, min_length=2, curtail_length=25)

- strip_whitespace: removes leading and trialing whitespace
- strict: will not allow anything other than a string type (e.g passing an `int` will not be cast to a `str`
- min_length: specifies the minimum string length
- curtail_length: will truncate the string to this length if it is longer

In [67]:
t = Test(last_name="   Newton   ")
t

Test(first_name=None, last_name='Newton')

In [68]:
try:
    Test(last_name="   x    ")
except ValidationError as ex:
    print(ex.json())

[
  {
    "loc": [
      "lastName"
    ],
    "msg": "ensure this value has at least 2 characters",
    "type": "value_error.any_str.min_length",
    "ctx": {
      "limit_value": 2
    }
  }
]


In [69]:
try:
    Test(last_name = 100)
except ValidationError as ex:
    print(ex.json())

[
  {
    "loc": [
      "lastName"
    ],
    "msg": "str type expected",
    "type": "type_error.str"
  }
]


In [70]:
t = Test(last_name = "*" * 100)
t

Test(first_name=None, last_name='*************************')

As you can see, Pydantic provides a slew of options, read more in that link I gave above.

##### Custom Validators

If those built-in validators are not sufficient, we can also define our own custom validators!

In [71]:
from pydantic import validator

In [72]:
class Test(CustomBaseModel):
    hash_tag: str
        
    @validator('hash_tag')
    def validate_name(cls, value):  # yes, this is a class method, not an instance method
        if not value.startswith('#'):
            raise ValueError("Hash tag must start with a #")
        return value  # if validation passed, return the value
    

In [73]:
t = Test(hash_tag='#test')
t

Test(hash_tag='#test')

In [74]:
try:
    Test(hash_tag='test')
except ValidationError as ex:
    print(ex.json())

[
  {
    "loc": [
      "hashTag"
    ],
    "msg": "Hash tag must start with a #",
    "type": "value_error"
  }
]


##### Modifying Field Values with Custom Validators

We could actually use the validator to do some extra things, like add the `#` if it's not present. While we're at it, we'll also add a constraint that the string needs to be at least `5` characters, and should have any whitespace stripped.

In [75]:
class Test(CustomBaseModel):
    hash_tag: constr(min_length=5, strip_whitespace=True)
        
    @validator('hash_tag')
    def validate_name(cls, value):  # yes, this is a class method, not an instance method
        if not value.startswith('#'):
            return f"#{value}"
        return value

In [76]:
Test(hash_tag="#python")

Test(hash_tag='#python')

In [77]:
Test(hash_tag="python")

Test(hash_tag='#python')

In [78]:
try:
    Test(hash_tag="T")
except ValidationError as ex:
    print(ex)

1 validation error for Test
hashTag
  ensure this value has at least 5 characters (type=value_error.any_str.min_length; limit_value=5)


##### Multi-Field Custom Validators

In fact, we can define validators that can validate a field based on the contents of fields that are defined **above** it in the class, **and** passed validation.

Let's try this example, where we define a polygon, with a polygon type (based on an enumeration), and a series of points that defines each vertex, and we want to validate the number of vertices provided based on the polygon type.

In [79]:
from enum import Enum
from typing import List, Tuple, Union

In [80]:
class PolygonType(Enum):
    triangle = 3
    tetragon = 4
    pentagon = 5
    hexagon = 6    

In [81]:
t = PolygonType.triangle
t.name

'triangle'

In [82]:
class Polygon(CustomBaseModel):
    polygon_type: PolygonType
    vertices: List[Tuple[Union[int, float], Union[int, float]]]
        
    @validator('vertices')
    def validate_vertices(cls, value, values):
        # values has access to fields defined "above" itself in the class, and only as
        # long as that field also passed its validation
        polygon_type = values.get('polygon_type')
        if polygon_type:
            num_vertices_required = polygon_type.value
            if len(value) != num_vertices_required:
                raise ValueError(
                    f"For a {polygon_type.name}, exactly {polygon_type.value} "
                    "vertices are required."
                )
        return value

Now let's try to see how this model behaves:

In [83]:
Polygon(polygon_type=PolygonType.triangle, vertices = [(1, 1), (2, 2), (3, 3)])

Polygon(polygon_type=<PolygonType.triangle: 3>, vertices=[(1, 1), (2, 2), (3, 3)])

In [84]:
try:
    Polygon(polygon_type=PolygonType.triangle, vertices=[(1, 1), (2, 2+2j), (3, 3)])
except ValidationError as ex:
    print(ex)

2 validation errors for Polygon
vertices -> 1 -> 1
  value is not a valid integer (type=type_error.integer)
vertices -> 1 -> 1
  value is not a valid float (type=type_error.float)


In [85]:
try:
    Polygon(polygon_type=PolygonType.pentagon, vertices=[(1, 1), (2, 2), (3, 3)])
except ValidationError as ex:
    print(ex)

1 validation error for Polygon
vertices
  For a pentagon, exactly 5 vertices are required. (type=value_error)


In [86]:
try:
    Polygon(polygon_type=10, vertices = [(1, 1)])
except ValidationError as ex:
    print(ex)

1 validation error for Polygon
polygonType
  value is not a valid enumeration member; permitted: 3, 4, 5, 6 (type=type_error.enum; enum_values=[<PolygonType.triangle: 3>, <PolygonType.tetragon: 4>, <PolygonType.pentagon: 5>, <PolygonType.hexagon: 6>])


There's quite a bit more to validators, so check out the docs [here](https://pydantic-docs.helpmanual.io/usage/validators/).

#### Model Composition

Finally, we can create models that are composed of other models, allowing to easily create complex nested models with ease - and with full serialization/deserialization capabilities too.

For example, let's say we want to model a simple blog post, using this model:

- post
    - byline (one or more authors)
        - author:
            - first_name (required, min 2 chars, max 20 chars)
            - last_name (required, min 1 char, max 20 chars)
            - display_name (optional, default to first name, initial of last name, min 1 char, max 25 chars)
    - title (required, at least 10 characters, no more than 50, force title case)
    - sub title (optional, if present at least 20 characters, max 100)
    - body (required, at least 100 characters, no upper limit)
    - links (0 or more)
        - link:
            - name (required, min 5 characters, max 25 characters)
            - url (required, valid url, that must include scheme (http/https))

Let's start by defining models for the parts that make up the overall post.

The `Author` model first:

In [87]:
class Author(CustomBaseModel):
    first_name: constr(min_length=1, max_length=20, strip_whitespace=True)
    last_name: constr(min_length=1, max_length=20, strip_whitespace=True)
    display_name: constr(min_length=1, max_length=25) = None

    # always = True forces the validator to run, even if display_name is None, this
    # is how we can set a dynamic default value
    @validator("display_name", always=True)  
    def validate_display_name(cls, value, values):
        # validator runs, even if previous fields did not validate properly - so 
        # we will need to run our code only if prior fields validated OK.
        if not value and 'first_name' in values and 'last_name' in values:
            first_name = values['first_name']
            last_name = values['last_name']
            return f"{first_name} {(last_name[0]).upper()}"
        return value

Let's try this class out:

In [88]:
Author(first_name="Gottfried", last_name="Leibniz")

Author(first_name='Gottfried', last_name='Leibniz', display_name='Gottfried L')

In [89]:
Author(first_name='John', last_name="von Neumann", display_name="Johnny")    

Author(first_name='John', last_name='von Neumann', display_name='Johnny')

In [90]:
try:
    Author(first_name="X", last_name="X" * 50)
except ValidationError as ex:
    print(ex)

1 validation error for Author
lastName
  ensure this value has at most 20 characters (type=value_error.any_str.max_length; limit_value=20)


Next, let's create a model for the links:

In [91]:
from pydantic import AnyHttpUrl

In [92]:
class Link(CustomBaseModel):
    name: constr(min_length=5, max_length=25)
    url: AnyHttpUrl

And let's try it out.

In [93]:
Link(name="google", url="https://www.google.com")

Link(name='google', url=AnyHttpUrl('https://www.google.com', ))

In [94]:
try:
    Link(name="google", url="www.google.com")
except ValidationError as ex:
    print(ex)

1 validation error for Link
url
  invalid or missing URL scheme (type=value_error.url.scheme)


In [95]:
try:
    Link(name="google", url="https://not a valid url")
except ValidationError as ex:
    print(ex)

1 validation error for Link
url
  URL invalid, extra characters found after valid URL: ' a valid url' (type=value_error.url.extra; extra= a valid url)


Now, it's time to create our main model.

We start by adding the byline and title.

In [96]:
from pydantic import conlist

In [97]:
class Post(CustomBaseModel):
    byline: conlist(item_type=Author, min_items=1)
    title: constr(min_length=10, max_length=50, strip_whitespace=True)
    
    @validator('title')
    def validate_title(cls, value):
        return value and value.title()

In [98]:
Post(
    byline=[
        Author(first_name="Isaac", last_name="Newton")
    ], 
    title="some title"
)

Post(byline=[Author(first_name='Isaac', last_name='Newton', display_name='Isaac N')], title='Some Title')

Next we add the sub title, and the body:

In [99]:
class Post(CustomBaseModel):
    byline: conlist(item_type=Author, min_items=1)
    title: constr(min_length=10, max_length=50, strip_whitespace=True)
    sub_title: constr(min_length=20, max_length=100, strip_whitespace=True) = None
    body: constr(min_length=100)
        
    @validator('title')
    def validate_title(cls, value):
        return value and value.title()

In [100]:
Post(
    byline=[
        Author(first_name="Isaac", last_name="Newton")
    ], 
    title="some title",
    body="x" * 100
)

Post(byline=[Author(first_name='Isaac', last_name='Newton', display_name='Isaac N')], title='Some Title', sub_title=None, body='xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx')

In [101]:
Post(
    byline=[
        Author(first_name="Isaac", last_name="Newton")
    ], 
    title="some title",
    sub_title="A slightly longer subtitle",
    body="*" * 100
)

Post(byline=[Author(first_name='Isaac', last_name='Newton', display_name='Isaac N')], title='Some Title', sub_title='A slightly longer subtitle', body='****************************************************************************************************')

Finally we can add the links:

In [102]:
class Post(CustomBaseModel):
    byline: conlist(item_type=Author, min_items=1)
    title: constr(min_length=10, max_length=50, strip_whitespace=True)
    sub_title: constr(min_length=20, max_length=100, strip_whitespace=True) = None
    body: constr(min_length=100)
    links: List[Link] = []

    @validator('title')
    def validate_title(cls, value):
        return value and value.title()

Let's just recap all our models here:

In [103]:
class Author(CustomBaseModel):
    first_name: constr(min_length=1, max_length=20, strip_whitespace=True)
    last_name: constr(min_length=1, max_length=20, strip_whitespace=True)
    display_name: constr(min_length=1, max_length=25) = None

    @validator("display_name", always=True)  
    def validate_display_name(cls, value, values):
        if not value and 'first_name' in values and 'last_name' in values:
            first_name = values['first_name']
            last_name = values['last_name']
            return f"{first_name} {(last_name[0]).upper()}"
        return value
    
class Link(CustomBaseModel):
    name: constr(min_length=5, max_length=25)
    url: AnyHttpUrl
        
class Post(CustomBaseModel):
    byline: conlist(item_type=Author, min_items=1)
    title: constr(min_length=10, max_length=50, strip_whitespace=True)
    sub_title: constr(min_length=20, max_length=100, strip_whitespace=True) = None
    body: constr(min_length=100)
    links: List[Link] = []

    @validator('title')
    def validate_title(cls, value):
        return value and value.title()

And let's create a full post:

In [104]:
post = Post(
    byline=[
        Author(first_name="John", last_name="von Neumann", display_name="Johnny V"),
        Author(first_name="Oskar", last_name="Morgenstern")
    ],
    title="Theory of Games and Economic Behavior",
    sub_title="A non-mathematical overview",
    body="Lorem ipsum sit dolor amet." * 20,
    links=[
        Link(name="Original Book", url="https://archive.org/details/in.ernet.dli.2015.215284"),
        Link(name="Review", url = "https://www.ams.org/journals/bull/1945-51-07/S0002-9904-1945-08391-8/S0002-9904-1945-08391-8.pdf")
    ]
)

Now we can easily serialize this to JSON for example:

In [105]:
print(post.json(indent=2))

{
  "byline": [
    {
      "first_name": "John",
      "last_name": "von Neumann",
      "display_name": "Johnny V"
    },
    {
      "first_name": "Oskar",
      "last_name": "Morgenstern",
      "display_name": "Oskar M"
    }
  ],
  "title": "Theory Of Games And Economic Behavior",
  "sub_title": "A non-mathematical overview",
  "body": "Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.",
  "links": [
    {
      "name": "Original Book",
      "url": "https://archive.org/details/in.ernet.dli.2015.21

And of course, we can deserialize a JSON string back to an instance of our model:

In [106]:
json_str = '''
{
  "byline": [
    {
      "first_name": "John",
      "last_name": "von Neumann",
      "display_name": "John V"
    },
    {
      "first_name": "Oskar",
      "last_name": "Morgenstern",
      "display_name": null
    }
  ],
  "title": "Theory Of Games And Economic Behavior",
  "sub_title": "A non-mathematical overview",
  "body": "Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.",
  "links": [
    {
      "name": "Original Book",
      "url": "https://archive.org/details/in.ernet.dli.2015.215284"
    },
    {
      "name": "Review",
      "url": "https://www.ams.org/journals/bull/1945-51-07/S0002-9904-1945-08391-8/S0002-9904-1945-08391-8.pdf"
    }
  ]
}
'''

In [107]:
p = Post.parse_raw(json_str)

In [108]:
p

Post(byline=[Author(first_name='John', last_name='von Neumann', display_name='John V'), Author(first_name='Oskar', last_name='Morgenstern', display_name='Oskar M')], title='Theory Of Games And Economic Behavior', sub_title='A non-mathematical overview', body='Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.', links=[Link(name='Original Book', url=AnyHttpUrl('https://archive.org/details/in.ernet.dli.2015.215284', )), Link(name='Review', url=AnyHttpUrl('https://www.ams.org/journals/bull/1945-51-07/S0002-9

We can technically create our model instance piecemeal, but we have to watch out for fields that are not optional:

In [109]:
p = Post(
    byline=[
        Author(first_name="John", last_name="von Neumann", display_name="Johnny V"),
    ],
    title="Theory of Games and Economic Behavior",
    body="Lorem ipsum sit dolor amet." * 20,
)

In [110]:
p.byline.append(Author(first_name="Oskar", last_name="Morgenstern"))

In [111]:
p.sub_title = "A non-mathematical overview"

In [112]:
p.links = [
    Link(name="Original Book", url="https://archive.org/details/in.ernet.dli.2015.215284")
]

In [113]:
print(p.json(by_alias=True, indent=2))

{
  "byline": [
    {
      "firstName": "John",
      "lastName": "von Neumann",
      "displayName": "Johnny V"
    },
    {
      "firstName": "Oskar",
      "lastName": "Morgenstern",
      "displayName": "Oskar M"
    }
  ],
  "title": "Theory Of Games And Economic Behavior",
  "subTitle": "A non-mathematical overview",
  "body": "Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.Lorem ipsum sit dolor amet.",
  "links": [
    {
      "name": "Original Book",
      "url": "https://archive.org/details/in.ernet.dli.2015.215284"
 

#### Generating a JSON Schema

Finally, we can generate the JSON schema for our model this way:

In [114]:
print(Post.schema_json(indent=2))

{
  "title": "Post",
  "type": "object",
  "properties": {
    "byline": {
      "title": "Byline",
      "minItems": 1,
      "type": "array",
      "items": {
        "$ref": "#/definitions/Author"
      }
    },
    "title": {
      "title": "Title",
      "minLength": 10,
      "maxLength": 50,
      "type": "string"
    },
    "subTitle": {
      "title": "Subtitle",
      "minLength": 20,
      "maxLength": 100,
      "type": "string"
    },
    "body": {
      "title": "Body",
      "minLength": 100,
      "type": "string"
    },
    "links": {
      "title": "Links",
      "default": [],
      "type": "array",
      "items": {
        "$ref": "#/definitions/Link"
      }
    }
  },
  "required": [
    "byline",
    "title",
    "body"
  ],
  "additionalProperties": false,
  "definitions": {
    "Author": {
      "title": "Author",
      "type": "object",
      "properties": {
        "firstName": {
          "title": "Firstname",
          "minLength": 1,
          "maxLength":

Although we covered a lot of functionality in this video, Pydantic has a **lot** more going on that what we looked at here. It's a fantastic library, and I encourage to read more about it in their (great) [documentation](https://pydantic-docs.helpmanual.io/).