# Welcome to the Pydantic Tutorial!


Pre-requisites:
1. Python
2. Basic OOP knowledge

🔍 Pydantic is a data validation and parsing library that enforces type hints at runtime using Python’s built-in type annotations.

In simpler words we can get: 

1. 🤖 Automatic Checking - checks that the data we receive matches the types and constraints we define in our model
2. 🔁 Conversion - it tries to convert data to the correct type whenever possible
3. 🧱 Structured Data Handling - lets us define complex, nested data structures using models, and ensures all the data inside is validated and accessible in a structured way

... without having to write a lot of if-else conditional statements.


When writing in Python, we often work with APIs, databases, forms, JSON, and all sorts of external data. This data can be messy, inconsistent, or wrongly typed, which can result in a lot of errors.

Pydantic makes sure the data you're working with is correct, clean, and well-structured.



⏰ **TLDR: You define the shape of your data once — and Pydantic ensures it stays that way

In [22]:
!pip install pydantic
!pip install pydantic[email]




[notice] A new release of pip is available: 24.2 -> 25.1.1
[notice] To update, run: python.exe -m pip install --upgrade pip





[notice] A new release of pip is available: 24.2 -> 25.1.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [23]:
import pydantic
print(pydantic.__version__) # Must be higher than 2.0

2.11.7


# Basic Models

Let's create our first Pydantic model!

In [24]:
from pydantic import BaseModel # Importing BaseModel - to enforce type hints at runtime

class User(BaseModel):
    name: str # Type hints are rquired
    age: int
    is_active: bool

# Create an instance
user1 = User(name="Bob", age=30, is_active= True)
print(user1)
print(user1.model_dump()) # We can also print it like this

name='Bob' age=30 is_active=True
{'name': 'Bob', 'age': 30, 'is_active': True}


In [26]:
# We can also set default values
class Student(BaseModel):
    name: str 
    age: int = 18 # We can set default values as well
    subject: str

# Here, we don't need to pass in the age, if we do it will overwrite the default
student1 = Student(name="Jeff", age = 16, subject= "Mathematics") 
print(student1)

name='Jeff' age=16 subject='Mathematics'


In [28]:
# Type Validation

user2 = User(name="Bob", age="25", is_active=True)  # String age gets converted to int - AUTOMATIC!
print(user2)  # name='Bob' age=25 is_active=True
print(type(user2.age))

# Now lets try an invalid case
try:
    User(name="Charlie", age="twenty", is_active=False)
except ValueError as e:
    print(e)

name='Bob' age=25 is_active=True
<class 'int'>
1 validation error for User
age
  Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='twenty', input_type=str]
    For further information visit https://errors.pydantic.dev/2.11/v/int_parsing


## Fields

In Pydantic, `Field()` lets us add metadata, validation rules, and custom behavior to a model field.

For further information, you can use the documentation: https://docs.pydantic.dev/latest/concepts/fields/

In [29]:
# Basic example using Field()
from pydantic import BaseModel, Field

class FieldUser(BaseModel):
    name: str = Field(description="The user's full name") # this does not mean it is default!


field_user1 = FieldUser(name = "Vaibhav")
print(field_user1)

name='Vaibhav'


In [30]:
# if we want defaults, we can add on this parameter
class FieldUserDefault(BaseModel):
    name: str = Field(description="The user's full name", default = "John" )


field_user2 = FieldUserDefault() # did not pass anything this time!
print(field_user2)

name='John'


Now we are going to explore a problem: ⚠️

Pydantic does NOT validate the default value by default. For example:

In [31]:
class FlawUser(BaseModel):
    age: int = Field(default="twelve")

user = FlawUser()
print(user.age)  # This gives: "twelve" which is wrong!
print(type(user.age))

twelve
<class 'str'>


This happens because Pydantic only validates user input by default.

However, we can force validation like this:

In [33]:
from pydantic import BaseModel, Field, ValidationError

class ProperUser(BaseModel):
    age: int = Field(default="twelve", validate_default=True)  # Force validation

try:
    user = ProperUser()
except ValidationError as e:
    print(e)

1 validation error for ProperUser
age
  Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='twelve', input_type=str]
    For further information visit https://errors.pydantic.dev/2.11/v/int_parsing


### Aliasing 

An alias is a different name that our model can use to:
1. Accept input (validation)
2. Give output (serialization)

This is helpful when the external name ≠ internal variable name.

Lets explore a real-world example together: 

Suppose we are working with a logistics API:

**JSON from the API:**

```javascript
{
  "pkg_weight_kg": 4.5,
  "pkg_dest": "Singapore",
  "pkg_is_fragile": true
}
```

But in our python backend code, we would prefer:
- weight
- destination
- is_fragile

In [36]:
class Package(BaseModel):
    weight: float = Field(alias="pkg_weight_kg")
    destination: str = Field(alias="pkg_dest")
    is_fragile: bool = Field(alias="pkg_is_fragile")

# Order of data doesn't matter, as long as they are named correctly!
data = {
    "pkg_weight_kg": 4.5,
    "pkg_is_fragile": True,
    "pkg_dest": "Singapore"
}

package = Package(**data) # Unpacking the Dictionary - takes eacch key-value pair and passes them as named arguments.

# We can access using our own field names
print(package.weight)   
#print(package.pkg_weight_kg) # No such thing!     
print(package.destination)   
print(package.is_fragile)    

# Exporting with original alias names
print(package.model_dump(by_alias=True))

4.5
Singapore
True
{'pkg_weight_kg': 4.5, 'pkg_dest': 'Singapore', 'pkg_is_fragile': True}


In [37]:
# Example I
class Student(BaseModel):
    # email is the internal variable
    email: str = Field(
        validation_alias="student_email",       # Accepts this as input
        serialization_alias="studentEmail"      # Outputs this name
    )

# Incoming data
incoming_data = {
    "student_email": "hi@gmail.com"
}

student = Student(**incoming_data)
print(student.email) 

print(student.model_dump())  # {'email': 'hi@gmail.com'}
print(student.model_dump(by_alias=True))  # {'studentEmail': 'hi@gmail.com'} - uses serialization alias!

hi@gmail.com
{'email': 'hi@gmail.com'}
{'studentEmail': 'hi@gmail.com'}


In [38]:
# Example II

class Book(BaseModel):
    title: str = Field(
        validation_alias="book_title",          # Input will use this
        serialization_alias="bookTitle"         # Output will use this
    )
    author: str = Field(
        validation_alias="author_name",         # Input will use this
        serialization_alias="authorName"        # Output will use this
    )


backend_data = {
    "book_title": "Pydantic Guide",
    "author_name": "DataCamp"
}

book = Book(**backend_data)

print(book.title)
print(book.author)  
print(book.model_dump())  
print(book.model_dump(by_alias=True))

Pydantic Guide
DataCamp
{'title': 'Pydantic Guide', 'author': 'DataCamp'}
{'bookTitle': 'Pydantic Guide', 'authorName': 'DataCamp'}


In [41]:
# Numeric Limits - part of Validation which we will be covering soon

class Product(BaseModel):
    name: str = Field(min_length=1, max_length=50)
    price: float = Field(gt=0)
    description: str | None = Field(default=None, max_length=300)

# Example usage
valid_product = Product(name="Laptop", price=999.99, description="Very cool laptop")
print(valid_product)

name='Laptop' price=999.99 description='Very cool laptop'


In [None]:
# Now lets try to create an invalid product
invalid_product = Product(name="", price=-10)  # This will trigger 2 validation errors

#### Exercise

We need to create a Book model which has:
1. Required title (string, 1-100 chars)
2. Required author (string)
3. Optional isbn
4. price (positive float ≤1000)
5. in_stock (boolean, default True)

In [43]:
from pydantic import BaseModel, Field

class Book(BaseModel):
    title: str = Field(min_length=1, max_length=100)
    author: str
    isbn: str = Field(default = None)
    price: float = Field(le =1000, gt = 0)
    in_stock: bool = Field(default= True)

valid_book = Book(
    title="The Pragmatic Programmer",
    author="Andrew Hunt",
    price=29.99
)

Note: ISBN (International Standard Book Number) is normally exactly 13 digits, so to be able to include that in our class, we would need to use `regex expressions`. 

These `regex expressions` provide constraints. In this example, we would have to define isbn as:


```python
isbn: str = Field(default = None, regex=r"^\d{13}$")
```

However, this is beyond this tutorial since it required previous knowledge of `regex expressions`.

There is another way to do this though 👀

## Validation

Validation is one of the biggest reasons we use Pydantic. We will exploring it in this section.

`@model_validator` is used when we want to:

1. Validate multiple fields together
2. To perform logic that involves the whole model
3. Run code before or after normal field validation

In [45]:
from pydantic import BaseModel, model_validator

class Event(BaseModel):
    name: str
    start_hour: int
    end_hour: int

    @model_validator(mode='after') # Use a decorator 
    def check_time(self):
        if self.end_hour <= self.start_hour:
            raise ValueError("end_hour must be greater than start_hour. Please fix this!")
            
        return self


event1 = Event(name="Hackathon", start_hour=10, end_hour=9) # How can start hour be before end hour (we are working in 24 hour clock)

ValidationError: 1 validation error for Event
  Value error, end_hour must be greater than start_hour. Please fix this! [type=value_error, input_value={'name': 'Hackathon', 'st...our': 10, 'end_hour': 9}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.11/v/value_error

You might have noticed there is a mode parameter in the `@model_validator` and that we have 2 options:

1. `before` - before validation
2. `after` - after validation 


Let's take a look at an example and a vizualisation to help you understand this fully:

<div align="center">
  <img src="pydantic_validation_flow.png" alt="Pydantic Validation Flow" width="500"/>
</div>


**Code:**
```python
class User(BaseModel):
    age: int

User(age="25")
```

- `Before validation:` age is "25" → still a string
- `After validation:` age is 25 → now an integer 

In [46]:
# Before Validation

class Delivery(BaseModel):
    pickup: int
    drop: int

    @model_validator(mode='before')
    @classmethod 
    def fix_input(cls, data):
        print("Before validator sees raw input:", data)
        # Let's swap them if they are reversed
        if int(data['drop']) < int(data['pickup']):
            data['pickup'], data['drop'] = data['drop'], data['pickup']
        return data
    

order1 = Delivery(pickup=15, drop = 13)
print("After model validation:", order1.model_dump())

Before validator sees raw input: {'pickup': 15, 'drop': 13}
After model validation: {'pickup': 13, 'drop': 15}


`@model_validator()` can handle multiple fields, but what if we want to only focus on only one?


`@field_validator` is focused on only one field. Let's take a look at this now:

In [48]:
from pydantic import BaseModel, field_validator # importing field_validator

class Product(BaseModel):
    price: float

    @field_validator("price")
    def must_be_positive(value):
        if value <= 0:
            raise ValueError("Price must be greater than 0")
        return value


product1 = Product(price = -10) # This can also be achieved using the `Field`

ValidationError: 1 validation error for Product
price
  Value error, Price must be greater than 0 [type=value_error, input_value=-10, input_type=int]
    For further information visit https://errors.pydantic.dev/2.11/v/value_error

## Built‑in Types

Pydantic ships helpers that wrap common patterns—no regex or extra code. This makes development easier and more efficient!

🛠️ Lets take a look at a built-in example:


In [52]:
from pydantic import BaseModel, EmailStr, HttpUrl, PositiveInt

class Contact(BaseModel):
    email: EmailStr          # eg: If we remove the .com it will not work!
    website: HttpUrl        
    followers: PositiveInt   # Must be > 0

good = Contact(
    email="vaibhav@example.com",
    website="https://example.com",
    followers=10
)
print(good.model_dump())

{'email': 'vaibhav@example.com', 'website': HttpUrl('https://example.com/'), 'followers': 10}


### Nested Models

A **nested model** is when one Pydantic model is inside another model.

It helps us to organize data cleanly, like JSON objects inside objects.

In [53]:
# No additional imports needed!

class Address(BaseModel):
    street: str
    city: str
    postcode: str

class User(BaseModel):
    name: str
    email: str
    address: Address  

data = {
    "name": "Robert",
    "email": "robert@gmail.com",
    "address": {
        "street": "123 UCL Road",
        "city": "London",
        "postcode": "AB1 2CD" # this postcode is completly made up btw!
    }
}

user = User(**data)
print(user.address.city)


London


In [56]:
# Lets do another, more difficult example together!

# This time we will be creating a list of lessons:
from typing import List # importing list

class Lesson(BaseModel):
    title: str
    duration_minutes: int
    is_free: bool

class Tutorial(BaseModel):
    name: str
    instructor: str
    lessons: List[Lesson]


data = {
    "name": "Learn Pydantic",
    "instructor": "Vaibhav",
    "lessons": [
        {"title": "Basic Models", "duration_minutes": 10, "is_free": True},
        {"title": "Aliases", "duration_minutes": 20, "is_free": False},
        {"title": "Validation", "duration_minutes": 30, "is_free": False}
    ]
}

pydantic_lesson = Tutorial(**data)
print(pydantic_lesson.model_dump())


{'name': 'Learn Pydantic', 'instructor': 'Vaibhav', 'lessons': [{'title': 'Basic Models', 'duration_minutes': 10, 'is_free': True}, {'title': 'Aliases', 'duration_minutes': 20, 'is_free': False}, {'title': 'Validation', 'duration_minutes': 30, 'is_free': False}]}


### Recursive Models

Similar to Nested Models, we can create Recursive Models.

In [57]:
from pydantic import BaseModel
from typing import List, Optional

class FamilyTree(BaseModel):
    name: str
    children: Optional[List["FamilyTree"]] = None  # Use string for forward reference

FamilyTree.model_rebuild() # This tells Pydantic to fully build the model after the whole class is defined

data = {
    "name": "root",
    "children": [
        {
            "name": "child_1",
            "children": [
                {"name": "grandchild_1"},
                {"name": "grandchild_2"}
            ]
        },
        {"name": "child_2",
         "children": [{"name": "grandchild_3"}]}
    ]
}

tree = FamilyTree(**data)
print(tree.model_dump())


{'name': 'root', 'children': [{'name': 'child_1', 'children': [{'name': 'grandchild_1', 'children': None}, {'name': 'grandchild_2', 'children': None}]}, {'name': 'child_2', 'children': [{'name': 'grandchild_3', 'children': None}]}]}


In [58]:
# I wrote this function for visualization purposes
def print_family_tree(node, indent=0):
    print(" " * indent + node.name)
    if node.children:
        for child in node.children:
            print_family_tree(child, indent + 2)

print_family_tree(tree)


root
  child_1
    grandchild_1
    grandchild_2
  child_2
    grandchild_3


# Congratulations! 🎉


We covered:
1. Basic Models
2. Field Types
3. Aliasing
4. Validation
5. Built-In Types
6. Nested Models
7. Recursive Models

Where to go from here:

- https://www.datacamp.com/tutorial/pydantic
- https://www.datacamp.com/tutorial/python-user-input
- https://datacamp.com/courses/introduction-to-fastapi
- https://campus.datacamp.com/courses/model-validation-in-python
- https://campus.datacamp.com/courses/intermediate-python
