# Custom Validators using Annotations

We saw earlier in this course how we could attach a `Field` object directly to a field:

In [1]:
from pydantic import BaseModel, Field, ValidationError

class Model(BaseModel):
    number: int = Field(gt=0, lt=5)

Or we could do it using an annotated type:

In [2]:
from typing import Annotated

BoundedInt = Annotated[int, Field(gt=0, lt=5)]

class Model(BaseModel):
    number: BoundedInt

We can do something similar with validators.

First, we define our validation function. Because we are essentially defining this function outside of a class, it is a regular function, not a class method (so we don't need that `cls` argument at all).

Let's do our datetime example, starting with a before validator:

In [3]:
from datetime import datetime
from typing import Any

from dateutil.parser import parse

def parse_datetime(value: Any):
    if isinstance(value, str):
        try:
            return parse(value)
        except Exception as ex:
            raise ValueError(str(ex))
    return value

We now want to attach this, as a before validator, to a datetime type, using an annotation:

In [4]:
from pydantic import BeforeValidator

In [5]:
DateTime = Annotated[datetime, BeforeValidator(parse_datetime)]

Now we can use this annotated type in any model:

In [6]:
class Model(BaseModel):
    dt: DateTime

In [7]:
Model(dt="2020/1/1 3pm")

Model(dt=datetime.datetime(2020, 1, 1, 15, 0))

We can also add the after validator we had, in the same way using annotations.

In [8]:
from pydantic import AfterValidator

In [9]:
import pytz

def make_utc(dt: datetime) -> datetime:
    if dt.tzinfo is None:
        dt = pytz.utc.localize(dt)
    else:
        dt = dt.astimezone(pytz.utc)
    return dt

In [10]:
DateTimeUTC = Annotated[datetime, BeforeValidator(parse_datetime), AfterValidator(make_utc)]

And we can use this in any model:

In [11]:
class Model(BaseModel):
    dt: DateTimeUTC

In [12]:
Model(dt="2020/1/1 3pm")

Model(dt=datetime.datetime(2020, 1, 1, 15, 0, tzinfo=<UTC>))

In [13]:
eastern = pytz.timezone('US/Eastern')
dt = eastern.localize(datetime(2020, 1, 1, 3, 0, 0))

Model(dt=dt)

Model(dt=datetime.datetime(2020, 1, 1, 8, 0, tzinfo=<UTC>))

We can also specify multiple before and after validators, which will get executed in this order:
- right to left for before validators
- Pydantic validator
- after validators left to right

(Ordering determined by the order in which they are listed in the annotation)

In [14]:
def before_validator_1(value):
    print("before_validator_1")
    return value

def before_validator_2(value):
    print("before_validator_2")
    return value
    
def before_validator_3(value):
    print("before_validator_3")
    return value

def after_validator_1(value):
    print("after_validator_1")
    return value

def after_validator_2(value):
    print("after_validator_2")
    return value

def after_validator_3(value):
    print("after_validator_3")
    return value

In [15]:
CustomType = Annotated[
    int, 
    BeforeValidator(before_validator_1),
    AfterValidator(after_validator_1),
    BeforeValidator(before_validator_2),
    AfterValidator(after_validator_2),
    AfterValidator(after_validator_3),
    BeforeValidator(before_validator_3),
]

In [16]:
class Model(BaseModel):
    number: CustomType

In [17]:
Model(number=10)

before_validator_3
before_validator_2
before_validator_1
after_validator_1
after_validator_2
after_validator_3


Model(number=10)

Let's look at another example of using annotations for validators.

Suppose we want to define a field that is a list, of some type, that only contains unique elements.

We'll want to make it reusable, so we'll implement this using annotations.

First, we'll start with an annotated type for just integers, then we'll use the same technique I showed you earlier with TypeVar to extend this to arbitrary types.

Let's write our validator first:

In [18]:
def are_elements_unique(values: list[Any]) -> list[Any]:
    unique_elements = []
    for value in values:
        if value in unique_elements:
            raise ValueError("elements must be unique")
        unique_elements.append(value)
    return values

You may be wandering why I did not use a `set` to check if all elements are unique:

In [19]:
len(set([1, 2, 3, 4, 5])) == len([1, 2, 3, 4, 5])

True

In [20]:
len(set([1, 1, 3, 4, 5])) == len([1, 1, 3, 4, 5])

False

The reason is that sets can only contain hashable values - we may want out lists to be of types that are not hashable. There are probably better ways of doing this, but it will do for our purposes here.

Next, let's create the annotated type:

In [21]:
UniqueIntegerList = Annotated[list[int], AfterValidator(are_elements_unique)]

In [22]:
class Model(BaseModel):
    numbers: UniqueIntegerList = []

In [23]:
m = Model(numbers=(1, 2, 3, 4, 5))
m

Model(numbers=[1, 2, 3, 4, 5])

In [24]:
try:
    Model(numbers=[1, 1, 2, 3])
except ValidationError as ex:
    print(ex)

1 validation error for Model
numbers
  Value error, elements must be unique [type=value_error, input_value=[1, 1, 2, 3], input_type=list]
    For further information visit https://errors.pydantic.dev/2.5/v/value_error


Now, let's extend this a bit further, so we can handle homogeneous lists of any particular type:

In [25]:
from typing import TypeVar

In [26]:
T = TypeVar('T')

In [27]:
UniqueList = Annotated[list[T], AfterValidator(are_elements_unique)]

In [28]:
class Model(BaseModel):
    numbers: UniqueList[int] = []
    strings: UniqueList[str] = []

In [29]:
Model(numbers=[1, 2, 3], strings=["pyt", "hon"])

Model(numbers=[1, 2, 3], strings=['pyt', 'hon'])

In [30]:
try:
    Model(numbers=[1, 1, 2])
except ValidationError as ex:
    print(ex)

1 validation error for Model
numbers
  Value error, elements must be unique [type=value_error, input_value=[1, 1, 2], input_type=list]
    For further information visit https://errors.pydantic.dev/2.5/v/value_error


In [31]:
try:
    Model(numbers=["a", 2, 3], strings=[1, "b"])
except ValidationError as ex:
    print(ex)

2 validation errors for Model
numbers.0
  Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='a', input_type=str]
    For further information visit https://errors.pydantic.dev/2.5/v/int_parsing
strings.0
  Input should be a valid string [type=string_type, input_value=1, input_type=int]
    For further information visit https://errors.pydantic.dev/2.5/v/string_type


We could take this a step further, and require not only that we have a unique list of homogeneous type, but we could also add size bounds to the list - using `Field`:

In [32]:
UniqueList = Annotated[
    list[T], 
    Field(min_length=1, max_length=5), 
    AfterValidator(are_elements_unique)
]

In [33]:
class Model(BaseModel):
    numbers: UniqueList[int] = []
    strings: UniqueList[str] = []

In [34]:
Model(numbers=[1, 2, 3], strings=["a", "b", "c"])

Model(numbers=[1, 2, 3], strings=['a', 'b', 'c'])

In [35]:
try:
    Model(numbers=[], strings=list("python"))
except ValidationError as ex:
    print(ex)

2 validation errors for Model
numbers
  List should have at least 1 item after validation, not 0 [type=too_short, input_value=[], input_type=list]
    For further information visit https://errors.pydantic.dev/2.5/v/too_short
strings
  List should have at most 5 items after validation, not 6 [type=too_long, input_value=['p', 'y', 't', 'h', 'o', 'n'], input_type=list]
    For further information visit https://errors.pydantic.dev/2.5/v/too_long


Hopefully you noticed an issue with the defaults in our model!

We now have a constraint that specifies that the lists cannot be empty, yet we set the defaults to empty lists.

In this case, since I don't know what a suitable default would be, I will remove the defaults and make the fields required:

In [36]:
class Model(BaseModel):
    numbers: UniqueList[int]
    strings: UniqueList[str]