#### Custom Data Types

You can also define your own custom data types. There are several ways to achieve it.

##### Classes with `__get_validators__`

You use a custom class with a classmethod `__get_validators__`. It will be called to get validators to parse and validate the input data.

> These validators have the same semantics as in `Validators`, you can declare a parameter `config`, `field`, etc.

In [1]:
import re
from pydantic import BaseModel, ValidationError
from pydantic.fields import ModelField
from typing import TypeVar, Generic

In [2]:
# https://en.wikipedia.org/wiki/Postcodes_in_the_United_Kingdom#Validation
post_code_regex = re.compile(
    r"(?:"
    r"([A-Z]{1,2}[0-9][A-Z0-9]?|ASCN|STHL|TDCU|BBND|[BFS]IQQ|PCRN|TKCA) ?"
    r"([0-9][A-Z]{2})|"
    r"(BFPO) ?([0-9]{1,4})|"
    r"(KY[0-9]|MSR|VG|AI)[ -]?[0-9]{4}|"
    r"([A-Z]{2}) ?([0-9]{2})|"
    r"(GE) ?(CX)|"
    r"(GIR) ?(0A{2})|"
    r"(SAN) ?(TA1)"
    r")"
)

In [3]:
class PostCode(str):
    """
    Partial UK postcode validation. Note: this is just an example, and is not
    intended for use in production; in particular this does NOT guarantee
    a postcode exists, just that it has a valid format.
    """
    
    @classmethod
    def __get_validators__(cls):
        # one or more validators may be yielded which will be called in the
        # order to validate the input, each validator will receive as an input
        # the value returned from the previous validator
        yield cls.validate
    
    @classmethod
    def validate(cls, v):
        if not isinstance(v, str):
            raise TypeError("string required")
        m = post_code_regex.fullmatch(v.upper())
        if not m:
            raise ValueError("invalid postcode format")
        # you could also return a string here which would mean model.post_code
        # would be a string, pydantic won't care but you could end up with some
        # confusion since the value's type won't match the type annotation
        # exactly
        return cls(f"{m.group(1)} {m.group(2)}")
    
    def __repr__(self):
        return f"PostCode({super().__repr__()})"

In [4]:
class Model(BaseModel):
    post_code: PostCode

In [5]:
model = Model(post_code="sw8 5el")
print(f"{model = }")
print(f"{model.post_code = }")

model = Model(post_code=PostCode('SW8 5EL'))
model.post_code = PostCode('SW8 5EL')


In [6]:
print(f"{Model.schema() = }")

Model.schema() = {'title': 'Model', 'type': 'object', 'properties': {'post_code': {'title': 'Post Code', 'type': 'string'}}, 'required': ['post_code']}


Similar validation could be achieved using `constr(regex=...)` except the value won't be formatted with a space, the schema would just include the full pattern and the returned value would be a vanilla string.

##### Arbitrary Types Allowed

You can allow arbitrary types using the `arbitrary_types_allowed` config in the `Model Config`.

In [7]:
class Pet:
    def __init__(self, name: str):
        self.name = name

In [8]:
class PetOwner(BaseModel):
    pet: Pet
    owner: str
    
    class Config:
        arbitrary_types_allowed = True

In [9]:
pet = Pet(name="Hedwig")
m = PetOwner(owner="Harry", pet=pet)
print(f"{m = }")
print(f"{m.pet = }")
print(f"{m.pet.name = }")
print(f"{type(m.pet) = }")

m = PetOwner(pet=<__main__.Pet object at 0x000001E72F222FD0>, owner='Harry')
m.pet = <__main__.Pet object at 0x000001E72F222FD0>
m.pet.name = 'Hedwig'
type(m.pet) = <class '__main__.Pet'>


In [10]:
try:
    m = PetOwner(owner="Harry", pet="Hedwig")
    print(m)
except ValidationError as e:
    print(e)

1 validation error for PetOwner
pet
  instance of Pet expected (type=type_error.arbitrary_type; expected_arbitrary_type=Pet)


In [11]:
pet = Pet(name=42)
m = PetOwner(owner="Harry", pet=pet)
print(f"{m = }")
print(f"{m.pet = }")
print(f"{m.pet.name = }")
print(f"{type(m.pet) = }")

m = PetOwner(pet=<__main__.Pet object at 0x000001E72F24CA30>, owner='Harry')
m.pet = <__main__.Pet object at 0x000001E72F24CA30>
m.pet.name = 42
type(m.pet) = <class '__main__.Pet'>


##### Generic Classes as Types

> ##### Warning
>
> This is an advanced technique that you might not need in the beginning. In most of the cases you will probably be fine with standard _pydantic_ models.

You can use `Generic Classes` as field types and perform custom validation based on the "type parameters" (or sub-types) with `__get_validators__`.

If the Generic class that you are using as a sub-type has a classmethod `__get_validators__` you don't need to use `arbitrary_types_allowed` for it to work.

Because you can declare validators that receive the current `field`, you can extract the `sub_fields` (from the generic class type parameters) and validate data with them.

In [12]:
AgedType = TypeVar("AgedType")
QualityType = TypeVar("QualityType")

In [13]:
class TastingModel(Generic[AgedType, QualityType]):
    def __init__(self, name: str, aged: AgedType, quality: QualityType):
        self.name = name
        self.aged = aged
        self.quality = quality
    
    @classmethod
    def __get_validators__(cls):
        yield cls.validate
    
    @classmethod
    def validate(cls, v, field: ModelField):
        if not isinstance(v, cls):
            raise TypeError("invalid value")
        if not field.sub_fields:
            return v
        
        aged_f = field.sub_fields[0]
        quality_f = field.sub_fields[1]
        errors = []
        
        valid_value, error = aged_f.validate(v.aged, {}, loc="aged")
        if error:
            errors.append(error)
        
        valid_value, error = quality_f.validate(v.quality, {}, loc="quality")
        if error:
            errors.append(error)
        
        if errors:
            raise ValidationError(errors, cls)
        return v

In [14]:
class Tasting(BaseModel):
    # for wine, "aged" is an int with years, "quality" is a float
    wine: TastingModel[int, float]
    # for cheese, "aged" is a bool, "quality" is a str
    cheese: TastingModel[bool, str]
    # for thing, "aged" is a Any, "quality" is Any
    thing: TastingModel

In [15]:
model = Tasting(
    # This wine was aged for 20 years and has a quality of 85.6
    wine=TastingModel(name="Cabernet Sauvignon", aged=20, quality=85.6),
    # This cheese is aged (is mature) and has "Good" quality
    cheese=TastingModel(name="Gouda", aged=True, quality="Good"),
    # This Python thing has aged "Not much" and has a quality "Awesome"
    thing=TastingModel(name="Python", aged="Not much", quality="Awesome"),
)

In [16]:
print(f"{model = }")
print(f"{model.wine.aged = }")
print(f"{model.wine.quality = }")
print(f"{model.cheese.aged = }")
print(f"{model.cheese.quality = }")
print(f"{model.thing.aged = }")

model = Tasting(wine=<__main__.TastingModel object at 0x000001E72F5CB550>, cheese=<__main__.TastingModel object at 0x000001E72F5CB640>, thing=<__main__.TastingModel object at 0x000001E72F5C09A0>)
model.wine.aged = 20
model.wine.quality = 85.6
model.cheese.aged = True
model.cheese.quality = 'Good'
model.thing.aged = 'Not much'


In [17]:
try:
    m = Tasting(
        # For wine, aged should be an int with the years, and quality a float
        wine=TastingModel(name="Merlot", aged=True, quality="Kinda good"),
        # For cheese, aged should be a bool, and quality a str
        cheese=TastingModel(name="Gouda", aged="yeah", quality=5),
        # For thing, no type parameters are declared, and we skipped validation
        # in those cases in the Assessment.validate() function
        thing=TastingModel(name="Python", aged="Not much", quality="Awesome"),
    )
except ValidationError as e:
    print(e)

2 validation errors for Tasting
wine -> quality
  value is not a valid float (type=type_error.float)
cheese -> aged
  value could not be parsed to a boolean (type=type_error.bool)
