# Custom Serializers

So far we have been happy with the way Pydantic serializes field values.

But sometimes, especially with certain data types, like datetimes, we may want to control how fields get serialized.

A typical example is to specify how a date or datetime object might get serialized.

Another example might be standardizing the number of decimal places used for floats.

Whatever your need is, you can control how field data gets serialized very easily.

We'll need to use a **decorator** function provided by Pydantic, called `@field_serializer` which is used to control serialization at the field level.

In [1]:
from pydantic import BaseModel, field_serializer

The decorator has several arguments that defines which field the serializer applies to and how the serializer needs to be applied.

One important option is:
- `when_used`: by default the custom serializer is always used, but we have other options available:
    - `always`: the default, serializer is executed when serializing either to a dict or to JSON
    - `unless-none`: serializer is not used if the value is None
    - `json`: serializer is only used when serializing to JSON
    - `json-unless-none`: serializer used when serializing to JSON, unless the value is None

There is also another option for mode plain vs wrap, but this is rarely used, and I won't cover it in this course.

Let's take a look the `when_used` option and understand the circumstances when our serializer gets called.

In [2]:
from datetime import datetime

class Model(BaseModel):
    dt: datetime | None = None

    @field_serializer("dt", when_used="always")
    def serialize_name(self, value):
        print(f"type = {type(value)}")
        return value

In [3]:
m = Model(dt="2020-01-01T12:00:00")
m

Model(dt=datetime.datetime(2020, 1, 1, 12, 0))

So, first thing to realize is that the serializer will run once the model has been populated - which means that `value` in our arguments will be of the valid type (or `None` in our specific example since we made the field nullable).

In [4]:
m.model_dump()

type = <class 'datetime.datetime'>


{'dt': datetime.datetime(2020, 1, 1, 12, 0)}

As you can see, our serializer was invoked - the return value is what will be used in the serialized output.

Let's serialize to JSON and see what happens:

In [5]:
m.model_dump_json()

type = <class 'datetime.datetime'>


'{"dt":"2020-01-01T12:00:00"}'

The data was correctly serialized, since Pydantic can correctly serialize datetime to JSON (uses ISO standard).

However, when serializing to JSON we may not want this datetime representation. 

We'll get back to that in a minute.

Let's see what happens if the value of `dt` is None:

In [6]:
m = Model()
m

Model(dt=None)

In [7]:
m.model_dump()

type = <class 'NoneType'>


{'dt': None}

In [8]:
m.model_dump_json()

type = <class 'NoneType'>


'{"dt":null}'

As you can see, our custom serializer was called in both cases.

If we don't want to run our custom serializer when the field value is `None`, we can use one of the other `when_used` options:

In [9]:
from datetime import datetime

class Model(BaseModel):
    dt: datetime | None = None

    @field_serializer("dt", when_used="unless-none")
    def serialize_name(self, value):
        print(f"type = {type(value)}")
        return value

In [10]:
m = Model(dt="2020-01-01T12:00:00")
m

Model(dt=datetime.datetime(2020, 1, 1, 12, 0))

In [11]:
m.model_dump()

type = <class 'datetime.datetime'>


{'dt': datetime.datetime(2020, 1, 1, 12, 0)}

In [12]:
m = Model()
m

Model(dt=None)

In [13]:
m.model_dump()

{'dt': None}

In [14]:
m.model_dump_json()

'{"dt":null}'

As you can see, our serializer did not get called when `dt` was `None`.

Let's go back to the case where we only want to change the serialization when serializing to JSON. We might be OK with the dictionary serialization, but for our JSON output we want to modify the datetime format to be formatted like this:

```
2020/1/1 12:00 PM
```

We can use the `strftime()` method to do this:

In [15]:
dt = datetime(2020, 1, 1, 12, 0, 0)
dt.isoformat()

'2020-01-01T12:00:00'

In [16]:
dt.strftime("%Y/%-m/%-d %I:%M %p")

'2020/1/1 12:00 PM'

So, let's use this in our serializer, and configure the serializer to only apply to JSON serialization, and not when the value is None:

In [17]:
from datetime import datetime

class Model(BaseModel):
    dt: datetime | None = None

    @field_serializer("dt", when_used="json-unless-none")
    def serialize_name(self, value):
        print(f"type = {type(value)}")
        return value.strftime("%Y/%-m/%-d %I:%M %p")

In [18]:
m = Model(dt="2020-01-01T12:00:00")
m

Model(dt=datetime.datetime(2020, 1, 1, 12, 0))

In [19]:
m.model_dump()

{'dt': datetime.datetime(2020, 1, 1, 12, 0)}

As you can see, serializing to a dictionary did not run our serializer.

However, when serializing to JSON:

In [20]:
m.model_dump_json()

type = <class 'datetime.datetime'>


'{"dt":"2020/1/1 12:00 PM"}'

And, because of our configuration, the serializer will not be invoked if the value is `None`:

In [21]:
m = Model()
m

Model(dt=None)

In [22]:
m.model_dump_json()

'{"dt":null}'

Now suppose we want to implement a different serialization depending on whether we are serializing to a dictionary or to JSON.

We need to somehow be able to figure out, inside our serializer whhich serialization we are performing and react accordingly.

Pydantic implements yet another argument that we can add to our serializer function - an argument with type `FieldSerializationInfo`. Let's take a look:

In [23]:
from pydantic import FieldSerializationInfo

In [24]:
class Model(BaseModel):
    dt: datetime | None = None

    @field_serializer("dt", when_used="unless-none")
    def dt_serializer(self, value, info: FieldSerializationInfo):
        print(f"info={info}")
        return value

In [25]:
m = Model(dt=datetime(2020, 1, 1))
m

Model(dt=datetime.datetime(2020, 1, 1, 0, 0))

In [26]:
m.model_dump()

info=SerializationInfo(include=None, exclude=None, mode='python', by_alias=False, exclude_unset=False, exclude_defaults=False, exclude_none=False, round_trip=False)


{'dt': datetime.datetime(2020, 1, 1, 0, 0)}

Notice that `mode` value in the `info` object? It is set to `python`.

Now, let's dump to JSON:

In [27]:
m.model_dump_json()

info=SerializationInfo(include=None, exclude=None, mode='json', by_alias=False, exclude_unset=False, exclude_defaults=False, exclude_none=False, round_trip=False)


'{"dt":"2020-01-01T00:00:00"}'

Notice that the `mode` is now set to `json`.

We could use that, but `FieldSerializationInfo` offers us a method named `mode_is_json` that we can use instead.

In [28]:
class Model(BaseModel):
    dt: datetime | None = None

    @field_serializer("dt", when_used="unless-none")
    def dt_serializer(self, value, info: FieldSerializationInfo):
        print(f"mode_is_json={info.mode_is_json()}")
        return value

In [29]:
m = Model(dt=datetime(2020, 1, 1))

In [30]:
m.model_dump()

mode_is_json=False


{'dt': datetime.datetime(2020, 1, 1, 0, 0)}

In [31]:
m.model_dump_json()

mode_is_json=True


'{"dt":"2020-01-01T00:00:00"}'

Let's look at a situation where we might want this flexibility.

Let's say we want our serializer to ensure that datetime objects are always serialized to timezone aware UTC times. Furthermore, we want the serialized value to use the `Z` notation for UTC times, instead of `+00:00` that Python's `isoformat()` function usually returns.

We can easily write Python code to do this, using the `pytz`library.

To complete this example, you'll need to make sure you have `pytz` installed in your virtual environment.

Let's write a simple Python function that will do the following, given a datetime object as an argument:
- if the datetime is naive, make it aware, and assume the naive datetime was already UTC
- if the datetime is aware, change it to be UTC

In [32]:
import pytz

def make_utc(dt: datetime) -> datetime:
    if dt.tzinfo is None:
        dt = pytz.utc.localize(dt)
    else:
        dt = dt.astimezone(pytz.utc)
    return dt

We can use it this way:

In [33]:
dt = make_utc(datetime.now())
dt

datetime.datetime(2023, 12, 2, 11, 15, 30, 559094, tzinfo=<UTC>)

In [34]:
dt.isoformat()

'2023-12-02T11:15:30.559094+00:00'

We need to change the serialized format of this datetime, and since we know it will always be in UTC, this is quite simple:

In [35]:
dt.strftime("%Y-%m-%dT%H:%M:%SZ")

'2023-12-02T11:15:30Z'

Let's make a function for this:

In [36]:
def dt_utc_json_serializer(dt: datetime) -> str:
    dt = make_utc(dt)
    return dt.strftime("%Y-%m-%dT%H:%M:%SZ")

And now let's implement this in our custom serializer:

In [37]:
class Model(BaseModel):
    dt: datetime | None = None

    @field_serializer("dt", when_used="unless-none")
    def dt_serializer(self, dt, info: FieldSerializationInfo):
        if info.mode_is_json():
            return dt_utc_json_serializer(dt)
        return make_utc(dt)

In [38]:
m = Model(dt=datetime(2020, 1, 1))
m

Model(dt=datetime.datetime(2020, 1, 1, 0, 0))

In [39]:
m.model_dump()

{'dt': datetime.datetime(2020, 1, 1, 0, 0, tzinfo=<UTC>)}

In [40]:
m.model_dump_json()

'{"dt":"2020-01-01T00:00:00Z"}'

And if we have an aware datetime that is not in UTC already:

In [41]:
eastern = pytz.timezone('US/Eastern')
dt = eastern.localize(datetime(2020, 1, 1))
dt

datetime.datetime(2020, 1, 1, 0, 0, tzinfo=<DstTzInfo 'US/Eastern' EST-1 day, 19:00:00 STD>)

Now let's use it in our model:

In [42]:
m = Model(dt=dt)
m

Model(dt=datetime.datetime(2020, 1, 1, 0, 0, tzinfo=<DstTzInfo 'US/Eastern' EST-1 day, 19:00:00 STD>))

In [43]:
m.model_dump()

{'dt': datetime.datetime(2020, 1, 1, 5, 0, tzinfo=<UTC>)}

In [44]:
m.model_dump_json()

'{"dt":"2020-01-01T05:00:00Z"}'