# Default Factories

We've actually already seen this before, but let's make sure we understand how to deal with mutable defaults, or, more generally, how to create defaults that might require calling a function to define the default.

Let's talk Python first.

When we define default values for a function:

In [1]:
def my_func(a=1, b=2):
    pass

The way this works, is that `my_func` is compiled (once when the module containing the function is loaded for the first time each time your app is run).

Functions are objects, and they have state. In particular they have the code that the function will execute when it is called. But, it also stores the default values in that state.

This means, that the defaults are calculated once, when the function is compiled. Thereafter, every the time the function is called, the defaults are pulled from the function's state - so they do not get recalculated every time the function is called.

This is really important to understand, as it can lead to bugs if you don't understand that.

Let's look at this function, where we want to output some string to the console, but want to prefix the function wih a provided datetime, or default to the current datetime if it was not provided.

You might be tempted to do this:

In [2]:
from datetime import datetime, UTC

def log(text: str, dt: datetime = datetime.now(UTC)):
    print(f"{dt.isoformat()}: {text}")

Now let's try it out:

In [3]:
log("line 1", dt=datetime(2020, 1, 1, 15, 0, 0))

2020-01-01T15:00:00: line 1


And if we don't provide `dt`:

In [4]:
log("line 2")

2023-12-02T18:21:05.316346+00:00: line 2


Let's wait a few seconds, and try that again:

In [5]:
log("line 3")

2023-12-02T18:21:05.316346+00:00: line 3


Notice something odd? That time has **not changed**!

This is because the default was calculated when the function was compiled (when I executed the cell containing the `def` statement). 

After that, every time the function is called, it retrieves that value from the function's state, hence we see the same value again and again.

This is the same reason why using a mutable default as a function argument can lead to bugs - I won't get into the details, but basically, you should, probably never, do this:

In [6]:
def my_func(items=[]):
    pass

Instead, the solution for both this and the preceding example is to set the default to `None`, and **inside** the function (hence code that runs every time the function is called), create the empty list, or get the current datetime if the argument was `None`.

The next issue, is not exactly the same, but is related.

When we look at Pydantic models, remember that we blighthely used code such as this:

In [7]:
from pydantic import BaseModel

In [8]:
class Model(BaseModel):
    elements: list[int] = []

In a regular class this would not be a good idea:

In [9]:
class Model:
    elements: list[int] = []

In [10]:
m = Model()
m

<__main__.Model at 0x1149d6510>

In [11]:
m.elements.append(1)
m.elements

[1]

In [12]:
m2 = Model()
m2

<__main__.Model at 0x1149d6150>

In [13]:
m2.elements

[1]

You probably noticed that the difference here is that we are dealing with a class attribute, not an instance attribute. 

Good!

But, we actually have the same issue with dataclasses, where we use class attributes to define instance attributes. In fact, dataclasses will prevent us from even doing this:

In [14]:
from dataclasses import dataclass

In [15]:
try:
    @dataclass
    class Model:
        elements: list[int] = []
except ValueError as ex:
    print(ex)

mutable default <class 'list'> for field elements is not allowed: use default_factory


Note the exception hint: `use default_factory`. We'll get back to that in a second.

Pydantic however, **does** allow us to set mutable defaults, as we just saw. So why do they work without causing issues?

In [16]:
class Model(BaseModel):
    elements: list[int] = []

The secret is that Pydantic recognizes a mutable default when it sees one, and **replaces** it with somethign else - a function that will be called to generate a new empty list every time a new instance of the model is created.

And so, we come to default factories, and why we need them (whether they are implicitly created via mutable defaults, or explicitly using the `Field` class).

Let's say we want our default for some field to always be the current UTC date?

If we were to try this:

In [17]:
class Log(BaseModel):
    dt: datetime = datetime.now(UTC)
    message: str

We'll run into the same issue as with the Python fucntions - that default datetime is calculated **once**, when the model class is compiled.

In [18]:
Log(message="line 1")

Log(dt=datetime.datetime(2023, 12, 2, 18, 21, 5, 409467, tzinfo=datetime.timezone.utc), message='line 1')

In [19]:
Log(message="line 2")

Log(dt=datetime.datetime(2023, 12, 2, 18, 21, 5, 409467, tzinfo=datetime.timezone.utc), message='line 2')

As you can see, the exact same datetime.

So, instead, we can use the `default_factory` argument in the `Field` class to assign a function that Pydantic needs to call every time an instance is created and a default is needed.

This function does not take any arguments, and should return a value compatible with the field type.

In [20]:
from pydantic import Field

class Log(BaseModel):
    dt: datetime = Field(default_factory=lambda: datetime.now(UTC))
    message: str

And now we have the desired result:

In [21]:
Log(message="line 1")

Log(dt=datetime.datetime(2023, 12, 2, 18, 21, 5, 421543, tzinfo=datetime.timezone.utc), message='line 1')

In [22]:
Log(message="line 2")

Log(dt=datetime.datetime(2023, 12, 2, 18, 21, 5, 424267, tzinfo=datetime.timezone.utc), message='line 2')

As you can see, the default datetime is re-calculated both times.

For other mutable defaults, for example an empty list, we could use a default factory:

In [23]:
class Model(BaseModel):
    elements: list[int] = Field(default_factory=lambda: [])

or, since Pydantic supports it, we can simply use the mutable default directly in the class like we started off with:

In [24]:
class Model(BaseModel):
    elements: list[int] = []