# Partially Validated Pydantic Models

In one of my Udemy classes, someone posed a really interesting question. 

> I wish to ingest JSON API data that sometimes will fail validation (for whatever reason e.g. allowing mistakes to be made on their end) instead of effectively throwing that response away.
>
>I want the model to still be instantiated with just the fields that pass validation, i.e. a partially validated model, but with missing/failed validation fields set to some special sentinel value or perhaps None
>
>My intention is to insert this into Postgres, and have an errors JSONB column which contains the ValidationError.json(). How can I achieve this?


My immediate reaction was, no, this is unlikely as Pydantic V2 does not support that natively (V1 kind of did in some fashion), and I could not see a way of getting around taking over **all** of the validation to achieve something like this, which essentially negated the built-in validation usefulness of Pydantic.

Undetered by my response, the person continued working on the problem and eventually came up with a really neat solution to this. 

Here is a link to the person's Git repo that presents that solution:

[https://github.com/linktoad/pydantic-partial/tree/main](https://github.com/linktoad/pydantic-partial/tree/main)

as well as the currently open Pydantic issue:

[https://github.com/pydantic/pydantic/issues/7140](https://github.com/pydantic/pydantic/issues/7140)

This is also the first real practical use I have seen for wrap validators in Pydantic, so it's a really good example of how those work and where they can be useful. I'm sure there other examples of uses for wrap validators, I just have never found a need for one before, so this is a great example.

It also shows how model validators can be useful, and how to use them.

Let's go through the solution step by step.

In [1]:
from __future__ import annotations
from typing import Any

from pydantic import (
    BaseModel,
    ValidationError,
    field_validator,
    model_validator,
    computed_field,
    ValidatorFunctionWrapHandler,
    ValidationInfo,
    Field,
)

The `__future__` import is needed because we will be using a type hint in a class that specifies that class itself as a return value. For Python 3.12 and above this will not be needed, but this code is compatible with earlier versions of Python as well, hence the use of `__future__` `annotations`.

See [here](https://docs.python.org/3/library/typing.html#typing.Self) for more details.

Since the idea is to include some attribute on the Pydantic model that will contain the validation errors, a structure will be needed. This was accomplished using a Pydantic model to "describe" those errors.

In [2]:
class Error(BaseModel):
    field: str
    type: str = "missing"
    msg: str = "Field required"
    input: Any = Field(default=None, exclude=True)

We'll want to identify an invalid field somehow - using `None` is not really an option since `None` could be a valid field value, so we'll need to just have some other object take on that role. This is often called a **sentinel** value, and the usual trick in Python is to define it this way:

In [3]:
SENTINEL = object()

Now we can get to the actual implementation. This was done by define a new Pydantic base model that can be be used for your own models.

The author called this class `MissingOrInvalidAsNone`, so I'll stick with that name for the base model.

The first thing is we'll need some "private" variable (i.e. one that is not part of the model fields, where we can store the errors).

In Pydantic, this is achieved by prefixing field names with an underscore:

In [4]:
class MissingOrInvalidAsNone(BaseModel):
    _errors: list[Error] = []

There are now two cases we need to distinguish for invalid fields: the field might be required but missing, or the field is present but fails further validation.

To identify missing fields, we can use a **model validator**. Model validators in Pydantic gives you a way to access all the fields in the instance, and perform some validation that can reference any of the model fields. By making it a **before** validator, this will happen before the individual field validators will run, and is the perfect place to check for missing required fields.

See Pydantic's [docs](https://docs.pydantic.dev/latest/concepts/validators/#model-validators) for more info on model validators, and specifically on some of the caveats with `before` model validators.

In [5]:
class MissingOrInvalidAsNone(BaseModel):
    _errors: list[Error] = []

    @model_validator(mode="before")
    @classmethod
    def missing_fields_as_sentinels(cls, data: Any) -> Any:
        if not isinstance(data, dict):
            return data
        return data | {
            field: SENTINEL
            for field, field_info in cls.model_fields.items()
            if field_info.is_required() and field not in data
        }

The model validator receives a dictionary containing all the fields with their raw values. The idea is to identify all the missing, but required fields, and set their value to that `SENTINEL` we defined (this will dindicate later that the field is missing, and is therefore distinguishable from `None`).

Let's break this down:

```python
{
    field: SENTINEL
    for field, field_info in cls.model_fields.items()
    if field_info.is_required() and field not in data
}
```

Basically this is a dictionary comprehension that loops through all the model fields, and for those fields that are required but not present in the data dictionary, produces a key/value pair with the field name as the key, and the `SENTINEL` object as the value - other fields are now included in this comprehension.

Then the original data dictionary is unioned with this new dictionry, with the end result being a new data dictionary that now has the original data along with missing required fields set to `SENTINEL`.

The next step is to create a field validator (that will apply to every field).

But what we want here is to "inject" this validator without losing the rest of the model's and Pydantic's validation logic. 

This is where a **wrap** validator comes into play. 

We can insert ourselves into the validation flow, and set the failing fields to an instance of that `Error` class that was created earlier. Later, another model validator (an `after` model validator) will handle fields set to those `Error` instances. But for now, we just want to replace any field that is missing, or fails validation with our custom `Error` instance, and then **let Pydantic continue the validation flow**.

In [6]:
class MissingOrInvalidAsNone(BaseModel):
    _errors: list[Error] = []

    @model_validator(mode="before")
    @classmethod
    def missing_fields_as_sentinels(cls, data: Any) -> Any:
        if not isinstance(data, dict):
            return data
        return data | {
            field: SENTINEL
            for field, field_info in cls.model_fields.items()
            if field_info.is_required() and field not in data
        }

    @field_validator("*", mode="wrap")
    @classmethod
    def gracefully_handle_validation_errors(
        cls, v: Any, handler: ValidatorFunctionWrapHandler, info: ValidationInfo
    ) -> Any:
        if v is SENTINEL:
            return Error(field=info.field_name)
        try:
            return handler(v)
        except ValidationError as ex:
            return Error(field=info.field_name, **ex.errors()[0])

The arguments for a wrap validator include the field value itself, a reference to the "next" validator, and some information on the field itself (in particular we need the field name).

So here, as you can see, if the value is `SENTINEL` we return a new value for the field (set to an `Error` instance), and stop the validation (returning the value in a wrap validator essentially sets the field to whatever return value you provide, and since it does not call the next step in the validation sequence, validation effectively stops there).

If the field was not a missing field, then the code attempts to continue the validation by calling the `handler` argument. This is now the "normal" field validation for a non-missing field.

However, instead of just letting this run and let Pydantic raise any validation errors, that would result in your model validation failing, we want the validation to continue running in spite of any validation errors (that was the whole intent to start off with).

So, this call is wrapped in a try/except block. If a validation exception occurs, then instead of letting this exception bubble up, again the field value is replaced with an instance of this `Error` object. So validation "passes", but we now have a field with that special value.

At this point, we now have a valida Pydantic object, with some fields set to those `Error` values.

Let's try it out and see:

In [7]:
class Model(MissingOrInvalidAsNone):
    a: int
    b: bool
    c: str
    d: float

json_data = """
{
    "a": "3",
    "b": "something",
    "c": null
}
"""

m = Model.model_validate_json(json_data)

In [8]:
for field in m:
    print(field, end="\n\n")

('a', 3)

('b', Error(field='b', type='bool_parsing', msg='Input should be a valid boolean, unable to interpret input', input='something'))

('c', Error(field='c', type='string_type', msg='Input should be a valid string', input=None))

('d', Error(field='d', type='missing', msg='Field required', input=None))



In the final step, we want to replace those `Error` values with some other default, in this case the author decided to use `None`. We also want to populate that `_errors` list with all the validation errors.

To do this, a model validator is once again used, but this time as an **after** validator, so it will run after all the other validators have taken place.

In [9]:
class MissingOrInvalidAsNone(BaseModel):
    _errors: list[Error] = []

    @model_validator(mode="before")
    @classmethod
    def missing_fields_as_sentinels(cls, data: Any) -> Any:
        if not isinstance(data, dict):
            return data
        return data | {
            field: SENTINEL
            for field, field_info in cls.model_fields.items()
            if field_info.is_required() and field not in data
        }

    @field_validator("*", mode="wrap")
    @classmethod
    def gracefully_handle_validation_errors(
        cls, v: Any, handler: ValidatorFunctionWrapHandler, info: ValidationInfo
    ) -> Any:
        if v is SENTINEL:
            return Error(field=info.field_name)
        try:
            return handler(v)
        except ValidationError as ex:
            return Error(field=info.field_name, **ex.errors()[0])

    @model_validator(mode="after")
    def save_errors_and_set_none(self) -> MissingOrInvalidAsNone:
        for field in self.model_fields:
            value = getattr(self, field)
            if isinstance(value, Error):
                self._errors.append(value)
                setattr(self, field, None)  # Could set it to anything you want for your app
        return self

As you can see, the after model validator just iterates through all the model fields, and, if the field value is an `Error` adds that error to the `_errors` list and replaces the field value with `None`.

In [10]:
class Model(MissingOrInvalidAsNone):
    a: int
    b: bool
    c: str
    d: float

json_data = """
{
    "a": "3",
    "b": "something",
    "c": null
}
"""

m = Model.model_validate_json(json_data)

In [11]:
m

Model(a=3, b=None, c=None, d=None)

In [12]:
m._errors

[Error(field='b', type='bool_parsing', msg='Input should be a valid boolean, unable to interpret input', input='something'),
 Error(field='c', type='string_type', msg='Input should be a valid string', input=None),
 Error(field='d', type='missing', msg='Field required', input=None)]

Only thing left is to maybe make access to this "private" `_errors` field "public", by using a computed field:

In [13]:
class MissingOrInvalidAsNone(BaseModel):
    _errors: list[Error] = []

    @model_validator(mode="before")
    @classmethod
    def missing_fields_as_sentinels(cls, data: Any) -> Any:
        if not isinstance(data, dict):
            return data
        return data | {
            field: SENTINEL
            for field, field_info in cls.model_fields.items()
            if field_info.is_required() and field not in data
        }

    @field_validator("*", mode="wrap")
    @classmethod
    def gracefully_handle_validation_errors(
        cls, v: Any, handler: ValidatorFunctionWrapHandler, info: ValidationInfo
    ) -> Any:
        if v is SENTINEL:
            return Error(field=info.field_name)
        try:
            return handler(v)
        except ValidationError as ex:
            return Error(field=info.field_name, **ex.errors()[0])

    @model_validator(mode="after")
    def save_errors_and_set_none(self) -> MissingOrInvalidAsNone:
        for field in self.model_fields:
            value = getattr(self, field)
            if isinstance(value, Error):
                self._errors.append(value)
                setattr(self, field, None)  # Could set it to anything you want for your app
        return self

    @computed_field
    @property
    def errors(self) -> list[Error]:
        return self._errors

Let's try out this final solution:

In [14]:
class Model(MissingOrInvalidAsNone):
    a: int
    b: bool
    c: str
    d: float

json_data = """
{
    "a": "3",
    "b": "something",
    "c": null
}
"""

m = Model.model_validate_json(json_data)

In [15]:
for field in m:
    print(field, end="\n\n")

('a', 3)

('b', None)

('c', None)

('d', None)



In [16]:
m.errors

[Error(field='b', type='bool_parsing', msg='Input should be a valid boolean, unable to interpret input', input='something'),
 Error(field='c', type='string_type', msg='Input should be a valid string', input=None),
 Error(field='d', type='missing', msg='Field required', input=None)]

In [17]:
print(m.model_dump_json(indent=2))

{
  "a": 3,
  "b": null,
  "c": null,
  "d": null,
  "errors": [
    {
      "field": "b",
      "type": "bool_parsing",
      "msg": "Input should be a valid boolean, unable to interpret input"
    },
    {
      "field": "c",
      "type": "string_type",
      "msg": "Input should be a valid string"
    },
    {
      "field": "d",
      "type": "missing",
      "msg": "Field required"
    }
  ]
}


And there you have it, a really interesting and elegant solution to the original problem of "partially validated" Pydantic models.

Big thanks to the author for coming up with this and sharing it on GitHub!