Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for partial model validation #3179

Closed
wants to merge 10 commits into from
Closed

Conversation

lsapan
Copy link

@lsapan lsapan commented Sep 6, 2021

Change Summary

This is a simple change that will allow Pydantic models to be used for PATCH operations. It adds a partial config option which allows required fields to be omitted during validation. There has been various discussion on the topic of using pydantic for partial updates (particularly in the realm of FastAPI), but it seems like the suggested solutions were too complex for the perceived benefit. This change doesn't require adding any new types, it simply doesn't require fields to be present if the model has partial = True configured.

To illustrate why this change is necessary, imagine a scenario where you have a model with required fields, but you want to have an endpoint that can update just a few fields at a time:

from typing import Optional

from pydantic import BaseModel, constr


class User(BaseModel):
    first_name: constr(min_length=1)
    last_name: constr(min_length=1)
    phone: Optional[constr(min_length=1)]

# We normally want the model to fail validation if last_name is left off, or null. i.e.:
User(first_name='John')
User(first_name='John', last_name=None)

# However, in the case of partial updates, we should be able to leave off last_name. We can do that now:
class PartialUser(BaseModel):
    first_name: constr(min_length=1)
    last_name: constr(min_length=1)
    phone: Optional[constr(min_length=1)]

    class Config:
        partial = True

# This no longer fails validation
PartialUser(first_name='John')

# This still fails validation, because last_name is still a required field
PartialUser(first_name='John', last_name=None)

There is an important distinction here. It has been recommended in the past to simply make every field optional, and use dict(exclude_unset=True). However, that would allow setting last_name=None, bypassing the requirement that it always have a value. This change allows us to cleanly leave fields off when we aren't interested in updating them, while still ensuring they have a valid value when they're provided.

Speaking of clean, it's a bit impractical to define the same model twice! This change also introduces a class method to quickly generate a partial version of the model:

PartialUser = User.partial_model()

Related issue number

There are various issues in both this repo and FastAPI that have been seeking this functionality. My apologies for not opening a separate issue first, I just wanted to present the solution at the same time because it is an incredibly small change that adds a lot of flexibility. 👍

Checklist

  • Unit tests for the changes exist
  • Tests pass on CI and coverage remains at 100%
  • Documentation reflects the changes where applicable
  • changes/<pull request or issue id>-<github username>.md file added describing change
    (see changes/README.md for details)
  • My PR is ready to review, please add a comment including the phrase "please review" to assign reviewers

Add a `partial` config option which allows required fields to be omitted during validation. This is ideal for using a pydantic model as part of a PATCH operation.
@lsapan
Copy link
Author

lsapan commented Sep 6, 2021

Thanks for such an awesome library! Please review when you have time 😄

pydantic/main.py Outdated Show resolved Hide resolved
@@ -115,6 +115,9 @@ not be included in the model schemas. **Note**: this means that attributes on th
**`copy_on_model_validation`**
: whether or not inherited models used as fields should be reconstructed (copied) on validation instead of being kept untouched (default: `True`)

**`partial`**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I let you see with the author of the other PR and others for the best wording (total was my suggestion to mimic TypedDict but that's just an idea)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'm quite flexible on phrasing. I was basing it off of Django REST Framework as that's where I'm coming to FastAPI / pydantic from.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree on "partial", because:

  1. It matches (roughly) TypeScript which is kind of the gold standard for type hints in dynamic languages
  2. It means you can talk about "partial models" which is cleaner than "non-total models"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But can we add a note to the docs explaining that the word "total" could also be used with the opposite meaning.

tests/test_edge_cases.py Outdated Show resolved Hide resolved
pydantic/main.py Outdated Show resolved Hide resolved
@lsapan
Copy link
Author

lsapan commented Sep 6, 2021

@PrettyWood thanks for the review! I changed it from a classmethod to a function in pydantic.tools. I also added additional test coverage to demonstrate that passing null raises a validation error.

Didn't rename it yet, though that can be easily done! It looked like you were waiting to hear from @samuelcolvin in the other PR so I figured I'd just leave it for now until he chimes in.

As for the last bit about setting required = False versus leaving it out of the data altogether, I can see the benefits of both, so I'll definitely defer to whatever you guys land on.

Thanks again! 🎉

@PrettyWood
Copy link
Member

IMO having the right JSON schema is important especially when you recall why people want this (mostly for FastAPI).
Setting only required=False won't make the field allow None as value.

@lsapan
Copy link
Author

lsapan commented Sep 6, 2021

@PrettyWood sure thing! I just changed it to set required = False on all the fields instead. Would you be open to the partial name? A few reasons:

  1. It's easier to talk about the model as a "partial model", rather than a "model with total = False", or "model without totality", etc.
  2. The name of the helper function would also be a bit confusing (for similar reasons): create_model_without_totality ?
  3. Because of the behavior I just changed, it actually deviates from TypedDict's handling of totality:
Python 3.9.4 (default, Apr 28 2021, 16:15:19) 
[Clang 12.0.5 (clang-1205.0.22.9)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from typing import TypedDict
>>> class Movie(TypedDict, total=False):
...     name: str
...     year: int
... 
>>> m: Movie = {}
>>> m2: Movie = {'year': 2015}
>>> m
{}
>>> m2
{'year': 2015}

TypedDict instances with total = False don't show the missing values as None, so it may actually make sense to use a separate name as to avoid confusion.

Just my two cents, happy to change it if desired!

@PrettyWood
Copy link
Member

You can still add

             if field.required:
                 errors.append(ErrorWrapper(MissingError(), loc=field.alias))
                 continue
+            elif config.partial:
+                continue

to support dict/json serialization ;) Best of both worlds.

@PrettyWood
Copy link
Member

PrettyWood commented Sep 6, 2021

Can you add tests with dict serialization and schema? 🙏
Also there are still some questions that need to be answered in the doc:
What happens if you inherit a class from another? Do you expect all the fields to be not required or only the one in your class?


I can't wait to work on v2 and #2353.
Making a clear difference between not required and nullable fields ;)

@lsapan
Copy link
Author

lsapan commented Sep 6, 2021

@PrettyWood

You can still add

             if field.required:
                 errors.append(ErrorWrapper(MissingError(), loc=field.alias))
                 continue
+            elif config.partial:
+                continue

to support dict/json serialization ;) Best of both worlds.

Ah, I had misread what you were asking for earlier! Added this back.

Can you add tests with dict serialization and schema? 🙏
Also there are still some questions that need to be answered in the doc:
What happens if you inherit a class from another? Do you expect all the fields to be not required or only the one in your class?

I can't wait to work on v2 and #2353.
Making a clear difference between not required and nullable fields ;)

Sure thing! I've added a bunch more coverage. Let me know if there's anything else you want to see. I'm also quite excited for v2!

I noticed one interesting thing when testing subclasses of partial models. The subclass inherits partial (as would be expected), so no fields are required, however:

class ModelA(BaseModel):
    a: str = Field(...)

    class Config:
        partial = True


class ModelB(ModelA):
    b: str = Field(...)


class ModelC(ModelA):
    c: str


# No error
ModelB(b=None)

# Validation error
ModelC(c=None)

I'd expect both to produce a validation error. I'm not familiar enough with the magic happening in Field(...) to understand what could possibly be happening here. Any ideas?

@PrettyWood
Copy link
Member

Yup you should probably put an extra condition when inferring the field in the metaclass

-        if value is Required:
+        if config.partial:
+            required = False
+        elif value is Required:

@lsapan
Copy link
Author

lsapan commented Sep 7, 2021

@PrettyWood that did the trick! It's passing with the = Field(...) syntax now. Thanks! 🎉

@lsapan
Copy link
Author

lsapan commented Sep 10, 2021

@PrettyWood do you need anything else from me for approval? 🚀 Not trying to rush you at all - just want to make sure you're not waiting on me. Thanks!

class Config(model.Config): # type: ignore
partial = True

PartialModel.__name__ = model.__name__

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious, why do you make the model name the same as the original one ?

I personally think it's a bit confusing but I don't understand why you want to do it in the first place, so I might be missing a piece here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@christophelec it would show up as PartialModel otherwise 🙂

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about something like

Suggested change
PartialModel.__name__ = model.__name__
PartialModel.__name__ = f'Partial{model.__name__}'

?


# Verify b (a required field) can be omitted on the partial model
b = Bar(a='x', c=None)
assert b.dict() == {'a': 'x', 'c': None}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reference as I mentioned it in my PR : having b not appear even though we did not set exclude_unset is a surprising behavior to me, but it might be because I implemented it in a different way which shaped my expectations differently

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@christophelec yeah as I mentioned here, it can be easily changed. It just made more sense to me in the context of a "partial" model coming from a Django REST Framework background.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@christophelec also just to flesh out the logic behind not showing the fields a bit more:

Given the model:

class Foo(BaseModel):
    a: str
    b: Optional[str]

Looking at this, it's clear that a will always have at least a string value, and b may or may not.

As such, it's a bit misleading to ever assign None to a, if we were to include it:

f = Foo(b='hi')

# It feels strange / risky to assign None to a value that normally could not be None, based on the model's definition
f.dict()  # {'a': None, 'b': 'hi'}

Granted yes, the user should be using exclude_unset=True anyway, but returning None on a non-nullable field still feels like an impractical decision. Just my two cents though! Happy to change it if others disagree!

@camflan
Copy link

camflan commented Nov 10, 2021

Bump, this would be really useful to have built-in!

Copy link
Member

@samuelcolvin samuelcolvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. This will be really useful.

A few things to change, then we will need a lot more docs.

@@ -115,6 +115,9 @@ not be included in the model schemas. **Note**: this means that attributes on th
**`copy_on_model_validation`**
: whether or not inherited models used as fields should be reconstructed (copied) on validation instead of being kept untouched (default: `True`)

**`partial`**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree on "partial", because:

  1. It matches (roughly) TypeScript which is kind of the gold standard for type hints in dynamic languages
  2. It means you can talk about "partial models" which is cleaner than "non-total models"

@@ -115,6 +115,9 @@ not be included in the model schemas. **Note**: this means that attributes on th
**`copy_on_model_validation`**
: whether or not inherited models used as fields should be reconstructed (copied) on validation instead of being kept untouched (default: `True`)

**`partial`**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But can we add a note to the docs explaining that the word "total" could also be used with the opposite meaning.

@@ -96,7 +97,8 @@ def prepare_field(cls, field: 'ModelField') -> None:
"""
Optional hook to check or modify fields during model creation.
"""
pass
if cls.partial:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be a breaking/confusing change, since the default prepare_field does nothing, people will override it without calling super() and wonder why partial is not being respected.

class Config(model.Config): # type: ignore
partial = True

PartialModel.__name__ = model.__name__
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about something like

Suggested change
PartialModel.__name__ = model.__name__
PartialModel.__name__ = f'Partial{model.__name__}'

?

Comment on lines +984 to +985
elif config.partial:
continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kind of the cruz of the question about partial and how it should behave.

The problem is that with this (or any implementation of partial/non-total) the runtime instance no-longer matches the definition of the class in code.

I can imagine people legitimating wanting this to work both ways.

What if we added another config property config.partial_omit, then based on that, either omitted the attribute or made it None?

I think it would also be smart if we could change the AttributeError when accessing omitted properties to something like

AttributeError: 'MyModel' object has no attribute 'bar'  - this is a partial model and the field was omitted

But only if we can do this easily and performantly.

assert exc_info.value.errors() == [
{'loc': ('b',), 'msg': 'none is not an allowed value', 'type': 'type_error.none.not_allowed'}
]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we split this into multiple tests, even if it involves duplicating model definitions.

@samuelcolvin
Copy link
Member

please update.

@bbatliner
Copy link

One of the original use cases discussed in #1223 was nested partial models, e.g.:

class Foo(BaseModel):
  a: str
  b: str

class Bar(BaseModel):
  foo: Foo
  c: str

See @sborovic's solution or my solution on that issue for a couple of reference implementations.

Correct me if I'm wrong, but this PR only "optionalizes" the top layer of a model, not any nested model fields? Will this work?

PartialFoo = Foo.partial_model()
PartialBar = Bar.partial_model()
PartialBar(foo=PartialFoo(b="abc"))

@macintacos
Copy link

Friendly ping - @lsapan are there any updates expected for this PR, or should someone else take over?

@lsapan
Copy link
Author

lsapan commented May 11, 2022

Hey @macintacos! Sorry for the radio silence on my end - I'm working on a different project that isn't FastAPI based and I haven't had time to sit down and work on this. I'm hoping to in the next few months, but definitely happy to have someone else take over if they have time!

@samuelcolvin
Copy link
Member

Honestly, I think we should probably pause this.

I've never been comfortable with partial models since the data no longer matches the types.

This will be much much more elegant in pydantic v2 when typeddicts will be first class citizens.

I know it's annoying, bit I think the best answer is to wait for v2.

@BrutalSimplicity
Copy link

This is an all or none approach, but is there a simple way to say exclude this single field?

@bhushanatdukaan
Copy link

product_description: Optional[constr(max_length=500)] = "" product_category: Optional[constr(max_length=100)] = "" quantity: conint(ge=1) gst_rate: Optional[condecimal(ge=0, le=100, decimal_places=2, max_digits=5)] = 0

You all can use Optional and give pass an default field check constr or any field validation to add partial model validation. You can also pass default as well.

@samuelcolvin
Copy link
Member

TypedDicts in pydantic-core already support total=False which achieves the same features as this in a type-safe way.

Closing this feature and deferring the feature until V2.

@mrlubos
Copy link

mrlubos commented Sep 21, 2022

@samuelcolvin (im)patiently waiting for v2! 🚀

@samuelcolvin
Copy link
Member

Me too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants