-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Make model dump/load symmetric for aliased fields #160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #160 +/- ##
==========================================
+ Coverage 99.89% 99.89% +<.01%
==========================================
Files 9 9
Lines 992 995 +3
Branches 205 206 +1
==========================================
+ Hits 991 994 +3
Partials 1 1 |
|
I can't accept a change like this as it alters current behaviour too much. Better to add a kwarg to |
|
Understood; I’ll get this updated as an opt-in behavior. Thanks! |
0e0e298 to
5289bc6
Compare
|
WDYT of this approach? I'd rather be able to opt-in on a model-by-model basis, rather than having to hunt down every call site for If this looks reasonable to you, I'll update the documentation as well. Thanks! |
|
#164 looks like the implementation of your initial feedback, I think. I understood what you were suggesting, but I'm not convinced it's cleaner. I'll share a bit more of my use case in hopes it might highlight why I think it might be cleaner to define this in the model config instead. We've implemented common models for a large number of objects that we are ingesting from a third-party API. That API provides a reasonable structure for the data, but in numerous instances, we've needed to alias field names so that we can use these objects in downstream services in a more developer-friendly way. These objects, after being loaded from the third-party JSON and validated, are then serialized and placed on a message queue, where other consumers are able to receive the serialized data and load it back into the exact same pydantic model (with the model distributed as shared code). In many cases, those consumers then re-serialize all or part of the models they've loaded from messages they've received, and they send them to other components. Thus, as I mentioned, we have a growing number of By having the model itself configured to allow aliases, this problem is avoided. The model remains the canonical definition of how it will serialize/deserialize itself--this is already the case for other aspects of serialization/deserialization; in my mind, this is a very close sibling to I hope this provides some useful context for why I added the option to the model config. I'm also curious what the unexpected consequences you have in mind are; I'm probably blind to them, having not seen the library evolve, and I definitely don't want to cause any! Thanks for your consideration. 😄 |
It is, in the end implementing another PR to show what I meant was easier than a comment with loads of code. I've come round to your suggestion, if you still think this is required once you've read through my explanation below I'll merge this. (documentation and history will need to be updated first)
Let's say we have class Foobar(BaseModel):
card_number: str
expiry: date = date(2020, 1, 1)
class Config:
allow_population_by_alias = True
fields = {
'card_number': {
'alias': 'cardNumber'
}
}That model expects something like
You could just have a base model you use everywhere like class OurBaseModel(BaseModel):
def dict(self, *kwargs):
kwargs.setdefault('use_aliases', True)
return super().dict(**kwargs)or
This would be much better accumplished using pickle. Pickling and unpickling models will be much more performant as validation doesn't have to be rerun upon every deserialisation, it also gets round this very problem. I setup pydantic to play well with pickle for this very case, see here. |
|
Thanks for going over this--I appreciate your thorough response. 😄 I agree with what you're saying w.r.t. similar, but ultimately undesired, values being considered valid. I think this concern can/should be strongly stated in the documentation for this new config field, for the reasons you've laid out. I'll have an update to the docs and history shortly. Perhaps the root of my confusion is in my interpretation of "alias;" I'd understood it as setting up an equivalence between the field names, rather than a one-way transformation, and it was that confusion that makes me think having this noted and configurable could improve the DX for others. OTOH, sometimes I can't words well, so it might just be me!
It would definitely be faster, but I don't mind re-validating when the payload is coming off the wire...and I've had some bad experiences with pickle that still sting a bit. Appreciate the suggestion, though; it was a selling point when looking at pydantic that you'd ensured compatibility with pickle. Not every author can be bothered to do so, I think. 😁 |
|
I agree that |
|
It's also worth noting that inheritance should "work" (aka, do something magic and cause massive confusion) between different models. So you could have class FrontendFoobar(BaseModel):
...
class Config:
foobar = 'xxx'
fields = {'card_number': 'cardNumber', 'expiry_date': 'expiryDate'}
class BackendFoobar(FrontendFoobar):
class Config:
fields = {}Here |
|
@bendemaree what's your intention on this? |
|
+1 I could use this. |
|
@samuelcolvin I will have the remaining changes up tomorrow. My apologies; I forgot that I was going on vacation for a few weeks a day or two after committing to this. |
|
@samuelcolvin I've updated the documentation; hopefully it's to your standard. I also took the liberty of opening the changelog for a new patch version. Apologies for the additional delay on this! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
otherwise LGTM.
docs/index.rst
Outdated
| population by alias disabled (the default), trying to parse an object with only the key ``cardNumber`` will fail. | ||
| However, if you enable population by alias, this previously-invalid object will now populate the ``card_number`` | ||
| field from ``cardNumber``. This may be desired for some use cases, but in others (like the one given here, perhaps), | ||
| relaxing strictness could introduce bugs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wait this is the wrong way around I think, shouldn't it be
As an example, say you have a field named ``card_number`` with the alias ``cardNumber``. With
``allow_population_by_alias = False`` disabled, trying to parse an object with only the key
``card_number`` will fail (the ``cardNumber`` field is required). However, if you enable population
by alias, this previously-invalid object will now populate the ``card_number`` field from
``card_number``. This may be desired or acceptable for some use cases, but in others it could
introduce bugs.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh my...thank you for catching that! 😳
|
Great, thank you very much. |
When using field aliases, I expected to be able to dump a model to a
dictand then load that result back into the same model, but field aliases aren't considered when populating a model. This changes that behavior to allow this; in other words,MyModel(**MyModel(a="foo").dict())now works ifahas an alias.The field alias is considered first to maintain as much of the existing behavior, but this change does have the potential to break user code, as data that previously failed validation could now pass.
Thanks for this library, @samuelcolvin! It's a very nice tool to have in the toolbox. 😄