New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-28678: Pydantic JSON serialization prototype #471
Conversation
83d0dfa
to
8620391
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks fine. Given that we're still constructing dicts a lot internally just to make the pydantic instances, I think it's only better than the previous version without pydantic if:
- we do want schema validation (something I'm not really qualified to answer; web APIs are not an area of expertise for me);
- we can in fact simplify some of the
to_simple
/from_simple
implementations further, maybe by using different pydantic classes for the minimal versions and/or eliminating minimal forms we're not sure we'll use.
If we do want schema validation, I have absolutely no problem with this; it's only a little bit more complicated than the version without pydantic.
@@ -180,7 +215,7 @@ def to_simple(self, minimal: bool = False) -> Dict: | |||
# We can still be a little minimalist with a component | |||
# but we will also need to record the datasetType component | |||
simple["component"] = self.datasetType.component() | |||
return simple | |||
return SerializedDatasetRef(**simple) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder a bit whether we'd gain overall by using different pydantic model types for the minimal forms; I bet it would simplify the to_simple
and from_simple
logic (less need to incrementally build dicts just to pass as kwargs), and it might help simplify the new validation methods, too.
Of course, more types overall is a downside, and if we end up wanting to have a semi-minimal thing for one type that might or might not use the minimal version of some nested thing, that gets ugly so quickly it's probably not worth considering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to use a different model class for the minimal form (which would simplify model validation even further) but that becomes a problem with deserialization since then the JSON reader can't be told directly what class it is getting and would then have to pass it to DatasetRef for inspection before instantiating the pydantic model instance. That seemed worse than declaring one and only one pydantic model per butler class. Of course in the future it would help if we tried to default to using pydantic models natively.
806ed8e
to
56d6c58
Compare
Retain the from_json/to_json and from_simple/to_simple APIs that use universe/registry/minimal parameters but instead make the simple form return a class that is a pydantic model. This allows the JSON to be validated through pydantic. This has required a change to the DatasetType minimal form and DimensionGraph form since pydantic seemingly requires all models are dicts.
No description provided.