Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-28678: Pydantic JSON serialization prototype #471

Merged
merged 5 commits into from Apr 9, 2021
Merged

Conversation

timj
Copy link
Member

@timj timj commented Feb 5, 2021

No description provided.

@timj timj force-pushed the tickets/DM-28678 branch 3 times, most recently from 83d0dfa to 8620391 Compare February 8, 2021 17:18
Copy link
Member

@TallJimbo TallJimbo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks fine. Given that we're still constructing dicts a lot internally just to make the pydantic instances, I think it's only better than the previous version without pydantic if:

  • we do want schema validation (something I'm not really qualified to answer; web APIs are not an area of expertise for me);
  • we can in fact simplify some of the to_simple/from_simple implementations further, maybe by using different pydantic classes for the minimal versions and/or eliminating minimal forms we're not sure we'll use.

If we do want schema validation, I have absolutely no problem with this; it's only a little bit more complicated than the version without pydantic.

@@ -180,7 +215,7 @@ def to_simple(self, minimal: bool = False) -> Dict:
# We can still be a little minimalist with a component
# but we will also need to record the datasetType component
simple["component"] = self.datasetType.component()
return simple
return SerializedDatasetRef(**simple)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder a bit whether we'd gain overall by using different pydantic model types for the minimal forms; I bet it would simplify the to_simple and from_simple logic (less need to incrementally build dicts just to pass as kwargs), and it might help simplify the new validation methods, too.

Of course, more types overall is a downside, and if we end up wanting to have a semi-minimal thing for one type that might or might not use the minimal version of some nested thing, that gets ugly so quickly it's probably not worth considering.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to use a different model class for the minimal form (which would simplify model validation even further) but that becomes a problem with deserialization since then the JSON reader can't be told directly what class it is getting and would then have to pass it to DatasetRef for inspection before instantiating the pydantic model instance. That seemed worse than declaring one and only one pydantic model per butler class. Of course in the future it would help if we tried to default to using pydantic models natively.

timj added 3 commits April 7, 2021 12:46
Retain the from_json/to_json and from_simple/to_simple APIs
that use universe/registry/minimal parameters but instead make
the simple form return a class that is a pydantic model.
This allows the JSON to be validated through pydantic.

This has required a change to the DatasetType minimal
form and DimensionGraph form since pydantic seemingly requires
all models are dicts.
@timj timj merged commit b36ab2f into master Apr 9, 2021
@timj timj deleted the tickets/DM-28678 branch April 9, 2021 16:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants