Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Techniques for schema versioning and upgrade paths #1107

Closed
altendky opened this issue Jan 21, 2019 · 6 comments
Closed

Techniques for schema versioning and upgrade paths #1107

altendky opened this issue Jan 21, 2019 · 6 comments

Comments

@altendky
Copy link

Schema's will obviously change over time and you will sometimes need to load old data and catch it up. Are there any existing features or tools or plans/hopes/dreams of supporting these needs in Marshmallow or supporting libraries?

I vaguely imagine that before a schema update you would have to copy the code to a new location, register it into a version registry, add upgrade paths to the registry, and use the registry to deserialize so that it can pick out the proper code for the serialized form it finds.

@deckar01
Copy link
Member

deckar01 commented Jan 21, 2019

I can't find any existing issues on the topic. Can you provide more detail about the source of your data?

A common strategy I have seen with APIs, is to version the route, maintain a copy of the schema from each major version, and eventually schedule old versions for deprecation. Instead of using a registry to resolve the schema from the data version, the route acts as a static mapping. This strategy is useful for versioning the request structure as well as the response structure.

In database migrations, the schema does not seem to get versioned. The data is versioned so that the migrations can update the data to the latest version and the schema loads that.

@altendky
Copy link
Author

@deckar01, my present application is a parameter definition and management tool (https://github.com/altendky/pm). I use Marshmallow to [de]serialize data to/from files on disk.

I could define the upgrade paths against the JSON but I use Marshmallow so I can work with my Python objects. :] I guess beyond just the schema themselves you need the classes and any processing the classes do, at least on 'load'. As always, this is going to be fun.

@deckar01
Copy link
Member

You might consider implementing this as a custom render_module. It is just a class that defines loads and dumps, which defaults to json, but it seems like it might be a good place to intercept the raw data on a per schema basis.

https://marshmallow.readthedocs.io/en/3.0/api_reference.html#marshmallow.Schema.Meta

@sloria
Copy link
Member

sloria commented Feb 1, 2019

Going to close this for now, as we won't be adding versioning functionality in marshmallow core. Feel free to carry on the discussion, though--it's an interesting use case =).

@sloria sloria closed this as completed Feb 1, 2019
@katetsu
Copy link

katetsu commented Oct 21, 2020

Hi @altendky did you manage to find a solution to this? what your schema versioning approach with marshmallow?

@deckar01
Copy link
Member

Looking at this with fresh eyes I would recommend migrating the data before deserializing it and avoid maintaining a schema history. I can't find any tooling for generic data migration. Everything I found seems to be DB specific with lots of magic for auto generating migration scripts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants