Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Unpickling CachedQueryResults Created By Py2 #3234

Merged
merged 6 commits into from Jan 1, 2021

Conversation

phil-lopreiato
Copy link
Member

@phil-lopreiato phil-lopreiato commented Dec 30, 2020

This is a rabbit hole, as pickling across python versions is a terrible idea.

However, we can hack around a few of the major issues that come up. Start by adding a repro unit test that tries to load+depickle a value I got from the datastore admin tool for something written by the py2 app.

Next, we need to fix:

  • we changed the import path of models, so we'll need to rewrite those on the fly. We can do this with a custom Unpickler. See this answer.
  • bytestrings are handled totally differently between python 2 and 3, so we need to make sure that the py3 reader will unpickle using encoding="bytes". We can do this in our custom unpickler too. Plus there's this whole thing about unpickling datetimes. See this answer.
  • Finally, there's a ndb incompatibility. The legacy app uses a custom pickling format (a serialized protobuf), which it does by defining __getstate__ and __setstate on the base Model class. Since the py3 compatible ndb library does not do this, unpickling them will fail. I filed Can't Unpickle Models Pickled by Legacy Library googleapis/python-ndb#587 in hope of working with upstream to fix this, but if not, we can potentially hack around it ourselves by manually deserializing the protobuf and setting the field ourselves.

WARNING: After this, we should be able to read data written by the legacy app, but we can not yet write data that the legacy app can read. So after this, when we re-enable CachedQueryResult, we'll have to make sure that we don't write anything to the datastore.

@phil-lopreiato phil-lopreiato changed the title Support Unpickling CachedQueryResults Created By Py2 [wip] Support Unpickling CachedQueryResults Created By Py2 Dec 30, 2020
@codecov
Copy link

codecov bot commented Dec 31, 2020

Codecov Report

Merging #3234 (c168a9f) into py3 (cd50970) will increase coverage by 0.06%.
The diff coverage is 92.30%.

Impacted file tree graph

@@            Coverage Diff             @@
##              py3    #3234      +/-   ##
==========================================
+ Coverage   95.41%   95.48%   +0.06%     
==========================================
  Files         361      367       +6     
  Lines       15865    16328     +463     
==========================================
+ Hits        15137    15590     +453     
- Misses        728      738      +10     
Impacted Files Coverage Δ
src/backend/common/models/cached_model.py 88.42% <86.25%> (-11.58%) ⬇️
src/backend/common/models/cached_query_result.py 100.00% <100.00%> (ø)
...es/tests/cached_query_result_compatibility_test.py 100.00% <100.00%> (ø)
src/backend/api/main.py 100.00% <0.00%> (ø)
src/backend/api/handlers/team.py 100.00% <0.00%> (ø)
src/backend/api/handlers/error.py 100.00% <0.00%> (ø)
src/backend/common/queries/types.py 100.00% <0.00%> (ø)
src/backend/common/queries/team_query.py 100.00% <0.00%> (ø)
src/backend/common/queries/award_query.py 100.00% <0.00%> (ø)
src/backend/common/queries/event_query.py 100.00% <0.00%> (ø)
... and 31 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cd50970...c168a9f. Read the comment docs.

@phil-lopreiato phil-lopreiato marked this pull request as ready for review January 1, 2021 01:51
@phil-lopreiato phil-lopreiato changed the title [wip] Support Unpickling CachedQueryResults Created By Py2 Support Unpickling CachedQueryResults Created By Py2 Jan 1, 2021
@phil-lopreiato phil-lopreiato merged commit 10c7a61 into the-blue-alliance:py3 Jan 1, 2021
@phil-lopreiato phil-lopreiato deleted the py2-unpickle branch January 1, 2021 02:01
phil-lopreiato added a commit that referenced this pull request Jan 9, 2021
This is the followup to #3234

It implements the write side, as well as some other bug fixes and cleanup. Also adds tests to assert that we can do a round trip serialize + deserialize and get the original value back.

This includes some of the serialization code from the legacy runtime, which maintains its original license. I made the basic modifications to port it to py3, and left links to where to find the original to make future debugging easier. 

I also verified that I could take pickles generated by these unit tests and be able to deserialize them in the legacy app's console. After fixing imports, of course. Which we'll need to add to the py2 code anyway.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant