-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add version metadata in ta_cache_rollups #1192
Conversation
✅ Sentry found no issues in your recent changes ✅ |
Codecov ReportAll modified and coverable lines are covered by tests ✅ ✅ All tests successful. No failed tests found. 📢 Thoughts on this report? Let us know! |
Codecov ReportAll modified and coverable lines are covered by tests ✅
✅ All tests successful. No failed tests found. Additional details and impacted files@@ Coverage Diff @@
## main #1192 +/- ##
=======================================
Coverage 97.70% 97.70%
=======================================
Files 454 454
Lines 37024 37035 +11
=======================================
+ Hits 36173 36184 +11
Misses 851 851
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
e628bde
to
64643b9
Compare
data, | ||
POLARS_SCHEMA, | ||
V1_POLARS_SCHEMA, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are we going to serialize data for schemas before this V1 in the next 60 days? They won't have the testsuite right, so wouldn't this lead to it line 83 failing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the schema is describing the format of what we're writing (the keys in data) so as long as we include a value for each field in the schema in data the schema will be respected:
schema = ["field1", "field2"]
data = {"field1": value1, "field2": value2} # this is good
data = {"field3": value1} # bad
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. the good thing is that these are "cache" files, which by definition can be regenerated from the source data at any time.
we want to be able to evolve the schema of the rollups and we can do that by including a version tag in the GCS object metadata. However, even though in this case we're making the change in worker first, in the future, we should modify the reading code to handle the new schema before modifying the write code, since if we deploy both reader and writer at the same time, its possible a rollup written by a new version of the writer is read by an old version of the reader which doesn't understand the new format
64643b9
to
563a6fa
Compare
we want to be able to evolve the schema of the rollups and we can do that by including a version tag in the GCS object metadata.
However, even though in this case we're making the change in worker first, in the future, we should modify the reading code to handle the new schema before modifying the write code, since if we deploy both reader and writer at the same time, its possible a rollup written by a new version of the writer is read by an old version of the reader which doesn't understand the new format
deps on: codecov/shared#590