Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add version metadata in ta_cache_rollups #1192

Merged
merged 1 commit into from
Apr 8, 2025
Merged

Conversation

joseph-sentry
Copy link
Contributor

we want to be able to evolve the schema of the rollups and we can do that by including a version tag in the GCS object metadata.

However, even though in this case we're making the change in worker first, in the future, we should modify the reading code to handle the new schema before modifying the write code, since if we deploy both reader and writer at the same time, its possible a rollup written by a new version of the writer is read by an old version of the reader which doesn't understand the new format

deps on: codecov/shared#590

@joseph-sentry joseph-sentry requested a review from a team March 31, 2025 15:57
Copy link
Contributor

✅ Sentry found no issues in your recent changes ✅

@codecov-notifications
Copy link

codecov-notifications bot commented Mar 31, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

✅ All tests successful. No failed tests found.

📢 Thoughts on this report? Let us know!

Copy link

codecov bot commented Mar 31, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.70%. Comparing base (078221e) to head (563a6fa).
Report is 4 commits behind head on main.

✅ All tests successful. No failed tests found.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1192   +/-   ##
=======================================
  Coverage   97.70%   97.70%           
=======================================
  Files         454      454           
  Lines       37024    37035   +11     
=======================================
+ Hits        36173    36184   +11     
  Misses        851      851           
Flag Coverage Δ
integration 42.82% <47.05%> (+<0.01%) ⬆️
unit 90.50% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@joseph-sentry joseph-sentry force-pushed the joseph/version branch 2 times, most recently from e628bde to 64643b9 Compare March 31, 2025 19:55
data,
POLARS_SCHEMA,
V1_POLARS_SCHEMA,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are we going to serialize data for schemas before this V1 in the next 60 days? They won't have the testsuite right, so wouldn't this lead to it line 83 failing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the schema is describing the format of what we're writing (the keys in data) so as long as we include a value for each field in the schema in data the schema will be respected:

schema = ["field1", "field2"]
data = {"field1": value1, "field2": value2} # this is good
data = {"field3": value1} # bad

Copy link
Contributor

@Swatinem Swatinem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. the good thing is that these are "cache" files, which by definition can be regenerated from the source data at any time.

we want to be able to evolve the schema of the rollups and we can do
that by including a version tag in the GCS object metadata.

However, even though in this case we're making the change in worker
first, in the future, we should modify the reading code to handle the
new schema before modifying the write code, since if we deploy both
reader and writer at the same time, its possible a rollup written by a
new version of the writer is read by an old version of the reader
which doesn't understand the new format
@joseph-sentry joseph-sentry added this pull request to the merge queue Apr 8, 2025
Merged via the queue into main with commit 7840d9f Apr 8, 2025
26 of 27 checks passed
@joseph-sentry joseph-sentry deleted the joseph/version branch April 8, 2025 18:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants