-
Notifications
You must be signed in to change notification settings - Fork 16
Backfill/quidel covidtest #1660
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| backfilldata.rename({"timestamp": "time_value", | ||
| "totalTest_total": "den_total", | ||
| "positiveTest_total": "num_total", | ||
| "positiveTest_age_0_4": "num_age_0_4", | ||
| "totalTest_age_0_4": "den_age_0_4", | ||
| "positiveTest_age_5_17": "num_age_5_17", | ||
| "totalTest_age_5_17": "den_age_5_17", | ||
| "positiveTest_age_18_49": "num_age_18_49", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we considered saving each of these age buckets to a separate data file? That way, the numerator/denominator field names can be standardized ("num" and "den") for easier handling in the corrections pipeline. I'm imagining filenames like quidel_covidtest_all_ages_as_of_20200817.parquet, quidel_covidtest_age_0_4_as_of_20200817.parquet, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did. But it adds burn to the storage. But it might not be supper big deal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having a consistent format is really nice and will save some work down the line if the backfill system gets more complicated. BUT either way works and this is just the first version (and I already have the backfill correction package to handle quidel's nonstandard numerator/denominator names, so that's not a concern).
Co-authored-by: nmdefries <42820733+nmdefries@users.noreply.github.com>
nmdefries
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
krivard
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great!
Description
Generate and store county level backfill file locally (On MIDAS) for Quidel Covid Antigen Test. The newly added function will check the input file daily and store backfill file on a daily basis. Every 4 weeks, we will check the daily files and merge them into a combined file.
Changelog
Itemize code/test/documentation changes and files added/removed.
run.py,backfill.pymodify the current code, to save the county level intermediate file tobackfill_dirbefore generating signals if there are new input raw files added by Quidel.setup.pyadd pyarrow as a necessary package so as to save the backfill file in the parquet format.tests/test_run.pyupdate the PARAMS in unit teststests/test_backfill.pyupdate the unit tests for backfill related helper functions.params.json.templateaddbackfill_dir,backfill_merge_day