Skip to content

Conversation

@jingjtang
Copy link
Contributor

@jingjtang jingjtang commented Jul 19, 2022

Description

Generate and store county level backfill file locally (On MIDAS) for Quidel Covid Antigen Test. The newly added function will check the input file daily and store backfill file on a daily basis. Every 4 weeks, we will check the daily files and merge them into a combined file.

Changelog

Itemize code/test/documentation changes and files added/removed.

  • run.py, backfill.py modify the current code, to save the county level intermediate file to backfill_dir before generating signals if there are new input raw files added by Quidel.
  • setup.py add pyarrow as a necessary package so as to save the backfill file in the parquet format.
  • tests/test_run.py update the PARAMS in unit tests
  • tests/test_backfill.py update the unit tests for backfill related helper functions.
  • params.json.template add backfill_dir, backfill_merge_day

@jingjtang jingjtang requested a review from krivard August 4, 2022 06:38
@nmdefries nmdefries self-requested a review August 23, 2022 20:07
Comment on lines +29 to +36
backfilldata.rename({"timestamp": "time_value",
"totalTest_total": "den_total",
"positiveTest_total": "num_total",
"positiveTest_age_0_4": "num_age_0_4",
"totalTest_age_0_4": "den_age_0_4",
"positiveTest_age_5_17": "num_age_5_17",
"totalTest_age_5_17": "den_age_5_17",
"positiveTest_age_18_49": "num_age_18_49",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we considered saving each of these age buckets to a separate data file? That way, the numerator/denominator field names can be standardized ("num" and "den") for easier handling in the corrections pipeline. I'm imagining filenames like quidel_covidtest_all_ages_as_of_20200817.parquet, quidel_covidtest_age_0_4_as_of_20200817.parquet, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did. But it adds burn to the storage. But it might not be supper big deal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a consistent format is really nice and will save some work down the line if the backfill system gets more complicated. BUT either way works and this is just the first version (and I already have the backfill correction package to handle quidel's nonstandard numerator/denominator names, so that's not a concern).

@jingjtang jingjtang requested a review from nmdefries September 1, 2022 01:52
Copy link
Contributor

@nmdefries nmdefries left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

@krivard krivard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

@krivard krivard merged commit e97bbb9 into main Sep 13, 2022
@krivard krivard deleted the backfill/quidel_covidtest branch September 13, 2022 18:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants