Skip to content

Conversation

@krivard
Copy link
Contributor

@krivard krivard commented Nov 12, 2021

Description

This indicator generates COVID-19 test positivity and total test volume signals using the Community Profile Report (CPR) published by the Data Strategy and Execution Workgroup at https://healthdata.gov/Health/COVID-19-Community-Profile-Report/gqxm-d9w9.

The geo handling on this one is a bit different, since the CPR already includes a separate sheet for most of our geographic aggregation levels, including HHS regions, states, MSAs (a subset of CBSAs), and counties. The indicator pulls data frames from each of these sheets, rather than using our GeoMapper and risk publishing figures that don't match what was in the report.

Changelog

Itemize code/test/documentation changes and files added/removed.

  • run.py: thin wrapper in charge of reading params and exporting CSVs
  • constants.py: URLs formulae, a small configuration object for each of the geo level sheets, signals list, formula for making a full signal name out of the short name
  • pull.py: Figures out which reports are available, which haven't been downloaded yet, & which reports to include in the final data frames, then parses all matching reports into a data frame for each (geo, signal) combination.

Fixes

@krivard krivard marked this pull request as ready for review November 12, 2021 22:15
@krivard krivard requested a review from nmdefries November 12, 2021 22:15
Comment on lines +47 to +51
GeoMapper _is_ used to generate national figures from
state, due to architectural differences between the starred sheets and the
Overview sheet. If we discover that our nation-level figures differ too much
from those listed in the Overview sheet, we can add dedicated parsing for the
Overview sheet and remove GeoMapper from this indicator altogether.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you thought about what the threshold for "differs too much" is? Since we have all the historical data, we can probably decide now whether or not to add dedicated national parsing so we don't have to do any backfilling later on.

krivard and others added 4 commits November 19, 2021 13:04
Co-authored-by: nmdefries <42820733+nmdefries@users.noreply.github.com>
* Use export_start_date / export_end_date
* Correct nation aggregation for positivity as rate
@krivard
Copy link
Contributor Author

krivard commented Jan 10, 2022

(failing test is unrelated; due to spurious network call made by validator tests and fixed in #1458)

@krivard krivard force-pushed the krivard/community_profile branch from 457da16 to a58388c Compare January 18, 2022 22:45
@krivard krivard requested a review from nmdefries January 18, 2022 22:52
Copy link
Contributor

@nmdefries nmdefries left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small suggested change to the package description but otherwise looks good!

Co-authored-by: nmdefries <42820733+nmdefries@users.noreply.github.com>
@krivard krivard merged commit 03c6f45 into main Jan 19, 2022
@krivard krivard deleted the krivard/community_profile branch January 19, 2022 15:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

New indicator: CDC COVID-19 Community Profile Report

3 participants