# 2024-07-03 CDC ADRIO Demo

_author: Trevor Johnson_

This devlog will demonstrate the functionality of a new 'CDC ADRIO maker' which fetches data from various CDC and HealthData datasets. Six datasets are currenlty included, each with their own limitations and set of supported attributes. A comprehensive list of attributes and their limitations can be found below.

| Attribute Name | Dates | Granularity | Dataset | Description |
| --- | --- | --- | --- | --- |
| covid_cases_per_100k | 2/24/2022 - 5/4/2023 | County, State | United States COVID 19 Community Levels by County | Weekly number of COVID\-19 cases per 100k population. |
| covid_hospitalizations_per_100k | 2/24/2022 - 5/4/2023 | County, State | United States COVID 19 Community Levels by County | Weekly number of COVID\-19 hospitalizations per 100k population. |
| covid_hospitalization_avg_facility | 12/13/2020 - 5/10/2023 | County, State | COVID-19 Reported Patient Impact and Hospital Capacity by Facility | Weekly averages of COVID\-19 hospitalizations reported by facility. |
| covid_hospitalization_sum_facility | 12/13/2020 - 5/10/2023 | County, State | COVID-19 Reported Patient Impact and Hospital Capacity by Facility | Weekly sums of COVID\-19 hospitalizations reported by facility. |
| influenza_hospitalization_avg_facility | 12/13/2020 - 5/10/2023 | County, State | COVID-19 Reported Patient Impact and Hospital Capacity by Facility | Weekly averages of influenza hospitalizations reported by facility. |
| influenza_hospitalization_sum_facility | 12/13/2020 - 5/10/2023 | County, State | COVID-19 Reported Patient Impact and Hospital Capacity by Facility | Weekly sums of influenza hospitalizations reported by facility. |
| covid_hospitalization_avg_state | 1/04/2020 - present | State | Weekly United States Hospitalization Metrics by Jurisdiction | Weekly averages of COVID\-19 hospitalizations reported by state. |
| covid_hospitalization_sum_state | 1/04/2020 - present | State | Weekly United States Hospitalization Metrics by Jurisdiction | Weekly sums of COVID\-19 hospitalizations reported by state. |
| influenza_hospitalization_avg_state | 1/04/2020 - present | State | Weekly United States Hospitalization Metrics by Jurisdiction | Weekly averages of influenza hospitalizations reported by state. |
| influenza_hospitalization_avg_state | 1/04/2020 - present | State | Weekly United States Hospitalization Metrics by Jurisdiction | Weekly sums of influenza hospitalizations reported by state. |
| full_covid_vaccinations | 12/13/2020 - 5/10/2024 | County, State | COVID-19 Vaccinations in the United States,County | Weekly cumulative total of individuals who have compeleted a series of COVID\-19 vaccinations. |
| one_dose_covid_vaccinations | 12/13/2020 - 5/10/2024 | County, State | COVID-19 Vaccinations in the United States,County | Weely cumulative total of individuals who have recieved at least one dose of COVID\-19 vaccination. |
| covid_booster_doses | 12/13/2020 - 5/10/2024 | County, State | COVID-19 Vaccinations in the United States,County | Weekly cumulative total of COVID\-19 booster doses administered. |
| covid_deaths_county | 1/4/2020 - 4/5/2024 | County, State | AH COVID-19 Death Counts by County and Week, 2020-present | Weekly number of COVID\-19 deaths reported by county.  |
| covid_deaths_state | 1/4/2020 - present | State | Provisional COVID-19 Death Counts by Week Ending Date and State | Weekly number of COVID\-19 deaths reported by state. |
| influenza_deaths | 1/4/2020 - present | State | Provisional COVID-19 Death Counts by Week Ending Date and State | Weekly number of influenza deaths reported by state. |

## **Demo**
ADRIO maker functionality will be demonstrated one dataset at a time. A brief description of each dataset will be given and one ADRIO for each of its attributes will be created and run.

In [1]:
from epymorph.geo.adrio.cdc.adrio_cdc import ADRIOMakerCDC
from epymorph.geography.us_census import CountyScope, StateScope

maker = ADRIOMakerCDC()
county_scope = CountyScope.in_states(['04', '08'])
state_scope = StateScope.in_states(['04', '08'])

### **United States COVID 19 Community Levels by County**
This dataset is used to fetch data on reported COVID-19 cases and hospitalizations per 100k population.

- Supported attributes: covid_cases_per_100k, covid_hospitalizations_per_100k
- Available date range: 2/24/2022 to 5/4/2023
- Granularity: county, state

https://healthdata.gov/dataset/United-States-COVID-19-Community-Levels-by-County/nn5b-j5u9/about_data

In [2]:
from datetime import date

from epymorph.data_shape import Shapes
from epymorph.geo.spec import DateRange
from epymorph.simulation import geo_attrib

time_period = DateRange(date(2022, 2, 24), date(2023, 5, 4))

cases = maker.make_adrio(geo_attrib("covid_cases_per_100k", int,
                         Shapes.TxN), county_scope, time_period)
hospitalizations = maker.make_adrio(geo_attrib(
    "covid_hospitalizations_per_100k", int, Shapes.TxN), county_scope, time_period)

In [3]:
print(f"COVID cases per 100k:\n {cases.get_value()[:3]}\n")
print(f"COVID hospitalizations per 100k:\n {hospitalizations.get_value()[:3]}")

COVID cases per 100k:
 [('2022-02-24', array([332, 260, 181, 236, 272, 231, 165, 190, 186, 288, 238, 203, 109,
        148, 170, 142, 277, 164, 106, 139, 394, 152, 161, 186, 109,  41,
        353, 180, 395, 118, 198, 170,  48, 193, 177, 224, 209, 436, 231,
        112, 152, 137,   0, 260,   0, 207,   0, 183,  61, 183, 233, 296,
        122, 240, 306,   0, 255, 213, 215, 134, 284,  80,  84, 234, 348,
        221, 219, 268, 363, 179, 278, 549, 207, 266, 148,  86, 203, 183,
        129]))
 ('2022-03-03', array([ 97, 100,  78,  57,  56,  84,  90,  77,  60,  83,  64,  55,  32,
         32,  46,  90, 190, 108,  49, 279,  89, 126,  86, 171,   0,  51,
        219, 180, 329,  78, 105, 104,  97,  93,  70,  78, 144, 399, 169,
        112,  95, 475, 121, 101, 143, 105, 142,  28,  24,  87, 177, 186,
        192, 325, 243,   0, 180, 114, 135,  37, 114,  60,  74,  70, 208,
         98, 157, 253, 257, 187, 175,   0,  85,  88, 174,  86, 122, 149,
         79]))
 ('2022-03-10', array([ 293,  131,  132, 

### **COVID-19 Reported Patient Impact and Hospital Capacity by Facility**
This dataset is used to fetch hospitalization data for COVID-19 and other respiratory illnesses.

- Supported attributes: covid_hospitalization_avg_facility, covid_hospitalization_sum_facility, influenza_hospitalization_avg_facility, influenza_hospitalization_sum_facility
- Available date range: 12/13/2020 to 5/10/2023
- Granularity: county, state

https://healthdata.gov/Hospital/COVID-19-Reported-Patient-Impact-and-Hospital-Capa/anag-cw7u/about_data


In [4]:
time_period = DateRange(date(2020, 12, 13), date(2023, 5, 10))

covid_avg = maker.make_adrio(geo_attrib(
    "covid_hospitalization_avg_facility", float, Shapes.TxN), county_scope, time_period)
covid_sum = maker.make_adrio(geo_attrib(
    "covid_hospitalization_sum_facility", int, Shapes.TxN), county_scope, time_period)
flu_avg = maker.make_adrio(geo_attrib(
    "influenza_hospitalization_avg_facility", float, Shapes.TxN), county_scope, time_period)
flu_sum = maker.make_adrio(geo_attrib(
    "influenza_hospitalization_sum_facility", int, Shapes.TxN), county_scope, time_period)

In [5]:
print(f"COVID hospitalization average:\n {covid_avg.get_value()[:3]}\n")
print(f"COVID hospitalization sum:\n {covid_sum.get_value()[:3]}\n")
print(f"Influenza hospitalization average:\n {flu_avg.get_value()[:3]}\n")
print(f"Influenza hospitalization sum:\n {flu_sum.get_value()[:3]}")

COVID hospitalization average:
 [('2020-12-13', array([ 6.40000e+00, -9.99999e+05, -9.99999e+05,  2.88000e+01,
         2.48000e+01,  1.96000e+01, -9.99999e+05, -9.99999e+05,
         7.29000e+01, -9.99999e+05,  1.65600e+02,  8.80000e+00,
         2.21000e+02,  3.23200e+02,  1.60600e+02,  8.90000e+00,
         1.44300e+02, -9.99999e+05, -9.99999e+05,  7.07000e+01,
        -9.99999e+05,  0.00000e+00,  0.00000e+00,  0.00000e+00,
         0.00000e+00,  7.89000e+01,  2.74000e+01, -9.99999e+05,
        -9.99999e+05,  8.10000e+00,  0.00000e+00,  0.00000e+00,
        -9.99999e+05,  0.00000e+00,  3.33000e+01, -9.99999e+05,
         0.00000e+00,  0.00000e+00,  1.60000e+01, -9.99999e+05,
         0.00000e+00,  0.00000e+00,  6.30000e+00,  5.26000e+01,
        -9.99999e+05,  0.00000e+00,  1.04000e+01,  4.90000e+00,
         8.40000e+00,  6.10000e+00,  0.00000e+00, -9.99999e+05,
         1.14000e+01, -9.99999e+05,  0.00000e+00,  0.00000e+00,
         0.00000e+00, -9.99999e+05,  0.00000e+00,  4.2700

### **Weekly United States Hospitalization Metrics by Jurisdiction**
Like the previous dataset, this dataset is used to fetch hospitalization data for COVID-19 and other respiratory illnesses. Unlike the previous dataset however, it includes metrics reported voluntarily after the end of the manditory reporting period and is limited to state granularity.

- Supported attributes: covid_hospitalization_avg_state, covid_hospitalization_sum_state, influenza_hospitalization_avg_state, influenza_hospitalization_sum_state
- Available date range: 1/04/2020 to present. Data reported voluntary past 5/1/2024.
- Granularity: state

https://data.cdc.gov/Public-Health-Surveillance/Weekly-United-States-Hospitalization-Metrics-by-Ju/aemt-mg7g/about_data

In [6]:
time_period = DateRange(date(2020, 12, 13), date(2024, 6, 28))

covid_avg = maker.make_adrio(geo_attrib(
    "covid_hospitalization_avg_state", float, Shapes.TxN), state_scope, time_period)
covid_sum = maker.make_adrio(geo_attrib(
    "covid_hospitalization_sum_state", int, Shapes.TxN), state_scope, time_period)
flu_avg = maker.make_adrio(geo_attrib(
    "influenza_hospitalization_avg_state", float, Shapes.TxN), state_scope, time_period)
flu_sum = maker.make_adrio(geo_attrib(
    "influenza_hospitalization_sum_state", int, Shapes.TxN), state_scope, time_period)

In [7]:
print(f"COVID hospitalization average:\n {covid_avg.get_value()[:3]}\n...\n")
print(f"COVID hospitalization sum:\n {covid_sum.get_value()[:3]}\n...\n")
print(f"Influenza hospitalization average:\n {flu_avg.get_value()[:3]}\n...\n")
print(f"Influenza hospitalization sum:\n {flu_sum.get_value()[:3]}\n...")

  warn("State level hospitalization data is voluntary past 5/1/2024.")


COVID hospitalization average:
 [('2020-12-19', array([452., 199.])) ('2020-12-26', array([472., 164.]))
 ('2021-01-02', array([495., 150.]))]
...



  warn("State level hospitalization data is voluntary past 5/1/2024.")


COVID hospitalization sum:
 [('2020-12-19', array([3164, 1396])) ('2020-12-26', array([3307, 1148]))
 ('2021-01-02', array([3465, 1051]))]
...



  warn("State level hospitalization data is voluntary past 5/1/2024.")


Influenza hospitalization average:
 [('2020-12-19', array([2., 1.])) ('2020-12-26', array([1., 1.]))
 ('2021-01-02', array([6., 0.]))]
...



  warn("State level hospitalization data is voluntary past 5/1/2024.")


Influenza hospitalization sum:
 [('2020-12-19', array([11,  5])) ('2020-12-26', array([8, 5]))
 ('2021-01-02', array([44,  0]))]
...


### **COVID-19 Vaccinations in the United States,County**
This dataset is used to fetch cumulative COVID-19 vaccination data.

- Supported attributes: full_covid_vaccinations, one_dose_covid_vaccinations, covid_booster_doses
- Available date range: 12/13/2020 to 5/10/2024.
- Granularity: county, state

https://data.cdc.gov/Vaccinations/COVID-19-Vaccinations-in-the-United-States-County/8xkx-amqh/about_data

In [8]:
time_period = DateRange(date(2020, 12, 13), date(2024, 5, 10))

full = maker.make_adrio(geo_attrib("full_covid_vaccinations",
                        int, Shapes.TxN), county_scope, time_period)
one = maker.make_adrio(geo_attrib("one_dose_covid_vaccinations",
                       int, Shapes.TxN), county_scope, time_period)
booster = maker.make_adrio(geo_attrib("covid_booster_doses",
                           int, Shapes.TxN), county_scope, time_period)

In [9]:
print(f"Full COVID vaccinations:\n {full.get_value()[:3]}\n")
print(f"One dose COVID vaccinations:\n {one.get_value()[:3]}\n")
print(f"COVID booster doses:\n {booster.get_value()[:3]}")

Full COVID vaccinations:
 [('2020-12-13', array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]))
 ('2020-12-14', array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]))
 ('2020-12-15', array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]))                                        ]

One dose COVID vaccinations:
 [('2020-12-13', array([0, 0, 0, 0, 0, 0, 0, 

### **AH COVID-19 Death Counts by County and Week, 2020-present**
This dataset is used to fetch data on COVID-19 deaths.

- Supported attributes: covid_deaths_county
- Available date range: 1/4/2020 to 4/5/2024.
- Granularity: county, state

https://data.cdc.gov/NCHS/AH-COVID-19-Death-Counts-by-County-and-Week-2020-p/ite7-j2w7/about_data

In [10]:
deaths = maker.make_adrio(geo_attrib("covid_deaths_county", int, Shapes.TxN),
                          county_scope, DateRange(date(2020, 1, 4), date(2024, 4, 5)))

In [11]:
print(f"COVID deaths:\n {deaths.get_value()[:3]}\n")

COVID deaths:
 [('2020-01-04', array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]))
 ('2020-01-11', array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]))
 ('2020-01-18', array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]))                                        ]



### **Provisional COVID-19 Death Counts by Week Ending Date and State**
This dataset is used to fetch data on COVID-19 and influenza deaths. It is continuously updated but only available for state granularity.

- Supported attributes: covid_deaths_state, influenza_deaths
- Available date range: 1/4/2020 to present.
- Granularity: state

https://data.cdc.gov/NCHS/Provisional-COVID-19-Death-Counts-by-Week-Ending-D/r8kw-7aab/about_data

In [12]:
time_period = DateRange(date(2020, 1, 4), date(2024, 4, 5))

covid_deaths = maker.make_adrio(geo_attrib(
    "covid_deaths_state", int, Shapes.TxN), state_scope, time_period)
flu_deaths = maker.make_adrio(geo_attrib(
    "influenza_deaths", int, Shapes.TxN), state_scope, time_period)

In [13]:
print(f"COVID deaths:\n {covid_deaths.get_value()[:3]}\n...\n")
print(f"Influenza deaths:\n {flu_deaths.get_value()[:3]}\n...")

COVID deaths:
 [('2020-01-04', array([0, 0])) ('2020-01-11', array([0, 0]))
 ('2020-01-18', array([0, 0]))]
...

Influenza deaths:
 [('2020-01-04', array([0, 0])) ('2020-01-11', array([0, 0]))
 ('2020-01-18', array([11,  0]))]
...
