# 2024-07-03 CDC ADRIO Demo

_author: Trevor Johnson_

This devlog will demonstrate the functionality of a new 'CDC ADRIO maker' which fetches data from various CDC and HealthData datasets. Six datasets are currenlty included, each with their own limitations and set of supported attributes. A comprehensive list of attributes and their limitations can be found below.

| Attribute Name | Dates | Granularity | Dataset | Description |
| --- | --- | --- | --- | --- |
| covid_cases_per_100k | 2/24/2022 - 5/4/2023 | County, State | United States COVID 19 Community Levels by County | Weekly number of COVID\-19 cases per 100k population. |
| covid_hospitalizations_per_100k | 2/24/2022 - 5/4/2023 | County, State | United States COVID 19 Community Levels by County | Weekly number of COVID\-19 hospitalizations per 100k population. |
| covid_hospitalization_avg_facility | 12/13/2020 - 5/10/2023 | County, State | COVID-19 Reported Patient Impact and Hospital Capacity by Facility | Weekly averages of COVID\-19 hospitalizations reported by facility. |
| covid_hospitalization_sum_facility | 12/13/2020 - 5/10/2023 | County, State | COVID-19 Reported Patient Impact and Hospital Capacity by Facility | Weekly sums of COVID\-19 hospitalizations reported by facility. |
| influenza_hospitalization_avg_facility | 12/13/2020 - 5/10/2023 | County, State | COVID-19 Reported Patient Impact and Hospital Capacity by Facility | Weekly averages of influenza hospitalizations reported by facility. |
| influenza_hospitalization_sum_facility | 12/13/2020 - 5/10/2023 | County, State | COVID-19 Reported Patient Impact and Hospital Capacity by Facility | Weekly sums of influenza hospitalizations reported by facility. |
| covid_hospitalization_avg_state | 1/04/2020 - present | State | Weekly United States Hospitalization Metrics by Jurisdiction | Weekly averages of COVID\-19 hospitalizations reported by state. |
| covid_hospitalization_sum_state | 1/04/2020 - present | State | Weekly United States Hospitalization Metrics by Jurisdiction | Weekly sums of COVID\-19 hospitalizations reported by state. |
| influenza_hospitalization_avg_state | 1/04/2020 - present | State | Weekly United States Hospitalization Metrics by Jurisdiction | Weekly averages of influenza hospitalizations reported by state. |
| influenza_hospitalization_avg_state | 1/04/2020 - present | State | Weekly United States Hospitalization Metrics by Jurisdiction | Weekly sums of influenza hospitalizations reported by state. |
| full_covid_vaccinations | 12/13/2020 - 5/10/2024 | County, State | COVID-19 Vaccinations in the United States,County | Weekly cumulative total of individuals who have compeleted a series of COVID\-19 vaccinations. |
| one_dose_covid_vaccinations | 12/13/2020 - 5/10/2024 | County, State | COVID-19 Vaccinations in the United States,County | Weely cumulative total of individuals who have recieved at least one dose of COVID\-19 vaccination. |
| covid_booster_doses | 12/13/2020 - 5/10/2024 | County, State | COVID-19 Vaccinations in the United States,County | Weekly cumulative total of COVID\-19 booster doses administered. |
| covid_deaths_county | 1/4/2020 - 4/5/2024 | County, State | AH COVID-19 Death Counts by County and Week, 2020-present | Weekly number of COVID\-19 deaths reported by county.  |
| covid_deaths_state | 1/4/2020 - present | State | Provisional COVID-19 Death Counts by Week Ending Date and State | Weekly number of COVID\-19 deaths reported by state. |
| influenza_deaths | 1/4/2020 - present | State | Provisional COVID-19 Death Counts by Week Ending Date and State | Weekly number of influenza deaths reported by state. |

## **Demo**
ADRIO maker functionality will be demonstrated one dataset at a time. A brief description of each dataset will be given and one ADRIO for each of its attributes will be created and run.

In [1]:
from unittest.mock import Mock

import numpy as np

from epymorph.data_shape import SimDimensions
from epymorph.geography.us_census import CountyScope, StateScope
from epymorph.simulation import NamespacedAttributeResolver

county_scope = CountyScope.in_states(["04", "08"])
state_scope = StateScope.in_states(["04", "08"])

data = Mock(spec=NamespacedAttributeResolver)
dim = Mock(spec=SimDimensions)
rng = Mock(spec=np.random.Generator)

### **United States COVID 19 Community Levels by County**
This dataset is used to fetch data on reported COVID-19 cases and hospitalizations per 100k population.

- Supported attributes: covid_cases_per_100k, covid_hospitalizations_per_100k
- Available date range: 2/24/2022 to 5/4/2023
- Granularity: county, state

https://healthdata.gov/dataset/United-States-COVID-19-Community-Levels-by-County/nn5b-j5u9/about_data

In [2]:
from epymorph.adrio.cdc import CovidCasesPer100k, CovidHospitalizationsPer100k
from epymorph.simulation import TimeFrame

time_period = TimeFrame.range("2022-02-24", "2023-05-04")

cases = CovidCasesPer100k(time_period)
hospitalizations = CovidHospitalizationsPer100k(time_period)

In [3]:
print(
    f"COVID cases per 100k:\n {cases.evaluate_in_context(data, dim, state_scope, rng)[:3]}\n"
)
print(
    f"COVID hospitalizations per 100k:\n {hospitalizations.evaluate_in_context(data, dim, state_scope, rng)[:3]}"
)

COVID cases per 100k:
 [[('2022-02-24',  3217.93) ('2022-02-24', 12475.51)]
 [('2022-03-03',  1018.81) ('2022-03-03',  8795.74)]
 [('2022-03-10',  1912.86) ('2022-03-10', 12280.08)]]

COVID hospitalizations per 100k:
 [[('2022-02-24', 334.7) ('2022-02-24', 822.5)]
 [('2022-03-03', 232.6) ('2022-03-03', 445.5)]
 [('2022-03-10', 186. ) ('2022-03-10', 332.5)]]


### **COVID-19 Reported Patient Impact and Hospital Capacity by Facility**
This dataset is used to fetch hospitalization data for COVID-19 and other respiratory illnesses.

- Supported attributes: covid_hospitalization_avg_facility, covid_hospitalization_sum_facility, influenza_hospitalization_avg_facility, influenza_hospitalization_sum_facility
- Available date range: 12/13/2020 to 5/10/2023
- Granularity: county, state

https://healthdata.gov/Hospital/COVID-19-Reported-Patient-Impact-and-Hospital-Capa/anag-cw7u/about_data


In [13]:
from epymorph.adrio.cdc import (
    CovidHospitalizationAvgFacility,
    CovidHospitalizationSumFacility,
    InfluenzaHospitalizationSumFacility,
    InfluenzaHosptializationAvgFacility,
)

time_period = TimeFrame.range("2020-12-13", "2023-05-10")

# data for these attributes contains many -999,999 "sentinel values" to suppress values < 4
# an additional integer parameter in range 0-3 is required to specify what to replace these with
covid_avg = CovidHospitalizationAvgFacility(time_period, 0)
covid_sum = CovidHospitalizationSumFacility(time_period, 0)
flu_avg = InfluenzaHosptializationAvgFacility(time_period, 0)
flu_sum = InfluenzaHospitalizationSumFacility(time_period, 0)

In [10]:
print(
    f"COVID hospitalization average:\n {covid_avg.evaluate_in_context(data, dim, county_scope, rng)[:3]}\n"
)
print(
    f"COVID hospitalization sum:\n {covid_sum.evaluate_in_context(data, dim, county_scope, rng)[:3]}\n"
)
print(
    f"Influenza hospitalization average:\n {flu_avg.evaluate_in_context(data, dim, county_scope, rng)[:3]}\n"
)
print(
    f"Influenza hospitalization sum:\n {flu_sum.evaluate_in_context(data, dim, county_scope, rng)[:3]}"
)

  warn(f'{num_sentinel} values < 4 were replaced with {replace_sentinel} in returned data.')


COVID hospitalization average:
 [[('2020-12-13', 6.4000e+00) ('2020-12-13', 5.9400e+01)
  ('2020-12-13', 9.3200e+01) ('2020-12-13', 2.8800e+01)
  ('2020-12-13', 2.4800e+01) ('2020-12-13', 1.9600e+01)
  ('2020-12-13', 3.3203e+03) ('2020-12-13', 1.5520e+02)
  ('2020-12-13', 7.2900e+01) ('2020-12-13', 6.6820e+02)
  ('2020-12-13', 1.6560e+02) ('2020-12-13', 8.8000e+00)
  ('2020-12-13', 2.2100e+02) ('2020-12-13', 3.2320e+02)
  ('2020-12-13', 1.6060e+02) ('2020-12-13', 8.9000e+00)
  ('2020-12-13', 1.4430e+02) ('2020-12-13', 0.0000e+00)
  ('2020-12-13', 0.0000e+00) ('2020-12-13', 7.0700e+01)
  ('2020-12-13', 0.0000e+00) ('2020-12-13',        nan)
  ('2020-12-13',        nan) ('2020-12-13',        nan)
  ('2020-12-13',        nan) ('2020-12-13', 7.8900e+01)
  ('2020-12-13', 2.7400e+01) ('2020-12-13', 0.0000e+00)
  ('2020-12-13', 2.0890e+02) ('2020-12-13', 8.1000e+00)
  ('2020-12-13',        nan) ('2020-12-13', 0.0000e+00)
  ('2020-12-13', 0.0000e+00) ('2020-12-13',        nan)
  ('2020-12-13',

  warn(f'{num_sentinel} values < 4 were replaced with {replace_sentinel} in returned data.')


COVID hospitalization sum:
 [[('2020-12-13', 4.5000e+01) ('2020-12-13', 4.4800e+02)
  ('2020-12-13', 6.7800e+02) ('2020-12-13', 2.0100e+02)
  ('2020-12-13', 1.7400e+02) ('2020-12-13', 9.8000e+01)
  ('2020-12-13', 2.3289e+04) ('2020-12-13', 1.1010e+03)
  ('2020-12-13', 5.1000e+02) ('2020-12-13', 4.6850e+03)
  ('2020-12-13', 1.1580e+03) ('2020-12-13', 6.2000e+01)
  ('2020-12-13', 1.5460e+03) ('2020-12-13', 2.2620e+03)
  ('2020-12-13', 1.1240e+03) ('2020-12-13', 6.2000e+01)
  ('2020-12-13', 1.0100e+03) ('2020-12-13', 6.0000e+00)
  ('2020-12-13', 6.0000e+00) ('2020-12-13', 4.9400e+02)
  ('2020-12-13', 0.0000e+00) ('2020-12-13',        nan)
  ('2020-12-13',        nan) ('2020-12-13',        nan)
  ('2020-12-13',        nan) ('2020-12-13', 5.5200e+02)
  ('2020-12-13', 1.9200e+02) ('2020-12-13', 8.0000e+00)
  ('2020-12-13', 1.4790e+03) ('2020-12-13', 5.7000e+01)
  ('2020-12-13',        nan) ('2020-12-13', 0.0000e+00)
  ('2020-12-13', 5.0000e+00) ('2020-12-13',        nan)
  ('2020-12-13', 2.3

  warn(f'{num_sentinel} values < 4 were replaced with {replace_sentinel} in returned data.')


Influenza hospitalization average:
 [[('2020-12-13',  0.) ('2020-12-13',  0.) ('2020-12-13',  0.)
  ('2020-12-13',  0.) ('2020-12-13',  0.) ('2020-12-13',  0.)
  ('2020-12-13', 76.) ('2020-12-13',  0.) ('2020-12-13',  0.)
  ('2020-12-13',  0.) ('2020-12-13',  0.) ('2020-12-13',  0.)
  ('2020-12-13',  0.) ('2020-12-13',  0.) ('2020-12-13',  0.)
  ('2020-12-13',  0.) ('2020-12-13',  0.) ('2020-12-13',  0.)
  ('2020-12-13',  0.) ('2020-12-13',  0.) ('2020-12-13',  0.)
  ('2020-12-13', nan) ('2020-12-13', nan) ('2020-12-13', nan)
  ('2020-12-13', nan) ('2020-12-13',  0.) ('2020-12-13',  0.)
  ('2020-12-13',  0.) ('2020-12-13',  0.) ('2020-12-13',  0.)
  ('2020-12-13', nan) ('2020-12-13',  0.) ('2020-12-13',  0.)
  ('2020-12-13', nan) ('2020-12-13',  0.) ('2020-12-13',  0.)
  ('2020-12-13', nan) ('2020-12-13', nan) ('2020-12-13',  0.)
  ('2020-12-13',  0.) ('2020-12-13', nan) ('2020-12-13',  0.)
  ('2020-12-13',  0.) ('2020-12-13',  0.) ('2020-12-13',  0.)
  ('2020-12-13', nan) ('2020-12-13

  warn(f'{num_sentinel} values < 4 were replaced with {replace_sentinel} in returned data.')


### **Weekly United States Hospitalization Metrics by Jurisdiction**
Like the previous dataset, this dataset is used to fetch hospitalization data for COVID-19 and other respiratory illnesses. Unlike the previous dataset however, it includes metrics reported voluntarily after the end of the manditory reporting period and is limited to state granularity.

- Supported attributes: covid_hospitalization_avg_state, covid_hospitalization_sum_state, influenza_hospitalization_avg_state, influenza_hospitalization_sum_state
- Available date range: 1/04/2020 to present. Data reported voluntary past 5/1/2024.
- Granularity: state

https://data.cdc.gov/Public-Health-Surveillance/Weekly-United-States-Hospitalization-Metrics-by-Ju/aemt-mg7g/about_data

In [6]:
from epymorph.adrio.cdc import (
    CovidHospitalizationAvgState,
    CovidHospitalizationSumState,
    InfluenzaHospitalizationAvgState,
    InfluenzaHospitalizationSumState,
)

time_period = TimeFrame.range("2020-12-13", "2024-06-28")

covid_avg = CovidHospitalizationAvgState(time_period)
covid_sum = CovidHospitalizationSumState(time_period)
flu_avg = InfluenzaHospitalizationAvgState(time_period)
flu_sum = InfluenzaHospitalizationSumState(time_period)

In [7]:
print(
    f"COVID hospitalization average:\n {covid_avg.evaluate_in_context(data, dim, state_scope, rng)[:3]}\n...\n"
)
print(
    f"COVID hospitalization sum:\n {covid_sum.evaluate_in_context(data, dim, state_scope, rng)[:3]}\n...\n"
)
print(
    f"Influenza hospitalization average:\n {flu_avg.evaluate_in_context(data, dim, state_scope, rng)[:3]}\n...\n"
)
print(
    f"Influenza hospitalization sum:\n {flu_sum.evaluate_in_context(data, dim, state_scope, rng)[:3]}\n..."
)

  warn("State level hospitalization data is voluntary past 5/1/2024.")


COVID hospitalization average:
 [[('2020-12-19', 452.) ('2020-12-19', 199.)]
 [('2020-12-26', 472.) ('2020-12-26', 164.)]
 [('2021-01-02', 495.) ('2021-01-02', 150.)]]
...



  warn("State level hospitalization data is voluntary past 5/1/2024.")


COVID hospitalization sum:
 [[('2020-12-19', 3164.) ('2020-12-19', 1396.)]
 [('2020-12-26', 3307.) ('2020-12-26', 1148.)]
 [('2021-01-02', 3465.) ('2021-01-02', 1051.)]]
...



  warn("State level hospitalization data is voluntary past 5/1/2024.")


Influenza hospitalization average:
 [[('2020-12-19', 2.) ('2020-12-19', 1.)]
 [('2020-12-26', 1.) ('2020-12-26', 1.)]
 [('2021-01-02', 6.) ('2021-01-02', 0.)]]
...



  warn("State level hospitalization data is voluntary past 5/1/2024.")


Influenza hospitalization sum:
 [[('2020-12-19', 11.) ('2020-12-19',  5.)]
 [('2020-12-26',  8.) ('2020-12-26',  5.)]
 [('2021-01-02', 44.) ('2021-01-02',  0.)]]
...


### **COVID-19 Vaccinations in the United States,County**
This dataset is used to fetch cumulative COVID-19 vaccination data.

- Supported attributes: full_covid_vaccinations, one_dose_covid_vaccinations, covid_booster_doses
- Available date range: 12/13/2020 to 5/10/2024.
- Granularity: county, state

https://data.cdc.gov/Vaccinations/COVID-19-Vaccinations-in-the-United-States-County/8xkx-amqh/about_data

In [8]:
from epymorph.adrio.cdc import (
    CovidBoosterDoses,
    FullCovidVaccinations,
    OneDoseCovidVaccinations,
)

time_period = TimeFrame.range("2021-12-13", "2024-05-10")

full = FullCovidVaccinations(time_period)
one = OneDoseCovidVaccinations(time_period)
booster = CovidBoosterDoses(time_period)

In [9]:
print(
    f"Full COVID vaccinations:\n {full.evaluate_in_context(data, dim, state_scope, rng)[:3]}\n"
)
print(
    f"One dose COVID vaccinations:\n {one.evaluate_in_context(data, dim, state_scope, rng)[:3]}\n"
)
print(
    f"COVID booster doses:\n {booster.evaluate_in_context(data, dim, state_scope, rng)[:3]}"
)

Full COVID vaccinations:
 [[('2021-12-13', 6.162100e+04) ('2021-12-13', 7.317100e+04)
  ('2021-12-13', 1.030980e+05) ('2021-12-13', 2.951500e+04)
  ('2021-12-13', 2.237200e+04) ('2021-12-13', 3.782000e+03)
  ('2021-12-13', 9.534000e+03) ('2021-12-13', 2.357961e+06)
  ('2021-12-13', 8.183000e+04) ('2021-12-13', 7.401400e+04)
  ('2021-12-13', 6.569600e+05) ('2021-12-13', 2.272790e+05)
  ('2021-12-13', 4.503200e+04) ('2021-12-13', 1.025850e+05)
  ('2021-12-13', 1.335930e+05) ('2021-12-13', 3.243510e+05)
  ('2021-12-13', 8.528000e+03) ('2021-12-13', 4.121960e+05)
  ('2021-12-13', 7.695000e+03) ('2021-12-13', 1.361000e+03)
  ('2021-12-13', 1.385000e+03) ('2021-12-13', 2.414870e+05)
  ('2021-12-13', 5.366000e+04) ('2021-12-13', 1.248700e+04)
  ('2021-12-13', 5.700000e+02) ('2021-12-13', 4.526000e+03)
  ('2021-12-13', 3.702000e+03) ('2021-12-13', 2.228000e+03)
  ('2021-12-13', 1.245000e+03) ('2021-12-13', 2.302000e+03)
  ('2021-12-13', 1.400900e+04) ('2021-12-13', 5.161470e+05)
  ('2021-12-13

### **AH COVID-19 Death Counts by County and Week, 2020-present**
This dataset is used to fetch data on COVID-19 deaths.

- Supported attributes: covid_deaths_county
- Available date range: 1/4/2020 to 4/5/2024.
- Granularity: county, state

https://data.cdc.gov/NCHS/AH-COVID-19-Death-Counts-by-County-and-Week-2020-p/ite7-j2w7/about_data

In [10]:
from epymorph.adrio.cdc import CovidDeathsCounty

deaths = CovidDeathsCounty(TimeFrame.range("2021-01-04", "2024-04-05"))

In [11]:
print(
    f"COVID deaths:\n {deaths.evaluate_in_context(data, dim, state_scope, rng)[:3]}\n"
)

COVID deaths:
 [[('2021-01-09', 921.) ('2021-01-09', 180.)]
 [('2021-01-16', 960.) ('2021-01-16', 125.)]
 [('2021-01-23', 926.) ('2021-01-23',  97.)]]



### **Provisional COVID-19 Death Counts by Week Ending Date and State**
This dataset is used to fetch data on COVID-19 and influenza deaths. It is continuously updated but only available for state granularity.

- Supported attributes: covid_deaths_state, influenza_deaths
- Available date range: 1/4/2020 to present.
- Granularity: state

https://data.cdc.gov/NCHS/Provisional-COVID-19-Death-Counts-by-Week-Ending-D/r8kw-7aab/about_data

In [12]:
from epymorph.adrio.cdc import CovidDeathsState, InfluenzaDeathsState

time_period = TimeFrame.range("2021-01-04", "2024-04-05")

covid_deaths = CovidDeathsState(time_period)
flu_deaths = InfluenzaDeathsState(time_period)

In [13]:
print(
    f"COVID deaths:\n {covid_deaths.evaluate_in_context(data, dim, state_scope, rng)[:3]}\n...\n"
)
print(
    f"Influenza deaths:\n {flu_deaths.evaluate_in_context(data, dim, state_scope, rng)[:3]}\n..."
)

COVID deaths:
 [[('2021-01-09', 942.) ('2021-01-09', 211.)]
 [('2021-01-16', 996.) ('2021-01-16', 165.)]
 [('2021-01-23', 959.) ('2021-01-23', 158.)]]
...

Influenza deaths:
 [[('2021-01-09', 0.) ('2021-01-09', 0.)]
 [('2021-01-16', 0.) ('2021-01-16', 0.)]
 [('2021-01-23', 0.) ('2021-01-23', 0.)]]
...
