# Coronavirus Data

#### A collection of datasets, published in articles, academic papers, government websites, or hopefully, data aggregators.

To understand coronavirus better, I've aggregated data on confirmed cases, tests, deaths. Where possible, data broken down by age and comorbidity is used.

The goal is to understand what groups are most at risk. And to predict who will die under different scenarios like the one I lean toward: protect the vulnerable, reopen the economy, and attain herd immunity.

What is seroprevalence? An estimate of the percentage of people who have been infected. It measures coronavirus antibodies. https://en.wikipedia.org/wiki/Seroprevalence.


## The COVID Tracking Project

https://covidtracking.com/

Has CSV daily data for every state for tests, cases, deaths, hospitalizations, in icu, on ventilator. Some states might not have all the data. Detailed notes on states data including a data quality grade. Arizona scores an A+.

Covidtracking has an awesome spreadsheet detailing how well states are reporting data:
https://docs.google.com/spreadsheets/u/1/d/e/2PACX-1vRL2zG1o-qj9l2sl19d1lj1oHd6WbkJ0ukFwN04a_ms_ANUdgxTMpI7AF-gbQzwOSreJUDx6PEK7Vnq/pubhtml.

See also https://covidtracking.com/about-data. 


## The US Census Bureau


County-level population data by age group can be found from this page: https://www.census.gov/data/tables/time-series/demo/popest/2010s-counties-detail.html.

The csv link is: https://www2.census.gov/programs-surveys/popest/datasets/2010-2018/counties/asrh/cc-est2018-alldata.csv

The file `data\cc-est2018-alldata.csv` contains 2018 estimates of county population by age band. This is used to estimate the infections (i.e. cases) by age band, to break down infection fatality rate estimates by age band and state.


## Michigan



Michigan state landing site is https://www.michigan.gov/coronavirus/. 
Data: https://www.michigan.gov/coronavirus/0,9753,7-406-98163_98173---,00.html.

- by LTC? cases and deaths by LTC
- by age? cases and deaths by age
- by comorbidity?
- by county? confirmed and reported deaths by county. 


## Massachusetts

- Data Landing Page: https://www.mass.gov/info-details/covid-19-response-reporting
- Raw data is zip file with many csvs
- LTC data
- Age data
- Nice data dashboard (a daily pdf with many charts and tables.)

https://www.mass.gov/doc/covid-19-dashboard-may-2-2020/download

Data report from 2020-05-02

<pre>
Age Group,cases,cases-per-100000,hospitalizations,hospitalizations-per-100000,deaths,deaths-per-100000
0-19,2130,133,32,2,0,0
20-29,8159,788,143,14,2,0
30-39,9272,1072,302,33,11,1
40-49,9580,1124,466,55,33,4
50-59,11075,1140,864,89,133,14
60-69,8966,1083,1172,142,384,46
70-79,6337,1320,1371,286,853,178
80+,10021,3438,2026,695,2430,834
</pre>


## California

### Santa Clara County CA

####  Seroprevalence Studies

Source: https://www.medrxiv.org/content/10.1101/2020.04.14.20062463v2

From the source:

- April 3-4, 2020, we tested county residents for antibodies to SARS-CoV-2 using a lateral flow immunoassay.
- The raw prevalence of antibodies to SARS-CoV-2 in our sample was 1.5% (exact binomial 95CI 1.1-2.0%).
- After weighting for population demographics of Santa Clara County, the prevalence was 2.8% (95CI 1.3-4.7%)
- These prevalence point estimates imply that 54,000 (95CI 25,000 to 91,000 using weighted prevalence; 23,000 with 95CI 14,000-35,000 using unweighted prevalence) people were infected in Santa Clara County by early April, many more than the approximately 1,000 confirmed cases at the time of the survey.


### Los Angeles County CA

####  Seroprevalence Studies

Source: http://publichealth.lacounty.gov/phcommon/public/media/mediapubhpdetail.cfm?prid=2328

From the source:

- Article date: April 20, 2020
- the research team estimates that approximately 4.1% of the county's adult population has antibody to the virus. Adjusting this estimate for statistical margin of error implies about 2.8% to 5.6% of the county's adult population has antibody to the virus
- i.e. approximately 221,000 to 442,000 adults in the county who have had the infection.
- That estimate is 28 to 55 times higher than the 7,994 confirmed cases of COVID-19 reported to the county by the time of the study in early April. The number of COVID-related deaths in the county has now surpassed 600.

LA county statistics: http://publichealth.lacounty.gov/media/Coronavirus/locations.htm

#### Deaths in residential facilities

http://publichealth.lacounty.gov/media/Coronavirus/locations.htm

#### Cases by age

http://publichealth.lacounty.gov/media/Coronavirus/locations.htm

As of 05-02:
<pre>
Age Group (Los Angeles County Cases Only-excl LB and Pas)		
0 to 17,634	
18 to 40,7917	
41 to 65,10029	
Over 65,5124	
Under Investigation,70
</pre>

## Florida

### Miami-Dade County FL
#### Seroprevalence Studies

Source: https://www.miamidade.gov/releases/2020-04-24-sample-testing-results.asp From the article:

- Article date: April 24, 2020
- County population: 2.75 million residents.
- To date, nearly 1,800 individuals have participated in this program.
- 6% of participants tested positive for COVID-19 antibodies, which equates to 165,000 Miami-Dade County residents. This figure directly contrasts with testing site data, which indicated that there 10,000 positive cases, suggesting that the actual number of infections is potentially 16.5 times the number of those captured through testing sites and local hospitals alone. Using statistical methods that account for the limitations of the test (sensitivity and specificity), we are 95% certain that the true amount of infection lies between 4.4% and 7.9% of the population, or between 123,000 and 221,000 residents.
- of the individuals who tested positive for the antibodies each week, more than half had NO symptoms in the seven to fourteen days prior to screening.

In late April, several antibody seroprevalence studies were done in the United States, one in Santa Clara (CA) county, one in LA county (CA), one in Miami Dade County (FL) and one in New York state.

## New York

####  Seroprevalence Studies

Source: https://www.governor.ny.gov/news/amid-ongoing-covid-19-pandemic-governor-cuomo-announces-results-completed-antibody-testing

- Date of press release: 2020-05-02
- 12.3 percent of the population have COVID-19 antibodies.
- The survey developed a baseline infection rate by testing 15,000 people at grocery stores and community centers across the state over the past two weeks.
- Finally, the Governor confirmed 4,663 additional cases of novel coronavirus, bringing the statewide total to 312,977 confirmed cases in New York State.
- Regional breakdown:
<pre>
Region,Percent Positive
Capital District,2.2%
Central NY,1.9%
Finger Lakes,2.6%
Hudson Valley (Without Westchester/Rockland),3%
Long Island,11.4%
Mohawk Valley,2.7%
North Country,1.2%
NYC,19.9%
Southern Tier,2.4%
Westchester/Rockland,13.8%
Western NY,6%
</pre>

#### Age given cases (New York City)

https://www1.nyc.gov/assets/doh/downloads/pdf/imm/covid-19-daily-data-summary-05022020-1.pdf

Data from 2020-05-01:
<pre>
Age Group,Cases,Percent of Cases
0 to 17,3897,2%
18 to 44,61376,37%
45 to 64,61424,37%
65 to 74,20813,12%
75 and over,19045,11%
Unknown,328
</pre>

#### Age given death (New York City)

Data from 2020-05-01:
<pre>
Age Group,Deaths
0 to 17,6,0.05%
18 to 44,534,4.06%
45 to 64,2937,22.32%
65 to 74,3260,24.78%
75 and over,6417,48.78%
Unknown,2,0.02%
</pre>

#### Underlying condition given death (New York City)

Data from 2020-05-01:
<pre>
Underlying Condition,deaths,percent(excluding unknown)
yes,10025,99.2%
no,81,0.8%
unknown,3050
</pre>

#### Underlying condition and age (New York City)

Data from 2020-05-01:
<pre>
Age Group,Underlying Conditions,No Underlying Conditions,Underlying Conditions Unknown,Total
0 to 17,6,0,0,6
18 to 44,431,14,89,534
45 to 64,2532,60,345,2937
65 to 74,2470,5,785,3260
75 and over,4586,2,1829,6417
Unknown,0,0,2,2
</pre>


## Georgia

## Colorado