# File names
- ./data/time_series_covid19_confirmed_HOU.csv - COVID-19 Confirmed Cases data
- ./data/time_series_covid19_death_HOU.csv - COVID-19 Deceased Cases data
- ./data/mask_use_HOU.csv - COVID-19 Mask Usage Survey
- ./data/time_series_covid19_hosp_TX.csv - COVID-19 Hospitalization data
- ./data/Texas COVID-19 Hospitalizations over Time by TSA Region.xlsx - COVID-19 Hospitalization data
- ./data/Texas Hospital Capacity over Time by TSA Region.xlsx - COVID-19 Hospitalization Capacity data
- ./data/UID_ISO_FIPS_LookUp_Table.csv - County FIPS and population data

## COVID-19 confirmed cases data
Confirmed cases data consists of accumulated confirmed cases at 9 counties in Greater Houston between 03/01/2020 and 08/15/2020. In addition, longitude, latitude, and FIPS are provided, which may serve as foreign keys to query mask survey data.

## COVID-19 deceased cases data
Deceased data consists of accumulated deceased cases at 9 counties in Greater Houston between 03/01/2020 and 08/15/2020. In addition, longitude, latitude, and FIPS are provided, which may serve as foreign keys to query mask survey data.

## COVID-19 mask usage survey
COVID-19 mask usage survey conducted by The New York Times to estimate the mask usage by county in the United States. Data comes from over 250,000 online interviews between 07/02/2020 and 07/14/2020. Specifically, each interview involves how often the participant wears a mask publicly when he or she expects to be within six feet of another person.

## COVID-19 Hospitalization data
The hospitalization data in TX includes incident rate, estimated active cases, testing number, and available hospitalization rate from CSSE. Some portion of the data is missing, such as estimated hospitalization and hospitalization rate after reopening (04/28/2020). But hospitalization and hospital capacity is available from TSA Region data.

## County FIPS and population data
FIPS data is used to check county code and population. It’s in a single csv file.

In [14]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
sns.set(style="darkgrid")

In [23]:
# File Paths
import os

path_parent = os.getcwd()
print(path_parent)

confirmed_cases_path = os.path.join(path_parent, 'Data Files/time_series_covid19_confirmed_HOU.csv')
death_cases_path = os.path.join(path_parent, 'Data Files/time_series_covid19_death_HOU.csv')
mask_usage_path = os.path.join(path_parent, 'Data Files/mask_use_HOU.csv')
hosp_tx_path = os.path.join(path_parent, 'Data Files/time_series_covid19_hosp_TX.csv')
fips_path = os.path.join(path_parent, 'Data Files/UID_ISO_FIPS_LookUp_Table.csv')

hosp_tx_xlsx_path = os.path.join(path_parent, 'Data Files/Texas COVID-19 Hospitalizations over Time by TSA Region.xlsx')
hosp_capa_path = os.path.join(path_parent, 'Data Files/Texas Hospital Capacity over Time by TSA Region.xlsx')



/Users/irpan/Desktop/Coding/COVID Datathon


In [16]:
%%time
df_confirmed_cases = pd.read_csv(confirmed_cases_path)
df_confirmed_cases

CPU times: user 34.4 ms, sys: 7.42 ms, total: 41.8 ms
Wall time: 35.9 ms


Unnamed: 0,UID,iso2,iso3,code3,FIPS,Admin2,Province_State,Country_Region,Lat,Long_,...,8/15/20,8/16/20,8/17/20,8/18/20,8/19/20,8/20/20,8/21/20,8/22/20,8/23/20,8/24/20
0,84048015,US,USA,840,48015.0,Austin,Texas,US,29.885487,-96.277369,...,328,331,332,335,344,352,368,371,391,398
1,84048039,US,USA,840,48039.0,Brazoria,Texas,US,29.187574,-95.445632,...,8025,8148,8233,8303,8367,8438,8514,8605,8662,8662
2,84048071,US,USA,840,48071.0,Chambers,Texas,US,29.70972,-94.671545,...,1028,1028,1028,1028,1054,1080,1080,1080,1080,1103
3,84048157,US,USA,840,48157.0,Fort Bend,Texas,US,29.527045,-95.772195,...,12228,12228,12623,13034,13605,13688,13794,14034,14034,14640
4,84048167,US,USA,840,48167.0,Galveston,Texas,US,29.401673,-94.904691,...,9937,9937,10033,10135,10212,10283,10308,10350,10350,10400
5,84048201,US,USA,840,48201.0,Harris,Texas,US,29.858649,-95.393395,...,91698,92253,92944,93872,94676,95631,96658,97745,98506,99290
6,84048291,US,USA,840,48291.0,Liberty,Texas,US,30.151527,-94.812056,...,1107,1115,1115,1132,1183,1206,1268,1287,1339,1376
7,84048339,US,USA,840,48339.0,Montgomery,Texas,US,30.300791,-95.505728,...,6957,6957,7160,7327,7533,7743,7915,7915,7915,8122
8,84048473,US,USA,840,48473.0,Waller,Texas,US,30.010584,-95.990118,...,521,526,526,531,550,576,579,608,625,635


In [17]:
%%time
df_death_cases = pd.read_csv(death_cases_path)
df_death_cases

CPU times: user 32.6 ms, sys: 4.82 ms, total: 37.5 ms
Wall time: 71.3 ms


Unnamed: 0,UID,iso2,iso3,code3,FIPS,Admin2,Province_State,Country_Region,Lat,Long_,...,8/15/20,8/16/20,8/17/20,8/18/20,8/19/20,8/20/20,8/21/20,8/22/20,8/23/20,8/24/20
0,84048015,US,USA,840,48015.0,Austin,Texas,US,29.885487,-96.277369,...,328,331,332,335,344,352,368,371,391,398
1,84048039,US,USA,840,48039.0,Brazoria,Texas,US,29.187574,-95.445632,...,8025,8148,8233,8303,8367,8438,8514,8605,8662,8662
2,84048071,US,USA,840,48071.0,Chambers,Texas,US,29.70972,-94.671545,...,1028,1028,1028,1028,1054,1080,1080,1080,1080,1103
3,84048157,US,USA,840,48157.0,Fort Bend,Texas,US,29.527045,-95.772195,...,12228,12228,12623,13034,13605,13688,13794,14034,14034,14640
4,84048167,US,USA,840,48167.0,Galveston,Texas,US,29.401673,-94.904691,...,9937,9937,10033,10135,10212,10283,10308,10350,10350,10400
5,84048201,US,USA,840,48201.0,Harris,Texas,US,29.858649,-95.393395,...,91698,92253,92944,93872,94676,95631,96658,97745,98506,99290
6,84048291,US,USA,840,48291.0,Liberty,Texas,US,30.151527,-94.812056,...,1107,1115,1115,1132,1183,1206,1268,1287,1339,1376
7,84048339,US,USA,840,48339.0,Montgomery,Texas,US,30.300791,-95.505728,...,6957,6957,7160,7327,7533,7743,7915,7915,7915,8122
8,84048473,US,USA,840,48473.0,Waller,Texas,US,30.010584,-95.990118,...,521,526,526,531,550,576,579,608,625,635


In [22]:
%%time
df_mask_usage = pd.read_csv(mask_usage_path)
df_mask_usage

CPU times: user 4.97 ms, sys: 2.64 ms, total: 7.61 ms
Wall time: 5.1 ms


Unnamed: 0,COUNTYFP,NEVER,RARELY,SOMETIMES,FREQUENTLY,ALWAYS
0,48015,0.006,0.024,0.054,0.224,0.692
1,48039,0.021,0.042,0.075,0.17,0.691
2,48071,0.027,0.094,0.078,0.201,0.6
3,48157,0.024,0.03,0.057,0.115,0.774
4,48167,0.033,0.037,0.105,0.207,0.617
5,48201,0.019,0.024,0.069,0.152,0.736
6,48291,0.019,0.034,0.114,0.182,0.65
7,48339,0.031,0.073,0.061,0.145,0.69
8,48473,0.016,0.06,0.07,0.236,0.618


In [19]:
%%time
df_hosp_tx = pd.read_csv(hosp_tx_path)
df_hosp_tx

CPU times: user 6.99 ms, sys: 2.11 ms, total: 9.1 ms
Wall time: 93.5 ms


Unnamed: 0,Province_State,Country_Region,Last_Update,Lat,Long_,Confirmed,Deaths,Recovered,Active,FIPS,Incident_Rate,People_Tested,People_Hospitalized,Mortality_Rate,UID,ISO3,Testing_Rate,Hospitalization_Rate
0,Texas,US,2020-04-12 23:18:15,31.0545,-97.5635,13677,283,2014.0,13394.0,48.0,59.505161,124533.0,1338.0,2.069167,84000048.0,USA,541.811523,9.782847
1,Texas,US,2020-04-13 23:07:54,31.0545,-97.5635,14275,305,2269.0,13970.0,48.0,62.106907,133226.0,1176.0,2.136602,84000048.0,USA,579.632563,8.238179
2,Texas,US,2020-04-14 23:33:31,31.0545,-97.5635,15006,342,2580.0,14664.0,48.0,65.287303,146467.0,1409.0,2.279088,84000048.0,USA,637.240798,9.389578
3,Texas,US,2020-04-15 22:56:51,31.0545,-97.5635,15907,375,3150.0,15532.0,48.0,69.207326,151810.0,1538.0,2.357453,84000048.0,USA,660.486837,9.668699
4,Texas,US,2020-04-16 23:30:51,31.0545,-97.5635,16876,414,3677.0,16462.0,48.0,73.423199,158547.0,1459.0,2.453188,84000048.0,USA,689.797817,8.645414
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
111,Texas,US,2020-08-02 04:35:05,31.0545,-97.5635,444738,6865,282604.0,155269.0,48.0,1533.797162,3747779.0,,1.543605,84000048.0,USA,12925.211688,
112,Texas,US,2020-08-03 04:34:47,31.0545,-97.5635,448145,6878,282604.0,158663.0,48.0,1545.547107,3747779.0,,1.534771,84000048.0,USA,12925.211688,
113,Texas,US,2020-08-04 04:42:12,31.0545,-97.5635,456624,7016,297422.0,152186.0,48.0,1574.789192,3834586.0,,1.536494,84000048.0,USA,13224.588692,
114,Texas,US,2020-08-05 04:34:56,31.0545,-97.5635,466032,7271,306262.0,152499.0,48.0,1607.235179,3884848.0,,1.560193,84000048.0,USA,13397.930554,


In [20]:
%%time
df_fips = pd.read_csv(fips_path)
df_fips

CPU times: user 16.5 ms, sys: 4 ms, total: 20.4 ms
Wall time: 135 ms


Unnamed: 0,UID,iso2,iso3,code3,FIPS,Admin2,Province_State,Country_Region,Lat,Long_,Combined_Key,Population
0,4,AF,AFG,4.0,,,,Afghanistan,33.939110,67.709953,Afghanistan,38928341.0
1,8,AL,ALB,8.0,,,,Albania,41.153300,20.168300,Albania,2877800.0
2,12,DZ,DZA,12.0,,,,Algeria,28.033900,1.659600,Algeria,43851043.0
3,20,AD,AND,20.0,,,,Andorra,42.506300,1.521800,Andorra,77265.0
4,24,AO,AGO,24.0,,,,Angola,-11.202700,17.873900,Angola,32866268.0
...,...,...,...,...,...,...,...,...,...,...,...,...
4148,84056037,US,USA,840.0,56037.0,Sweetwater,Wyoming,US,41.659439,-108.882788,"Sweetwater, Wyoming, US",42343.0
4149,84056039,US,USA,840.0,56039.0,Teton,Wyoming,US,43.935225,-110.589080,"Teton, Wyoming, US",23464.0
4150,84056041,US,USA,840.0,56041.0,Uinta,Wyoming,US,41.287818,-110.547578,"Uinta, Wyoming, US",20226.0
4151,84056043,US,USA,840.0,56043.0,Washakie,Wyoming,US,43.904516,-107.680187,"Washakie, Wyoming, US",7805.0


In [None]:
hosp_tx_xlsx_path