# <u> Covid-19 Data Exploration </u>

## <u> Objective: </u> 
### Assess the Quality of the *Case Numbers* Statistic to control response measures.
Define the ultimate goal of response measures to prevent hospital capacities from maxing out.\
Thus, investigate whether there exist scenarios in which an increase in the hospital admissions was not indicated
by an increase in case numbers.
Furthermore, investigate beahviour of hospital admissions and deaths after substantial increase in testing.

## <u> Method: </u>
1. Get an Overview over Hospital Admission Data
2. Correlate Case Number Data with Testing Strategy
3. Combine Observations from (1) and (2)

In [1]:
import os
import wget
import pandas as pd
import matplotlib.pyplot as plt

## Download and Load all Data Sets

In [None]:
# create data directory if it doesnt exist yet
if not 'data' in os.listdir():
    ! mkdir data

data_dir = './data/'
# clean data dir from old data (some sets are updated on a daily bases)
! rm data/*


# add urls to data file for new data sources here

# From European Centre for Disease Prevention and Control:
cases_deaths_url = 'https://opendata.ecdc.europa.eu/covid19/nationalcasedeath_eueea_daily_ei/csv/data.csv'
hospitalization_url = 'https://opendata.ecdc.europa.eu/covid19/hospitalicuadmissionrates/csv/data.csv'
tests_url = 'https://opendata.ecdc.europa.eu/covid19/testing/csv/data.csv'
variants_url = 'https://opendata.ecdc.europa.eu/covid19/virusvariant/csv/data.csv'
vaccinations_url = 'https://opendata.ecdc.europa.eu/covid19/vaccine_tracker/csv/data.csv'

# add them to the dictionary and specify the desired file name
download_dict = {'cases_deaths.csv': cases_deaths_url, 'hospitalizations.csv': hospitalization_url,
                'tests': tests_url, 'vaccinations': vaccinations_url, 'variants': variants_url}

for dict_item in download_dict.items():
    # download data file
    wget.download(dict_item[1], data_dir + dict_item[0])
    # load data frame named by filename
    df_name = dict_item[0].split('.')[0]
    globals()[df_name] = pd.read_csv(data_dir + str(dict_item[0])) # use string as variable name

In [3]:
print('Loaded: \n')
for filename in download_dict.keys():
    print(filename + ', with variables:' + '\n')
    print(globals()[filename.split('.')[0]].columns.values, '\n \n')

Loaded: 

cases_deaths.csv, with variables:

['dateRep' 'day' 'month' 'year' 'cases' 'deaths' 'countriesAndTerritories'
 'geoId' 'countryterritoryCode' 'popData2020' 'continentExp'] 
 

hospitalizations.csv, with variables:

['country' 'indicator' 'date' 'year_week' 'value' 'source' 'url'] 
 

tests, with variables:

['country' 'country_code' 'year_week' 'level' 'region' 'region_name'
 'new_cases' 'tests_done' 'population' 'testing_rate' 'positivity_rate'
 'testing_data_source'] 
 

vaccinations, with variables:

['YearWeekISO' 'FirstDose' 'FirstDoseRefused' 'SecondDose' 'UnknownDose'
 'NumberDosesReceived' 'Region' 'Population' 'ReportingCountry'
 'TargetGroup' 'Vaccine' 'Denominator'] 
 

variants, with variables:

['country' 'country_code' 'year_week' 'source' 'new_cases'
 'number_sequenced' 'percent_cases_sequenced' 'valid_denominator'
 'variant' 'number_detections_variant' 'percent_variant'] 
 

