# Collect data

## Data sources

- [Johns Hopkins University - Time Series](https://github.com/CSSEGISandData/COVID-19)
- [Johns Hopkins University - Vaccination](https://github.com/govex/COVID-19/)
- [Government of Mexico - COVID-19](https://datos.gob.mx/busca/dataset/informacion-referente-a-casos-covid-19-en-mexico)

## Load libraries

In [2]:
import requests
import covid_analysis.utils.paths as path

## Utility functions

In [3]:
def download_csv(url: str, out_file: path.Path) -> None:
    request = requests.get(url)
    content = request.content

    with open(out_file, "wb") as file_content:
        file_content.write(content)


## Define default output directory

In [4]:
output_dir = path.data_raw_dir()
output_dir.mkdir(parents=True, exist_ok=True)

## Download Johns Hopkins University time series

In [5]:
hopkins_base_url = "https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/"

hopkins_filenames = (
    "time_series_covid19_confirmed_global.csv",
    "time_series_covid19_deaths_global.csv"
)

hopkins_time_series_urls = {
    path.data_raw_dir(file_name): f"{hopkins_base_url}{file_name}"
    for file_name in hopkins_filenames
}

In [6]:
[
    download_csv(url, out_path) for out_path, url in hopkins_time_series_urls.items()
];

## Download Johns Hopkins University countries metadata

In [7]:
countries_meta_url = "https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/UID_ISO_FIPS_LookUp_Table.csv"
countries_meta_filename = output_dir.joinpath("UID_ISO_FIPS_LookUp_Table.csv")

In [8]:
download_csv(countries_meta_url, countries_meta_filename);

## Download Johns Hopkins University vaccination time series

In [9]:
vaccination_url = "https://raw.githubusercontent.com/govex/COVID-19/master/data_tables/vaccine_data/global_data/time_series_covid19_vaccine_global.csv"
vaccination_filename = output_dir.joinpath("time_series_covid19_vaccine_global.csv")

In [10]:
download_csv(vaccination_url, vaccination_filename);

## Download Government of Mexico data

### Data dictionaries

In [23]:
data_dict_mex_url = "http://datosabiertos.salud.gob.mx/gobmx/salud/datos_abiertos/diccionario_datos_covid19.zip"
data_dict_mex_filename = str(output_dir.joinpath("diccionario_datos_covid19.zip"))

In [27]:
!wget -q {data_dict_mex_url} -O {data_dict_mex_filename}

### Open covid-19 data

In [35]:
data_mex_url = "http://datosabiertos.salud.gob.mx/gobmx/salud/datos_abiertos/datos_abiertos_covid19.zip"
data_mex_filename = str(output_dir.joinpath("datos_abiertos_covid19.zip"))

/Users/jvelezmagic/Documents/Github/personal_projects/covid_analysis/data/raw/datos_abiertos_covid19.zip


```bash
wget -q {data_mex_url} -O {output_dir}
```

In [43]:
!axel -q -n 8 {data_mex_url} -o {output_dir}

No state file, cannot resume!
