# Visualizing the COVID-19 Crisis Across the World

The data in this notebook is from two sources:

COVID-19 Data:
 - Global - [European CDC](https://www.ecdc.europa.eu/en/publications-data/download-todays-data-geographic-distribution-covid-19-cases-worldwide) as of 3/27/20.
 - US - [New York Times via Github](https://github.com/nytimes/covid-19-data) as of 3/25/20.

Population Data: [The World Bank](https://data.worldbank.org/indicator/SP.POP.TOTL) as of 2018.

### Introduction
The COVID-19 crisis is affecting countries all over the world. This notebook will look at different measures of how bad the outbreak is across countries. These will be displayed in global choropleth maps for each metric. Additionally, this notebook sets up a set of code to repeat this exercise as the crisis continues and more daily data is collected.

UPDATE: This notebook now includes a United States specific study at the state level. The same metrics will be viewed.

In [None]:
# Import necessary packages
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import statsmodels as sm
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
pd.set_option('max_colwidth',-1)
from plotly.offline import download_plotlyjs, init_notebook_mode, iplot
from plotly.graph_objs import *
init_notebook_mode()

### Population Data

In [None]:
# Load Country Population data into dataframe
global_pop_df = pd.read_csv('Data/Population_Data/Population_data.csv', skiprows = 3)
global_pop_df.head()

In [None]:
# Keep only 2018 population column (most up to date)
global_pop_df = pop_df.filter(items = ['Country Name', 'Country Code', '2018'], axis = 1)
global_pop_df.head()

### Global COVID-19 Data

In [None]:
# Load Global COVID19 data into dataframe
global_df = pd.read_excel('Data/COVID_Data/COVID-19-geographic-disbtribution-worldwide-2020-03-27.xlsx')
global_df.head()

In [None]:
# Drop any dates before 2020
global_df = global_df[global_df.year > 2019]

In [None]:
# Rename columns
global_df.rename({'countriesAndTerritories' : 'Country', 'cases' : 'Cases',
                  'deaths' : 'Deaths', 'dateRep': 'date'}, axis = 1, inplace = True)

In [None]:
# Drop unnecessary columns
global_df.drop(columns = ['day', 'month', 'year', 'geoId', 'countryterritoryCode', 'popData2018'], inplace = True)

In [None]:
# Preview dataframe
global_df.head()

In [None]:
# Fix country names
global_df.Country = global_df.Country.map(lambda x: x.replace('_',' '))
global_df.Country = global_df.Country.map(lambda x: x.title())

In [None]:
# Define alpha3 dictionary for mapping countries
dict_alpha3 = {'Afghanistan': 'AFG',
 'Albania': 'ALB',
 'Algeria': 'DZA',
 'American Samoa': 'ASM',
 'Andorra': 'AND',
 'Angola': 'AGO',
 'Anguilla': 'AIA',
 'Antarctic Fisheries': '@@@',
 'Antigua And Barbuda': 'ATG',
 'Argentina': 'ARG',
 'Armenia': 'ARM',
 'Aruba': 'ABW',
 'Australia': 'AUS',
 'Austria': 'AUT',
 'Azerbaijan': 'AZE',
 'Bahamas': 'BHS',
 'Bahrain': 'BHR',
 'Bangladesh': 'BGD',
 'Barbados': 'BRB',
 'Belarus': 'BLR',
 'Belgium': 'BEL',
 'Belize': 'BLZ',
 'Benin': 'BEN',
 'Bermuda': 'BMU',
 'Bhutan': 'BTN',
 'Bolivia': 'BOL',
 'Bonaire, St Eustatius, Saba': 'BIH',
 'Bosnia And Herzegovina': 'BIH',
 'Botswana': 'BWA',
 'Brazil': 'BRA',
 'British Virgin Islands': 'VGB',
 'Brunei Darussalam': 'BRN',
 'Bulgaria': 'BGR',
 'Burkina Faso': 'BFA',
 'Burundi': 'BDI',
 'Cape Verde': 'CPV',
 'Cambodia': 'KHM',
 'Cameroon': 'CMR',
 'Canada': 'CAN',
 'Cayman Islands': 'CYM',
 'Central African Republic': 'CAF',
 'Chad': 'TCD',
 'Chile': 'CHL',
 'China': 'CHN',
 'China, Hong Kong SAR': 'HKG',
 'China, Macao SAR': 'MAC',
 'Colombia': 'COL',
 'Commonwealth of Independent States (CIS)': '@@@',
 'Comoros': 'COM',
 'Congo': 'COG',
 'Cook Islands': 'COK',
 'Costa Rica': 'CRI',
 'Croatia': 'HRV',
 'Cuba': 'CUB',
 'Curaçao': 'CUW',
 'Cyprus': 'CYP',
 'Czech Republic': 'CZE',
 'Czechoslovakia (former)': 'CZE',
 "Cote Divoire": 'CIV',
 'Democratic Republic Of The Congo': 'COD',
 'Denmark': 'DNK',
 'Djibouti': 'DJI',
 'Dominica': 'DMA',
 'Dominican Republic': 'DOM',
 'Ecuador': 'ECU',
 'Egypt': 'EGY',
 'El Salvador': 'SLV',
 'Equatorial Guinea': 'GNQ',
 'Eritrea': 'ERI',
 'Estonia': 'EST',
 'Ethiopia': 'ETH',
 'Ethiopia, incl. Eritrea': 'ETH',
 'Faroe Islands': 'FRO',
 'Falkland Is. (Malvinas)': 'MDV',
 'Fiji': 'FJI',
 'Finland': 'FIN',
 'France': 'FRA',
 'French Guiana': 'GUF',
 'French Polynesia': 'PYF',
 'Gabon': 'GAB',
 'Gambia': 'GMB',
 'Georgia': 'GEO',
 'German Dem. R. (former)': '@@@',
 'Germany': 'DEU',
 'Germany, Fed. R. (former)': '@@@',
 'Ghana': 'GHA',
 'Gibraltar': 'GIB',
 'Greece': 'GRC',
 'Greenland': 'GRL',
 'Grenada': 'GRD',
 'Guadeloupe': 'GLP',
 'Guam': 'GUM',
 'Guatemala': 'GTM',
 'Guernsey': 'GGY',
 'Guinea': 'GIN',
 'Guinea Bissau': 'GNB',
 'Guyana': 'GUY',
 'Haiti': 'HTI',
 'Honduras': 'HND',
 'Hungary': 'HUN',
 'Iceland': 'ISL',
 'India': 'IND',
 'Indonesia': 'IDN',
 'Iran': 'IRN',
 'Iraq': 'IRQ',
 'Ireland': 'IRL',
 'Isle Of Man': 'IMN',
 'Israel': 'ISR',
 'Italy': 'ITA',
 'Jamaica': 'JAM',
 'Japan': 'JPN',
 'Jersey': 'JEY',
 'Jordan': 'JOR',
 'Kazakhstan': 'KAZ',
 'Kenya': 'KEN',
 'Kiribati': 'KIR',
 "Korea, Dem.Ppl's.Rep.": 'PRK',
 'South Korea': 'KOR',
 'Kuwait': 'KWT',
 'Kyrgyzstan': 'KGZ',
 "Laos": 'LAO',
 'Latvia': 'LVA',
 'Lebanon': 'LBN',
 'Lesotho': 'LSO',
 'Liberia': 'LBR',
 'Libya': 'LBY',
 'Liechtenstein': 'LIE',
 'Lithuania': 'LTU',
 'Luxembourg': 'LUX',
 'Madagascar': 'MDG',
 'Malawi': 'MWI',
 'Malaysia': 'MYS',
 'Maldives': 'MDV',
 'Mali': 'MLI',
 'Malta': 'MLT',
 'Marshall Islands': 'MHL',
 'Martinique': 'MTQ',
 'Mauritania': 'MRT',
 'Mauritius': 'MUS',
 'Mayotte': 'MYT',
 'Mexico': 'MEX',
 'Micronesia (Fed. States of)': 'FSM',
 'Mongolia': 'MNG',
 'Montenegro': 'MNE',
 'Montserrat': 'MSR',
 'Morocco': 'MAR',
 'Mozambique': 'MOZ',
 'Myanmar': 'MMR',
 'Namibia': 'NAM',
 'Nauru': 'NRU',
 'Nepal': 'NPL',
 'Netherlands Antilles': 'NLD',
 'Netherlands': 'NLD',
 'New Caledonia': 'NCL',
 'New Zealand': 'NZL',
 'Nicaragua': 'NIC',
 'Niger': 'NER',
 'Nigeria': 'NGA',
 'Niue': 'NIU',
 'Northern Mariana Islands': 'MNP',
 'Norway': 'NOR',
 'Oman': 'OMN',
 'Other Asia': '@@@',
 'Pacific Islands (former)': '@@@',
 'Pakistan': 'PAK',
 'Palau': 'PLW',
 'Panama': 'PAN',
 'Papua New Guinea': 'PNG',
 'Paraguay': 'PRY',
 'Peru': 'PER',
 'Philippines': 'PHL',
 'Poland': 'POL',
 'Portugal': 'PRT',
 'Puerto Rico': 'PRI',
 'Qatar': 'QAT',
 'Moldova': 'MDA',
 'Romania': 'ROU',
 'Russia': 'RUS',
 'Rwanda': 'RWA',
 'Réunion': 'REU',
 'Samoa': 'WSM',
 'Sao Tome and Principe': 'STP',
 'Saudi Arabia': 'SAU',
 'Senegal': 'SEN',
 'Serbia': 'SRB',
 'Serbia and Montenegro': 'SRB',
 'Seychelles': 'SYC',
 'Sierra Leone': 'SLE',
 'Singapore': 'SGP',
 'Sint Maarten': 'SXM',
 'Slovakia': 'SVK',
 'Slovenia': 'SVN',
 'Solomon Islands': 'SLB',
 'Somalia': 'SOM',
 'South Africa': 'ZAF',
 'South Sudan': 'SSD',
 'Spain': 'ESP',
 'Sri Lanka': 'LKA',
 'St. Helena and Depend.': 'SHN',
 'Saint Kitts And Nevis': 'KNA',
 'Saint Lucia': 'LCA',
 'St. Pierre-Miquelon': 'SPM',
 'Saint Vincent And The Grenadines': 'VCT',
 'Palestine': 'PSE',
 'Sudan': 'SDN',
 'Sudan (former)': 'SDN',
 'Suriname': 'SUR',
 'Eswatini': 'SWZ',
 'Sweden': 'SWE',
 'Switzerland': 'CHE',
 'Syria': 'SYR',
 'North Macedonia': 'MKD',
 'Tajikistan': 'TJK',
 'Thailand': 'THA',
 'Timor Leste': 'TLS',
 'Togo': 'TGO',
 'Tonga': 'TON',
 'Trinidad And Tobago': 'TTO',
 'Tunisia': 'TUN',
 'Turkey': 'TUR',
 'Turkmenistan': 'TKM',
 'Turks And Caicos Islands': 'TCA',
 'Tuvalu': 'TUV',
 'USSR (former)': '@@@',
 'Uganda': 'UGA',
 'Ukraine': 'UKR',
 'United Arab Emirates': 'ARE',
 'United Kingdom': 'GBR',
 'United Republic Of Tanzania': 'TZA',
 'United States Of America': 'USA',
 'United States Virgin Islands': 'VIR',
 'Uruguay': 'URY',
 'Uzbekistan': 'UZB',
 'Vanuatu': 'VUT',
 'Venezuela': 'VEN',
 'Vietnam': 'VNM',
 'Wallis and Futuna Is.': 'WLF',
 'Yemen': 'YEM',
 'Yemen Arab Rep. (former)': 'YEM',
 'Yemen, Dem. (former)': '@@@',
 'Yugoslavia, SFR (former)': '@@@',
 'Zambia': 'ZMB',
 'Zimbabwe': 'ZWE'}

In [None]:
# Function to fix country name outliers
def country_fix(country):
    if country == 'Cases On An International Conveyance Japan':
        country = 'Japan'
    if country == 'Holy See':
        country = 'Italy'
    if country == 'Kosovo':
        country = 'Serbia'
    if country == 'Monaco':
        country = 'France'
    if country == 'San Marino':
        country = 'Italy'
    if country == 'Taiwan':
        country = 'China'
    return country

In [None]:
# Apply country_fix function to dataframe
global_df.Country = global_df.Country.map(country_fix)

In [None]:
# Create alpha3 column in dataframe
global_df['alpha3'] = global_df.Country.map(lambda x: dict_alpha3[x])
global_df.head()

In [None]:
# Groupby to make sure all countries are including all folded in name outliers and sum the cases/deaths
# Reset index to remove the multi-index
grouped_global_df = global_df.groupby(by = ['alpha3', 'date']).sum()
grouped_global_df = grouped_global_df.reset_index()

### Global COVID-19 Data Feature Engineering

In [None]:
# Get aggregate totals over time for cases and deaths
grouped_global_df['Total_Cases'] = grouped_global_df.groupby('alpha3').Cases.cumsum()
grouped_global_df['Total_Deaths'] = grouped_global_df.groupby('alpha3').Deaths.cumsum()

In [None]:
# Preview dataframe
grouped_global_df.head()

In [None]:
# Rename columns
grouped_global_df.rename({'alpha3' : 'Country', 'Cases' : 'Daily_Cases', 'Deaths' : 'Daily_Deaths',
                          'Total_Cases' : 'Cases', 'Total_Deaths' : 'Deaths'}, axis = 1, inplace = True)

### US COVID-19 Data

In [None]:
# Load US State COVID19 data into dataframe
us_state_df = pd.read_csv('Data/COVID_Data/us-states.csv')
us_state_df.head()

### Merge Population and COVID Tables

In [None]:
# Merge dataframes
final_global_df = pd.merge(grouped_global_df, global_pop_df, how = 'outer',
                           left_on = 'Country', right_on = 'Country Code')

In [None]:
# Drop unnecessary columns
final_global_df.drop('Country Code', axis = 1, inplace = True)
final_global_df.head()

In [None]:
# Rename columns
final_global_df.rename({'2018' : 'Population', 'Country' : 'Code', 'Country Name' : 'Country'},
                       axis = 1, inplace = True)

### Final Data Table Feature Engineering

In [None]:
# Add Cases and Deaths per capita, death rate
final_global_df['Deaths_per_Capita'] = final_global_df.Deaths / final_global_df.Population
final_global_df['Cases_per_Capita'] = final_global_df.Cases / final_global_df.Population
final_global_df['Death_rate'] = (final_global_df.Deaths / final_global_df.Cases).fillna(0.0)

In [None]:
# Preview dataframe
final_global_df.head()

### Mapping

In [None]:
# Def function to create world choropleth map
import plotly.express as px

def world_map(df, input_date, input_val):
    
    fig = px.choropleth(df[df.date == input_date], locations = "Code",
                    color = input_val, hover_name = "Country", locationmode = 'ISO-3',
                    color_continuous_scale = 'reds', title = f'2020 COVID-19 {input_val} by Country')
    fig.show()

#### 3/20/20 Results

In [None]:
# Map Global Cases as of 3/20/20
world_map(final_global_df, '2020-03-20', 'Cases')

In [None]:
# Map Global Deaths as of 3/20/20
world_map(final_global_df, '2020-03-20', 'Deaths')

In [None]:
# Map Global Cases per Capita as of 3/20/20
world_map(final_global_df, '2020-03-20', 'Cases_per_Capita')

In [None]:
# Map Global Deaths per Capita as of 3/20/20
world_map(final_global_df, '2020-03-20', 'Deaths_per_Capita')

In [None]:
# Map Global Death rates as of 3/20/20
world_map(final_global_df, '2020-03-20', 'Death_rate')

#### 3/27/20 Results

In [None]:
# Map Global Cases as of 3/27/20
world_map(final_global_df, '2020-03-27', 'Cases')

In [None]:
# Map Global Deaths as of 3/27/20
world_map(final_global_df, '2020-03-27', 'Deaths')

In [None]:
# Map Global Cases per Capita as of 3/27/20
world_map(final_global_df, '2020-03-27', 'Cases_per_Capita')

In [None]:
# Map Global Deaths per Capita as of 3/27/20
world_map(final_global_df, '2020-03-27', 'Deaths_per_Capita')

In [None]:
# Map Global Death rates as of 3/27/20
world_map(final_global_df, '2020-03-27', 'Death_rate')

### Conclusion
As you can see by looking at the various metrics, certain countries are handling the virus better than others. China and the United States have many cases, but in comparison to their overall population, the number of cases is not that high. European countries like Iceland, Spain, and Italy have a high amount of cases per capita. Unfortunately, when looking at the death rates, places with less medical resources seem to have higher death rates, such as Sudan, Zimbabwe or Guyana, caveat these rates with very low number of cases so far however. European countries on the other hand are not low either with high numbers of cases.