# COVID-19 Economic Impact - A Survival Guide

Given below is a dataset highlighting the various factors influencing COVID 19 impacts, including **GDP Per Capita** and **Human Development Index**.

These factors will help us figure out which coutries have the best chances of surviving COVID-19 with the least impact to its citizens and its economy. 

## Content Guide - 

* [Factors Affecting The Spread of COVID-19](#factors)


* [Figures' Overview](#figures)
    1. [Total Figures](#total)
    2. [Monthly Figures](#monthly)


* [World Map Representation](#map)
    1. [Total Figures](#total_world)
    2. [Monthly Figures](#monthly_world)


* [Stringency Index](#strin)
    1. [Total Figures](#total_strin)
    2. [Monthly Figures](#monthly_strin)


* [Ranking the Safest Countries](#rank)

In [None]:
# data processing
import pandas as pd
import numpy as np
import functools
import os
from sklearn.impute import SimpleImputer
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# data visualization
import matplotlib.pyplot as plt 
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go

# extras
from datetime import datetime as dt
!pip install pycountry_convert
import pycountry_convert as pc

print("\n Libraries imported successfully")

<a id="factors"></a>
## Factors Affecting The Spread of COVID-19

Let us look at the information that is at our disposal

In [None]:
data_filepath = '../input/impact-of-covid19-pandemic-on-the-global-economy/raw_data.csv'
transformed_data_filepath = '../input/impact-of-covid19-pandemic-on-the-global-economy/transformed_data.csv'

prdata = pd.read_csv(data_filepath)
data = pd.read_csv(transformed_data_filepath)

In [None]:
prdata.head(5)

In [None]:
data.head()

Within the information we have, **HDI (Human Development Index)**, **Population**, **GDPCAP (GDP per Capita)** and **Stringency Index** are of particular importance to us. 

Let us delve further into representing the data we have in a cleaner manner, for further analysis.

#### Removing Null Values from the Dataset

In [None]:
print('\n', prdata.isna().sum())
print('\n', data.isna().sum())

Null values in this dataset are of particular concern in this dataset as they are present within the **HDI** and **GDPCAP** columns. 

These values cannot be filled in with mean, or a SimpleImputer as these values differ with each country. 

We will leave these values as 0 for the duration of this dataset. 

In [None]:
data = data.replace(to_replace = np.nan, value = 0) 
prdata = prdata.replace(to_replace = np.nan, value = 0) 

We see that we have two different datasets. The dataset labelled 'Data' is the refined version of the raw 'PRData' dataset, but it has certain values we can discard. 

So let's combine the two datasets and use the best of both.

In [None]:
population = prdata.population
gdp_capita = prdata.gdp_per_capita
data["population"] = population
data["gdp_cap"] = gdp_capita

data["DATE"] = data["DATE"].map(lambda x: dt.strptime(x, '%Y-%m-%d'))
data["year"] = data["DATE"].map(lambda x: x.year)
data["month"] = data["DATE"].map(lambda x: x.month)

data = data[['CODE', 'COUNTRY', 'year', 'month', 'DATE', 'population', 'gdp_cap', 'HDI', 'TC', 'TD', 'STI']]
data = data.rename(columns={'CODE':'Code', 'COUNTRY':'Country', 'year':'Year', 'month':'Month', 'DATE':'Date', 'population':'Population', 'gdp_cap':'GDP_Cap', 'HDI':'HDI', 'TC':'Cases', 'TD':'Deaths', 'STI':'Stringency_Index'})
data.head()

Upon a rudimentary inspection of the data, we find a few key figures. 

* **Code** - This column represents the ISO Codes given to a country. Each Country gets an ISO Code that can be of two forms - Alpha_2 and Alpha_3. We can use these codes to find the continent the country resides in. 


* **Country** - This column contains the names of countries whose data has been mentioned.


* **Year** - We are considering cases only till October 2020, so the cases begin from December 2019.


* **Month** - Contains months from December 2019 to October 2020.


* **Population** - Shows the population of each country around the world. Crucial for our analysis.


* **GDP_Cap** - Represents the GDP per Capita offered by a country. GDP is the Gross Domestic Product. It refers to the sum of money that a particular state can use for the welfare of its citizens. GDP per Capita is the division of the GDP among the citizens. Higher the GDP per Capita, more the country is capable of handling pandemic level situations like COVID-19. 


* **HDI** - Stands for Human Development Index. It is a measure of average achievement in key areas of human development. These could include healthcare facilities, medical advancements etc. Simply put, higher the HDI, better chances of citizens surviving a COVID-19 pandemic. 


* **TC** - Refers to Total Cases. Measured daily, this tally gives us the number of people affected by the COVID-19 pandemic. 


* **TD** - Refers to Total Deaths. Measured daily, this tally gives us the number of people who died as a result of contracting the SARS-COV-2 virus. 


* **Stringency_Index** - Refers to how strict governments around the world became during the pandemic. Also refers to how stringent people became about following lockdown rules. In simpler words, it refers to how willing people were to following the rules of a pandemic.

<a id="figures"></a>
## Figures' Overview

A display of countries' rankings and positions based on key factors that would improve or inhibit the spread of COVID-19

### The datasets we will be working with

In [None]:
data = data[data.Country != 'Kosovo']

country = data.Country.unique().tolist()
country_code = data.Code.unique().tolist()
pop_world = data.Population.unique().tolist()
hdi_world = []
gdp_world = []
cases_country = []
death_country = []
stringency_index = []

for i in country:
    hdi_world.append((data.loc[data.Country == i, 'HDI']).sum()/294)
    gdp_world.append(data.loc[data.Country == i, 'GDP_Cap'].sum()/294)
    stringency_index.append(data.loc[data.Country == i, 'Stringency_Index'].sum()/294)
    cases_country.append(data.loc[(data["Country"] == i), "Cases"].sum())
    death_country.append(data.loc[(data["Country"] == i), "Deaths"].sum())

In [None]:
alpha2_code = []
for i in country_code:
    alpha2_code.append(pc.country_alpha3_to_country_alpha2(i))
continent_code = []
for i in alpha2_code:
    try:
        continent_code.append(pc.country_alpha2_to_continent_code(i))
    except:
        continent_code.append('Unknown') 

data_agg = pd.DataFrame(list(zip(country_code, country, pop_world, cases_country, death_country, hdi_world, gdp_world, stringency_index, continent_code)), columns =['Code', 'Country', 'Population', 'Cases', 'Deaths', 'HDI', 'GDP_Cap','Stringency_Index', 'Continent']) 
data_agg = data_agg.replace({'AF':'Africa', 'AN':'Antarctica', 'AS':'Asia', 'EU':'Europe', 'NA':'North America', 'OC':'Oceania', 'SA':'South America'})
data_agg = data_agg.round(2)

data_agg.head()

<a id="total"></a>
## Total Figures

In [None]:
fig = px.scatter(data_agg, x="HDI", y="GDP_Cap", size="Population", hover_name="Country", color='Continent', template='simple_white', size_max=50)
fig.update_layout(
    height=500,
    title_text="Comparison between a Country's GDP per Capita and HDI"
)
fig.show()

In [None]:
fig = px.scatter(data_agg, x="GDP_Cap", y="Cases", size="Population", hover_name="Country", color='Continent', template='simple_white', size_max=50)
fig.update_layout(
    height=500,
    title_text="COVID-19 Cases vs GDP per Capita (per Country)"
)
fig.show()

In [None]:
fig = px.scatter(data_agg, x='HDI', y='Cases', hover_name='Country', color='Continent', size='Population', template="simple_white", size_max=50)
fig.update_traces(textposition='top center')
fig.update_layout(
    height=500,
    title_text='COVID-19 Cases vs HDI (per Country)'
)
fig.show()

In [None]:
fig = px.bar(data_agg, x='Continent', y='Cases', hover_name='Country', color='Continent', template="simple_white")
fig.update_layout(
    height=500,
    title_text='COVID-19 Cases per Continent'
)
fig.update_xaxes(showticklabels=False)
fig.show()

In [None]:
fig = px.scatter(data_agg, x='GDP_Cap', y='Cases', hover_name='Country', animation_frame="Continent", animation_group="Country", size='Population', template="simple_white", size_max=50)
fig.update_traces(textposition='top center')
fig.update_layout(
    height=500,
    title_text='COVID-19 Cases vs GDP per Capita per Country (divided by Continents)'
)
fig["layout"].pop("updatemenus")
fig.show()

In [None]:
fig = px.scatter(data_agg, x='HDI', y='Cases', hover_name='Country', animation_frame="Continent", animation_group="Country", size='Population', template="simple_white", size_max=50)
fig.update_traces(textposition='top center')
fig.update_layout(
    height=500,
    title_text='COVID-19 Cases vs HDI per Country (divided by Continents)'
)
fig["layout"].pop("updatemenus")
fig.show()

<a id="monthly"></a>
## Monthly Figures

In [None]:
case_month = data[['Code', 'Country', 'Month', 'Population', 'Cases', 'GDP_Cap', 'HDI']].groupby(['Country', 'Month', 'Population', 'GDP_Cap', 'HDI', 'Code'], as_index=False).sum()
alpha2_code_month = []
for i in case_month['Code']:
    alpha2_code_month.append(pc.country_alpha3_to_country_alpha2(i))
continent_code_month = []
for j in alpha2_code_month:
    try:
        continent_code_month.append(pc.country_alpha2_to_continent_code(j))
    except:
        continent_code_month.append('Unknown') 
        
case_month['Continent'] = continent_code_month
case_month = case_month.replace({'AF':'Africa', 'AN':'Antarctica', 'AS':'Asia', 'EU':'Europe', 'NA':'North America', 'OC':'Oceania', 'SA':'South America'})
case_month['Month'] = case_month['Month'].replace({1:"Jan 2020", 2:"Feb 2020", 3:"Mar 2020", 4:"Apr 2020", 5:"May 2020", 6:"Jun 2020", 7:"Jul 2020", 8:"Aug 2020", 9:"Sep 2020", 10:"Oct 2020", 12:"Dec 2019",})
case_month.head()

In [None]:
fig = px.scatter(case_month, x='GDP_Cap', y='Cases', hover_name='Country', animation_frame="Month", animation_group="Country", color='Continent', size='Population', template="simple_white", size_max=50)
fig.update_traces(textposition='top center')
fig.update_layout(
    height=500,
    title_text='COVID-19 Cases vs GDP per Capita (per Country)'
)
fig.update_yaxes(
    range=(-100, 600),
    constrain='domain'
)
fig.show()

In [None]:
fig = px.scatter(case_month, x='HDI', y='Cases', hover_name='Country', animation_frame="Month", animation_group="Country", color='Continent', size='Population', template="simple_white", size_max=50)
fig.update_traces(textposition='top center')
fig.update_layout(
    height=500,
    title_text='COVID-19 Cases vs GDP per Capita (per Country)'
)
fig.update_yaxes(
    range=(-100, 600),
    constrain='domain'
)
fig.show()

<a id="map"></a>
<a id="total_world"></a>
## World Map Representation (Total Figures)

In [None]:
fig = px.choropleth(data_agg, locations="Country", locationmode='country names', color='GDP_Cap', hover_name="Country", hover_data=['Population', 'HDI', 'Continent'], title='GDP per Capita across the world', template='simple_white', color_continuous_scale='Bluered_r')
fig.show()

In [None]:
fig = px.choropleth(data_agg, locations="Country", locationmode='country names', color='HDI', hover_name="Country", hover_data=['Population', 'Cases', 'Continent'], title='Human Development Index (HDI) across the world', template='simple_white',  color_continuous_scale='Inferno')
fig.show()

In [None]:
fig = px.choropleth(data_agg, locations="Country", locationmode='country names', color='Cases', hover_name="Country", hover_data=['Population', 'HDI', 'Continent'], title='Countries with Confirmed Cases', template='simple_white', color_continuous_scale='Viridis')
fig.show()

<a id="monthly_world"></a>
## World Map Representation (Monthly Figures)

In [None]:
fig = px.choropleth(case_month, locations="Country", locationmode='country names', color='Cases', hover_name="Country", hover_data=['Population', 'HDI', 'Continent'], animation_frame="Month", template='simple_white', title='Countries with Confirmed Cases', color_continuous_scale='Viridis')
fig.show()

<a id="strin"></a>
## Stringency Index

Stringency Index is the measure of the strictness followed by countries regarding protocol, health and safety, medicare and lockdown. Usually when the countries have a higher score, it means they are working that much more actively to curb the spread of the virus.

<a id="total_strin"></a>
### Total Figures

In [None]:
fig = px.scatter(data_agg, x='Stringency_Index', y='GDP_Cap', hover_name='Country', color='Continent', size='Population', template="simple_white", size_max=50)
fig.update_traces(textposition='top center')
fig.update_layout(
    height=500,
    title_text='Stringency Index vs GDP Per Capita (per Country)'
)
fig.show()

In [None]:
fig = px.scatter(data_agg, x='Stringency_Index', y='HDI', hover_name='Country', color='Continent', size='Population', template="simple_white", size_max=50)
fig.update_traces(textposition='top center')
fig.update_layout(
    height=500,
    title_text='Stringency Index vs HDI (per Country)'
)
fig.show()

In [None]:
fig = px.scatter(data_agg, x='Stringency_Index', y='Cases', hover_name='Country', color='Continent', size='Population', template="simple_white", size_max=50)
fig.update_traces(textposition='top center')
fig.update_layout(
    height=500,
    title_text='COVID-19 Cases vs Stringency Index (per Country)'
)
fig.show()

In [None]:
fig = px.scatter(data_agg, x='Stringency_Index', y='Cases', hover_name='Country', animation_frame="Continent", animation_group="Country", size='Population', template="simple_white", size_max=50)
fig.update_traces(textposition='top center')
fig.update_layout(
    height=500,
    title_text='COVID-19 Cases vs Stringency Index per Country (divided by Continents)'
)
fig["layout"].pop("updatemenus")
fig.show()

In [None]:
fig = px.choropleth(data_agg, locations="Country", locationmode='country names', color='Stringency_Index', hover_name="Country", hover_data=['Population', 'HDI', 'Cases', 'Continent'], template='simple_white', title='Stringency Index across the world', color_continuous_scale=px.colors.diverging.BrBG)
fig.show()

<a id="monthly_strin"></a>
### Monthly Figures

In [None]:
stri_month = data[['Code', 'Country', 'Month', 'Population', 'Cases', 'GDP_Cap', 'HDI', 'Stringency_Index']].groupby(['Country', 'Month', 'Stringency_Index', 'Population', 'GDP_Cap', 'HDI', 'Code'], as_index=False).sum()
alpha2_code_month = []
for i in stri_month['Code']:
    alpha2_code_month.append(pc.country_alpha3_to_country_alpha2(i))
continent_code_month = []
for j in alpha2_code_month:
    try:
        continent_code_month.append(pc.country_alpha2_to_continent_code(j))
    except:
        continent_code_month.append('Unknown') 
        
stri_month['Continent'] = continent_code_month
stri_month = stri_month.replace({'AF':'Africa', 'AN':'Antarctica', 'AS':'Asia', 'EU':'Europe', 'NA':'North America', 'OC':'Oceania', 'SA':'South America'})
stri_month['Month'] = stri_month['Month'].replace({1:"Jan 2020", 2:"Feb 2020", 3:"Mar 2020", 4:"Apr 2020", 5:"May 2020", 6:"Jun 2020", 7:"Jul 2020", 8:"Aug 2020", 9:"Sep 2020", 10:"Oct 2020", 12:"Dec 2019"})

In [None]:
fig = px.choropleth(stri_month, locations="Country", locationmode='country names', color='Stringency_Index', hover_name="Country", hover_data=['Population', 'HDI', 'Cases', 'Continent'], animation_frame="Month", template='simple_white', title='Stringency Index across the world', color_continuous_scale='ice')
fig.show()

<a id="rank"></a>
## Ranking the Safest Countries during COVID-19

In [None]:
rankings = data_agg
rankings['Death_Rate'] = (rankings['Deaths'] / rankings['Cases'])
rankings['Inf_Rate'] = (rankings['Cases'] / rankings['Population']) * 100
rankings['GDP'] = (rankings['GDP_Cap'] / rankings['GDP_Cap'].sum()) * 100
rankings = rankings[['Country', 'HDI', 'GDP', 'Population', 'Stringency_Index', 'Death_Rate', 'Inf_Rate', 'Continent']]
rankings = rankings.fillna(0)
rankings = rankings.round(2)
rankings['Score'] = (rankings['HDI'] + rankings['GDP'] + rankings['Stringency_Index']) - (rankings['Death_Rate']) 
rankings = rankings.sort_values('Score', ascending=False)

We use a relatively simple metric, that combines the Stringency Index, HDI, Death and Infection Rate to give us the optimum score out of 5.

Higher the score, better your chances of surviving SARS-COV-2 in that particular country

In [None]:
fig = px.bar(rankings[0:50], x='Score', y='Country', hover_name='Country', template="simple_white", color_continuous_scale='ice')
fig.update_layout(
    height=1000,
    title_text='TOP 50 SAFEST COUNTRIES TO LIVE IN (DURING COVID-19)'
)
fig.show()