# COVID-19 (Coronavirus) Survey

In 2019, a novel coronavirus - __Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2)__ - was discovered in China.
The illness resulting from this virus is a new respiratory disease called  __coronavirus Disease 2019(COVID-19)__ .
It has been called a __pandemic__ by the World Health Organization.

Below are the analyses of how the pandemic has affected nations globally.

In [133]:
import pandas as pd

import numpy as np
import plotly
import chart_studio.plotly as py
import plotly.offline as offline
import plotly.graph_objs as go
from plotly.subplots import make_subplots
import datetime
import plotly.express as px

offline.init_notebook_mode(connected=True)

In [134]:
covid_data_raw = pd.read_csv('covid_19_data.csv')
confirmed_raw = pd.read_csv('time_series_covid_19_confirmed.csv')
deaths_raw = pd.read_csv('time_series_covid_19_deaths.csv')
recovered_raw = pd.read_csv('time_series_covid_19_recovered.csv')

In [135]:
confirmed_raw.head()

Unnamed: 0,Province/State,Country/Region,Lat,Long,1/22/20,1/23/20,1/24/20,1/25/20,1/26/20,1/27/20,...,3/20/20,3/21/20,3/22/20,3/23/20,3/24/20,3/25/20,3/26/20,3/27/20,3/28/20,3/29/20
0,,Afghanistan,33.0,65.0,0,0,0,0,0,0,...,24,24,40,40,74,84,94,110,110,120
1,,Albania,41.1533,20.1683,0,0,0,0,0,0,...,70,76,89,104,123,146,174,186,197,212
2,,Algeria,28.0339,1.6596,0,0,0,0,0,0,...,90,139,201,230,264,302,367,409,454,511
3,,Andorra,42.5063,1.5218,0,0,0,0,0,0,...,75,88,113,133,164,188,224,267,308,334
4,,Angola,-11.2027,17.8739,0,0,0,0,0,0,...,1,2,2,3,3,3,4,4,5,7


In [136]:
def covid_rename_columns(input_data):
    output_data = input_data.rename(columns={'Province/State':'subregion', 'Country/Region':'country','Lat':'lat', 'Long':'long'})
    return output_data

In [137]:
def covid_melt_data(input_data, value_var_name):
    output_data = input_data.melt(id_vars=['subregion','country','lat','long'], var_name='date_raw', value_name=value_var_name)
    return output_data

In [138]:
def covid_convert_dates(input_data):
    output_data = input_data.assign(date = pd.to_datetime(input_data.date_raw, format='%m/%d/%y'))
    output_data.drop(columns=['date_raw'], inplace = True)
    return output_data

In [139]:
def covid_rearrange_data(input_data, value_var_name):
    output_data = (input_data.filter(['country','subregion','date','lat','long', value_var_name])
                   .sort_values(['country','subregion','date','lat','long'])
                   .reset_index(drop=True))
                   
    return output_data 

In [140]:
def covid_get_data(covid_data, value_var_name):
    covid_data = covid_rename_columns(covid_data)
    covid_data = covid_melt_data(covid_data, value_var_name)
    covid_data = covid_convert_dates(covid_data)
    covid_data = covid_rearrange_data(covid_data, value_var_name)
    
    return covid_data

In [141]:
covid_confirmed = covid_get_data(confirmed_raw, 'confirmed')
covid_deaths  = covid_get_data(deaths_raw, 'deaths')
covid_recovered = covid_get_data(recovered_raw, 'recovered')

In [142]:
covid_data_merged = (covid_confirmed.merge(covid_deaths, on=['country','subregion','lat','long','date'], how='left')
 .merge(covid_recovered, on=['country','subregion','date', 'lat','long'], how='left'))

In [143]:
covid_data_merged.head()

Unnamed: 0,country,subregion,date,lat,long,confirmed,deaths,recovered
0,Afghanistan,,2020-01-22,33.0,65.0,0,0,0.0
1,Afghanistan,,2020-01-23,33.0,65.0,0,0,0.0
2,Afghanistan,,2020-01-24,33.0,65.0,0,0,0.0
3,Afghanistan,,2020-01-25,33.0,65.0,0,0,0.0
4,Afghanistan,,2020-01-26,33.0,65.0,0,0,0.0


In [144]:
covid_data_sum = covid_data_merged.filter(['country','confirmed','deaths','recovered']).groupby('country').agg('max')

In [145]:
def top_ten_sum(df,col):
    return df.filter([col]).sort_values(by=col, ascending=False)[:10]

In [146]:
top_10_confirmed = top_ten_sum(covid_data_sum,'confirmed')
top_10_recovered = top_ten_sum(covid_data_sum,'recovered')
top_10_deaths = top_ten_sum(covid_data_sum,'deaths')

In [147]:
fig = px.bar(top_10_confirmed, x=top_10_confirmed.index, y='confirmed', text='confirmed', color='confirmed')
fig.update_traces(texttemplate='%{text:.3s}', textposition='outside')
fig.update_layout(uniformtext_minsize=8, uniformtext_mode = 'hide')
fig.show()

In [148]:
fig = px.bar(top_10_recovered, x=top_10_recovered.index, y='recovered', text='recovered', color='recovered')
fig.update_traces(texttemplate='%{text:.3s}', textposition='outside')
fig.update_layout(uniformtext_minsize=8, uniformtext_mode = 'hide')
fig.show()

In [149]:
fig = px.bar(top_10_deaths, x=top_10_deaths.index, y='deaths', text='deaths', color='deaths')
fig.update_traces(texttemplate='%{text:.3s}', textposition='outside')
fig.update_layout(uniformtext_minsize=8, uniformtext_mode = 'hide')
fig.show()

- The top four nations with the highest number of confirmed cases are in the following order: __The United States__ , __Italy__, __Spain__, and __China__
- __China__ has the highest number of __recovered__ cases, followed by __Italy__ and __Spain__
- __Italy__ has the highest number of __death__ cases, followed by __Spain__ and __China__.

- As of __March, 30th__, the three most affected by this pandemic are __Italy, China__, and __the United States__

With the high correlation between __confirmed__ cases and __deaths__,
it is interesting to notice that the number of confirmed cases in __the US__ does not match with the number of deaths. 

We have to look at the datasets over the period for the US and the two highly affected countries, China and Italy.

In [150]:
total_per_day = (covid_data_merged.filter(['country','date','confirmed','deaths','recovered'])
                 .groupby(['date'])
                 .agg('sum'))

In [152]:
import pandas as pd

fig = go.Figure()

fig.add_trace(go.Scatter(
                x=total_per_day.index,
                y=total_per_day.confirmed,
                name="confirmed",
                mode='lines+markers',
                line_color='deepskyblue',
                opacity=0.8))

fig.add_trace(go.Scatter(
                x=total_per_day.index,
                y=total_per_day.recovered,
                name="recovered",
                mode='lines+markers',
                line_color='green',
                opacity=0.8))

fig.add_trace(go.Scatter(
                x=total_per_day.index,
                y=total_per_day.deaths,
                name="deaths",
                mode='lines+markers',
                line_color='red',
                opacity=0.8))

fig.update_layout(
    title='Total cases per day from January, 2020 to March, 2020',
    yaxis = dict(
        showgrid=False,
        showline=False,
        showticklabels=True,
        domain=[0,0.85],
    ),

    xaxis = dict(
        zeroline=False,
        showline=False,
        showticklabels=True,
        showgrid=True,
        domain=[0,0.85],
    ),
    
    legend=dict(x=0.029,y=1.038, font_size=10),
    margin=dict(l=100, r=20, t=70, b=70),
    paper_bgcolor='rgb(248, 248, 255)',
    plot_bgcolor='rgb(248, 248, 255)',

)

fig.show()

- The total number of __confirmed__ cases globally is way much higher than that of the __recovered__ and __death__ cases. 
- The ideal situation is to __flatten the curve__ in all of the three cases.

In [153]:
def country_per_day(df,country):
    return (covid_data_merged.filter(['country','date','confirmed','deaths','recovered'])
            .query("country == '{}'".format(country)).groupby(['date']).agg('sum'))

In [154]:
China = country_per_day(covid_data_merged, 'China')
US = country_per_day(covid_data_merged, 'US')
Italy = country_per_day(covid_data_merged, 'Italy')
Nigeria = country_per_day(covid_data_merged, 'Nigeria')

In [155]:
fig = go.Figure()

fig.add_trace(go.Scatter(
                x=US.index,
                y=US.confirmed,
                name="confirmed (United States)",
                mode='lines+markers',
                line_color='deepskyblue',
                opacity=0.8))

fig.add_trace(go.Scatter(
                x=China.index,
                y=China.confirmed,
                name="confirmed (China)",
                mode='lines+markers',
                line_color='green',
                opacity=0.8))

fig.add_trace(go.Scatter(
                x=Italy.index,
                y=Italy.confirmed,
                name="confirmed (Italy)",
                mode='lines+markers',
                line_color='purple',
                opacity=0.8))

fig.update_layout(
    title='Total confirmed cases per day for the US, China, and Italy',
    yaxis = dict(
        showgrid=False,
        showline=False,
        showticklabels=True,
        domain=[0,0.85],
    ),

    xaxis = dict(
        zeroline=False,
        showline=False,
        showticklabels=True,
        showgrid=True,
        domain=[0,0.85],
    ),
    
    legend=dict(x=0.029,y=1.038, font_size=10),
    margin=dict(l=100, r=20, t=70, b=70),
    paper_bgcolor='rgb(248, 248, 255)',
    plot_bgcolor='rgb(248, 248, 255)',

)

fig.show()

The rate of change of the __confirmed cases__ for the US was almost normal from __January, 22__, till it had a sharp increase on __March, 18__,
and surpassed both China and Italy on __March, 26__, making the US the [highest number of confirmed cases](https://www.nytimes.com/2020/03/26/health/usa-coronavirus-cases.html) [1].

The question is what triggered such exponential growth of the confirmed cases in the US? 
Perhaps, flights coming from Europe and China where the pandemic had its stronghold? Or the social distance not thoroughly implemented by those affected with the virus?
Or a failure to take the pandemic seriously as it engulfed China? Or a combination of these factors?


In [163]:
fig = go.Figure()

fig.add_trace(go.Scatter(
                x=US.index,
                y=US.deaths,
                name="deaths (US)",
                mode='lines+markers',
                line_color='deepskyblue',
                opacity=0.8))

fig.add_trace(go.Scatter(
                x=China.index,
                y=China.deaths,
                name="deaths (China)",
                mode='lines+markers',
                line_color='green',
                opacity=0.8))

fig.add_trace(go.Scatter(
                x=Italy.index,
                y=Italy.deaths,
                name="deaths (Italy)",
                mode='lines+markers',
                line_color='purple',
                opacity=0.8))


fig.update_layout(
    title='Total death cases per day for the US, China, and Italy',
    yaxis = dict(
        showgrid=False,
        showline=False,
        showticklabels=True,
        domain=[0,0.85],
    ),

    xaxis = dict(
        zeroline=False,
        showline=False,
        showticklabels=True,
        showgrid=True,
        domain=[0,0.85],
    ),
    
    legend=dict(x=0.029,y=1.038, font_size=10),
    margin=dict(l=100, r=20, t=70, b=70),
    paper_bgcolor='rgb(248, 248, 255)',
    plot_bgcolor='rgb(248, 248, 255)',

)

fig.show()

If you look at the graph for the __death__ cases, you will notice that __US__ data is rising to meet that of __China__, though __Italy__ is still
leading in the highest number of deaths.

Will the __US__ surpass China and Italy, judging from the exponetial growth? If so, when will that be? 

What effective measures should be taken to help infected people recover from the disease in order to flatten the curve in all countries?

In [164]:
fig = go.Figure()

fig.add_trace(go.Scatter(
                x=US.index,
                y=US.recovered,
                name="deaths (US)",
                mode='lines+markers',
                line_color='deepskyblue',
                opacity=0.8))

fig.add_trace(go.Scatter(
                x=China.index,
                y=China.recovered,
                name="recovered (China)",
                mode='lines+markers',
                line_color='green',
                opacity=0.8))

fig.add_trace(go.Scatter(
                x=Italy.index,
                y=Italy.recovered,
                name="recovered (Italy)",
                mode='lines+markers',
                line_color='purple',
                opacity=0.8))


fig.update_layout(
    title='Total recovered cases per day for the US, China, and Italy',
    yaxis = dict(
        showgrid=False,
        showline=False,
        showticklabels=True,
        domain=[0,0.85],
    ),

    xaxis = dict(
        zeroline=False,
        showline=False,
        showticklabels=True,
        showgrid=True,
        domain=[0,0.85],
    ),
    
    legend=dict(x=0.029,y=1.038, font_size=10),
    margin=dict(l=100, r=20, t=70, b=70),
    paper_bgcolor='rgb(248, 248, 255)',
    plot_bgcolor='rgb(248, 248, 255)',

)

fig.show()

__China__ plateaued (__flattened the curve__) on both the __confirmed__ and __death__ cases around __March 1__, and has a higher rate of __recovery__ compared to __Italy__ and __the US__.

- They were able to achieve this effectively by:
 - Total Lockdown
 - Social Distancing
 - Efficient Healthcare
 
And this was strictly enforced by the Chinese government. Nations should follow suit, especially nations without large and efficient healthcare since even those who have this are struggling to contain the viral disease, their healthcare systems overpopulated with the infected patients.

These measures are also enforced in African countries, one of which is Nigeria. Borders have been closed, to prevent entry from other countries. There is a total lockdown in the states, and social distancing is strongly encouraged.

We are to compare __Nigeria__ with __China__, one of the highly affected countries.

In [165]:
fig = go.Figure()

fig.add_trace(go.Scatter(
                x=Nigeria.index,
                y=Nigeria.confirmed,
                name="confirmed (Nigeria)",
                mode='lines+markers',
                line_color='deepskyblue',
                opacity=0.8))

fig.add_trace(go.Scatter(
                x=China.index,
                y=China.confirmed,
                name="confirmed (China)",
                mode='lines+markers',
                line_color='green',
                opacity=0.8))

fig.update_layout(
    title='Total confirmed cases per day for China, and Nigeria',
    yaxis = dict(
        showgrid=False,
        showline=False,
        showticklabels=True,
        domain=[0,0.85],
    ),

    xaxis = dict(
        zeroline=False,
        showline=False,
        showticklabels=True,
        showgrid=True,
        domain=[0,0.85],
    ),
    
    legend=dict(x=0.029,y=1.038, font_size=10),
    margin=dict(l=100, r=20, t=70, b=70),
    paper_bgcolor='rgb(248, 248, 255)',
    plot_bgcolor='rgb(248, 248, 255)',

)

fig.show()

In [166]:
fig = go.Figure()

fig.add_trace(go.Scatter(
                x=Nigeria.index,
                y=Nigeria.deaths,
                name="deaths (Nigeria)",
                mode='lines+markers',
                line_color='deepskyblue',
                opacity=0.8))

fig.add_trace(go.Scatter(
                x=China.index,
                y=China.deaths,
                name="deaths (China)",
                mode='lines+markers',
                line_color='green',
                opacity=0.8))

fig.update_layout(
    title='Total death cases per day for China, and Nigeria',
    yaxis = dict(
        showgrid=False,
        showline=False,
        showticklabels=True,
        domain=[0,0.85],
    ),

    xaxis = dict(
        zeroline=False,
        showline=False,
        showticklabels=True,
        showgrid=True,
        domain=[0,0.85],
    ),
    
    legend=dict(x=0.029,y=1.038, font_size=10),
    margin=dict(l=100, r=20, t=70, b=70),
    paper_bgcolor='rgb(248, 248, 255)',
    plot_bgcolor='rgb(248, 248, 255)',

)

fig.show()

- The curve is almost __flattened__ in both __confirmed__ and __death__ cases in Nigeria.

However, we should not count our blessings (or data) too soon, as complacency and relaxation in discipline in enforcing the effective measures of lockdown and social distancing will result in us suffering the fate the US, which is now - _following a series of missteps_ - the [__epicenter of the pandemic__](https://www.nytimes.com/2020/03/26/health/usa-coronavirus-cases.html).[1]

### EFFECTIVE MEASURES TO BE TAKEN TO FLATTEN THE CURVE OF THE PANDEMIC

In Summary:

- Stay at home
- Avoid social gatherings
- Wash your hands regularly.
- Sanitize.
- Avoid touching your face, or nostrils with your hands.

If you have the symptoms of the viral disease - fever, cough or difficulty in breathing - contact your country's Center for Disease Control (CDC). 
The toll-free number for Nigeria's CDC (NCDC) is __0800 970 000 0010__. Do not self-medicate. 

These may seem to be difficult, but Coronavirus pandemic is __real__, and it is killing __thousands__ of people __daily__.
You might not want to experience the other option, because you will not live to tell the tale.


# References

[1]. The U.S. Now Leads the World in Confirmed Coronavirus Cases https://www.nytimes.com/2020/03/26/health/usa-coronavirus-cases.html