<h1 align="center"> COVID-19 Case Study - A Beginner Friendly Analysis </h1>

In this Notebook, I have performed a beginner friendly analysis of COVID19. Please UPVOTE & COMMENT if you find it useful. 

<img src="https://www.heywood.org/images/heyBanner/imgBanner/banner2.png" width="2000px" align="center"></img>

<i>Image Credits: <a href="https://nenow.in/north-east-news/manipur/manipurs-covid-19-tally-reaches-385-with-19-new-cases.html">Northeast Now</a></i>

<a></a>
# Content:

- [About COVID-19](#About-COVID-19)<br>
- [Dataset](#Dataset)
- [Imports and Datasets](#Imports-and-Datasets)
- [Preprocessing](#Preprocessing)
- [Global Analysis](#Global-Analysis)
- [Continent wise Analysis](#Continent-wise-Analysis)
- [Country wise Analysis](#Country-wise-Analysis)
- [Feedback](#Feedback)


 # About-COVID-19

<p><b>Coronavirus</b> is a family of viruses that can cause illness, whereas the disease that is caused due to Coronavirus is termed as <b>COVID19</b> by the scientists. Middle East Respiratory Syndrome (MERS-CoV) and Severe Acute Respiratory Syndrome (SARS-CoV) belong to the Coronavirus family as well which the world already has faced.</p>
<p> SARS-CoV-2 (n-coronavirus) is the new virus of the Coronavirus family, which was discovered in 2019, which has not been identified in humans before. It is a contiguous virus which started from Wuhan in December 2019. Which was later declared as a Pandemic by WHO due to high rate spreads throughout the world.</p>
<p> In this nootebook, I have made a small effort of analyzing the data of Covid cases over time. The notebook is mainly divided into 3 sections - Global Analysis, Continent wise Ananlysis and Country wise Analysis. </p>

<a></a>
 # Dataset
 
All datasets are imported from 2019 Novel Coronavirus COVID-19 (2019-nCoV) Data Repository by **Johns Hopkins CSSE** - https://github.com/CSSEGISandData/COVID-19
- **df_covid19**  consists of Number of Confirmed cases, Deaths, Recovered, Active, Incident_Rate and Mortality_Rate updated daily country wise.
- **df_confirmed** and **df_deaths**, both are time-series datasets that are updated on daily basis from various official sources by John Hopkins CSSE.
- df_confirmed enlists number of confirmed cases in different countries on a daily basis.
- df_deaths enlists number of deaths in different countries on a daily basis.

# Imports-and-Datasets
- Pandas - for performing basic operations on dataset 
- Numpy - for mathematical calculations
- Matplotlib & Seaborn - for visualizations & plots
- Folium - for visualizing data on Maps
- pycountry_convert - for getting continent name from their country names

In [None]:
!pip install pycountry_convert 
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import folium
import pycountry_convert as pc
import warnings
from datetime import datetime, timedelta, date
warnings.filterwarnings('ignore')

%matplotlib inline

In [None]:
# importing covid19 confirmed cases datatset
df_confirmed = pd.read_csv('https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv')

# renaming some columns
df_confirmed = df_confirmed.rename(columns={"Country/Region": "Country", "Province/State": "State"})
df_confirmed.head()

In [None]:
# importing covid19 deaths datatset
df_deaths = pd.read_csv('https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv')

# renaming some columns
df_deaths = df_deaths.rename(columns={"Country/Region": "Country", "Province/State": "State"})
df_deaths.head()

In [None]:
# importing covid19 datatset
df_covid19 = pd.read_csv("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/web-data/data/cases_country.csv")

# dropping columns that are not required
df_covid19.drop(['Last_Update', 'People_Tested', 'People_Hospitalized', 'UID', 'ISO3'], inplace=True, axis=1)

# changing column name
df_covid19 = df_covid19.rename(columns={"Country_Region": "Country"})
df_covid19.head()

In [None]:
# Breif info about the dataset
df_covid19.info()

# Preprocessing

In [None]:
# Changing the conuntry names as required by pycountry_convert library
df_confirmed.loc[df_confirmed['Country'] == "US", "Country"] = "USA"
df_deaths.loc[df_deaths['Country'] == "US", "Country"] = "USA"
df_covid19.loc[df_covid19['Country'] == "US", "Country"] = "USA"


df_confirmed.loc[df_confirmed['Country'] == 'Korea, South', "Country"] = 'South Korea'
df_deaths.loc[df_deaths['Country'] == 'Korea, South', "Country"] = 'South Korea'
df_covid19.loc[df_covid19['Country'] == "Korea, South", "Country"] = "South Korea"

df_confirmed.loc[df_confirmed['Country'] == 'Taiwan*', "Country"] = 'Taiwan'
df_deaths.loc[df_deaths['Country'] == 'Taiwan*', "Country"] = 'Taiwan'
df_covid19.loc[df_covid19['Country'] == "Taiwan*", "Country"] = "Taiwan"

df_confirmed.loc[df_confirmed['Country'] == 'Congo (Kinshasa)', "Country"] = 'Democratic Republic of the Congo'
df_deaths.loc[df_deaths['Country'] == 'Congo (Kinshasa)', "Country"] = 'Democratic Republic of the Congo'
df_covid19.loc[df_covid19['Country'] == "Congo (Kinshasa)", "Country"] = "Democratic Republic of the Congo"

df_confirmed.loc[df_confirmed['Country'] == "Cote d'Ivoire", "Country"] = "Côte d'Ivoire"
df_deaths.loc[df_deaths['Country'] == "Cote d'Ivoire", "Country"] = "Côte d'Ivoire"
df_covid19.loc[df_covid19['Country'] == "Cote d'Ivoire", "Country"] = "Côte d'Ivoire"

df_confirmed.loc[df_confirmed['Country'] == "Reunion", "Country"] = "Réunion"
df_deaths.loc[df_deaths['Country'] == "Reunion", "Country"] = "Réunion"
df_covid19.loc[df_covid19['Country'] == "Reunion", "Country"] = "Réunion"

df_confirmed.loc[df_confirmed['Country'] == 'Congo (Brazzaville)', "Country"] = 'Republic of the Congo'
df_deaths.loc[df_deaths['Country'] == 'Congo (Brazzaville)', "Country"] = 'Republic of the Congo'
df_covid19.loc[df_covid19['Country'] == "Congo (Brazzaville)", "Country"] = "Republic of the Congo"

df_confirmed.loc[df_confirmed['Country'] == 'Bahamas, The', "Country"] = 'Bahamas'
df_deaths.loc[df_deaths['Country'] == 'Bahamas, The', "Country"] = 'Bahamas'
df_covid19.loc[df_covid19['Country'] == "Bahamas, The", "Country"] = "Bahamas"

df_confirmed.loc[df_confirmed['Country'] == 'Gambia, The', "Country"] = 'Gambia'
df_deaths.loc[df_deaths['Country'] == 'Gambia, The', "Country"] = 'Gambia'
df_covid19.loc[df_covid19['Country'] == "Gambia, The", "Country"] = "Gambia"

In [None]:
continents = {
    'NA': 'North America',
    'SA': 'South America', 
    'AS': 'Asia',
    'OC': 'Australia',
    'AF': 'Africa',
    'EU' : 'Europe',
    'OTH' : 'Others'
}

In [None]:
# function to find the continent of the country supplied
def country_to_continent(country):
    try:
        continent_code = pc.country_alpha2_to_continent_code(pc.country_name_to_country_alpha2(country))
    except:
        continent_code = 'OTH'
    return continents[continent_code]

In [None]:
# extracting the countries columns from all the 3 datasets 
countries_covid19 = np.asarray(df_covid19["Country"])
countries_confirmed = np.asarray(df_confirmed["Country"])
countries_deaths = np.asarray(df_deaths["Country"])

In [None]:
# applying the above function to all the 3 datasets to find the continents of the respective countries
df_covid19.insert(1,"Continent",  [country_to_continent(country) for country in countries_covid19])
df_confirmed.insert(1,"Continent",  [country_to_continent(country) for country in countries_confirmed])
df_deaths.insert(1,"Continent",  [country_to_continent(country) for country in countries_deaths])

# Global-Analysis

This section analyzes total Confirmed cases, Deaths reported, Recoveries, Active cases and Mortality Rate across the world as of **25th July, 2020**

In [None]:
df_global = df_covid19.drop(['Country', 'Continent', 'Lat', 'Long_', 'Incident_Rate', 'Mortality_Rate'], axis=1)

In [None]:
df_global_cases = pd.DataFrame(pd.to_numeric(df_global.sum()), dtype=np.float64).transpose()
df_global_cases['Mortality_Rate'] = np.round((df_global_cases["Deaths"]/df_global_cases["Confirmed"])*100,2)
df_global_cases

### Overview

The following Pie Chart shows the breakdown of Total Confirmed Cases into Active, Recovered and Deaths

In [None]:
labels =  [df_global_cases.columns[i]+ "\n" + str(int(df_global_cases.values[0][i])) for i in range(1,4)]
values = [df_global_cases.values[0][i] for i in range(1,4)]
plt.figure(figsize=(8,8))
plt.pie(values, labels=labels, autopct='%1.2f%%', pctdistance=0.85, labeldistance=1.1, textprops = {'fontsize':12})
my_circle = plt.Circle( (0,0), 0.7, color='white')
p = plt.gcf()
p.gca().add_artist(my_circle)
plt.text(0, 0, "Total\nConfirmed Cases \n"+str(int(df_global_cases.values[0][0])), horizontalalignment='center', verticalalignment='center', size=18)
plt.show()

### Global Spread Analysis of Covid19

The following graph shows the spread of Covid19. It shows how the number of confirmed cases and deaths have increased over time.

In [None]:
confirmed_cases = df_confirmed.drop(['Lat', 'Long', 'State', 'Country', 'Continent'], axis=1)
cases = confirmed_cases.sum().tolist()
cases = np.asarray(cases)

death_cases = df_deaths.drop(['Lat', 'Long', 'State', 'Country', 'Continent'], axis=1)
deaths = death_cases.sum().tolist()
deaths = np.asarray(deaths)

dates = confirmed_cases.columns
d = [datetime.strptime(date,'%m/%d/%y').strftime("%d %b") for date in dates]

In [None]:
plt.figure(figsize=(8,8))
marker_style_confirmed = dict(c="darkcyan", linewidth=6, linestyle='-', marker='o', markersize=6, markerfacecolor='#ffffff')
marker_style_death = dict(c="crimson", linewidth=6, linestyle='-', marker='o', markersize=6, markerfacecolor='#ffffff')
plt.plot(d, cases, label = 'Confirmed', **marker_style_confirmed)
plt.plot(d, deaths, label = 'Deaths', **marker_style_death)
plt.fill_between(d, cases, color='darkcyan', alpha=0.3)
plt.fill_between(d, deaths, color='crimson', alpha=0.3)
plt.xlabel("Date", fontsize = 15)
plt.ylabel("No. of Cases",fontsize = 15)
plt.title("COVID Cases: WorldWide", fontsize = 18)
plt.legend(loc= "best", fontsize = 15)
plt.grid(alpha=0.8)
plt.xticks(list(np.arange(0,len(d),int(len(d)/5))))
plt.yticks(np.arange(0, max(cases), 10**(len(str(int(max(cases))))-1)))
plt.show()

### Daily New Cases - Globally

The following graph shows number of confirmed new cases reported each day across the world

In [None]:
daily_cases = np.nan_to_num(df_confirmed.sum()[5:].diff())
f = plt.figure(figsize=(15,10))
date = np.arange(0,len(daily_cases))
marker_style = dict(linewidth=2, linestyle='-', marker='o',markersize=5)
plt.plot(date, daily_cases/1000,"-.",color="blue",**marker_style)

# Grid Settings
plt.grid(lw = 1, ls = '-', c = "0.85", which = 'major')
plt.grid(lw = 1, ls = '-', c = "0.95", which = 'minor')

#Title
plt.title("COVID-19 Daily Confirmed Cases - Worldwide",{'fontsize':24})

# Axis Label
plt.xlabel("Days",fontsize =18)
plt.ylabel("Number of Daily New Cases (Thousand)",fontsize =18)

plt.show()

### Daily Deaths - Globally

The following graph shows the number of deaths reported each day across the world

In [None]:
daily_deaths = np.nan_to_num(df_deaths.sum()[5:].diff())
f = plt.figure(figsize=(15,10))
date = np.arange(0,len(daily_deaths))
marker_style = dict(linewidth=2, linestyle='-', marker='o',markersize=5)
plt.plot(date, daily_deaths/1000,"-.",color="red",**marker_style)

# Grid Settings
plt.grid(lw = 1, ls = '-', c = "0.85", which = 'major')
plt.grid(lw = 1, ls = '-', c = "0.95", which = 'minor')

#Title
plt.title("COVID-19 Daily Deaths - Worldwide",{'fontsize':24})

# Axis Label
plt.xlabel("Days",fontsize =18)
plt.ylabel("Number of Daily Deaths (Thousand)",fontsize =18)

plt.show()

# Continent-wise-Analysis

This section analyzes Continent wise Total Confirmed cases, Deaths reported, Recoveries, Active cases and Mortality Rate as of **25th July, 2020**

In [None]:
df_continents = df_covid19.drop(['Country', 'Lat', 'Long_', 'Incident_Rate', 'Mortality_Rate'], axis=1)

In [None]:
df_continents_cases = df_continents.groupby('Continent').sum()
df_continents_cases['Mortality_Rate'] = np.round((df_continents_cases["Deaths"]/df_continents_cases["Confirmed"])*100,2)
df_continents_cases.drop(['Others'], inplace=True)
df_continents_cases

### Overview

The following Pie Chart shows the Continent wise distribution of Covid19 Confirmed Cases

In [None]:
labels = list(df_continents_cases.index)
sizes = df_continents_cases['Confirmed'].values
plt.figure(figsize=(8,8))
plt.pie(sizes, labels=labels, autopct='%1.2f%%', pctdistance=0.85, labeldistance=1.1, textprops = {'fontsize':10.5})
my_circle = plt.Circle( (0,0), 0.7, color='white')
p = plt.gcf()
p.gca().add_artist(my_circle)
plt.text(0, 0, "Continent wise \n Distribution of Cases", horizontalalignment='center', verticalalignment='center', size=18)
plt.show()

### COVID19 Spread Analysis

The following graphs show the spread of Covid19 across different Continents. It shows how the number of confirmed cases and deaths have increased over time.

In [None]:
df_confirmed_continents = df_confirmed.groupby('Continent').sum()
df_confirmed_continents = df_confirmed_continents[df_confirmed_continents.index!='Others']
df_confirmed_continents.drop(['Lat', 'Long'], inplace=True, axis=1)

df_deaths_continents = df_deaths.groupby('Continent').sum()
df_deaths_continents = df_deaths_continents[df_deaths_continents.index!='Others']
df_deaths_continents.drop(['Lat', 'Long'], inplace=True, axis=1)

dates = df_confirmed_continents.columns
d = [datetime.strptime(date,'%m/%d/%y').strftime("%d %b") for date in dates]

In [None]:
marker_style_confirmed = dict(c="darkcyan", linewidth=6, linestyle='-', marker='o', markersize=6, markerfacecolor='#ffffff')
marker_style_death = dict(c="crimson", linewidth=6, linestyle='-', marker='o', markersize=6, markerfacecolor='#ffffff')
fig, axes = plt.subplots(nrows=3, ncols=2, figsize=(15,15))
plt.subplots_adjust(top = 1.2, bottom = 0.1)
i=0
for rows in axes:
    for ax1 in rows:
        ax1.plot(d, df_confirmed_continents.iloc[i], label = 'Confirmed', **marker_style_confirmed)
        ax1.plot(d, df_deaths_continents.iloc[i], label = 'Deaths', **marker_style_death)
        ax1.fill_between(d, df_confirmed_continents.iloc[i], color='darkcyan', alpha=0.3)
        ax1.fill_between(d, df_deaths_continents.iloc[i], color='crimson', alpha=0.3)
        ax1.set_xlabel("Dates", fontsize = 12)
        ax1.set_ylabel("No. of Cases",fontsize = 12)
        ax1.set_title("COVID Cases: "+df_deaths_continents.index[i], fontsize = 15)
        ax1.legend(loc= "best", fontsize = 12)
        ax1.grid(which='major', linewidth = 0.3)
        ax1.set_xticks(list(np.arange(0,len(d),int(len(d)/5))))
        i+=1

### Daily Confirmed Cases - Continents

The following graph shows number of confirmed new cases reported each day across different Continents

In [None]:
daily_cases_continents = df_confirmed.groupby('Continent').sum().diff(axis=1).replace(np.nan,0)
daily_cases_continents = daily_cases_continents[daily_cases_continents.index!='Others']
f = plt.figure(figsize=(20,12))
ax = f.add_subplot(111)
for i,continent in enumerate(daily_cases_continents.index):
    t = daily_cases_continents.loc[daily_cases_continents.index == continent].values[0]
    t = t[t>=0]
    date = np.arange(0,len(t[:]))
    plt.plot(date,t/1000,'-o',label = continent,linewidth =2, markevery=[-1])

# Grid Settings
plt.grid(lw = 1, ls = '-', c = "0.85", which = 'major')
plt.grid(lw = 1, ls = '-', c = "0.95", which = 'minor')

#Title
plt.title("COVID-19 Daily Confirmed Cases in Continents",{'fontsize':24})

# Axis Label
plt.xlabel("Days",fontsize =18)
plt.ylabel("Number of Daily Confirmed Cases (Thousand)",fontsize =18)

# Legend
plt.legend(fontsize=18)

plt.show()

### Daily Deaths - Continents

The following graph shows number of confirmed deaths reported each day across different Continents

In [None]:
daily_deaths_continents = df_deaths.groupby('Continent').sum().diff(axis=1).replace(np.nan,0)
daily_deaths_continents = daily_deaths_continents[daily_deaths_continents.index!='Others']
f = plt.figure(figsize=(20,12))
ax = f.add_subplot(111)
for i,continent in enumerate(daily_deaths_continents.index):
    t = daily_deaths_continents.loc[daily_deaths_continents.index == continent].values[0]
    t = t[t>=0]
    date = np.arange(0,len(t[:]))
    plt.plot(date,t/1000,'-o',label = continent,linewidth =2, markevery=[-1])

# Grid Settings
plt.grid(lw = 1, ls = '-', c = "0.85", which = 'major')
plt.grid(lw = 1, ls = '-', c = "0.95", which = 'minor')

#Title
plt.title("COVID-19 Daily Deaths in Continents",{'fontsize':24})

# Axis Label
plt.xlabel("Days",fontsize =18)
plt.ylabel("Number of Daily Deaths (Thousand)",fontsize =18)

# Legend
plt.legend(fontsize=18)

plt.show()

### Visualization on World Map

In [None]:
df_continents_cases['Latitude'] = [6.426117205286786, 44.94789322476297, -25.734968546496344, 44.94789322476297, 56.51520886670177, -31.065922730080157]
df_continents_cases['Longitude'] = [18.2766152761759, 95.7503726784575, 134.489562782425, 28.2490403487619, -92.32043635079269, -60.7921128171538]
df_continents_cases.head()

In [None]:
world_map = folium.Map(location=[10,0], tiles="cartodbpositron", zoom_start=2, max_zoom=6, min_zoom=2)
for i in range(0, len(df_continents_cases)):
    folium.Circle(
        location=[df_continents_cases.iloc[i]['Latitude'], df_continents_cases.iloc[i]['Longitude']],
        tooltip = "<h5 style='text-align:center;font-weight: bold'>"+df_continents_cases.index[i]+"</h5>"+
                    "<hr style='margin:10px;'>"+
                    "<ul style='color: #444;list-style-type:circle;align-item:left;padding-left:20px;padding-right:20px'>"+
        "<li>Active: "+str(df_continents_cases['Active'][i])+"</li>"+
        "<li>Confirmed: "+str(df_continents_cases['Confirmed'][i])+"</li>"+
        "<li>Deaths:   "+str(df_continents_cases['Deaths'][i])+"</li>"+
        "</ul>",
        radius=(int((np.log(df_continents_cases['Confirmed'][i]+1.00001)))+0.2)*50000,
        color='#ff6600',
        fill_color='#ff8533',
        fill=True).add_to(world_map)
world_map

# Country-wise-Analysis

This section analyzes Country wise Total Confirmed cases, Deaths reported, Recoveries, Active cases and Mortality Rate as of **25th July, 2020**

In [None]:
df_country = df_covid19.drop(['Continent', 'Lat', 'Long_', 'Incident_Rate', 'Mortality_Rate'], axis=1)
df_country.index = df_country["Country"]
df_country.drop(['Country'], axis=1, inplace=True)
df_country.fillna(0,inplace=True)

In [None]:
df_country['Mortality_Rate'] = np.round((df_country["Deaths"]/df_country["Confirmed"])*100,2)
df_country

There are **188** countries that have reported Covid Cases

In [None]:
# function for plotting horizontal bar plot
def horizontal_barplot(x, y, title, xlabel, ylabel, color):
    fig = plt.figure(figsize = (10,5))
    fig.add_subplot(111)
    plt.axes(axisbelow = True)
    plt.barh(x.index[-10:], y.values[-10:], color = color)
    plt.tick_params(size = 5, labelsize = 13)
    plt.xlabel(xlabel, fontsize = 18)
    plt.ylabel(ylabel,fontsize = 18)
    plt.title(title,fontsize = 20)
    plt.grid(alpha = 0.3)
    plt.show()

### Top 10 Countries - Confirmed Cases

In [None]:
horizontal_barplot(x = df_country.sort_values('Confirmed')["Confirmed"], 
                   y = df_country.sort_values('Confirmed')["Confirmed"], 
                   title = "Top 10 Countries - Confirmed Cases", 
                   xlabel = "Confirmed Cases", 
                   ylabel = "Countries", 
                   color = "blue")

### Top 10 Countries - Active Cases

In [None]:
horizontal_barplot(x = df_country.sort_values('Active')["Active"], 
                   y = df_country.sort_values('Active')["Active"], 
                   title = "Top 10 Countries - Active Cases", 
                   xlabel = "Active Cases", 
                   ylabel = "Countries", 
                   color = "orange")

### Top 10 Countries - Recovered Cases

In [None]:
horizontal_barplot(x = df_country.sort_values('Recovered')["Recovered"], 
                   y = df_country.sort_values('Recovered')["Recovered"], 
                   title = "Top 10 Countries - Recovered Cases", 
                   xlabel = "Recovered", 
                   ylabel = "Countries", 
                   color = "limegreen")

### Top 10 Countries - Deaths

In [None]:
horizontal_barplot(x = df_country.sort_values('Deaths')["Deaths"], 
                   y = df_country.sort_values('Deaths')["Deaths"], 
                   title = "Top 10 Countries - Deaths", 
                   xlabel = "Deaths", 
                   ylabel = "Countries", 
                   color = "red")

### Visualization on World Map

In [None]:
df_countries = df_covid19.drop(['Continent', 'Incident_Rate', 'Mortality_Rate'], axis=1)
df_countries.index = df_countries["Country"]
df_countries.fillna(0,inplace=True)
df_countries.head()

In [None]:
world_map = folium.Map(location=[10,0], tiles="cartodbpositron", zoom_start=2, max_zoom=6, min_zoom=2)
for i in range(0, len(df_countries)):
    folium.Circle(
        location=[df_countries.iloc[i]['Lat'], df_countries.iloc[i]['Long_']],
        tooltip = "<h5 style='text-align:center;font-weight: bold'>"+df_countries.index[i]+"</h5>"+
                    "<hr style='margin:10px;'>"+
                    "<ul style='color: #444;list-style-type:circle;align-item:left;padding-left:20px;padding-right:20px'>"+
        "<li>Active: "+str(df_countries['Active'][i])+"</li>"+
        "<li>Confirmed: "+str(df_countries['Confirmed'][i])+"</li>"+
        "<li>Deaths:   "+str(df_countries['Deaths'][i])+"</li>"+
        "</ul>",
        radius=(int((np.log(df_countries['Confirmed'][i]+1.00001)))+0.2)*50000,
        color='#ff6600',
        fill_color='#ff8533',
        fill=True).add_to(world_map)
world_map

### Number of Countries affected by COVID19 over time 

In [None]:
case_nums_country = df_confirmed.groupby("Country").sum().drop(['Lat','Long'],axis =1).apply(lambda x: x[x > 0].count(), axis =0)
d = [datetime.strptime(date,'%m/%d/%y').strftime("%d %b") for date in case_nums_country.index]

f = plt.figure(figsize=(14,8))
f.add_subplot(111)
marker_style = dict(c="crimson",linewidth=6, linestyle='-', marker='o',markersize=6, markerfacecolor='#ffffff')
plt.plot(d, case_nums_country,**marker_style)
plt.tick_params(labelsize = 14)
plt.xticks(list(np.arange(0,len(d),int(len(d)/5))),d[:-1:int(len(d)/5)]+[d[-1]])

plt.xlabel("Dates",fontsize=18)
plt.ylabel("Number of Countries",fontsize=18)
plt.grid(alpha = 0.3)
plt.show()

From the above plot it is evident that almost all countries were affected of Covid19 by 21st April

### COVID19 Spread Analysis

The following graphs show the spread of Covid19 across Top 20 Contries (Confirmed Cases). It shows how the number of confirmed cases and deaths have increased over time.

In [None]:
df_countries_cases = df_confirmed.groupby(["Country"]).sum()
df_countries_deaths = df_deaths.groupby(["Country"]).sum()

df_countries_cases.drop(['Lat', 'Long'], inplace=True, axis=1)
df_countries_deaths.drop(['Lat', 'Long'], inplace=True, axis=1)

df_countries_cases = df_countries_cases.sort_values(df_confirmed.columns[-1],ascending = False)[:20]

dates = df_countries_cases.columns
d = [datetime.strptime(date,'%m/%d/%y').strftime("%d %b") for date in dates]

In [None]:
marker_style_confirmed = dict(c="darkcyan", linewidth=6, linestyle='-', marker='o', markersize=6, markerfacecolor='#ffffff')
marker_style_death = dict(c="crimson", linewidth=6, linestyle='-', marker='o', markersize=6, markerfacecolor='#ffffff')
fig, axes = plt.subplots(nrows=10, ncols=2, figsize=(15,15))
plt.subplots_adjust(top = 4.0)
i=0
for rows in axes:
    for ax1 in rows:
        ax1.plot(d, df_countries_cases.iloc[i], label = 'Confirmed', **marker_style_confirmed)
        ax1.plot(d, df_countries_deaths[df_countries_deaths.index == df_countries_cases.index[i]].values[0], label = 'Deaths', **marker_style_death)
        ax1.fill_between(d, df_countries_cases.iloc[i], color='darkcyan', alpha=0.3)
        ax1.fill_between(d, df_countries_deaths[df_countries_deaths.index == df_countries_cases.index[i]].values[0], color='crimson', alpha=0.3)
        ax1.set_xlabel("Dates", fontsize = 12)
        ax1.set_ylabel("No. of Cases",fontsize = 12)
        ax1.set_title("COVID Cases: "+df_countries_cases.index[i], fontsize = 15)
        ax1.legend(loc= "best", fontsize = 12)
        ax1.set_xticks(list(np.arange(0,len(d),int(len(d)/5))))
        ax1.grid(which='major', linewidth = 0.3)
        i+=1

### Daily Confirmed cases - Top 10 Countries

In [None]:
temp = df_confirmed.groupby('Country').sum().diff(axis=1).sort_values(df_confirmed.columns[-1],ascending=False).head(10).replace(np.nan,0)
f = plt.figure(figsize=(20,12))
ax = f.add_subplot(111)
for i,country in enumerate(temp.index):
    t = temp.loc[temp.index ==country].values[0]
    t = t[t>=0]
    date = np.arange(0,len(t[:]))
    plt.plot(date,t/1000,'-o',label = country,linewidth =2, markevery=[-1])

# Grid Settings
plt.grid(lw = 1, ls = '-', c = "0.85", which = 'major')
plt.grid(lw = 1, ls = '-', c = "0.95", which = 'minor')

#Title
plt.title("COVID-19 Daily Confirmed Cases in Top 10 Countries",{'fontsize':24})

# Axis Label
plt.xlabel("Days",fontsize =18)
plt.ylabel("Number of Daily Confirmed Cases (Thousand)",fontsize =18)

# Legend
plt.legend(fontsize=18)

plt.show()

### Daily deaths - Top 10 Countries

In [None]:
temp = df_deaths.groupby('Country').sum().diff(axis=1).sort_values(df_deaths.columns[-1],ascending=False).head(10).replace(np.nan,0)
f = plt.figure(figsize=(20,12))
ax = f.add_subplot(111)
for i,country in enumerate(temp.index):
    t = temp.loc[temp.index ==country].values[0]
    t = t[t>=0]
    date = np.arange(0,len(t[:]))
    plt.plot(date,t/1000,'-o',label = country,linewidth =2, markevery=[-1])

# Grid
plt.grid(lw = 1, ls = '-', c = "0.85", which = 'major')
plt.grid(lw = 1, ls = '-', c = "0.95", which = 'minor')

#Title
plt.title("COVID-19 Daily Deaths in Top 10 Countries",{'fontsize':24})

# Axis Label
plt.xlabel("Days",fontsize =18)
plt.ylabel("Number of Daily Death Cases (Thousand)",fontsize =18)

# Legend
plt.legend(fontsize=18)

plt.show()

# Feedback

- This was my small effort in applying my data analytics skills to analyze the spread of COVID19
- Special mention to the Notebook "COVID-19 Case Study - Analysis, Viz & Comparisons" by Tarun Kumar
- Your FEEDBACK is much APPRECIATED
- Please UPVOTE if you LIKE my EFFORT