<a href="https://colab.research.google.com/github/YashJain24-chief/COVID-19-Visualization-using-Python/blob/main/DV_project.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Tracking COVID-19 with Data Visualisation Techniques

####**SoE, Presidency University, Bangalore**

**Team members**
* Syed Farzan(20181COM0156)
* Yash Jain(20181COM0185)
* Sonu Kumar B(20181COM0188)
* Sinchan S Maitri(20181COM0189)

**Section**: 5COM3


## Overview

In a tweet on 11 March 2020, WHO declared COVID-19 (or coronavirus) a pandemic. A pandemic is a disease spread over the world.

<img src='https://student-datasets-bucket.s3.ap-south-1.amazonaws.com/images/who-coronavirus-pandemic.png' width=600>

Here's a link to the tweet: https://twitter.com/WHO/status/1237777021742338049

Coronavirus has claimed lives of more than **185000** people globally so far and still counting. You can look at the live dashboard to see the real-time updates.

[COVID-19 Live Dashboard](https://www.arcgis.com/apps/opsdashboard/index.html#/bda7594740fd40299423467b48e9ecf6)

In line with this pandemic and as a coronavirus awareness initiative, we are going to look at 

- How many people get affected with coronavirus every day

- Distribution of the number of people affected across the globe

To understand the virality of the disease.

---

## I. Importing modules and setting up the environment

Here's the link to the data source. It is a GitHub repository by CSSE dept. at Johns Hopkins University, a reuglarly updated and accurate dataset of coronavirus cases across the globe.

[COVID-19 Data Source](https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_time_series)


These are the modules imported:
* numpy for linear algebra
* pandas for accessing and manipulating the dataset
* matplotlib, seaborn for data visualization
* folium for cartograms/maps
* pycountry to get ISO codes for all countries
* plotly to plot an interactive world map

In [None]:
!pip install pycountry



In [None]:
import numpy as np
import pandas as pd

import matplotlib.pyplot as plt
import seaborn as sns

import folium

import pycountry
import plotly.express as px
%matplotlib inline

import ipywidgets as widgets
from ipywidgets import interact, interact_manual

---

## II. Creating and exploring the DataFrame

Let's create three different DataFrames for the total confirmed coronavirus cases across the globe.


In [None]:
conf_df = pd.read_csv('https://raw.githubusercontent.com/CSSEGISandData/COVID-19/a54f960a7d5440ded0277afcd9ab5fe0c329dca2/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv')
death_df = pd.read_csv('https://raw.githubusercontent.com/CSSEGISandData/COVID-19/a54f960a7d5440ded0277afcd9ab5fe0c329dca2/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv')
recovered_df = pd.read_csv('https://raw.githubusercontent.com/CSSEGISandData/COVID-19/a54f960a7d5440ded0277afcd9ab5fe0c329dca2/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_recovered_global.csv')

conf_df.head()

Unnamed: 0,Province/State,Country/Region,Lat,Long,1/22/20,1/23/20,1/24/20,1/25/20,1/26/20,1/27/20,1/28/20,1/29/20,1/30/20,1/31/20,2/1/20,2/2/20,2/3/20,2/4/20,2/5/20,2/6/20,2/7/20,2/8/20,2/9/20,2/10/20,2/11/20,2/12/20,2/13/20,2/14/20,2/15/20,2/16/20,2/17/20,2/18/20,2/19/20,2/20/20,2/21/20,2/22/20,2/23/20,2/24/20,2/25/20,2/26/20,...,10/17/20,10/18/20,10/19/20,10/20/20,10/21/20,10/22/20,10/23/20,10/24/20,10/25/20,10/26/20,10/27/20,10/28/20,10/29/20,10/30/20,10/31/20,11/1/20,11/2/20,11/3/20,11/4/20,11/5/20,11/6/20,11/7/20,11/8/20,11/9/20,11/10/20,11/11/20,11/12/20,11/13/20,11/14/20,11/15/20,11/16/20,11/17/20,11/18/20,11/19/20,11/20/20,11/21/20,11/22/20,11/23/20,11/24/20,11/25/20
0,,Afghanistan,33.93911,67.709953,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,...,40141,40200,40287,40357,40510,40626,40687,40768,40833,40937,41032,41145,41268,41334,41425,41501,41633,41728,41814,41935,41975,42033,42092,42297,42463,42609,42795,42969,43035,43240,43403,43628,43851,44228,44443,44503,44706,44988,45280,45490
1,,Albania,41.1533,20.1683,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,16774,17055,17350,17651,17948,18250,18556,18858,19157,19445,19729,20040,20315,20634,20875,21202,21523,21904,22300,22721,23210,23705,24206,24731,25294,25801,26211,26701,27233,27830,28432,29126,29837,30623,31459,32196,32761,33556,34300,34944
2,,Algeria,28.0339,1.6596,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,...,54203,54402,54616,54829,55081,55357,55630,55880,56143,56419,56706,57026,57332,57651,57942,58272,58574,58979,59527,60169,60800,61381,62051,62693,63446,64257,65108,65975,66819,67679,68589,69591,70629,71652,72755,73774,74862,75867,77000,78025
3,,Andorra,42.5063,1.5218,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,3377,3377,3623,3623,3811,3811,4038,4038,4038,4325,4410,4517,4567,4665,4756,4825,4888,4910,5045,5135,5135,5319,5383,5437,5477,5567,5616,5725,5725,5872,5914,5951,6018,6066,6142,6207,6256,6304,6351,6428
4,,Angola,-11.2027,17.8739,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,7462,7622,7829,8049,8338,8582,8829,9026,9381,9644,9871,10074,10269,10558,10805,11035,11228,11577,11813,12102,12223,12335,12433,12680,12816,12953,13053,13228,13374,13451,13615,13818,13922,14134,14267,14413,14493,14634,14742,14821


In [None]:
conf_df.shape

(271, 313)

-------

Now, let's create a series of the total confirmed cases of coronavirus reported

- Across globe

- In China

- Outside China

In [None]:
# Total confirmed cases reported across the globe.
total_cases = conf_df.iloc[:, 4:].apply(sum, axis=0)
total_cases[:10]

1/22/20     555
1/23/20     654
1/24/20     941
1/25/20    1434
1/26/20    2118
1/27/20    2927
1/28/20    5578
1/29/20    6167
1/30/20    8235
1/31/20    9927
dtype: int64

In [None]:
# Total confirmed cases in China.
china_cases = conf_df[conf_df['Country/Region'] == 'China'].iloc[:, 4:].apply(sum, axis=0)
china_cases[:10]

1/22/20     548
1/23/20     643
1/24/20     920
1/25/20    1406
1/26/20    2075
1/27/20    2877
1/28/20    5509
1/29/20    6087
1/30/20    8141
1/31/20    9802
dtype: int64

In [None]:
# Total confirmed cases outside China.
non_china_cases = conf_df[conf_df['Country/Region'] != 'China'].iloc[:, 4:].apply(sum, axis=0)
non_china_cases[:12]

1/22/20      7
1/23/20     11
1/24/20     21
1/25/20     28
1/26/20     43
1/27/20     50
1/28/20     69
1/29/20     80
1/30/20     94
1/31/20    125
2/1/20     147
2/2/20     157
dtype: int64

#####  Getting Dates and formatting them

Get the dates from the `conf_df` DataFrame. Since dataset corresponds to a time-series data, it is necessary to visualise the data with respect to time.

In [None]:
# Getting the dates from the 'conf_df' DataFrame.
dates = pd.Series(conf_df.columns[4:].values)
dates[:10]

0    1/22/20
1    1/23/20
2    1/24/20
3    1/25/20
4    1/26/20
5    1/27/20
6    1/28/20
7    1/29/20
8    1/30/20
9    1/31/20
dtype: object

In [None]:
# Converting the dates from text to the 'datetime' values.
dates = pd.to_datetime(dates)
dates[:10]

0   2020-01-22
1   2020-01-23
2   2020-01-24
3   2020-01-25
4   2020-01-26
5   2020-01-27
6   2020-01-28
7   2020-01-29
8   2020-01-30
9   2020-01-31
dtype: datetime64[ns]

In [None]:
# User-defined function to format date in the 'Month Day, Year' format.
def date_conversion(dates):
  dates_conv = []
  for date in dates:
    dates_conv.append(date.strftime('%b %d')) # date.strftime('%b %d, %Y')
  return dates_conv

dates_conv = date_conversion(dates)
dates_conv[:10]

['Jan 22',
 'Jan 23',
 'Jan 24',
 'Jan 25',
 'Jan 26',
 'Jan 27',
 'Jan 28',
 'Jan 29',
 'Jan 30',
 'Jan 31']

---

## III. Visualization with line plots

Now, we will create line plots for the total number of confirmed cases reported 

- across world

- in China

- outside China


In [None]:
current_df = conf_df

In [None]:
print(plt.style.available)
style = plt.style.available

['Solarize_Light2', '_classic_test_patch', 'bmh', 'classic', 'dark_background', 'fast', 'fivethirtyeight', 'ggplot', 'grayscale', 'seaborn', 'seaborn-bright', 'seaborn-colorblind', 'seaborn-dark', 'seaborn-dark-palette', 'seaborn-darkgrid', 'seaborn-deep', 'seaborn-muted', 'seaborn-notebook', 'seaborn-paper', 'seaborn-pastel', 'seaborn-poster', 'seaborn-talk', 'seaborn-ticks', 'seaborn-white', 'seaborn-whitegrid', 'tableau-colorblind10']


In [None]:
colors = [
    '#08F7FE',  # teal/cyan
    '#FE53BB',  # pink
    '#F5D300',  # yellow
    '#00ff41', # matrix green
    '#F9423C',
]

n_lines = 10
diff_linewidth = 1.05
alpha_value = 0.8

def styling():
  plt.style.use("seaborn-dark")
  for param in ['figure.facecolor', 'axes.facecolor', 'savefig.facecolor']:
      plt.rcParams[param] = '#212946'  # bluish dark grey
  for param in ['text.color', 'axes.labelcolor', 'xtick.color', 'ytick.color']:
      plt.rcParams[param] = '0.8'  # very light grey
  plt.rcParams["axes.titlesize"] = '28'
  plt.rcParams["xtick.labelsize"] = '12'
  plt.rcParams["ytick.labelsize"] = '16'
  plt.rcParams["axes.titleweight"] = '800'
  plt.rcParams["axes.titlecolor"] = '1'
  plt.grid(color='#2A4968')
  # bluish dark grey, but slightly lighter than background

In [None]:
# Line plot for the total number of coronavirus confirmed cases reported across the world starting from 1 Februrary 2020.

@interact
def change_date(Days=(0,92,1), Column=['Confirmed', 'Recovered', 'Death']):
    plt.figure(figsize=(25, 8))
    styling()
    if Column is 'Recovered':
        current = recovered_df
    elif Column is 'Death':

        current = death_df
    else:
        current = conf_df
        
    total_cases = current.iloc[:, 4:].apply(sum, axis=0)
    plt.plot(dates_conv[len(dates_conv)-Days:], total_cases.values[len(dates_conv)-Days:], 'o-',color=colors[0],linewidth=2+(diff_linewidth),
            alpha=alpha_value,label=Column + str (" Cases"))
    plt.xticks(rotation=90)
    plt.title('\nTotal Coronavirus Cases Reported Across World\n')
    plt.legend(loc = "best", fontsize = 25,markerfirst =True)
    plt.gcf().axes[0].yaxis.get_major_formatter().set_scientific(False)
    plt.show()


########   Need Active Output session for the interactive widgets to work. Please run all the cells to see the widgets in action   #########

interactive(children=(IntSlider(value=46, description='Days', max=92), Dropdown(description='Column', options=…

In [None]:
# Line plot for the total number of coronavirus confirmed cases reported in China 
@interact
def change_date(days=(0,92,1)):
    plt.figure(figsize=(25, 7))
    styling()
    plt.plot(dates_conv[len(dates_conv)-days:], china_cases.values[len(dates_conv)-days:], '-o',label="Confirmed cases in China ",color=colors[1],linewidth=2)
    plt.xticks(rotation=90)
    plt.legend(loc = "best", fontsize = 20,markerfirst =True)
    plt.gcf().axes[0].yaxis.get_major_formatter().set_scientific(False)
    plt.title('\nTotal Coronavirus Cases Reported in China\n')
    plt.show()

interactive(children=(IntSlider(value=46, description='days', max=92), Output()), _dom_classes=('widget-intera…

In [None]:
# Line plot for the total number of coronavirus confirmed cases reported outside world starting from Feb.
@interact
def change_date(days=(0,92,1)):

  plt.figure(figsize=(25, 7))
  styling()
  plt.plot(dates_conv[len(dates_conv)-days:], non_china_cases.values[len(dates_conv)-days:], 'o-',label="Confirmed cases outside China ",color=colors[2],linewidth=2)
  plt.xticks(rotation=90)
  plt.title('\nTotal Coronavirus Cases Reported Outside China\n')
  plt.legend(loc = "best", fontsize = 20,markerfirst =True)
  plt.gcf().axes[0].yaxis.get_major_formatter().set_scientific(False)
  plt.show()

interactive(children=(IntSlider(value=46, description='days', max=92), Output()), _dom_classes=('widget-intera…

Here the yellow line represents the confirmed cases outside china which is increasing gradually with each increasing day


In [None]:
# Line plot to compare the total number of coronavirus confirmed cases reported across the world, and in China 
@interact
def change_date(days=(0,92,1)):

  plt.figure(figsize=(25,8))
  styling()
  
  plt.plot(dates_conv[len(dates_conv)-days:], total_cases.values[len(dates_conv)-days:], 'd-', label='Across Globe', color = colors[3])
  plt.plot(dates_conv[len(dates_conv)-days:], china_cases.values[len(dates_conv)-days:], 'o-', label='In China', color = colors[4])
  
  plt.xticks(rotation=90)
  plt.title('\nCoronavirus Cases Reported: World, China & Outside China\n')
  plt.legend(loc = "best", fontsize = 20,markerfirst =True)
  plt.gcf().axes[0].yaxis.get_major_formatter().set_scientific(False)
  plt.show()

interactive(children=(IntSlider(value=46, description='days', max=92), Output()), _dom_classes=('widget-intera…

- The green curve represents the total number of coronavirus cases reported in the world.

- The red curve represents the total number of coronavirus cases reported in China.



Let's have a look at the top 5 countries with highest total number of confirmed cases.

In [None]:
# Top 5 countries with highest total number of confirmed cases.
@interact
def change_date(Days=(0,92,1)):
    grouped_conf_df = conf_df.groupby(by='Country/Region', as_index=False).sum()       # grouping the the data wrt countries 
    high_to_low_conf_df = grouped_conf_df.sort_values(by=conf_df.columns[-1], ascending=False) # Sorting the data wrt no. of confirmed cases
    highest_conf_regions = high_to_low_conf_df['Country/Region'][:5] # top 5 countries data

    plt.figure(figsize=(25, 9))
    styling()
    i = 0
    
    for region in highest_conf_regions.values:
        conf_series = high_to_low_conf_df.loc[high_to_low_conf_df['Country/Region'] == region, high_to_low_conf_df.columns[3:]].values[0]
        plt.plot(dates_conv[len(dates_conv)-Days:], conf_series[len(dates_conv)-Days:], 'o-', label=region,color = colors[i])
        i = i + 1
        
    plt.xticks(rotation=90)
    plt.title('\nTop 5 Countries With Highest Confirmed Cases\n')
    plt.legend(loc = "best", fontsize = 20,markerfirst =True)
    plt.gcf().axes[0].yaxis.get_major_formatter().set_scientific(False)
    plt.grid()
    plt.show()

interactive(children=(IntSlider(value=46, description='Days', max=92), Output()), _dom_classes=('widget-intera…

Let's have a look at the trend of confirmed cases 

In [None]:
# Line plot for total number of confirmed coronavirus cases for all the countries using an dropdown
value = current_df["Country/Region"].unique()

@interact
def change_date(Days=(0,92,1), Country = value):

    plt.figure(figsize=(25, 8))
    styling()

    # for loop to get the data of all country 
    for i in value:

      if Country is i:
        current_df = conf_df.loc[conf_df['Country/Region'] == Country, conf_df.columns[4:]]
        
    #summing up the confirmed cases data for a particular day for all the regions of that particular country
    total_cases = current_df.iloc[:].apply(sum, axis=0) 

    
    plt.plot(dates_conv[len(dates_conv)-Days:], total_cases[len(dates_conv)-Days:], 'o-', label=Country,color="red",linewidth=2)

    plt.xticks(rotation=90)
    plt.title('\nTotal Coronavirus Cases Reported in ' + str(Country) + ' (Select from dropdown above)\n')
    plt.legend(loc = "best", fontsize = 25,markerfirst =True)
    plt.gcf().axes[0].yaxis.get_major_formatter().set_scientific(False)
    plt.show()

interactive(children=(IntSlider(value=46, description='Days', max=92), Dropdown(description='Country', options…

In [None]:
value = current_df["Country/Region"].unique()

@interact
def change_date(Days=(0,92,1), Country = value):
    plt.figure(figsize=(30, 8))
    width=0.25
    x_indexes = np.arange((Days))
    styling()
    for i in value:
      if Country is i:
        current_df = conf_df.loc[conf_df['Country/Region'] == Country, conf_df.columns[4:]]
        death_data =  death_df.loc[death_df['Country/Region'] == Country, death_df.columns[4:]]
        recovered_data = recovered_df.loc[recovered_df['Country/Region'] == Country, recovered_df.columns[4:]]
    
    total_Current_cases = current_df.iloc[:].apply(sum, axis=0)
    
    total_death_cases = death_data.iloc[:].apply(sum, axis=0)

    total_recovery_cases = recovered_data.iloc[:].apply(sum, axis=0)
    
    plt.bar(x_indexes-width, total_death_cases[len(dates_conv)-Days:],color="red",width=width,label ="Death Cases - "+ str(Country))
    plt.bar(x_indexes, total_Current_cases[len(dates_conv)-Days:],color="lightblue",width=width,label ="Confirmed Cases - "+ str(Country))
    plt.bar(x_indexes+width, total_recovery_cases[len(dates_conv)-Days:],color="green",width=width,label ="Recovered Cases - "+ str(Country))

    
    plt.xticks(ticks = x_indexes, labels = list(dates_conv[-Days:]), rotation = 90)
    plt.title('Total Confirmed, Recovered and Fatal Cases Reported in ' + str(Country) + ' (Select from dropdown above)\n')
    plt.legend(loc = "best", fontsize = 14)
    plt.gcf().axes[0].yaxis.get_major_formatter().set_scientific(False)
    plt.plot()


interactive(children=(IntSlider(value=46, description='Days', max=92), Dropdown(description='Country', options…

---

## IV. Creating cartograms/maps for visualization

In [None]:
conf_china_df = conf_df[conf_df['Country/Region'] == 'China']
print(conf_china_df.head())

last_col_header = conf_df.columns[-1]

   Province/State Country/Region      Lat  ...  11/23/20  11/24/20  11/25/20
58          Anhui          China  31.8257  ...       992       992       992
59        Beijing          China  40.1824  ...       950       950       950
60      Chongqing          China  30.0572  ...       590       590       590
61         Fujian          China  26.0789  ...       478       479       480
62          Gansu          China  35.7518  ...       181       181       181

[5 rows x 313 columns]


Let's create a cartogram to show the distribution of confirmed coronavirus cases in China and mark the affected regions of China with location markers.

The markers will display the name of the region location along with the number of confirmed coronavirus cases in that region.

In [None]:

# Map to show the distribution of confirmed coronavirus cases in China (regular markers).
china_map = folium.Map(location=[30.9756, 112.2707], width='100%', height='100%', tiles='Stamen Terrain', zoom_start=4.25, max_bounds=True, min_zoom=3, max_zoom=6)
for i in conf_china_df.index:
  folium.Marker(location=[conf_china_df.loc[i, 'Lat'], conf_china_df.loc[i, 'Long']],
                popup= conf_china_df.loc[i, 'Province/State'] + "\nConfirmed: " + str(conf_china_df.loc[i, last_col_header])).add_to(china_map)
china_map

Let's create a cartogram to show the distribution of confirmed coronavirus cases in China and mark the affected regions of China with **circular** location markers.

The markers will display the name of the region location along with the number of confirmed coronavirus cases in that region.

In [None]:
# Map to show the distribution of confirmed coronavirus cases in China (circular markers).
china_map = folium.Map(location=[30.9756, 112.2707], width='100%', height='90%', tiles='CartoDB positron', zoom_start=4.5, max_bounds=True, min_zoom=4, max_zoom = 10)
for i in conf_china_df.index:
  test = conf_df.loc[i, 'Province/State'] + ' - Confirmed: ' + str(conf_df.loc[i, last_col_header]) + " | Deaths: " + str(death_df.loc[i, last_col_header])
  popup = folium.Popup(test, parse_html=False, max_width=300)
  folium.Circle(radius=int(conf_china_df.loc[i, last_col_header]) * 2,
                location=[conf_china_df.loc[i, 'Lat'], conf_china_df.loc[i, 'Long']],
                popup= popup,
                tooltip='<strong>Click Here</strong>',
                color='crimson', fill=True, fill_color='crimson').add_to(china_map)
china_map

We know for sure that the Hubei region was affected the most in China; almost 50 times the second most affected region. Hence, we the circles for the other regions are very small compared to the circle for the Hubei region.

Let's increase the scale of the radius of the circles to see the variation of people affected in other regions. Also, we will not create a circle for the Hubei region on the map. 

In [None]:
# Map to show the distribution of confirmed coronavirus cases in China excluding Wuhan (circular markers).
china_map = folium.Map(location=[30.9756, 112.2707], width='100%', height='100%', tiles='Stamen Toner', zoom_start=4.5, max_bounds=True, min_zoom=4, max_zoom = 10)
for i in conf_china_df.sort_values(by=last_col_header, ascending=False).index[1:]:
  test = conf_df.loc[i, 'Province/State'] + ' - Confirmed: ' + str(conf_df.loc[i, last_col_header]) + " | Deaths: " + str(death_df.loc[i, last_col_header])
  popup = folium.Popup(test, parse_html=False, max_width=300)
  folium.Circle(radius=int(conf_china_df.loc[i, last_col_header]) * 50,
                location=[conf_china_df.loc[i, 'Lat'], conf_china_df.loc[i, 'Long']],
                popup= popup,
                tooltip='<strong>Click Here</strong>',
                color='crimson', fill=True, fill_color='crimson').add_to(china_map)
china_map

##### - Cartogram/map for the World
Let's create a cartogram to show the distribution of confirmed coronavirus cases across the world and mark the affected regions in the world with **circular** location markers.

The markers will display the name of the region location along with the number of confirmed coronavirus cases in that region.

In [None]:
print(recovered_df.shape)
print(conf_df.shape)
print(death_df.shape)

print(conf_df.index[conf_df["Lat"].isna()])
conf_df.drop(labels = 52, inplace = True)

(256, 313)
(271, 313)
(271, 313)
Int64Index([52], dtype='int64')


In [None]:
# Map to show the distribution of confirmed coronavirus cases across the world (circular markers).

world_map = folium.Map(location=[0, 0], width='100%', height='100%', tiles='Stamen Terrain', zoom_start=3.25, max_bounds=True, min_zoom=3, max_zoom = 6)
for i in conf_df.index[15:]:
    test = conf_df.loc[i, 'Country/Region'] + ' - Confirmed: ' + str(conf_df.loc[i, last_col_header]) + " | Recovered: " + str(recovered_df.loc[i-15, last_col_header]) + " | Deaths: " + str(death_df.loc[i, last_col_header])
    popup = folium.Popup(test, parse_html=False, max_width=300)
    folium.Circle(location=[conf_df.loc[i, 'Lat'], conf_df.loc[i, 'Long']], 
                  radius=int(conf_df.loc[i, last_col_header]) * 0.1, 
                  popup=popup,
                  tooltip='<strong>Click Here</strong>',
                  color='crimson', fill=True, fill_color='crimson').add_to(world_map)
world_map

##### - Using Plotly to view confirmed cases around the globe

In [None]:
import pycountry
import plotly.express as px
import pandas as pd

URL_DATASET = r'https://raw.githubusercontent.com/datasets/covid-19/master/data/countries-aggregated.csv'
df1 = pd.read_csv(URL_DATASET)

list_countries = df1['Country'].unique().tolist()
#print(list_countries)
d_country_code = {}  # To hold the country names and their ISO
for country in list_countries:
    try:
        country_data = pycountry.countries.search_fuzzy(country)
        # country_data is a list of objects of class pycountry.db.Country
        # object of class Country have an alpha_3 attribute
        country_code = country_data[0].alpha_3
        d_country_code.update({country: country_code})
    except:
        print('could not add ISO 3 code for ->', country)
        # If could not find country, make ISO code ' '
        d_country_code.update({country: ' '})

#print(d_country_code)

# create a new column iso_alpha in the df
# and fill it with appropriate iso 3 code
for k, v in d_country_code.items():
    df1.loc[(df1.Country == k), 'iso_alpha'] = v

# print(df1.head)  To confirm that ISO codes added

@interact
def show_map(cases = ["Confirmed", "Recovered", "Deaths"]):
    fig = px.choropleth(data_frame = df1, 
                        locations= "iso_alpha",
                        color = cases,  
                        hover_name= "Country",
                        color_continuous_scale= 'Fall',
                        animation_frame= "Date")

    fig.show()

could not add ISO 3 code for -> Burma
could not add ISO 3 code for -> Congo (Brazzaville)
could not add ISO 3 code for -> Congo (Kinshasa)
could not add ISO 3 code for -> Diamond Princess
could not add ISO 3 code for -> Korea, South
could not add ISO 3 code for -> Laos
could not add ISO 3 code for -> MS Zaandam
could not add ISO 3 code for -> Taiwan*
could not add ISO 3 code for -> West Bank and Gaza


interactive(children=(Dropdown(description='cases', options=('Confirmed', 'Recovered', 'Deaths'), value='Confi…

### **Conclusion**
The COVID-19 pandemic has been one of the most significant events of the 21st century. But Technology, especially Data Science has been one of the strongest defenses against this challenge. Tracking the spread and exchanging the valuable information gathered has been made incredibly efficient. <br><br>
One of the ways to track or learn about the spread has been through the concepts of **data visualisation**. Employing these concepts in this notebook, Line plots, Cartograms and Chloropleths have disposed the distractions to highlight the required data. <br><br>
Summing up the observations from these plot, we can say that though the number of cases are still increasing, the steep increase has gradually flattened out due to the immense measures taken by nations across the world like,
* Lockdowns
* Hygiene and security measures
* Safety and treatment protocols
* Large scale testing and more

**Technology has been in the forefront of all defense mechanisms and Data Science has been among the primary aspects of it.** [COVID-19 India](https://covid19india.org) is the perfect example of the previous statement.

### <br>**References**
*   COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University - https://github.com/CSSEGISandData/COVID-19
*   COVID-19 dataset - https://github.com/datasets/covid-19
*   Matplotlib 3.3.3 Documentation - https://matplotlib.org/users/pyplot_tutorial.html
*   Folium 0.11.0 Documentation - https://python-visualization.github.io/folium/
*   Plotly Chloropleth Documentation - https://plotly.com/python/choropleth-maps/
*   Using Interact in Google Colab - https://colab.research.google.com/github/jupyter-widgets/ipywidgets/blob/master/docs/source/examples/Using%20Interact.ipynb


