# ADR Plot for different countries affected from Covid-19

ADR (Active, Death and Recovered) cases of a country are based on 3 dataset provided by JHU namely Confirmed cases, Active Cases and Deaths of countries due to SARS Covid-19 (aka Corona virus) outbreak. Each dataset consist of countries and their datewise cases. We have calculated Active cases as:

Active = Confirmed - (Recovered + Deaths)

This script fetches the online datasets and parses through them for the required countries and plot them in a tabular fashion with 3 coulumns (subplots). Each subplot presents a country data in form of stacked bar plots.

Datasets can be found here: https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_time_series 
[Official github link]

Note: Country name being used in code is compulsorily same as in datasets' column *'Country/Region'*

In [1]:
# Importing libraries

import numpy as np
import pandas as pd
import math
import matplotlib.pyplot as plt
import matplotlib
from datetime import datetime
from matplotlib.dates import DateFormatter
import matplotlib.dates as mdates
import operator

# This line shall help produce an interactive plot within the notebook
%matplotlib notebook

### Import and Load Dataset

#### Confirmed Cases
In link provided above one can see various datasets in form of comma separated files(csv), we will import for Covid-19 Confirmed cases:
*'time_series_covid19_confirmed_global.csv'*

In [2]:
# Importing the raw csv file

Confirmed_cases = pd.read_csv('https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv')
Confirmed_cases.iloc[0:5,:] # Printing the first 5 rows, not using .head() explicitly

Unnamed: 0,Province/State,Country/Region,Lat,Long,1/22/20,1/23/20,1/24/20,1/25/20,1/26/20,1/27/20,...,5/21/20,5/22/20,5/23/20,5/24/20,5/25/20,5/26/20,5/27/20,5/28/20,5/29/20,5/30/20
0,,Afghanistan,33.0,65.0,0,0,0,0,0,0,...,8676,9216,9998,10582,11173,11831,12456,13036,13659,14525
1,,Albania,41.1533,20.1683,0,0,0,0,0,0,...,969,981,989,998,1004,1029,1050,1076,1099,1122
2,,Algeria,28.0339,1.6596,0,0,0,0,0,0,...,7728,7918,8113,8306,8503,8697,8857,8997,9134,9267
3,,Andorra,42.5063,1.5218,0,0,0,0,0,0,...,762,762,762,762,763,763,763,763,764,764
4,,Angola,-11.2027,17.8739,0,0,0,0,0,0,...,58,60,61,69,70,70,71,74,81,84


#### Deaths
In link provided above one can see various datasets in form of comma separated files(csv), we will import for Deaths due to Covid-19:
 *'time_series_covid19_deaths_global.csv'*
 


In [3]:
Death_cases = pd.read_csv('https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv')
Death_cases.iloc[0:5,:]

Unnamed: 0,Province/State,Country/Region,Lat,Long,1/22/20,1/23/20,1/24/20,1/25/20,1/26/20,1/27/20,...,5/21/20,5/22/20,5/23/20,5/24/20,5/25/20,5/26/20,5/27/20,5/28/20,5/29/20,5/30/20
0,,Afghanistan,33.0,65.0,0,0,0,0,0,0,...,193,205,216,218,219,220,227,235,246,249
1,,Albania,41.1533,20.1683,0,0,0,0,0,0,...,31,31,31,32,32,33,33,33,33,33
2,,Algeria,28.0339,1.6596,0,0,0,0,0,0,...,575,582,592,600,609,617,623,630,638,646
3,,Andorra,42.5063,1.5218,0,0,0,0,0,0,...,51,51,51,51,51,51,51,51,51,51
4,,Angola,-11.2027,17.8739,0,0,0,0,0,0,...,3,3,4,4,4,4,4,4,4,4


#### Recovered Cases
In link provided above one can see various datasets in form of comma separated files(csv), we will import for Covid-19 Recovered cases:
 *'time_series_covid19_recovered_global.csv'*

In [4]:
Recovered_cases = pd.read_csv('https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_recovered_global.csv')
Recovered_cases.iloc[0:5,:]

Unnamed: 0,Province/State,Country/Region,Lat,Long,1/22/20,1/23/20,1/24/20,1/25/20,1/26/20,1/27/20,...,5/21/20,5/22/20,5/23/20,5/24/20,5/25/20,5/26/20,5/27/20,5/28/20,5/29/20,5/30/20
0,,Afghanistan,33.0,65.0,0,0,0,0,0,0,...,938,996,1040,1075,1097,1128,1138,1209,1259,1303
1,,Albania,41.1533,20.1683,0,0,0,0,0,0,...,771,777,783,789,795,803,812,823,851,857
2,,Algeria,28.0339,1.6596,0,0,0,0,0,0,...,4062,4256,4426,4784,4747,4918,5129,5277,5422,5549
3,,Andorra,42.5063,1.5218,0,0,0,0,0,0,...,639,652,653,653,663,676,676,681,684,692
4,,Angola,-11.2027,17.8739,0,0,0,0,0,0,...,17,17,18,18,18,18,18,18,18,18


### Setting Duration with Terminal Dates

All the three datasets have their starting date from index = 4, and the date is 22nd January 2020


In [5]:
# Storing Start and End Date in string format of 'DD M YYYY' (Here M is whole month)

BeginDate = datetime.strptime(Confirmed_cases.columns[4],'%m/%d/%y').strftime("%d %B %Y")
EndDate = datetime.strptime(Confirmed_cases.columns[-1],'%m/%d/%y').strftime("%d %B %Y")
print(BeginDate)
print(EndDate)

22 January 2020
30 May 2020


### Function for Parsing Data

The dataset consist of some countries which are divided into their different Provinces, therefore our first step is to add up all such data. See this example :-

In [6]:
# For example see Canada raw data
Confirmed_cases.iloc[35:46,:]

Unnamed: 0,Province/State,Country/Region,Lat,Long,1/22/20,1/23/20,1/24/20,1/25/20,1/26/20,1/27/20,...,5/21/20,5/22/20,5/23/20,5/24/20,5/25/20,5/26/20,5/27/20,5/28/20,5/29/20,5/30/20
35,Alberta,Canada,53.9333,-116.5765,0,0,0,0,0,0,...,6768,6800,6818,6860,6879,6901,6926,6955,6979,6992
36,British Columbia,Canada,49.2827,-123.1207,0,0,0,0,0,0,...,2479,2507,2517,2517,2530,2541,2550,2558,2562,2573
37,Grand Princess,Canada,37.6489,-122.6655,0,0,0,0,0,0,...,13,13,13,13,13,13,13,13,13,13
38,Manitoba,Canada,53.7609,-98.8139,0,0,0,0,0,0,...,290,292,292,292,292,292,292,294,294,294
39,New Brunswick,Canada,46.5653,-66.4619,0,0,0,0,0,0,...,121,121,121,121,121,122,123,126,128,129
40,Newfoundland and Labrador,Canada,53.1355,-57.6604,0,0,0,0,0,0,...,260,260,260,260,260,260,260,261,261,261
41,Nova Scotia,Canada,44.682,-63.7443,0,0,0,0,0,0,...,1046,1048,1049,1050,1051,1052,1053,1055,1055,1056
42,Ontario,Canada,51.2538,-85.3232,0,0,0,0,1,1,...,25595,26085,26560,26897,27302,27624,27943,28320,28700,29023
43,Prince Edward Island,Canada,46.5107,-63.4168,0,0,0,0,0,0,...,27,27,27,27,27,27,27,27,27,27
44,Quebec,Canada,52.9399,-73.5491,0,0,0,0,0,0,...,45504,46150,46847,47420,47993,48607,49148,49711,50232,50651


#### FetchData Function

*Params*: Country Name, Dataset (one of the 3 datasets can be passed)

*Returns*: Single row of (no. of)cases data of length equal to duration of terminal dates

The function takes care of the (no. of)cases data only and adds up all the rows corresponding to the particular country. There are countries with single row data as well, function takes care of them too.

In [7]:
# Function returns a row of cases(numerical data) related to the dataset passed as a parameter for a particular country

def FetchData(CountryName, Dataset):
       
    CountryRow = [0]*(Dataset.shape[1] - 4)
    
    for i in range(Dataset.shape[0]):
        if Dataset.iloc[i][1] == CountryName:
            for j in range(4,Dataset.shape[1]):
                CountryRow[j-4] += Dataset.iloc[i][j]
                
    return CountryRow

# --- Function Ends Here ---

### Function for creating Country DataFrame

In the next step of data processing, we need a single DataFrame for a single country to list all cases datewise.
By all cases, it means Confirmed cases, Active cases, Recovered cases and Deaths (4 categories).

*Params*: Country Name

*Returns*: Country Dataframe with 4 columns, indexed by dates

In [8]:
# Function returns a Dataframe of the country passed with index = Dates, and 4 columns each related to:-
# 1. Confirmed cases
# 2. Active cases
# 3. Recovered cases
# 4. Deaths
# Function also prints the dataframe

def CountryDataframe(CountryName):
    
    CCdata = FetchData(CountryName, Confirmed_cases)
    DCdata = FetchData(CountryName, Death_cases)
    RCdata = FetchData(CountryName, Recovered_cases)
    ACdata = list(map(operator.sub,CCdata,list(map(operator.add, DCdata, RCdata))))
    
    Dateslist = pd.date_range(BeginDate, EndDate)
    df = pd.DataFrame(list(zip(CCdata, ACdata, RCdata, DCdata)),index = Dateslist, columns = ['Confirmed', 'Active','Recovered','Deaths'])
    print(CountryName)
    print(df)
    return df

# --- Function Ends Here ---

Following is an example of the working of the function for country 'United Kingdom'

In [9]:
# For example
UKDf = CountryDataframe('United Kingdom')

United Kingdom
            Confirmed  Active  Recovered  Deaths
2020-01-22          0       0          0       0
2020-01-23          0       0          0       0
2020-01-24          0       0          0       0
2020-01-25          0       0          0       0
2020-01-26          0       0          0       0
...               ...     ...        ...     ...
2020-05-26     266599  228308       1161   37130
2020-05-27     268619  229911       1166   37542
2020-05-28     270508  231422       1167   37919
2020-05-29     272607  233192       1172   38243
2020-05-30     274219  234574       1187   38458

[130 rows x 4 columns]


### Function for plotting the graphs

*Params*: List of Countries (Number of countries should be a multiple of 3)

*Returns*: None

Function plot subplots showing each country data as a stacked bar graph (x-axis: Dates, y-axis: No. of cases). This function uses all the above mentioned functions. Final plot is also saved in the same directory as 'png' file.

In [10]:
# Function takes in a list of countries for plotting thier ADR data in multiple of 3
# Each country is plotted in one of the 3 columns of the figure
# The list is looped and then the plotting is done

def PlotCountryADRData(loc):

    noc = len(loc)
    nor = math.ceil(noc/3)
    
    fig, axes = plt.subplots(nrows = nor, ncols = 3, figsize = (20,13))
    fig.suptitle('Active, Death and Recovered Cases in selected countries as of ' + EndDate, fontsize = 18)
    plt.subplots_adjust(hspace = .4)
    
    k = 0
    for i in range(nor):
        for j in range(3):
            if k < noc:
                CountryData = CountryDataframe(loc[k])
                
                axes[i,j].bar(CountryData.index, CountryData['Active'], color = 'indigo', width = 1, edgecolor = 'white', linewidth = 0.1, label = 'Active')
                axes[i,j].bar(CountryData.index, CountryData['Deaths'], color = 'red', width = 1, edgecolor = 'white', linewidth = 0.1, label = 'Deaths', bottom = CountryData['Active'])
                axes[i,j].bar(CountryData.index, CountryData['Recovered'], color = 'green', width = 1, edgecolor = 'white', linewidth = 0.1, label = 'Recovered', bottom = CountryData['Active'] + CountryData['Deaths'])
                
                axes[i,j].set_title(loc[k], y = 1, fontsize = 13, fontweight='bold')
                
                axes[i,j].grid(axis = 'y', alpha = 0.5)
                
                myFmt = DateFormatter("%d-%b")
                axes[i,j].xaxis.set_major_formatter(myFmt)
                
                axes[i,j].spines['top'].set_visible(False)
                axes[i,j].spines['right'].set_visible(False)
                
                if k == noc-1:
                    handles, labels = axes[i,j].get_legend_handles_labels()
                
                k = k + 1
    
    fig.legend(handles, labels, bbox_to_anchor=(0.5, 0.95), loc='upper center', ncol = 3, fontsize = 13, frameon=False)
    plt.savefig('ADR.png')
    
# --- Function Ends Here ---


### Create a List of Countries

When passing the name of countries take special care of how they are spelled in the dataset

In [11]:
list_of_countries = ['India', 'China', 'Japan', 
                     'Italy', 'Iran', 'Korea, South', 
                     'France', 'Turkey', 'United Kingdom', 
                     'Russia', 'Canada', 'US',
                     'Germany', 'Brazil', 'Australia']

### Finally the Plot

The plot is interactive in nature, use scroll bars to see the whole. The plot is followed by dataframe of the countries passed. Also check image of the plot, saved in your system.

In [12]:
PlotCountryADRData(list_of_countries)

<IPython.core.display.Javascript object>

India
            Confirmed  Active  Recovered  Deaths
2020-01-22          0       0          0       0
2020-01-23          0       0          0       0
2020-01-24          0       0          0       0
2020-01-25          0       0          0       0
2020-01-26          0       0          0       0
...               ...     ...        ...     ...
2020-05-26     150793   82172      64277    4344
2020-05-27     158086   85803      67749    4534
2020-05-28     165386   89755      70920    4711
2020-05-29     173491   85884      82627    4980
2020-05-30     181827   89706      86936    5185

[130 rows x 4 columns]
China
            Confirmed  Active  Recovered  Deaths
2020-01-22        548     503         28      17
2020-01-23        643     595         30      18
2020-01-24        920     858         36      26
2020-01-25       1406    1325         39      42
2020-01-26       2075    1970         49      56
...               ...     ...        ...     ...
2020-05-26      84103     107    

Australia
            Confirmed  Active  Recovered  Deaths
2020-01-22          0       0          0       0
2020-01-23          0       0          0       0
2020-01-24          0       0          0       0
2020-01-25          0       0          0       0
2020-01-26          4       4          0       0
...               ...     ...        ...     ...
2020-05-26       7139     476       6560     103
2020-05-27       7150     468       6579     103
2020-05-28       7165     486       6576     103
2020-05-29       7184     476       6605     103
2020-05-30       7192     475       6614     103

[130 rows x 4 columns]
