<a href="https://colab.research.google.com/github/niltontac/EspAnalise-EngDados/blob/master/Covid_19_Analysis_and_Predictions%20-%20In%20Progress.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Sobre estes conjuntos de dados

#####Estes conjuntos de dados explorados nestas análises a seguir são fornecidos pela Johns Hopkins University, renomada instituição dos Estados Unidos que está na linha de frente dos dados coletados no mundo sobre o Covid-19. Também coleto dados da plataforma Kaggle, onde reune usuários do mundo inteiro colaborando com dados reais e de fontes confiáveis.
#####Todos os conjuntos de dados explorados aqui possuem informações com atualizações diárias sobre os números de casos confirmados, de mortes e de recuperação do Covid-19. Observe que são dados de séries temporais e, portando, os números de casos em um determinado dia são números acumulados.


#About this Dataset

#####These data sets explored in these analyzes below are provided by Johns Hopkins University, a renowned institution in the United States that is at the forefront of data collected worldwide about Covid-19.  I also collect data from the Kaggle platform, where it gathers users from all over the world collaborating with real data and from reliable sources.

#####All data sets explored have information with daily updates on the numbers of confirmed cases, deaths and recovery from Covid-19. Note that they are time series data and the numbers of cases on a given day are cumulative numbers.

---

#####Fonte | Source (Datasets): 
#####Johns Hopkins University:
#####https://coronavirus.jhu.edu/

#####Kaggle:
#####https://www.kaggle.com/sudalairajkumar/novel-corona-virus-2019-dataset
#####https://www.kaggle.com/unanimad/corona-virus-brazil

#####All datasets on github:

##### https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_time_series
##### https://github.com/niltontac/EspAnalise-EngDados/tree/master/data/Novel_Corona_Virus_2019_Dataset
##### https://github.com/niltontac/EspAnalise-EngDados/tree/master/data/covid19_brazil_data
---

#####Analyst: Nilton Thiago de Andrade Coura
#####Recife/PE - Brazil
#####niltontac@gmail.com
#####https://github.com/niltontac

# Covid-19 - Exploratory Analysis and Predictions using Machine Learning Algorithms

![alt text](https://i.ibb.co/txCZFvr/3-D-medical-animation-coronavirus-structure.jpg)

In [1]:
# Importing Libraries

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.graph_objects as go 
import seaborn as sns
import plotly as py
import plotly.express as px

from fbprophet.plot import plot_plotly
from fbprophet import Prophet
from fbprophet.plot import add_changepoints_to_plot


import warnings
warnings.filterwarnings('ignore')

# Loading dataset
# Last dataset update 04/04/2020

covid19confirmed = pd.read_csv('https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv')

covid19deaths = pd.read_csv('https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv')

covid19recovered = pd.read_csv('https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_recovered_global.csv')

covid19 = pd.read_csv('https://raw.githubusercontent.com/niltontac/EspAnalise-EngDados/master/data/Novel_Corona_Virus_2019_Dataset/covid_19_data.csv', parse_dates=['ObservationDate', 'Last Update'])

covid19Brazil = pd.read_csv('https://raw.githubusercontent.com/niltontac/EspAnalise-EngDados/master/data/covid19_brazil_data/brazil_covid19.csv')

  import pandas.util.testing as tm


Last Update:

In [0]:
last_date_update = '4/4/20'

Checking the last 5 cases to confirm when all the data sets were updated:

In [3]:
print('covid19confirmed:')
print(covid19confirmed.tail())
####
print('covid19deaths:')
print(covid19deaths.tail())
####
print('covid19recovered:')
print(covid19recovered.tail())
####
print('covid19:')
print(covid19.tail())
####
print('covid19Brazil:')
print(covid19Brazil.tail())

covid19confirmed:
                        Province/State  Country/Region  ...  4/3/20  4/4/20
254                                NaN         Burundi  ...       3       3
255                                NaN    Sierra Leone  ...       2       4
256   Bonaire, Sint Eustatius and Saba     Netherlands  ...       2       2
257                                NaN          Malawi  ...       3       4
258  Falkland Islands (Islas Malvinas)  United Kingdom  ...       0       1

[5 rows x 78 columns]
covid19deaths:
                        Province/State  Country/Region  ...  4/3/20  4/4/20
254                                NaN         Burundi  ...       0       0
255                                NaN    Sierra Leone  ...       0       0
256   Bonaire, Sint Eustatius and Saba     Netherlands  ...       0       0
257                                NaN          Malawi  ...       0       0
258  Falkland Islands (Islas Malvinas)  United Kingdom  ...       0       0

[5 rows x 78 columns]
covid19re

In [0]:
# Rename columns 'ObservationDate' for 'Date'

covid19 = covid19.rename(columns={'ObservationDate' : 'Date'})

Dimension of data sets (rows vs columns):

In [5]:
print('covid19confirmed:')
print(covid19confirmed.shape)
####
print('covid19deaths:')
print(covid19deaths.shape)
####
print('covid19recovered:')
print(covid19recovered.shape)
####
print('covid19:')
print(covid19.shape)
####
print('covid19Brazil:')
print(covid19Brazil.shape)

covid19confirmed:
(259, 78)
covid19deaths:
(259, 78)
covid19recovered:
(245, 78)
covid19:
(11930, 8)
covid19Brazil:
(1809, 5)


Checking for null or missing values:

In [6]:
print('covid19confirmed:')
print(pd.DataFrame(covid19confirmed.isnull().sum()))
####
print('covid19deaths:')
print(pd.DataFrame(covid19deaths.isnull().sum()))
####
print('covid19recovered:')
print(pd.DataFrame(covid19recovered.isnull().sum()))
####
print('covid19:')
print(pd.DataFrame(covid19.isnull().sum()))
####
print('covid19Brazil:')
print(pd.DataFrame(covid19Brazil.isnull().sum()))

covid19confirmed:
                  0
Province/State  178
Country/Region    0
Lat               0
Long              0
1/22/20           0
...             ...
3/31/20           0
4/1/20            0
4/2/20            0
4/3/20            0
4/4/20            0

[78 rows x 1 columns]
covid19deaths:
                  0
Province/State  178
Country/Region    0
Lat               0
Long              0
1/22/20           0
...             ...
3/31/20           0
4/1/20            0
4/2/20            0
4/3/20            0
4/4/20            0

[78 rows x 1 columns]
covid19recovered:
                  0
Province/State  179
Country/Region    0
Lat               0
Long              0
1/22/20           0
...             ...
3/31/20           0
4/1/20            0
4/2/20            0
4/3/20            0
4/4/20            0

[78 rows x 1 columns]
covid19:
                   0
SNo                0
Date               0
Province/State  5663
Country/Region     0
Last Update        0
Confirmed          0
Deat

Some data sets have missings values or null in "Province/State" column.
Let's replace them with 'unknow':

In [0]:
# Replacing data missings

covid19confirmed = covid19confirmed.fillna('unknow')
covid19deaths = covid19deaths.fillna('unknow')
covid19recovered = covid19recovered.fillna('unknow')
covid19 = covid19.fillna('unknow')

In [8]:
# Checking for null or missing values again

print('covid19confirmed:')
print(pd.DataFrame(covid19confirmed.isnull().sum()))
####
print('covid19deaths:')
print(pd.DataFrame(covid19deaths.isnull().sum()))
####
print('covid19recovered:')
print(pd.DataFrame(covid19recovered.isnull().sum()))
####
print('covid19:')
print(pd.DataFrame(covid19.isnull().sum()))

covid19confirmed:
                0
Province/State  0
Country/Region  0
Lat             0
Long            0
1/22/20         0
...            ..
3/31/20         0
4/1/20          0
4/2/20          0
4/3/20          0
4/4/20          0

[78 rows x 1 columns]
covid19deaths:
                0
Province/State  0
Country/Region  0
Lat             0
Long            0
1/22/20         0
...            ..
3/31/20         0
4/1/20          0
4/2/20          0
4/3/20          0
4/4/20          0

[78 rows x 1 columns]
covid19recovered:
                0
Province/State  0
Country/Region  0
Lat             0
Long            0
1/22/20         0
...            ..
3/31/20         0
4/1/20          0
4/2/20          0
4/3/20          0
4/4/20          0

[78 rows x 1 columns]
covid19:
                0
SNo             0
Date            0
Province/State  0
Country/Region  0
Last Update     0
Confirmed       0
Deaths          0
Recovered       0


##Plotly Visualizations: Exploratory data analysis in the World and Brazil. Predictions using some machine learning algorithms

###Worldwide:

All records including confirmed cases, deaths and recovered:

In [9]:
# all confirmed, deaths and recovered cases

cases_growth = covid19.groupby('Date')['Confirmed', 'Deaths', 'Recovered'].sum()
cases_growth = cases_growth.reset_index()
cases_growth = cases_growth.sort_values('Date', ascending=False)

fig = go.Figure()
fig.update_layout(template='plotly_dark')

fig.add_trace(go.Scatter(x=cases_growth['Date'], 
                        y=cases_growth['Confirmed'], 
                        mode='lines+markers',
                        name='Confirmed',
                        line=dict(color='Yellow', width=2)))

fig.add_trace(go.Scatter(x=cases_growth['Date'], 
                        y=cases_growth['Deaths'], 
                        mode='lines+markers',
                        name='Deaths',
                        line=dict(color='red', width=2)))

fig.add_trace(go.Scatter(x=cases_growth['Date'], 
                        y=cases_growth['Recovered'], 
                        mode='lines+markers',
                        name='Recovered',
                        line=dict(color='green', width=2)))

fig.show()

Death and recovery rates and percentage increase in confirmed cases:

In [10]:
cases_rate = covid19.groupby(['Date']).agg({'Deaths': ['sum'],'Recovered': ['sum'],'Confirmed': ['sum']})
cases_rate.columns = ['Global_Deaths','Global_Recovered','Global_Confirmed']
cases_rate = cases_rate.reset_index()
cases_rate['Increase_cases_per_day']=cases_rate['Global_Confirmed'].diff().shift(-1)

cases_rate['Global_Deaths_rate_%'] = cases_rate.apply(lambda row: ((row.Global_Deaths)/(row.Global_Confirmed))*100 , axis=1)
cases_rate['Global_Recovered_rate_%'] = cases_rate.apply(lambda row: ((row.Global_Recovered)/(row.Global_Confirmed))*100 , axis=1)
cases_rate['Global_Growth_rate_%']=cases_rate.apply(lambda row: row.Increase_cases_per_day/row.Global_Confirmed*100, axis=1)
cases_rate['Global_Growth_rate_%']=cases_rate['Global_Growth_rate_%'].shift(+1)



fig = go.Figure()
fig.update_layout(template='plotly_dark')
fig.add_trace(go.Scatter(x=cases_rate['Date'], 
                         y=cases_rate['Global_Deaths_rate_%'],
                         mode='lines+markers',
                         name='Death rate %',
                         line=dict(color='red', width=2)))

fig.add_trace(go.Scatter(x=cases_rate['Date'], 
                         y=cases_rate['Global_Recovered_rate_%'],
                         mode='lines+markers',
                         name='Recovery rate %',
                         line=dict(color='Green', width=2)))

fig.add_trace(go.Scatter(x=cases_rate['Date'], 
                         y=cases_rate['Global_Growth_rate_%'],
                         mode='lines+markers',
                         name='Growth rate confirmed %',
                         line=dict(color='Yellow', width=2)))

fig.show()

In [11]:
cases_rate.tail()

Unnamed: 0,Date,Global_Deaths,Global_Recovered,Global_Confirmed,Increase_cases_per_day,Global_Deaths_rate_%,Global_Recovered_rate_%,Global_Growth_rate_%
69,2020-03-31,42107.0,178034.0,857487.0,75118.0,4.910512,20.762297,9.601912
70,2020-04-01,46809.0,193177.0,932605.0,80552.0,5.019167,20.7137,8.760249
71,2020-04-02,52983.0,210263.0,1013157.0,82760.0,5.229496,20.753249,8.637312
72,2020-04-03,58787.0,225796.0,1095917.0,101488.0,5.364184,20.603385,8.168527
73,2020-04-04,64606.0,246152.0,1197405.0,,5.395501,20.557121,9.260555


Confirmed cases, Deaths and Recovered in all affected countries around the world:

In [12]:
cases_temp = covid19confirmed 
cases_temp = cases_temp[['Country/Region', last_date_update]]
cases_temp = cases_temp.groupby('Country/Region').sum().sort_values(by = last_date_update,ascending = False)
cases_temp['Recovered'] = covid19recovered[['Country/Region', last_date_update]].groupby('Country/Region').sum().sort_values(by = last_date_update, ascending = False)
cases_temp['Deaths'] = covid19deaths[['Country/Region', last_date_update]].groupby('Country/Region').sum().sort_values(by = last_date_update, ascending = False)
cases_temp['Active'] = cases_temp[last_date_update] - cases_temp['Recovered'] - cases_temp['Deaths']
cases_temp = cases_temp.rename(columns = {last_date_update: 'Confirmed', 'Recovered' : 'Recovered', 'Deaths' : 'Deaths', 'Active' : 'Active'})

cases_temp.style.background_gradient(cmap='Reds')

Unnamed: 0_level_0,Confirmed,Recovered,Deaths,Active
Country/Region,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
US,308850,14652,8407,285791
Spain,126168,34219,11947,80002
Italy,124632,20996,15362,88274
Germany,96092,26400,1444,68248
France,90848,15572,7574,67702
China,82543,76946,3330,2267
Iran,55743,19736,3452,32555
United Kingdom,42477,215,4320,37942
Turkey,23934,786,501,22647
Switzerland,20505,6415,666,13424


In [13]:
data_countries = covid19.groupby(['Country/Region', 'Date']).sum().reset_index().sort_values('Date', ascending=False)
data_countries = data_countries.drop_duplicates(subset = ['Country/Region'])
data_countries_confirmed = data_countries[data_countries["Confirmed"]>0]
data_countries_confirmed

Unnamed: 0,Country/Region,Date,SNo,Confirmed,Deaths,Recovered
4775,Saint Vincent and the Grenadines,2020-04-04,11753,7.0,0.0,1.0
1363,Czech Republic,2020-04-04,11656,4472.0,59.0,78.0
182,Antigua and Barbuda,2020-04-04,11620,15.0,0.0,0.0
3450,Malawi,2020-04-04,11715,4.0,0.0,0.0
3239,Lithuania,2020-04-04,11711,771.0,11.0,7.0
...,...,...,...,...,...,...
4294,Palestine,2020-03-09,4322,22.0,0.0,0.0
6098,Vatican City,2020-03-09,4507,1.0,0.0,0.0
4570,Republic of Ireland,2020-03-08,4067,21.0,0.0,0.0
4085,North Ireland,2020-02-28,2685,1.0,0.0,0.0


In [0]:
mortality = covid19.copy()


mortality = mortality.groupby(['Date', 'Country/Region']).agg({'Deaths': ['sum'],'Recovered': ['sum'],'Confirmed': ['sum']})
mortality.columns = ['Deaths','Recovered','Confirmed']
mortality = mortality.reset_index()
mortality = mortality[mortality.Deaths != 0]
mortality = mortality[mortality.Confirmed != 0]
#prevent division by zero
def ifNull(d):
    temp=1
    if d!=0:
        temp=d
    return temp

mortality['mortality_rate'] = mortality.apply(lambda row: ((row.Deaths+1)/ifNull((row.Confirmed)))*100, axis=1)

In [0]:
floorVar=0
worldPop=10000000

#Modelling total confirmed cases 
confirmed_training_dataset = pd.DataFrame(covid19.groupby('Date')['Confirmed'].sum().reset_index()).rename(columns={'Date': 'ds', 'Confirmed': 'y'})
#confirmed_training_dataset.insert(0,'floor',1)
confirmed_training_dataset['floor'] = floorVar
confirmed_training_dataset['cap'] = worldPop

#Modelling mortality rate
mortality_training_dataset = pd.DataFrame(mortality.groupby('Date')['mortality_rate'].mean().reset_index()).rename(columns={'Date': 'ds', 'mortality_rate': 'y'})

#Modelling deaths
death_training_dataset = pd.DataFrame(covid19.groupby('Date')['Deaths'].sum().reset_index()).rename(columns={'Date': 'ds', 'Deaths': 'y'})
death_training_dataset['floor'] = 0
death_training_dataset['cap'] = 25000

In [16]:
# Total dataframe model 
m = Prophet(
    growth="logistic",
    interval_width=0.98,
   # changepoint_prior_scale=0.05,
    #changepoint_range=0.9,
    yearly_seasonality=False,
    weekly_seasonality=False,
    daily_seasonality=True,
    seasonality_mode='additive'
    )

m.fit(confirmed_training_dataset)
future = m.make_future_dataframe(periods=50)
future['cap']=worldPop
future['floor']=floorVar
confirmed_forecast = m.predict(future)

# Mortality rate model
m_mortality = Prophet ()
m_mortality.fit(mortality_training_dataset)
mortality_future = m_mortality.make_future_dataframe(periods=31)
mortality_forecast = m_mortality.predict(mortality_future)

# Deaths model
m2 = Prophet(interval_width=0.95,
            growth="logistic")
m2.fit(death_training_dataset)

future2 = m2.make_future_dataframe(periods=7)
future2['cap']=25000
future2['floor']=0
death_forecast = m2.predict(future2)

INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
INFO:fbprophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.
INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
INFO:fbprophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.


In [17]:
fig = plot_plotly(m, confirmed_forecast)
annotations = []
annotations.append(dict(xref='paper', yref='paper', x=0.0, y=1.10,
                              xanchor='left', yanchor='bottom',
                              text='Predictions for Total Confirmed cases',
                              font=dict(family='Arial',
                                        size=25,
                                        color='rgb(37,37,37)'),
                              showarrow=False))
fig.update_layout(annotations=annotations)
fig

In [18]:
fig = plot_plotly(m_mortality, mortality_forecast)
annotations = []
annotations.append(dict(xref='paper', yref='paper', x=0.0, y=1.10,
                              xanchor='left', yanchor='bottom',
                              text='Predictions for mortality rate',
                              font=dict(family='Arial',
                                        size=25,
                                        color='rgb(37,37,37)'),
                              showarrow=False))
fig.update_layout(annotations=annotations)
fig

In [19]:
fig_death = plot_plotly(m2, death_forecast)  
annotations = []
annotations.append(dict(xref='paper', yref='paper', x=0.0, y=1.10,
                              xanchor='left', yanchor='bottom',
                              text='Predictions for Deaths',
                              font=dict(family='Arial',
                                        size=25,
                                        color='rgb(37,37,37)'),
                              showarrow=False))
fig_death.update_layout(annotations=annotations)
fig_death

In [0]:
#from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
#init_notebook_mode(connected=True) 

#data_map = [ dict(
#        type = 'choropleth',
#        locations = data_countries_confirmed['Country/Region'],
#        locationmode = 'country names',
#        z = data_countries_confirmed['Confirmed'],
#        colorscale=
#            [[0.0, "rgb(251, 237, 235)"],
#            [0.09, "rgb(245, 211, 206)"],
#            [0.12, "rgb(239, 179, 171)"],
#            [0.15, "rgb(236, 148, 136)"],
#            [0.22, "rgb(239, 117, 100)"],
#            [0.35, "rgb(235, 90, 70)"],
#            [0.45, "rgb(207, 81, 61)"],
#            [0.65, "rgb(176, 70, 50)"],
#            [0.85, "rgb(147, 59, 39)"],
#            [1.00, "rgb(110, 47, 26)"]],
#        autocolorscale = False,
#        reversescale = False,
#        marker = dict(
#            line = dict (
#                color = 'rgb(180,180,180)',
#                width = 0.5
#            ) 
#        ),
#        colorbar = dict(
#            autotick = False,
#            tickprefix = '',
#            title = 'Participant'),
#) 
#       ]

#layout = dict(
#    title = "Last Confirmed Cases (Till April 02, 2020)",
#    geo = dict(
#        showframe = False,
#        showcoastlines = True,
#        projection = dict(type = 'Mercator'),
#        width=500,height=400)
#)

#w_map = dict( data_map=data_map, layout=layout)
#iplot( w_map, validate=False)

###Brazil:

In [21]:
cases_Brazil = covid19.copy()
cases_Brazil = covid19.loc[covid19['Country/Region']=='Brazil']
cases_Brazil = cases_Brazil.groupby(['Date', 'Country/Region']).agg({'Confirmed':['sum'], 'Deaths':['sum'], 'Recovered':['sum']}).sort_values('Date', ascending = False)
cases_Brazil.columns = ['Confirmed', 'Deaths', 'Recovered']
cases_Brazil = cases_Brazil.reset_index()
cases_Brazil['New_daily_Confirmed_cases'] = cases_Brazil['Confirmed'].diff()
cases_Brazil['New_daily_Deaths_cases'] = cases_Brazil['Deaths'].diff()
cases_Brazil['New_daily_Recovered_cases'] = cases_Brazil['Recovered'].diff()

cases_Brazil_confirmed = cases_Brazil[cases_Brazil['Confirmed']!=0]
cases_Brazil_confirmed

Unnamed: 0,Date,Country/Region,Confirmed,Deaths,Recovered,New_daily_Confirmed_cases,New_daily_Deaths_cases,New_daily_Recovered_cases
0,2020-04-04,Brazil,10360.0,445.0,127.0,,,
1,2020-04-03,Brazil,9056.0,359.0,127.0,-1304.0,-86.0,0.0
2,2020-04-02,Brazil,8044.0,324.0,127.0,-1012.0,-35.0,0.0
3,2020-04-01,Brazil,6836.0,240.0,127.0,-1208.0,-84.0,0.0
4,2020-03-31,Brazil,5717.0,201.0,127.0,-1119.0,-39.0,0.0
5,2020-03-30,Brazil,4579.0,159.0,120.0,-1138.0,-42.0,-7.0
6,2020-03-29,Brazil,4256.0,136.0,6.0,-323.0,-23.0,-114.0
7,2020-03-28,Brazil,3904.0,111.0,6.0,-352.0,-25.0,0.0
8,2020-03-27,Brazil,3417.0,92.0,6.0,-487.0,-19.0,0.0
9,2020-03-26,Brazil,2985.0,77.0,6.0,-432.0,-15.0,0.0


In [22]:
fig = go.Figure()
fig.update_layout(template='plotly_dark')

fig.add_trace(go.Scatter(x=cases_Brazil_confirmed['Date'], 
                        y=cases_Brazil_confirmed['Confirmed'], 
                        mode='lines+markers',
                        name='Confirmed',
                        line=dict(color='yellow', width=2)))

fig.add_trace(go.Scatter(x=cases_Brazil_confirmed['Date'], 
                        y=cases_Brazil_confirmed['Deaths'], 
                        mode='lines+markers',
                        name='Deaths',
                        line=dict(color='red', width=2)))

fig.add_trace(go.Scatter(x=cases_Brazil_confirmed['Date'], 
                        y=cases_Brazil_confirmed['Recovered'], 
                        mode='lines+markers',
                        name='Recovered',
                        line=dict(color='green', width=2)))

fig.show()

In [23]:
cases_Brazil_rate = covid19.copy()
cases_Brazil_rate = covid19.loc[covid19['Country/Region']=='Brazil']
cases_Brazil_rate = cases_Brazil_rate.groupby(['Date']).agg({'Deaths': ['sum'],'Recovered': ['sum'],'Confirmed': ['sum']})
cases_Brazil_rate.columns = ['Brazil_Deaths','Brazil_Recovered','Brazil_Confirmed']
cases_Brazil_rate = cases_Brazil_rate.reset_index()
cases_Brazil_rate['Increase_cases_per_day_in_Brazil']=cases_Brazil_rate['Brazil_Confirmed'].diff().shift(-1)

cases_Brazil_rate = cases_Brazil_rate[cases_Brazil_rate.Brazil_Deaths != 0]
cases_Brazil_rate = cases_Brazil_rate[cases_Brazil_rate.Brazil_Confirmed != 0]
#prevent division by zero
def ifNull(d):
    temp=1
    if d!=0:
        temp=d
    return temp

cases_Brazil_rate['Brazil_Deaths_rate_%'] = cases_Brazil_rate.apply(lambda row: ((row.Brazil_Deaths)/(row.Brazil_Confirmed))*100 , axis=1)
cases_Brazil_rate['Brazil_Recovered_rate_%'] = cases_Brazil_rate.apply(lambda row: ((row.Brazil_Recovered)/(row.Brazil_Confirmed))*100 , axis=1)
cases_Brazil_rate['Brazil_Growth_rate_%']=cases_Brazil_rate.apply(lambda row: row.Increase_cases_per_day_in_Brazil/row.Brazil_Confirmed*100, axis=1)
cases_Brazil_rate['Brazil_Growth_rate_%']=cases_Brazil_rate['Brazil_Growth_rate_%'].shift(+1)



fig = go.Figure()
fig.update_layout(template='plotly_dark')
fig.add_trace(go.Scatter(x=cases_Brazil_rate['Date'], 
                         y=cases_Brazil_rate['Brazil_Deaths_rate_%'],
                         mode='lines+markers',
                         name='Death rate %',
                         line=dict(color='red', width=2)))

fig.add_trace(go.Scatter(x=cases_Brazil_rate['Date'], 
                         y=cases_Brazil_rate['Brazil_Recovered_rate_%'],
                         mode='lines+markers',
                         name='Recovery rate %',
                         line=dict(color='Green', width=2)))

fig.add_trace(go.Scatter(x=cases_Brazil_rate['Date'], 
                         y=cases_Brazil_rate['Brazil_Growth_rate_%'],
                         mode='lines+markers',
                         name='Growth rate confirmed %',
                         line=dict(color='yellow', width=2)))

fig.show()

Province/Region:

In [24]:
cases_Brazil_region = covid19Brazil.groupby(['region', 'date']).sum().reset_index().sort_values('date', ascending=False)
cases_Brazil_region = cases_Brazil_region.drop_duplicates(subset = ['region'])
cases_Brazil_region_confirmed = cases_Brazil_region[cases_Brazil_region["cases"]>0]
cases_Brazil_region_confirmed

Unnamed: 0,region,date,cases,deaths
334,Sul,2020-04-05,1213,26
267,Sudeste,2020-04-05,6678,351
133,Nordeste,2020-04-05,1880,78
66,Centro-Oeste,2020-04-05,708,12
200,Norte,2020-04-05,651,19


####State of Pernambuco (I live here)

Confirmed and Deaths cases in Pernambuco (absolute numbers):

In [25]:
cases_Brazil_state_Pernambuco = covid19Brazil.copy()
cases_Brazil_state_Pernambuco = covid19Brazil.loc[covid19Brazil['state']=='Pernambuco']
cases_Brazil_state_Pernambuco = cases_Brazil_state_Pernambuco.groupby(['date']).agg({'cases':['sum'], 'deaths':['sum']}).sort_values('date', ascending = False)
cases_Brazil_state_Pernambuco.columns = ['cases', 'deaths']
cases_Brazil_state_Pernambuco = cases_Brazil_state_Pernambuco.reset_index()

cases_Brazil_state_Pernambuco_confirmed = cases_Brazil_state_Pernambuco[cases_Brazil_state_Pernambuco['cases']!=0]
cases_Brazil_state_Pernambuco_confirmed.style.background_gradient(cmap='Reds')

Unnamed: 0,date,cases,deaths
0,2020-04-05,201,21
1,2020-04-04,176,14
2,2020-04-03,136,10
3,2020-04-02,106,9
4,2020-04-01,95,8
5,2020-03-31,87,6
6,2020-03-30,78,6
7,2020-03-29,73,5
8,2020-03-28,68,5
9,2020-03-27,56,4


Confirmed and Deaths cases in Pernambuco (absolute numbers) - GRAPH:

In [26]:
fig = go.Figure()
fig.update_layout(template='seaborn', width=1200, height=600)

fig.add_trace(go.Scatter(x=cases_Brazil_state_Pernambuco_confirmed['date'], 
                        y=cases_Brazil_state_Pernambuco_confirmed['cases'], 
                        mode='lines+markers',
                        name='Confirmed',
                        line=dict(color='blue', width=2)))

fig.add_trace(go.Scatter(x=cases_Brazil_state_Pernambuco_confirmed['date'], 
                        y=cases_Brazil_state_Pernambuco_confirmed['deaths'], 
                        mode='lines+markers',
                        name='Deaths',
                        line=dict(color='red', width=2)))

Confirmed and Deaths cases in Pernambuco - Rate %:

In [27]:
cases_Brazil_state_Pernambuco_rate = covid19Brazil.copy()
cases_Brazil_state_Pernambuco_rate = covid19Brazil.loc[covid19Brazil['state']=='Pernambuco']
cases_Brazil_state_Pernambuco_rate = cases_Brazil_state_Pernambuco_rate.groupby(['date']).agg({'deaths': ['sum'],'cases': ['sum']})
cases_Brazil_state_Pernambuco_rate.columns = ['Pernambuco_Deaths','Pernambuco_Cases']
cases_Brazil_state_Pernambuco_rate = cases_Brazil_state_Pernambuco_rate.reset_index()
cases_Brazil_state_Pernambuco_rate['Increase_cases_per_day_in_Pernambuco']=cases_Brazil_state_Pernambuco_rate['Pernambuco_Cases'].diff().shift(-1)

cases_Brazil_state_Pernambuco_rate = cases_Brazil_state_Pernambuco_rate[cases_Brazil_state_Pernambuco_rate.Pernambuco_Deaths != 0]
cases_Brazil_state_Pernambuco_rate = cases_Brazil_state_Pernambuco_rate[cases_Brazil_state_Pernambuco_rate.Pernambuco_Cases != 0]
#prevent division by zero
def ifNull(d):
    temp=1
    if d!=0:
        temp=d
    return temp

cases_Brazil_state_Pernambuco_rate['Pernambuco_Deaths_rate_%'] = cases_Brazil_state_Pernambuco_rate.apply(lambda row: ((row.Pernambuco_Deaths)/(row.Pernambuco_Cases))*100 , axis=1)
cases_Brazil_state_Pernambuco_rate['Pernambuco_Growth_rate_%'] = cases_Brazil_state_Pernambuco_rate.apply(lambda row: row.Increase_cases_per_day_in_Pernambuco/row.Pernambuco_Cases*100, axis=1)
cases_Brazil_state_Pernambuco_rate['Pernambuco_Growth_rate_%'] = cases_Brazil_state_Pernambuco_rate['Pernambuco_Growth_rate_%'].shift(+1)



fig = go.Figure()
fig.update_layout(template='seaborn', width=1200, height=600)
fig.add_trace(go.Scatter(x=cases_Brazil_state_Pernambuco_rate['date'], 
                         y=cases_Brazil_state_Pernambuco_rate['Pernambuco_Deaths_rate_%'],
                         mode='lines+markers',
                         name='Death rate %',
                         line=dict(color='red', width=2)))

fig.add_trace(go.Scatter(x=cases_Brazil_state_Pernambuco_rate['date'], 
                         y=cases_Brazil_state_Pernambuco_rate['Pernambuco_Growth_rate_%'],
                         mode='lines+markers',
                         name='Confirmed rate %',
                         line=dict(color='blue', width=2)))

report in progress for the next few days...