# SARS Cov 2 transmission rate ($\beta$), recovered rate ($\gamma$) and fatalities rate ($\delta$)

In this notebook, we seek a behaviour of transmission rate over a time in each continent and also some country.
Our work is as follows:

- **Short review on adequate contact rate and incidence**
    - **SIRF Model with standard incidence adapted**
    - **Practical: SIRF approximated**
    - **Estimate** $\beta(t)$, $\gamma(t)$, $\delta(t)$
- **See Worldwide SARS Cov 2**
  - **Define** $\beta(t)$, $\gamma(t)$, $\delta(t)$ **in each continent**
  - **Plotting  transmission rate, recovered rate and fatalities rate for each continent**
  - **Interpreting a control disease in Oceania continent**
      - **Find** $R_0$ **over time**
- **Find** $R_0$ **over time** **for Africa**
- **Case study: China, Australia, Cameroon**
    - **Transmission rate, recovered rate and fatalities rate forecasting**

## Adequate contact rate and incidence

**Contact rate $U(N)$** is the number of individuals contacted by infective per unit of time. Suppose that the probability of infection by each contact is $\beta_0$, then the **adequate contact rate** is $\beta_0U(N)$. 

The mean adequate contact rate of an infected individual to a susceptible is $\beta_0U(N)\dfrac{S}{N}$. This rate is called an **infection rate**. Then the total new infectives infected by all individuals in the infected compartiment per unit of time, at time t is $(\beta_0U(N)\dfrac{S}{N})I$, which is called **incidence** of disease.

- If $U(N) = kN$ that is, the contact rate is proportional to the total population size, the incidence is $\beta(t)S(t)I(t)$, where $\beta = \beta_0k$ is called the transmission coefficient(transmission rate). This type of incidence is called **bilinear incidence**
- If $U(N) = k^{'}$, that is, the contact rate is a constant in this case, the incidence become $\beta I\dfrac{S}{N}$, where $\beta = \beta_0k^{'}$, and it is called **standard incidence**.

**Extract from: Zhien Ma, Jia Li - Dynamical Modeling and Anaylsis of Epidemics-World Scientific Publishing Company (2009)**

### SIRF Model with standard incidence  adapted

**Can we find the model that explain well the spreading of covid 19 in the world?**

We know that covid19 have many importants variables but our data, we have four  **ConfirmedCases(TotalpositiveCases), CurrentConfirmedCases(CurrentpositiveCases), Recovered and Deaths**. How can we obtain the dynamics system equation for these variables? To answer this question, we are going to use the SIRF Model with standard incidence:

The SIRF model with standard incidence  is a classic model in epidemiology, it contain 04 subpopulations, the susceptibles **S**, the infectives **I** and recovered individuals **R**, fatalities **F**:

> Susceptiles 

> Infective

> Recovered

> Fatalities

The susceptible can become infective, and the infectives can become recovered or Fatalities, but no other transitions are considered.
The population $N = S + I + R + F$ remains constant. The model describes the movement between the classes by the system of differential equations.

> $\dfrac{dS}{dt} = -\beta I\dfrac{S}{N}$, $\qquad$ $\dfrac{dI}{dt} = \beta I\dfrac{S}{N} -(\gamma +\delta) I$, $\qquad$ $\dfrac{dR}{dt} = \gamma I$ $\qquad$ $\dfrac{dF}{dt} = \delta I$.  Where  $\beta$ is the transmission rate, $\gamma$ is the recovery rate, $\delta$ is fatalities rate and $R_{0}=\dfrac{\beta }{\gamma+\delta}$

### Practical:  SIRF approximated

In the context of sars cov 2 in the world, we need to adapt SIRF model to our data such that we can make some approximation on behavior of disease and define transmission rate and others. If we consider **(N)**  the number of population in some fixed surface ($Km^{2}$) at time t. We know that there will exist some confirmed cases population and non confirmed cases population.

**population size = totalpositivecases + totalnegativecases** and **totalpositivecases = currentpositivecases + (recovered + death)**

hence,

**population size = totalnegativecases + currentpositivecases + recovered + death**  (1)

From (1) we can make some identification:

> population size can be a total Population (N).

> totalnegativecases can be a Susceptible (S)

> currentpositivecases can be an Infective (I) 

> recovered + death can be a Recovered individuals (R) + Fatalities (F)

We can write again:

$S = N  - S_c \rightarrow \dfrac{S}{N} = 1 - \dfrac{S_c}{N}$ if $  \dfrac{S_c}{N} << 1 $ we have $S \approx N$ and SIRF Model with standard  incidence become:

$\dfrac{dI}{dt} = (\beta - \gamma - \delta)I$, $\qquad$ $\dfrac{dR}{dt} = \gamma I$ $\qquad$ $\dfrac{dF}{dt} = \delta I$

### Estimate $\beta(t), \gamma(t), \delta(t)$

> $\beta(t) = \dfrac{the \:  number \: of \:  daily \:  currentConfirmed \:  covid19 \:  patients \:  at \:  time \:  t}{the \:  number \:  of \:  accummulated \:  confirmed \:  covid19 \:  patients \:  at \:  time \:  t}$

> $\gamma(t) = \dfrac{the \:  number \: of \:  daily \:  recovered \:  covid19 \:  patients \:  at \:  time \:  t}{the \:  number \:  of \:  accummulated \:  confirmed \:  covid19 \:  patients \:  at \:  time \:  t}$

> $\delta(t) = \dfrac{the \:  number \: of \:  daily \:  deaths \:  covid19 \:  patients \:  at \:  time \:  t}{the \:  number \:  of \:  accummulated \:  confirmed \:  covid19 \:  patients \:  at \:  time \:  t}$

**Source: Zhien Ma, Jia Li - Dynamical Modeling and Anaylsis of Epidemics-World Scientific Publishing Company (2009)**

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load in 

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the "../input/" directory.
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# Any results you write to the current directory are saved as output.

In [None]:
# import package
import matplotlib.pyplot as plt
import seaborn as sns 
import statsmodels as sm
import folium as fl
from pathlib import Path
from sklearn.impute import SimpleImputer
import geopandas as gpd
import mapclassify as mpc
import warnings
from fbprophet import Prophet
from fbprophet.diagnostics import cross_validation
from fbprophet.diagnostics import performance_metrics
from statsmodels.tsa.stattools import grangercausalitytests
from statsmodels.tsa.vector_ar.vecm import coint_johansen
from statsmodels.tsa.vector_ar.var_model import VAR
import plotly.offline as py
import plotly.express as px
import cufflinks as cf

In [None]:
%matplotlib inline
pd.options.plotting.backend
#pd.plotting.register_matplotlib_converters()
gpd.plotting.plot_linestring_collection
sns.set()
warnings.filterwarnings('ignore')

In [None]:
covidfile = '/kaggle/input/novel-corona-virus-2019-dataset/covid_19_data.csv'

In [None]:
covid19 = pd.read_csv(covidfile, parse_dates=True)

In [None]:
covid19.head(3)

**Cleaning data**

In [None]:
covid19.isnull().sum()[covid19.isnull().sum()>0]

In [None]:
covid19.info()

In [None]:
covid19['ObservationDate'] = pd.DataFrame(covid19['ObservationDate'])
covid19['currentCase'] = covid19['Confirmed'] - covid19['Recovered'] - covid19['Deaths']

In [None]:
replace = ['Dem. Rep. Congo', "Côte d'Ivoire", 'Congo', 'United Kingdom', 'China','Central African Rep.',
          'Eq. Guinea','eSwatini','Bosnia and Herz.', 'S. Sudan', 'Dominican Rep.', 'W. Sahara',
          'United States of America']

name = ['Congo (Kinshasa)', 'Ivory Coast', 'Congo (Brazzaville)', 'UK', 'Mainland China', 
        'Central African Republic', 'Equatorial Guinea', 'Eswatini', 'Bosnia and Herzegovina', 'South Sudan',
       'Dominica', 'Western Sahara','US']

In [None]:
covid_data = covid19.drop(columns=['Province/State'])
covid_data = covid_data.replace(to_replace=name, value=replace)
# END Cleaning

In [None]:
covid_data.head()

## Worldwide SARS Cov 2 

### COVID-19 Worldwide Map

We use geospatial to see how covid19 spreads across the world.

In [None]:
gb_covid = covid_data.groupby('ObservationDate')[['Confirmed', 'Deaths', 'Recovered', 'currentCase']].agg('sum')
end_date = gb_covid.index.max()

In [None]:
print('========= COVID-19 Worldwide ==============================')
print("======== Report to date {} ===============\n".format(gb_covid.index.max()))
print('1- The number of country that are affected by COVID-19: {}'.format(len(covid_data['Country/Region'].unique())))
print('2- Total Confirmed: {}'.format(gb_covid['Confirmed'][gb_covid.index == gb_covid.index.max()].values[-1]))
print('3- Total Deaths: {}'.format(gb_covid['Deaths'][gb_covid.index==gb_covid.index.max()].values[-1]))
print('4- Total Recovered: {}'.format(gb_covid['Recovered'][gb_covid.index ==gb_covid.index.max()].values[-1]))
print('5- Total CurrentCase: {}'.format(gb_covid['currentCase'][gb_covid.index ==gb_covid.index.max()].values[-1]))
print('============================================================')

In [None]:
#plot the worldwide covid19
world_path_file = gpd.datasets.get_path('naturalearth_lowres') # upload natural data map
world = gpd.read_file(world_path_file)
world.head(3)

In [None]:
geo_merged = world.merge(covid_data[['ObservationDate','Country/Region','Confirmed','Deaths','Recovered','currentCase']] , 
                     left_on='name', right_on='Country/Region')

In [None]:
geo_merged.head(3)

In [None]:
geo_merged.info()

In [None]:
geo_merged['ObservationDate'] = pd.DataFrame(geo_merged['ObservationDate'])

In [None]:
geo_merged.plot(cmap='cividis_r', column='Confirmed', legend=True, figsize=(15,9), scheme='quantiles', k=6)
plt.title('SARS-Cov 2 in the worldwide')
plt.xlabel('Longitude')
plt.ylabel('Latitude')

In [None]:
worldwide = geo_merged.groupby(['ObservationDate','continent'])[['Confirmed','Deaths','Recovered','currentCase']].agg('sum').reset_index()

In [None]:
worldwide.head(3)

In [None]:
for c in worldwide.continent.unique():
    surface = worldwide[worldwide.continent==c]
    surface = surface.drop(columns='continent')
    surface.plot(x='ObservationDate',
    title='SARS Cov 2 confirmed, currentcase, recovered, deaths in {} continent over time'.format(c),
               figsize=(15,5))
    plt.ylabel('cummulative')

### Determinate $\beta(t), \gamma(t), \delta(t)$

Before  finding $\beta(t), \gamma(t), \delta(t)$, we need to see a ratio between the number confirmed case patients in one continent and population size in that continent at time t.

In [None]:
daily_case = geo_merged.loc[geo_merged.ObservationDate.isin([end_date])]

In [None]:
pop_size = daily_case.groupby('continent')['pop_est'].agg('sum')

In [None]:
case_size = daily_case.groupby('continent')['Confirmed'].agg('sum')

In [None]:
print('The number of positive case per population size in each continent at date: {} is:\n {}'.\
      format(end_date, case_size/pop_size))

**Our SIRF model approximated can be using**

In [None]:
def determinate_beta_gamma_delta(data=None):
    '''
        this function compute transmission rate, recovered rate and fatalities rate over time
        params: data
        return: beta, gamma, delta
    '''
    
    beta = []
    gamma = []
    delta = []
    
    for t in range(len(data.ObservationDate.values)):
        
        x = data.Confirmed.iloc[t]
        y = data.Deaths.iloc[t]
        z = data.Recovered.iloc[t]
        w = data.currentCase.iloc[t]
        
        if x == 0.0:
            beta.append(0)
            gamma.append(0)
            delta.append(0)
        else:
            beta_t = w/x
            gamma_t = z/x
            delta_t = y/x
            
            beta.append(beta_t)
            gamma.append(gamma_t)
            delta.append(delta_t)
            
    return np.array(beta), np.array(gamma), np.array(delta)        

In [None]:
geospatial = geo_merged.groupby(['ObservationDate','name','continent'])['Confirmed','Deaths','Recovered','currentCase'].agg('sum')

In [None]:
geospa = geospatial.reset_index()

In [None]:
geospa.head()

In [None]:
transmission, recovery, fatality = determinate_beta_gamma_delta(data=geospa)

In [None]:
geospa['beta'] = transmission
geospa['gamma'] = recovery
geospa['delta'] = fatality

In [None]:
geospa.head()

### transmission rate , recovery rate, fatalities rate worldwide map

In [None]:
rate_map = geospa.groupby(['ObservationDate','continent'])[['beta','gamma','delta']].agg('mean').reset_index()

In [None]:
rate_map.head()

In [None]:
for c in rate_map.continent.unique():
    surface = rate_map[rate_map.continent==c]
    surface = surface.drop(columns='continent')
    surface.plot(x='ObservationDate',
    title='SARS Cov 2 transmission rate, recovery rate, delta rate in {} continent over time'.format(c),
                 figsize=(15,5))
    plt.ylabel('means rate')

### Interpreting a disease control in Oceania continent

See plotting.

In [None]:
worldwide[worldwide.continent=='Oceania'].plot(x='ObservationDate',  figsize=(15,5),
    title='SARS Cov 2 control in {} continent over time'.format('Oceania'),
                                              )
plt.ylabel('cummulative')

Now, we see that Oceania is controlling a disease well. That means that transmission rate decrease and recovered rate increase also fatality rate over time.

In [None]:
rate_map[rate_map.continent=='Oceania'].plot(x='ObservationDate', 
title='SARS Cov 2 transmission rate, recovery rate, delta rate in {} continent over time'.format('Oceania'),
                                                                        figsize=(15,5))
plt.ylabel('means rate')

According to **SIRF approximated** we realize that if $\beta(t) - (\gamma(t) + \delta(t)) < 0 $ the disease is controlled otherwise is outbreak again. So, we compute $R_0(t) = \dfrac{\beta(t)}{\gamma(t) + \delta(t)}$ to see the control disease in Oceania continent.

From control disease  graph. **SIRF approximated** can become **SIRF stochastics approximated** which are:

$\dfrac{dI_t}{I_t} = (\beta - \gamma - \delta)dt + dB_t$, $\qquad$ $\dfrac{dR_t}{I_t} = \gamma dt + dB_t$ $\qquad$ $\dfrac{dF_t}{I_t} = \delta dt + dB_t$ where $B_t$ is **random walk**.

### Find $R_0$ over time 

In [None]:
oceania =  rate_map[rate_map.continent=='Oceania']

In [None]:
oceania['R0'] = oceania.beta.values/(oceania.gamma.values + oceania.delta.values)

In [None]:
oceania.plot(x='ObservationDate', y='R0',
             title='ratio reproduction number over time in Oceania',
              figsize=(15,5))
plt.ylabel('ratio')

## **Find** $R_0$ **over time** **for Africa**

In [None]:
africa = rate_map[rate_map.continent=='Africa']

In [None]:
africa ['R0'] = africa.beta.values/(africa.gamma.values + africa.delta.values)

In [None]:
africa.plot(x='ObservationDate', y='R0', 
             title='ratio reproduction number over time in Africa',
             figsize=(15,5))
plt.ylabel('ratio')

## Case study: China, Australia, Cameroon
    
  - **Transmission rate, recovered rate and fatalities rate forecasting**

In [None]:
def find_R0(data=None):
    return data.beta.values/(data.gamma.values + data.delta.values)

### China

I choose China because there is a first country that has controlled disease and mitigate a spreading of covid19.

In [None]:
china = geospa[geospa.name=='China']

In [None]:
china.head()

In [None]:
china[['ObservationDate','Confirmed','Deaths','Recovered','currentCase']].plot(x='ObservationDate', 
        title='SARS Cov 2 in China', figsize=(15,5))
plt.ylabel('Cummulative')

In [None]:
china[['ObservationDate','beta','gamma','delta']].plot(x='ObservationDate', 
                                                       title='SARS Cov 2 important parameters',
                                             figsize=(15,5))
plt.ylabel('rate')

In [None]:
# find R0
china['R0'] = find_R0(data=china)

In [None]:
china.plot(x='ObservationDate', y = 'R0', title='ratio reproductive number in China',
            figsize=(15,5))
plt.ylabel('ratio')

This curve shows the good approximation of SIRF Model and the decreasing of $R_0(t)$ over time.

### Austrialia

I choose this country because Oceania continent is the first continent that is controlling disease and mitigate a spreading of SARS Cov 2.

In [None]:
australia = geospa[geospa.name=='Australia']

In [None]:
australia.head()

In [None]:
australia[['ObservationDate','Confirmed','Deaths','Recovered','currentCase']].plot(x='ObservationDate', 
        title='SARS Cov 2 in Australia',  figsize=(15,5))
plt.ylabel('Cummulative')

In [None]:
australia[['ObservationDate','beta','gamma','delta']].plot(x='ObservationDate', 
                                                       title='SARS Cov 2 important parameters',
                                                        figsize=(15,5))
plt.ylabel('rate')

In [None]:
#compute R0
australia['R0'] = find_R0(data=australia)

In [None]:
australia.plot(x='ObservationDate', y = 'R0', title='ratio reproductive number in Australia',
                figsize=(15,5))
plt.ylabel('ratio')

### Cameroon

I choose this country because it have seriously problems (NOSO crisis, Boko haram crisis, socio-politics crisis ...) where SARS Cov 2 can also create another that this country could not control. That is why it is important to know a behaviour of the spreading of covid 19 in that country and migitigate the transmission.

In [None]:
cameroon = geospa[geospa.name=='Cameroon']

In [None]:
cameroon.head()

In [None]:
cameroon[['ObservationDate','Confirmed','Deaths','Recovered','currentCase']].plot(x='ObservationDate', 
        title='SARS Cov 2 in Cameroon', figsize=(15,5))
plt.ylabel('Cummulative')

We see that the curve have two behaviours. Before 6 Apr the curve are as exponential shape but after 6 Apr the curve jumps. Something is happening in this graph.

In [None]:
cameroon[['ObservationDate','beta','gamma','delta']].plot(x='ObservationDate', 
                                                       title='SARS Cov 2 important parameters',
                                                       figsize=(15,5))
plt.ylabel('rate')

In [None]:
#Compute R0
cameroon['R0'] = find_R0(data=cameroon)

In [None]:
cameroon.plot(x='ObservationDate', y = 'R0',  title='ratio reproductive number in Cameroon',
               figsize=(15,5))
plt.ylabel('ratio')

$R_0$ decreases over time, no bad.

### Upnext!

### Disclaimer

**This notebook does not affirm that all the models are exact it just offers  a track to better understand and give some approximation answers to fight effectively against this pandemic in the world.**