## Guillaume Deside - 39731800

# LGBIO2110 - Introduction to clinical engineering
## Project : Epidemiology modelling of COVID-19

__Date :__ Year 2021-2022

__Professor :__ Philippe Lefèvre and Benoît Delhaye

__Author :__ Benoît Delhaye and Donatien Doumont

__Content :__ At the end of this project, you should master and understand the following :


*   Master and be able to use mathematical modelling of a epidemy
*   Understanding the mechanisms that influence the disease spread and their dynamics. Be able to relate the dynamical evolution of the epidemy to events (political decisions, appearance of virus variant, vaccination,...) 
*   Predict the future trend : how severe will the pandemic be ? 
*   Suggest control strategies and evaluate their effects


In [63]:
#Librairies utiles 
from matplotlib import pyplot as plt
from matplotlib import colors as mcolors
import numpy as np
import urllib.request
import pandas as pd
import ipywidgets as widgets
%config InlineBackend.figure_format = 'retina'

# use NMA plot style
plt.style.use("https://raw.githubusercontent.com/NeuromatchAcademy/course-content/master/nma.mplstyle")
my_layout = widgets.Layout()

<font size=5 color=#009999> Context  </font> <br> <br>
SIR model from Kermack-McKendrick is expressed using only three *communicating vessel*
*   Susceptible S(t)
*   Infected I(t)
*   Removed R(t) caused by isolation/recovery/death

Dynamical equations of the model are the following : 
\begin{align*}
    \dot{S} &= -\beta S I\\
    \dot{I} &= \beta S I - \alpha I\\
    \dot{R} &= \alpha I\\
\end{align*}
Remark that the hypothesis of a constant population is made here, hence $\dot{S} + \dot{I} + \dot{R} = 0$ is verified.

***
Example : Behavior of the model for a given initial value for the susceptible $S(t)$ and infected $I(t)$ group, and for constant beta $\beta$ and alpha $\alpha$ parameters.
***
Play with the widgets parameters to better interpret their impact on the model.

In [64]:
#SIR model discrete computation 
def SIRmodel(S0,I0,param):
    beta = param[0]
    alpha = param[1]
    dt = param[2]
    #Initialisation
    S = np.zeros((N,1))
    I = S.copy()
    R = S.copy() 
    S[0] = S0 
    I[0] = I0
    for i in range(0,N-1) : 
        #print(i)
        S[i+1] = S[i]+(-beta*S[i]*I[i])*dt
        I[i+1] = I[i]+(beta*S[i]*I[i]-alpha*I[i])*dt
        R[i+1] = R[i]+(alpha*I[i])*dt
    return S, I, R

#Graphical representation
def plot_SIRmodel(t_vector, S, I, R):
    plt.plot(t_vector, S, label = 'S(t)')
    plt.plot(t_vector, I, label = 'I(t)')
    plt.plot(t_vector, R, label = 'R(t)')
    plt.legend()
    plt.xlabel("Time [days]")
    plt.ylabel("Relative group size")
    plt.xlim((0,t_vector[-1]))
    plt.ylim((0,1))
    

In [65]:
#--------------------------------------------------
tend = 60 #in days
Fs = 1e2 #sample frequency
N = int(tend*Fs) #sample size
time = np.linspace(0,tend,N) #time sequence

#Initial condition of the epidemic
S0 = 0.9
I0 = 0.1

def refresh(beta=0.5,alpha_over_one=10):
    #Dynamical evolution of the model
    alpha = 1/alpha_over_one
    Repro0 = beta*S0/alpha
    S,I,R = SIRmodel(S0,I0,[beta, alpha, 1/Fs])
    plot_SIRmodel(time, S, I, R)
    plt.title("$R_0$ = " + str(Repro0))
    plt.show()
style = {'description_width' : 'initial'}

_ = widgets.interact(refresh,
    beta = widgets.FloatLogSlider(value=0.5, min=-2, max=1, steps=0.05, description="Logarithmic slider: beta", style = style),
    alpha_over_one = widgets.IntSlider(value=10, min=1, max=30, step=1, description="Linear slider: 1/alpha(in days):",style=style),
)

interactive(children=(FloatLogSlider(value=0.5, description='Logarithmic slider: beta', max=1.0, min=-2.0, sty…

<font size=5 color=#009999> Homework  </font> <br> <br>
Instructions : 
1. Collect Belgian data 
2. Adjust the infected population number (being underestimated in the collected dataset)
3. Estimate the time varying reproduction number $R_t$ with an appropiate simple model and show the effect of government measures on the time variyng $R_t$ 
4. Forecast the evolution of $\beta(t)$ and predict the evolution of the pandemic 
5. Include the effect of vaccine 


The dataset used in this project is restricted to the following time period : from $t_0$ = **March, 2020** to $t_1$ = **March, 2021**. 

You will have to predict the behaviour of the epidemy at least **three months** after $t_1$. 

***
## 1) Importation of dataset
***
Open accessed data are imported from webpage sciensano and are preprocessed. 

Rem : The dataset is created in csv file in a folder data

In the experiments, we will use a COVID-19 dataset for Belgium. For missing data, we fill with 0 like explained in the codebook of Sciensan

In [66]:
def create_df(l,from_to):
    """
    :param l: name of interested data to complete path to the file
    :param from_to: date range to complete missing date
    :return: Two dataframes. One with no modification and the other one where data are grouped 
    according to the date
    """
    fnloc = "dataset/COVID19BE_" + l + ".csv"
    df = pd.read_csv(fnloc, sep=",", header='infer', parse_dates=["DATE"])
    if l == "VACC":
        df.loc[:, "DOSEA"] = (df.DOSE == "A") * df.COUNT
        df.loc[:, "DOSEB"] = (df.DOSE == "B") * df.COUNT
        df.loc[:, "DOSEC"] = (df.DOSE == "C") * df.COUNT
        df.loc[:, "DOSEE"] = (df.DOSE == "E") * df.COUNT

    # Keep only numerical quantities in the dataset
    fieldn = df.columns
    dfnum = df.select_dtypes(include='number')
    vars = dfnum.columns
    dfnum.insert(0, 'DATE', df.DATE, True)

    # Sum values over the same day
    dfsum = dfnum.groupby(['DATE'], as_index=True).sum()
    for ii in range(len(vars)):
        exec(l + '_' + vars[ii] + ' = dfsum.' + vars[ii])
    exec(l + '_DATE = df.DATE.unique()')
    dfsum  = dfsum.reindex(from_to, fill_value=0)
    return df, dfsum

In [67]:
from_date = '3/01/2020' # start data
to_date   = '3/17/2022' # end data collected
range_from_to = pd.date_range(start=from_date, end=to_date).to_list()

In [68]:
fn = ["HOSP", "MORT", "CASES_AGESEX", "tests", "VACC"]

df_hosp,dfsum_hosp = create_df("HOSP",range_from_to)
df_dead,dfsum_dead = create_df("MORT",range_from_to)
df_case,dfsum_case = create_df("CASES_AGESEX",range_from_to)
df_test,dfsum_test = create_df("tests",range_from_to)
df_vacc,dfsum_vacc = create_df("VACC",range_from_to)

In [69]:
display(dfsum_hosp)

Unnamed: 0_level_0,NR_REPORTING,TOTAL_IN,TOTAL_IN_ICU,TOTAL_IN_RESP,TOTAL_IN_ECMO,NEW_IN,NEW_OUT
DATE,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2020-03-01,0,0,0,0,0,0,0
2020-03-02,0,0,0,0,0,0,0
2020-03-03,0,0,0,0,0,0,0
2020-03-04,0,0,0,0,0,0,0
2020-03-05,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...
2022-03-13,103,1999,175,78,11,102,103
2022-03-14,103,2103,177,75,10,162,117
2022-03-15,103,2218,181,77,9,200,310
2022-03-16,103,2248,177,64,10,190,322


In [70]:
display(dfsum_dead)

Unnamed: 0_level_0,DEATHS
DATE,Unnamed: 1_level_1
2020-03-01,0
2020-03-02,0
2020-03-03,0
2020-03-04,0
2020-03-05,0
...,...
2022-03-13,12
2022-03-14,23
2022-03-15,14
2022-03-16,15


In [71]:
display(dfsum_case)

Unnamed: 0_level_0,CASES
DATE,Unnamed: 1_level_1
2020-03-01,19
2020-03-02,19
2020-03-03,34
2020-03-04,53
2020-03-05,81
...,...
2022-03-13,3987
2022-03-14,15602
2022-03-15,12159
2022-03-16,7953


In [72]:
display(dfsum_test)

Unnamed: 0_level_0,TESTS_ALL,TESTS_ALL_POS
DATE,Unnamed: 1_level_1,Unnamed: 2_level_1
2020-03-01,82,0
2020-03-02,317,10
2020-03-03,538,21
2020-03-04,701,37
2020-03-05,773,65
...,...,...
2022-03-13,16027,4017
2022-03-14,42397,11984
2022-03-15,51984,14984
2022-03-16,44898,13066


In [73]:
display(dfsum_vacc)

Unnamed: 0_level_0,COUNT,DOSEA,DOSEB,DOSEC,DOSEE
DATE,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2020-03-01,0,0,0,0,0
2020-03-02,0,0,0,0,0
2020-03-03,0,0,0,0,0
2020-03-04,0,0,0,0,0
2020-03-05,0,0,0,0,0
...,...,...,...,...,...
2022-03-13,1511,48,1006,0,188
2022-03-14,1512,77,477,0,525
2022-03-15,3854,85,425,0,1522
2022-03-16,11011,386,2813,0,3921


In [74]:
from plotly.subplots import make_subplots
import plotly.graph_objects as go

layout = go.Layout(
    autosize=False,
    width=500,
    height=500
)

fig = make_subplots(
    rows=2, cols=2, subplot_titles=("Daily cases", "Hospital IN", "Deaths", "Nb of tests")
)


# Add traces
fig.add_trace(go.Scatter(x=dfsum_case.index, y=dfsum_case["CASES"],name="daily case"), row=1, col=1)
fig.add_trace(go.Scatter(x=dfsum_hosp.index, y=dfsum_hosp["TOTAL_IN"],name="Total In Hospital"), row=1, col=2)
fig.add_trace(go.Scatter(x=dfsum_dead.index, y=dfsum_dead["DEATHS"],name="daily death"), row=2, col=1)
fig.add_trace(go.Scatter(x=dfsum_test.index, y=dfsum_test["TESTS_ALL"],name="daily tests"), row=2, col=2)

# Update title and height
fig.update_layout(title_text="Covid19 data for Belgium", height=700, width=1000)
#fig.write_html("figures/brut_data.html")
fig.write_image("figures/brut_data.jpeg")
fig.show()

### Smooth the data

The data from Sciensano have a lot of variation from one day to the next. This can be
explained by the fact that the data was collected over a seven-day period. According to the day of week on which
the report falls, some days tend to be underreported. To reduce that, we use a moving average window of order 7 to
smooth the data.

In [75]:
dfsum_case["CASES"] = dfsum_case["CASES"].rolling(7,min_periods=1).mean()
dfsum_dead["DEATHS"] = dfsum_dead["DEATHS"].rolling(7,min_periods=1).mean()
dfsum_test["TESTS_ALL"] = dfsum_test["TESTS_ALL"].rolling(7,min_periods=1).mean()
dfsum_hosp["TOTAL_IN"] = dfsum_hosp["TOTAL_IN"].rolling(7,min_periods=1).mean()
dfsum_case.CASES = dfsum_case.CASES.round()
dfsum_dead.DEATHS = dfsum_dead.DEATHS.round()
dfsum_test.TESTS_ALL = dfsum_test.TESTS_ALL.round()
dfsum_hosp.TOTAL_IN = dfsum_hosp.TOTAL_IN.round()

In [76]:
from plotly.subplots import make_subplots
import plotly.graph_objects as go

fig = make_subplots(
    rows=2, cols=2, subplot_titles=("Daily cases", "Hospital IN", "Deaths", "Nb of tests")
)

# Add traces
fig.add_trace(go.Scatter(x=dfsum_case.index, y=dfsum_case["CASES"],name="daily case"), row=1, col=1)
fig.add_trace(go.Scatter(x=dfsum_hosp.index, y=dfsum_hosp["TOTAL_IN"],name="Total In Hospital"), row=1, col=2)
fig.add_trace(go.Scatter(x=dfsum_dead.index, y=dfsum_dead["DEATHS"],name="daily death"), row=2, col=1)
fig.add_trace(go.Scatter(x=dfsum_test.index, y=dfsum_test["TESTS_ALL"],name="daily tests"), row=2, col=2)

# Update title and height
fig.update_layout(title_text="smoothed Covid19 data for Belgium", height=700)
fig.write_html("figures/smooth_data.html")
fig.write_image("figures/smooth_data.jpeg")
fig.show()

Variable available across time :
- from HOSP :           HOSP_DATE, HOSP_NEW_IN, HOSP_NEW_OUT, HOSP_NR_REPORTING, HOSP_TOTAL_IN, HOSP_TOTAL_IN_ECMO, HOSP_TOTAL_IN_ICU, HOSP_TOTAL_IN_RESP.
- from MORT :           MORT_DATE, MORT_DEATHS.
- from CASES_AGESEX :   CASES_AGESEX_DATE, CASES_AGESEX_CASES.
- from tests :          tests_DATE, tests_TESTS_ALL, tests_TESTS_ALL_POS.
- from VACC :           VACC_DATE, VACC_COUNT, VACC_DOSEA, VACC_DOSEB, VACC_DOSEC, VACC_DOSEE.

Beware that each 
***
## 2. Infected population estimation : 
The goal is to better estimate the infected population begin underestimated in the dataset.  
***

We use the number of new cases in the hospital to adjust the number of infected people:
$$ I(t) = Total\_in\_hospital(t) * k $$ 
We utilize data from a period when the test strategy was better than it was at the onset of the pandemic to discover the best k. The time span between October 16, 2021 and December 28, 2021 appears to be ideal. The number of daily tests is important and it is just before the start of Omicron variant that change in term of spread.

In [77]:
t0_k = '10/16/2021'
t1_k = '12/28/2021'
range_T0_T1_k = pd.date_range(start=t0_k, end=t1_k).to_list()
lst_range_k = []
for time in range_T0_T1_k:
    lst_range_k.append(time.strftime("%Y-%m-%d"))

In [78]:
defsum_case_t0_t1_k = dfsum_case.loc[lst_range_k]
defsum_hosp_t0_t1_k = dfsum_hosp.loc[lst_range_k]

In [79]:
defsum_hosp_t0_t1_k_array = np.array(defsum_hosp_t0_t1_k["TOTAL_IN"])
defsum_case_t0_t1_k_array = np.array(defsum_case_t0_t1_k["CASES"])

In [80]:
print("min of k:",np.min(defsum_case_t0_t1_k_array/defsum_hosp_t0_t1_k_array))
print("max of k:",np.max(defsum_case_t0_t1_k_array/defsum_hosp_t0_t1_k_array))
print("mean of k:",np.mean(defsum_case_t0_t1_k_array/defsum_hosp_t0_t1_k_array))

min of k: 2.5537400145243283
max of k: 5.7788309636650865
mean of k: 4.508676598765015


In [81]:
fig1 = go.Figure()
fig1.add_trace(go.Scatter(x=defsum_case_t0_t1_k.index, y=defsum_case_t0_t1_k["CASES"], name="Cases"))
fig1.add_trace(go.Scatter(x=defsum_hosp_t0_t1_k.index, y=defsum_hosp_t0_t1_k["TOTAL_IN"], name="Hospitalisation k =1"))
fig1.add_trace(go.Scatter(x=defsum_hosp_t0_t1_k.index, y=defsum_hosp_t0_t1_k["TOTAL_IN"]*2.6, name="Hospitalisation k =2.6"))
fig1.add_trace(go.Scatter(x=defsum_hosp_t0_t1_k.index, y=defsum_hosp_t0_t1_k["TOTAL_IN"]*4.5, name="Hospitalisation k =4.5"))
fig1.add_trace(go.Scatter(x=defsum_hosp_t0_t1_k.index, y=defsum_hosp_t0_t1_k["TOTAL_IN"]*5.8, name="Hospitalisation k =5.8"))
fig1.update_layout(legend=dict(
    yanchor="top",
    y=0.99,
    xanchor="left",
    x=0.01
))
fig1.write_html("figures/adjusted_infected.html")
fig1.update_layout(title_text="Covid19: Adjust the infected population number", height=700, width=1000)
fig1.write_image("figures/adjusted_data.jpeg")
fig1.show()

In [82]:
fig1 = go.Figure()
fig1.add_trace(go.Scatter(x=dfsum_case.index, y=dfsum_case["CASES"], name="Cases"))
fig1.add_trace(go.Scatter(x=dfsum_hosp.index, y=np.round(dfsum_hosp["TOTAL_IN"]*5.8), name="adapted number of cases"))
fig1.write_html("figures/adjusted_cases.html")
fig1.update_layout(title_text="Covid19: Adjust the infected population number",height=700, width=1000)
fig1.write_image("figures/adjusted_comparaison.jpeg")
fig1.show()

***
## 3) Fit Data :
The goal here is to fit the beta coefficient $\beta(t)$ with the data across time.
You will need to improve your model from the basic SIR model to a SEIR model that account for exposed population group. This group is infected but not contagious for a period of time. 
***

In [83]:
def SIERmodel(S0,E0,I0,R0,beta,alpha,gamma,dt,N):
    """
    inspired from https://towardsdatascience.com/social-distancing-to-slow-the-coronavirus-768292f04296
    :param S0: (int)initial value of Susceptible
    :param E0: (int)initial value of Exposed
    :param I0: (int)initial value of Infected
    :param R0: (int)initial value of Removed
    :param beta: (array) beta values
    :param alpha: (float) alpha value
    :param gamma: (float) gamma value
    :param dt: (int) time between two value (nb of day)
    :param N: (int) nb of days to calculate
    :return: (tuples of array) Susceptible(S),Exposed(E),Infected(I),Removed(R)
    """
    S = np.zeros(N)
    E = np.zeros(N)
    I = np.zeros(N)
    R = np.zeros(N)
    S[0] = S0
    E[0] = E0
    I[0] = I0
    R[0] = R0
    for i in range(0,N-1) : 
        S[i+1] = S[i] - (beta[i]*S[i]*I[i])*dt
        E[i+1] = E[i] + (beta[i]*S[i]*I[i] - alpha*E[i])*dt
        I[i+1] = I[i] + (alpha*E[i] - gamma*I[i])*dt
        R[i+1] = R[i] + (gamma*I[i])*dt
    return S,E,I, R

In [84]:
k = 5.8
N_pop = 11492641  #value from https://en.wikipedia.org/wiki/Belgium
t0 = '3/15/2020'   
t1 = '2/28/2021'
range_T0_T1 = pd.date_range(start=t0, end=t1).to_list()
N=len(range_T0_T1) 

lst_range = []
for time in range_T0_T1:
    lst_range.append(time.strftime("%Y-%m-%d"))

In [85]:
defsum_case_t0_t1 = dfsum_case.loc[lst_range]
defsum_hosp_t0_t1 = dfsum_hosp.loc[lst_range]
defsum_vacc_t0_t1 = dfsum_vacc.loc[lst_range]
defsum_dead_t0_t1 = dfsum_dead.loc[lst_range]
defsum_test_t0_t1 = dfsum_test.loc[lst_range]

In [86]:
infected_t0_t1 = np.round(np.array(defsum_hosp_t0_t1["TOTAL_IN"] *k))

## Compute Beta

To calculate the beta : 
$$ S(i+1) - S(i) = - INFECTION_{ESTIMATION}(i)  = - \beta * S(i) * I(i) $$


$$\beta [i] = \frac{ INFECTION_{ESTIMATION}(i)}{S[i]*I[i]}$$

For any system of ODE’s, we need to provide initial values. We’ll use normalized population values for our S_0, E_0, etc. We use the following arbitrary values for our model because we don't know what the starting values were.

In [87]:
I0 = 15000/N_pop
R0 = 15000/N_pop
E0 = 15000/N_pop
S0 = (1 - I0 - E0 - R0)
alpha = 0.2  #value from https://towardsdatascience.com/social-distancing-to-slow-the-coronavirus-768292f04296
gamma = 0.5  #value from https://towardsdatascience.com/social-distancing-to-slow-the-coronavirus-768292f04296
dt = 1

In [88]:
def calcul_beta(S0,E0,I0,R0,alpha,gamma,dt,N,infection_estimation):
    """
    inspired from https://towardsdatascience.com/social-distancing-to-slow-the-coronavirus-768292f04296
    
    :param S0: (int)initial value of Susceptible
    :param E0: (int)initial value of Exposed
    :param I0: (int)initial value of Infected
    :param R0: (int)initial value of Removed
    :param alpha: (float) alpha value
    :param gamma: (float) gamma value
    :param dt: (int) time between two value (nb of day)
    :param N: (int) nb of days to calculate
    :param infection_estimation: (array) Estimation of number of infection for each day
    :return:(array) beta values
    """
    S = np.zeros(N)
    E = np.zeros(N)
    I = np.zeros(N)
    R = np.zeros(N)
    S[0] = S0
    E[0] = E0
    I[0] = I0
    R[0] = R0
    beta = np.zeros(N)
    for i in range(0,N-1) : 
        beta[i] = infection_estimation[i]/(S[i]*I[i])
        S[i+1] = S[i] - (beta[i]*S[i]*I[i])*dt
        E[i+1] = E[i] + (beta[i]*S[i]*I[i] - alpha*E[i])*dt
        I[i+1] = I[i] + (alpha*E[i] - gamma*I[i])*dt
        R[i+1] = R[i] + (gamma*I[i])*dt
    return beta

In [89]:
beta = calcul_beta(S0,E0,I0,R0,alpha,gamma,dt,N,infected_t0_t1/N_pop)

In [90]:
fig1 = go.Figure()
fig1.add_trace(go.Scatter(x=lst_range[:-1], y=beta[:-1], name="beta"))

fig1.update_layout(title_text="Covid19: beta between 15/03/2020 and 28/02/2021",height=700, width=1000)
fig1.write_html("figures/SIR_data.html")
fig1.write_image("figures/beta_calculated.jpeg")
fig1.show()

In [91]:
print("mean of beta",np.mean(beta))
print("std of beta",np.std(beta))

mean of beta 0.6456846847891323
std of beta 0.22224480208621505


In [92]:
S,E,I,R = SIERmodel(S0,E0,I0,R0,beta,alpha,gamma,dt,N)

In [93]:
fig1 = go.Figure()
fig1.add_trace(go.Scatter(x=lst_range, y=I, name="infected"))
fig1.add_trace(go.Scatter(x=lst_range, y=S, name="susceptible"))
fig1.add_trace(go.Scatter(x=lst_range, y=R, name="removed"))
fig1.add_trace(go.Scatter(x=lst_range, y=E, name="Exposed"))


fig1.update_layout(legend=dict(
    yanchor="top",
    y=0.99,
    xanchor="right",
    x=0.99
))

fig1.update_layout(title_text="Covid19: SEIR model between 15/03/2020 and 28/02/2021",height=700, width=1000)
fig1.write_html("figures/SIR_data.html")
fig1.write_image("figures/SEIR_T0_T1.jpeg")
fig1.show()

### Compute $R_{0}$

The reproduction rate may be defined as the average number of persons infected by a single person. However, this number $R_{0}$ fluctuates over time and may be calculated using the following expression: $R_{0}=\frac{\beta}{\gamma}$

In [94]:
def R_0_calculate(beta, gamma):
    """
    :param beta: (array) beta values
    :param gamma: (float) gamma value
    :return: (array) RO value
    """
    return beta / gamma

In [95]:
R_0 = R_0_calculate(beta, gamma)

In [96]:
fig1 = go.Figure()
fig1.add_trace(go.Scatter(x=lst_range[:-1], y=R_0[:-1], name="beta"))

fig1.update_layout(title_text="Covid19: reproduction rate(R_0) between 15/03/2020 and 28/02/2021",height=700, width=1000)
fig1.write_html("figures/SIR_data.html")
fig1.write_image("figures/reproduction_rate.jpeg")
fig1.show()

***
## 4) Predict the future : 
The goal is to make prediction about the evolution of beta after $t_1$
*** 

We will test different scenario for the beta after $t_{1}$. I will try three different scenarios. The first one is that the beta keeps on growing according to a linear function. It corresponds to no action by a government.

In [113]:
t1 = '2/27/2021'
t2 = '5/28/2021'
range_T1_T2 = pd.date_range(start=t1, end=t2).to_list()
N_T1_T2 = len(range_T1_T2) 

lst_range_T1_T2 = []
for time in range_T1_T2:
    lst_range_T1_T2.append(time.strftime("%Y-%m-%d"))

In [114]:
range_T0_T2 = pd.date_range(start=t0, end=t2).to_list()

N_T0_T2 = len(range_T0_T2) 

lst_range_T0_T2 = []
for time in range_T0_T2:
    lst_range_T0_T2.append(time.strftime("%Y-%m-%d"))

In [115]:
defsum_case_t0_t2 = dfsum_case.loc[lst_range_T0_T2]
defsum_hosp_t0_t2 = dfsum_hosp.loc[lst_range_T0_T2]
defsum_vacc_t0_t2 = dfsum_vacc.loc[lst_range_T0_T2]
defsum_dead_t0_t2 = dfsum_dead.loc[lst_range_T0_T2]
defsum_test_t0_t2 = dfsum_test.loc[lst_range_T0_T2]

In [116]:
infected_t0_t2 = np.round(np.array(defsum_hosp_t0_t2["TOTAL_IN"] *k))
beta_T0_T2 = calcul_beta(S0,E0,I0,R0,alpha,gamma,dt,N_T0_T2,infected_t0_t2/N_pop)

### Prediction beta : linear

In [117]:
from sklearn.linear_model import LinearRegression

model = LinearRegression()

x = np.arange(-19,0,1)
x= x.reshape(-1, 1)

model.fit(x,beta[-20:-1])
new_x = np.arange(0,N_T1_T2)
new_x = new_x.reshape(-1, 1)
new_beta_1 = model.predict(new_x)
intercept = beta[-2] - new_beta_1[0]
new_beta_1 += intercept

### Constant beta

In [118]:
new_beta_2 = np.ones(N_T1_T2)*beta[-2]

### Second order Beta

In [119]:
low_bound = lst_range.index("2020-09-18")
upper_bound = lst_range.index("2020-10-10")

low_bound_1 = lst_range.index("2020-10-10")
upper_bound_1 = lst_range.index("2020-11-07")

low_bound_2 = lst_range.index("2020-11-07")
upper_bound_2 = lst_range.index("2020-11-15")


In [120]:
middle = upper_bound-low_bound
x = np.arange(-middle//2,middle//2,1)
p = np.polyfit(x,beta[low_bound:upper_bound],2)
nb_days = len(x)

In [121]:
beta_3 = np.polyval(p,x)
intercept = beta[-2] - beta_3[0]
beta_3 += intercept

In [122]:
middle_1 = upper_bound_1-low_bound_1
x_1 = np.arange(-middle_1//2,middle_1//2,1)
p_1 = np.polyfit(x_1,beta[low_bound_1:upper_bound_1],2)
nb_days_1 = len(x_1)

In [123]:
beta_3_1 = np.polyval(p_1,x_1)
intercept_1 = beta_3[-1] - beta_3_1[0]
beta_3_1 += intercept_1

In [124]:
middle_2 = upper_bound_2-low_bound_2
x_2 = np.arange(-middle_2//2,middle_2//2,1)
p_2 = np.polyfit(x_2,beta[low_bound_2:upper_bound_2],1)

In [125]:
nb_last_days = N_T1_T2 - nb_days - nb_days_1
x_3 = np.arange(-nb_last_days//2,nb_last_days//2,1)
beta_3_2 = np.polyval(p_2,x_3)
intercept_2 = beta_3_1[-1] - beta_3_2[0]
beta_3_2 += intercept_2

In [126]:
beta_3_concatenate = np.concatenate([beta_3,beta_3_1[1:],beta_3_2[1:]])

In [127]:
fig1 = go.Figure()
fig1.add_trace(go.Scatter(x=lst_range_T0_T2[:-1], y=beta_T0_T2[:-1], name="real value after T1"))
fig1.add_trace(go.Scatter(x=lst_range[:-1], y=beta[:-1], name="beta before T1"))
fig1.add_trace(go.Scatter(x=lst_range_T1_T2, y=new_beta_1, name="beta 1 prediction"))
fig1.add_trace(go.Scatter(x=lst_range_T1_T2, y=new_beta_2, name="beta 2 predcition"))
fig1.add_trace(go.Scatter(x=lst_range_T1_T2, y=beta_3_concatenate, name="beta 2 prediction"))

fig1.update_layout(legend=dict(
    yanchor="top",
    y=0.99,
    xanchor="right",
    x=0.85
))
fig1.update_layout(title_text="Covid19: beta prediction between 28/02/2021 and 28/05/2021",height=700, width=1000)
fig1.write_html("figures/beta_prediction.html")
fig1.write_image("figures/beta_prediction.jpeg")
fig1.show()

In [128]:
I0 = I[-1]
R0 = R[-1]
E0 = E[-1]
S0 = (1 - I0 - E0 - R0)
alpha = 0.2
gamma = 0.5
dt = 1

In [129]:
S_beta_1,E_beta_1,I_beta_1,R_beta_1 = SIERmodel(S0,E0,I0,R0,new_beta_1,alpha,gamma,dt,N_T1_T2-1)
S_beta_2,E_beta_2,I_beta_2,R_beta_2 = SIERmodel(S0,E0,I0,R0,new_beta_2,alpha,gamma,dt,N_T1_T2-1)
S_beta_3,E_beta_3,I_beta_3,R_beta_3 = SIERmodel(S0,E0,I0,R0,beta_3_concatenate,alpha,gamma,dt,N_T1_T2-1)

In [130]:
I0 = 15000/N_pop
R0 = 15000/N_pop
E0 = 15000/N_pop
S0 = (1 - I0 - E0 - R0)
alpha = 0.2
gamma = 0.5
dt = 1

In [131]:
S,E,I,R = SIERmodel(S0,E0,I0,R0,beta_T0_T2,alpha,gamma,dt,N_T0_T2)

In [133]:
fig1 = go.Figure()
fig1.add_trace(go.Scatter(x=lst_range_T0_T2, y=I, name="infected"))
fig1.add_trace(go.Scatter(x=lst_range_T0_T2, y=S, name="susceptible"))
fig1.add_trace(go.Scatter(x=lst_range_T0_T2, y=R, name="removed"))
fig1.add_trace(go.Scatter(x=lst_range_T0_T2, y=E, name="Exposed"))

fig1.add_trace(go.Scatter(x=lst_range_T1_T2, y=I_beta_1, name="infected beta 1"))
fig1.add_trace(go.Scatter(x=lst_range_T1_T2, y=S_beta_1, name="susceptible beta 1"))
fig1.add_trace(go.Scatter(x=lst_range_T1_T2, y=R_beta_1, name="removed beta 1"))
fig1.add_trace(go.Scatter(x=lst_range_T1_T2, y=E_beta_1, name="Exposed beta 1"))


fig1.add_trace(go.Scatter(x=lst_range_T1_T2, y=I_beta_2, name="infected beta 2"))
fig1.add_trace(go.Scatter(x=lst_range_T1_T2, y=S_beta_2, name="susceptible beta 2"))
fig1.add_trace(go.Scatter(x=lst_range_T1_T2, y=R_beta_2, name="removed beta 2"))
fig1.add_trace(go.Scatter(x=lst_range_T1_T2, y=E_beta_2, name="Exposed beta 2"))


fig1.add_trace(go.Scatter(x=lst_range_T1_T2, y=I_beta_3, name="infected beta 3"))
fig1.add_trace(go.Scatter(x=lst_range_T1_T2, y=S_beta_3, name="susceptible beta 3"))
fig1.add_trace(go.Scatter(x=lst_range_T1_T2, y=R_beta_3, name="removed beta 3"))
fig1.add_trace(go.Scatter(x=lst_range_T1_T2, y=E_beta_3, name="Exposed beta 3"))


fig1.update_layout(title_text="Covid19: t0 to t1",height=700, width=1000)
fig1.write_html("figures/SIR_data.html")

fig1.write_image("figures/SEIR_prediction.jpeg")
fig1.show()

***
## 5) Improve the model to take into account additional features.
***

Vaccinating the populace will help to slow the spread of the infection. Indeed, vaccinated people will no longer contract it, and the number of Susceptible, Exposed, and Infected persons will fall daily. It also influences the transmission rate. The number of Susceptible people will then be: $S = N − E − I − R − V_{accinated}$. 

In [None]:
def calcul_beta_vaccinated(S0,E0,I0,R0,alpha,gamma,dt,N,infection_estimation,vaccinated):
    """
    inspired from https://towardsdatascience.com/social-distancing-to-slow-the-coronavirus-768292f04296
    
    :param S0: (int)initial value of Susceptible
    :param E0: (int)initial value of Exposed
    :param I0: (int)initial value of Infected
    :param R0: (int)initial value of Removed
    :param alpha: (float) alpha value
    :param gamma: (float) gamma value
    :param dt: (int) time between two value (nb of day)
    :param N: (int) nb of days to calculate
    :param infection_estimation: (array) Estimation of number of infection for each day
    :param vaccinated: (array) nb of person vaccinated for each day
    :return:(array) beta values
    """
    S = np.zeros(N)
    E = np.zeros(N)
    I = np.zeros(N)
    R = np.zeros(N)
    S[0] = S0
    E[0] = E0
    I[0] = I0
    R[0] = R0
    beta = np.zeros(N)
    for i in range(0,N-1) : 
        beta[i] = infection_estimation[i]/(S[i]*I[i])
        S[i+1] = S[i] - (beta[i]*S[i]*I[i])*dt
        E[i+1] = E[i] + (beta[i]*S[i]*I[i] - alpha*E[i])*dt
        I[i+1] = I[i] + (alpha*E[i] - gamma*I[i])*dt
        R[i+1] = R[i] + (gamma*I[i])*dt
    return beta

In [134]:
def SIERmodel(S0,E0,I0,R0,beta,alpha,gamma,dt,N,vaccinated):
    """
    inspired from https://towardsdatascience.com/social-distancing-to-slow-the-coronavirus-768292f04296
    :param S0: (int)initial value of Susceptible
    :param E0: (int)initial value of Exposed
    :param I0: (int)initial value of Infected
    :param R0: (int)initial value of Removed
    :param beta: (array) beta values
    :param alpha: (float) alpha value
    :param gamma: (float) gamma value
    :param dt: (int) time between two value (nb of day)
    :param N: (int) nb of days to calculate
    :param vaccinated: (array) nb of person vaccinated for each day
    :return: (tuples of array) Susceptible(S),Exposed(E),Infected(I),Removed(R)
    """
    S = np.zeros(N)
    E = np.zeros(N)
    I = np.zeros(N)
    R = np.zeros(N)
    S[0] = S0
    E[0] = E0
    I[0] = I0
    R[0] = R0
    for i in range(0,N-1) : 
        S[i+1] = S[i] - (beta[i]*S[i]*I[i])*dt
        E[i+1] = E[i] + (beta[i]*S[i]*I[i] - alpha*E[i])*dt
        I[i+1] = I[i] + (alpha*E[i] - gamma*I[i])*dt
        R[i+1] = R[i] + (gamma*I[i])*dt
    return S,E,I, R

In [None]:
dfsum_vacc