Goal:
- Allow the spatial model to include local vaccinations
    - From the data
    - From a prediction into the future (many scenarios possible, but keep it simple). This is included in the `__call__` function of the `make_vaccination_function`, 
- Allow the model to include local measures (no priority)
- Allow the model to include local spread of VOCs (no priority)
- Main question: what is the distribution of hospitalisations in Belgium and how is this linked to the degree of vaccination?
- Side questions
    - How does the increase in Brussels affect the rest of the country?
    - How can this be alleviated with local measures? How is this different from national measures?
    - How can this be alleviated with isolation of Brussels?
    - How would the situation have been if Brussels was vaccinated as the rest-of-the-country average? And if Belgium was vaccinated as Brussels? 

**Remarks**

- Why are the vaccine-eligible subjects only those from class S and R? Would make sense to also include E and A
- Why do subjects from R become dS_v?
- `dD_v = (m_ICU/dICU_D)*ICU_v + (m_C/dc_D)*C_v` seems incomplete, unless the chance of dying once a subject is hospitalised is the same with or without vaccination. Not sure if there is data available on this.
- Note: `N_vacc` must be age and space stratified. It is not yet obvious how this is implemented into the main model (we'd have to dig deep into the code). The same problem goes with `Nc` if we are to stratify this spatially (per province).
- What type of vaccination does `get_public_spatial_vaccination_data` return? First vaccination, I suppose?

# Load packages and functions

In [None]:
# Preamble
import corner
import pandas as pd
import json
import numpy as np
from IPython.display import IFrame
import matplotlib.pyplot as plt

from covid19model.models import models
from covid19model.models.time_dependant_parameter_fncs import make_mobility_update_function, make_contact_matrix_function, make_VOC_function, make_vaccination_function
from covid19model.models.utils import initial_state
# from covid19model.optimization.run_optimization import checkplots, calculate_R0
# from covid19model.optimization.objective_fcns import prior_custom, prior_uniform
from covid19model.data import mobility, sciensano, model_parameters
# from covid19model.optimization import pso, objective_fcns
from covid19model.visualization.output import _apply_tick_locator 
from covid19model.visualization.optimization import autocorrelation_plot, traceplot, plot_fit, plot_calibration_fit
from covid19model.visualization.utils import moving_avg
from covid19model.optimization.utils import perturbate_PSO, run_MCMC
from covid19model.optimization.utils import samples_dict_to_emcee_chain # used for corner plots
import random
import datetime

# OPTIONAL: Load the "autoreload" extension so that package code can change
%load_ext autoreload
# OPTIONAL: always reload modules so that as you change code in src, it gets loaded
# This may be useful because the `covid19model` package is under construction
%autoreload 2

# Copied from mrollier-calibration-WAVE2_spatial.ipynb
import gc
import sys, getopt
import os
import ujson as json
import random
import emcee
import datetime
import corner
import argparse
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import multiprocessing as mp
from multiprocessing import Pool

from covid19model.models import models
from covid19model.optimization.objective_fcns import prior_custom, prior_uniform
from covid19model.data import mobility, sciensano, model_parameters, VOC
from covid19model.optimization import pso, objective_fcns
from covid19model.models.time_dependant_parameter_fncs import ramp_fun
from covid19model.visualization.output import _apply_tick_locator 
from covid19model.visualization.optimization import autocorrelation_plot, traceplot
from covid19model.models.utils import initial_state


agg='prov'

In [None]:
# Total population and contact matrices
initN, Nc_all = model_parameters.get_integrated_willem2012_interaction_matrices(spatial=agg)

# Google Mobility data (for social contact Nc)
df_google = mobility.get_google_mobility_data(update=False)

# Load and format mobility dataframe (for mobility place)
proximus_mobility_data, proximus_mobility_data_avg = mobility.get_proximus_mobility_data(agg, dtype='fractional', beyond_borders=False)
# Converting the index as date. Probably not necessary because it's already done.
# all_mobility_data.index = pd.to_datetime(all_mobility_data.index)

# Load and format national VOC data (for time-dependent VOC fraction)
df_VOC_abc = VOC.get_abc_data()

# Load and format local vaccination data
public_spatial_vaccination_data = sciensano.get_public_spatial_vaccination_data(update=False,agg=agg)

params = model_parameters.get_COVID19_SEIRD_parameters(spatial=agg, vaccination=True,VOC=True)

values = 'hospitalised_IN'
df_sciensano = sciensano.get_sciensano_COVID19_data_spatial(agg=agg, values=values, moving_avg=False)


### TIME-DEPENDENT FUNCTIONS ###
# Define time-dependent social contact matrix
contact_matrix_function = make_contact_matrix_function(df_google, Nc_all)
ramp_fun = make_contact_matrix_function(df_google, Nc_all).ramp_fun
# Covers the entire pandemic period
policies_WAVE2_no_relaxation = make_contact_matrix_function(df_google, Nc_all).policies_WAVE2_no_relaxation

# Mobility update function from class __call__ and function wrapper to get the right signature
mobility_wrapper_function = make_mobility_update_function(proximus_mobility_data, proximus_mobility_data_avg).mobility_wrapper_func

# function to update VOC fraction
VOC_function = make_VOC_function(df_VOC_abc)

# function to update local vaccination level (first dose only)
vaccination_function = make_vaccination_function(public_spatial_vaccination_data, spatial=True)
sciensano_spatial_first_dose = vaccination_function.get_sciensano_spatial_first_dose

# Visualise raw vaccination data

In [None]:
# Visualise vaccination data per age
fig, ax = plt.subplots(figsize=(12,4))
# public_spatial_vaccination_data[['CUMULATIVE']].groupby(['start_week', 'age']).sum().plot(ax=ax)

df = public_spatial_vaccination_data[['CUMULATIVE']].groupby(['start_week','age']).sum()
df.unstack().plot(kind='area', stacked=True, ax=ax)
handles, labels = ax.get_legend_handles_labels()
ax.legend(handles[::-1], labels[::-1], loc='upper left')
ax.grid(False)
ax.set_xlabel(None)
ax.set_ylabel('First vaccination per age')

In [None]:
# Visualise vaccination data per province
fig, ax = plt.subplots(figsize=(12,5))
# public_spatial_vaccination_data[['CUMULATIVE']].groupby(['start_week', 'age']).sum().plot(ax=ax)

df = public_spatial_vaccination_data[['CUMULATIVE']].groupby(['start_week','NIS']).sum()
df.unstack().plot(kind='area', stacked=True, ax=ax)
handles, labels = ax.get_legend_handles_labels()
ax.legend(handles[::-1], labels[::-1], loc='upper left')
ax.grid(False)
ax.set_xlabel(None)
ax.set_ylabel('First vaccination per province')

# Visualise VOCs

In [None]:
### EXTRA -- may delete ###

# ------------------
# Visualize variants
# ------------------

# Call VOC function and save data
dates = pd.date_range(start = '2020-09-01', end = '2021-09-01', freq='D')
wt=[]
alpha=[]
delta=[]
for date in dates:
    wt.append(VOC_function(date, {}, {})[0])
    alpha.append(VOC_function(date, {}, {})[1])
    delta.append(VOC_function(date, {}, {})[2])
        
# Make figure
fig,ax=plt.subplots(figsize=(12,4))
ax.plot(dates,np.array(wt)*100, color='black')
ax.plot(dates,np.array(alpha)*100, color='red')
ax.plot(dates,np.array(delta)*100, color='green')
ax.set_ylabel('Prevalence of variant (%)')
ax.legend(['WT',r'alpha, beta, gamma','delta'], bbox_to_anchor=(.04, .5), loc='center left', fontsize=14)
ax.grid(False)

# plt.savefig('VOC_changes.png', dpi=400, bbox_inches='tight')

# Visualise processed vaccination data

In [None]:
# Per province

dates_vacc = pd.date_range(start = '2020-09-01', end = '2021-09-01', freq='W-MON')
vaccs_prov=[]
for date in dates_vacc:
    vacc_prov = sciensano_spatial_first_dose(date).sum(axis=1)
    vaccs_prov.append(vacc_prov)
vaccs_prov=np.array(vaccs_prov).cumsum(axis=0)

fig, axs = plt.subplots(1, 2, figsize=(16,5), sharey=True)
axs[0].grid(False)
labels = ['Antwerpen', 'Vlaams-Brabant', 'Waals-Brabant', 'Brussels', 'West-Vlaanderen', 'Oost-Vlaanderen',\
          'Hainaut', 'Liège', 'Limburg', 'Luxembourg', 'Namur']
axs[0].stackplot(dates_vacc, vaccs_prov[:,0],\
             vaccs_prov[:,1],\
             vaccs_prov[:,2],\
             vaccs_prov[:,3],\
             vaccs_prov[:,4],\
             vaccs_prov[:,5],\
             vaccs_prov[:,6],\
             vaccs_prov[:,7],\
             vaccs_prov[:,8],\
             vaccs_prov[:,9],\
             vaccs_prov[:,10],\
             labels=labels)
handles, labels = axs[0].get_legend_handles_labels()
axs[0].legend(handles[::-1], labels[::-1], loc='upper left')
axs[0].tick_params('x', labelrotation=30)
axs[0].set_title('First vaccination dose per province')

# Per age

vaccs = []
for date in dates_vacc:
    vacc = sciensano_spatial_first_dose(date).sum(axis=0)
    vaccs.append(vacc)
vaccs=np.array(vaccs).cumsum(axis=0)

axs[1].grid(False)
labels = ['0-9', '10-19', '20-29', '30-39', '40-49', '50-59', '60-69', '70-79', '80+']
axs[1].stackplot(dates_vacc, vaccs[:,0],\
             vaccs[:,1],\
             vaccs[:,2],\
             vaccs[:,3],\
             vaccs[:,4],\
             vaccs[:,5],\
             vaccs[:,6],\
             vaccs[:,7],\
             vaccs[:,8],\
             labels=labels)
handles, labels = axs[1].get_legend_handles_labels()
axs[1].legend(handles[::-1], labels[::-1], loc='upper left')
axs[1].tick_params('x', labelrotation=30)
axs[1].set_title('First vaccination dose per age class')

# plt.savefig('vaccinations-per-province-and-age.png', dpi=400, bbox_inches='tight')

In [None]:
# Visualise vaccination degree per region (wallonia, Flanders, Brussels)

inhabitants = pd.read_csv('../../data/raw/GIS/inhabitants.csv')
pop_flanders = inhabitants[inhabitants['NIS']==2000]['inhabitants'].values[0]
pop_wallonia = inhabitants[inhabitants['NIS']==3000]['inhabitants'].values[0]
pop_brussels = inhabitants[inhabitants['NIS']==4000]['inhabitants'].values[0]

vaccs_flanders = vaccs_prov[:,0] + vaccs_prov[:,1] + vaccs_prov[:,4] + vaccs_prov[:,5] + vaccs_prov[:,8]
vaccs_wallonia = vaccs_prov[:,2] + vaccs_prov[:,6] + vaccs_prov[:,7] + vaccs_prov[:,9] + vaccs_prov[:,10]
vaccs_brussels = vaccs_prov[:,3]

rel_vaccs_flanders = (vaccs_flanders / pop_flanders)*100
rel_vaccs_wallonia = (vaccs_wallonia / pop_wallonia)*100
rel_vaccs_brussels = (vaccs_brussels / pop_brussels)*100

fig, ax = plt.subplots(figsize=(16,5))
ax.grid(False)
labels = ['Flanders', 'Wallonia', 'Brussels']
ax.plot(dates_vacc, rel_vaccs_flanders, label=labels[0])
ax.plot(dates_vacc, rel_vaccs_wallonia ,label=labels[1])
ax.plot(dates_vacc, rel_vaccs_brussels ,label=labels[2])
ax.legend(loc='upper left', fontsize=24)
ax.tick_params('x', labelrotation=30)
ax.set_title('Population percentage of first vaccination dose per region (all ages)')
ax.set_ylabel('Percentage')

plt.savefig('relative-vaccination-per-region.png', dpi=400, bbox_inches='tight')

# Visualise hospitalisations per province

In [None]:
values = 'hospitalised_IN'
df_sciensano = sciensano.get_sciensano_COVID19_data_spatial(agg=agg, values=values, moving_avg=False)

fig,ax=plt.subplots(figsize=(16,5))
df_sciensano.plot(kind='area', stacked=True,ax=ax)
ax.grid(False)

handles, labels = ax.get_legend_handles_labels()
ax.legend(handles[::-1], labels[::-1], loc='upper right', fontsize=14)
ax.set_title('Daily new hospitalisations per province')
ax.set_ylabel("new hospitalisations")
ax.set_xlabel(None)

# plt.savefig('daily-new-hospitalisations.png', dpi=400, bbox_inches='tight')

df_sciensano

# Initialise model

In [None]:
# Model initial condition on August 27th

# Problem! These are the initial states for arrondissements.
# We are only interested in provinces for now, so we need to **aggregate**

##############################
# STARTING IN SEPTEMBER 2020 #
##############################

# partial

init_states_name = 'arr_WAVE1_job-FULL_1000xPSO_20000xMCMC_arr_3-index-in-frac_2021-06-10_states_2020-08-27_full-manual-fix.json'
with open(f'../../data/interim/model_parameters/COVID19_SEIRD/calibrations/arr/{init_states_name}', 'r') as fp:
    initial_states = json.load(fp)

initial_states_prov_sept = dict()
for key in initial_states:
    initial_states_prov_sept[key] = np.zeros([11,9])
    initial_states_prov_sept[key][0] = np.array(initial_states[key][0:3]).sum(axis=0) # 1000 Antwerp
    initial_states_prov_sept[key][1] = np.array(initial_states[key][4:6]).sum(axis=0) # 20001 Vlaams-Brabant
    initial_states_prov_sept[key][2] = np.array(initial_states[key][6]) # 20002 Waals-Brabant
    initial_states_prov_sept[key][3] = np.array(initial_states[key][3]) # 21000 Brussel
    initial_states_prov_sept[key][4] = np.array(initial_states[key][7:15]).sum(axis=0) # 30000 West-Vlaanderen
    initial_states_prov_sept[key][5] = np.array(initial_states[key][15:21]).sum(axis=0) # 40000 Oost-Vlaanderen
    initial_states_prov_sept[key][6] = np.array(initial_states[key][21:28]).sum(axis=0) # 50000 Hainaut
    initial_states_prov_sept[key][7] = np.array(initial_states[key][28:33]).sum(axis=0) # 60000 Liege
    initial_states_prov_sept[key][8] = np.array(initial_states[key][33:36]).sum(axis=0) # 70000 Limburg
    initial_states_prov_sept[key][9] = np.array(initial_states[key][36:41]).sum(axis=0) # 80000 Luxembourg
    initial_states_prov_sept[key][10] = np.array(initial_states[key][41:]).sum(axis=0) # 90000 

# Add empty entries for vaccination states
vacc_states = ['S_v', 'E_v', 'I_v', 'A_v', 'M_v', 'C_v', 'C_icurec_v', 'ICU_v', 'R_v']
for state in vacc_states:
    initial_states_prov_sept[state]=np.zeros([11, 9])
    
    
##########################
# STARTING IN MARCH 2020 #
##########################

# complete

age=40
dist='frac'
init_number = 3 # if warmup = 54 and start_date = 2020-03-05
initE = initial_state(dist='frac', agg=agg, age=age, number=init_number) # 40-somethings dropped in Brussels (arrival by plane)
initial_states_prov_march = {'S': initN, 'E': initE}

In [None]:
# Add definition of function that controls seasonality


dates = pd.date_range(start = '2020-03-01', end = '2022-01-01', freq='D')
ref_date = pd.to_datetime('2021-01-01')

# Note that this definition is different from Tijs's definition of seasonality
def seasonality(t, states, param, amplitude, peak_shift):
    # If peak_shift = 0, the max is on the first of January
    maxdate = pd.Timedelta(days=peak_shift) + ref_date
    # One period is one year long (seasonality)
    t = (t - pd.to_datetime(maxdate))/pd.Timedelta(days=1)/365
    return 1+amplitude*np.cos( 2*np.pi*(t))


season_effects = []
amplitude = 0.1
peak_shift = 50 # days
for date in dates:
    season_effect = seasonality(date, {}, {}, amplitude, peak_shift)
    season_effects.append(season_effect)
    
fig, ax = plt.subplots(figsize=(16,3))
ax.plot(dates, season_effects)
ax.grid(False)
ax.set_ylim([0.75, 1.25])
ax.set_ylabel('Rescaling')
ax.axhline(1, color='k', linewidth=1, linestyle='--')
ax.axvline(ref_date, color='k', linewidth=1, linestyle='--')
title=ax.set_title(f'Seasonality with amplitude {amplitude}, phase {peak_shift} days')

In [None]:
# Model set-up

params.update({'l' : 5,
               'prev_schools' : 1,
               'prev_work' : 1,
               'prev_rest_lockdown' : 1,
               'prev_rest_relaxation' : 1,
               'prev_home' : 1,
               'initN' : initN,
               'daily_first_dose' : 60000, # copy default values from vaccination_function, which are curently not used I think
               'delay_immunity' : 21,
               'vacc_order' : [8, 7, 6, 5, 4, 3, 2, 1, 0],
               'stop_idx' : 9,
               'refusal' : [0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3]})
params.update({'season_factor' : 1})
# params.update({'season_factor' : 1,
#                'amplitude' : 0.1,
#                'peak_shift' : 0})

params.update({'beta_R' : 0.015,
               'beta_U' : 0.015,
               'beta_M' : 0.015})

model = models.COVID19_SEIRD_spatial_vacc(initial_states_prov_march, params, spatial=agg,
                        time_dependent_parameters={'Nc' : policies_WAVE2_no_relaxation, 
                                                   'N_vacc' : vaccination_function, 
                                                   'alpha' : VOC_function})
#                                                    'season_factor' : seasonality})

In [None]:
# -------------------
# Perform simulations
# -------------------

n_samples = 1
n_draws_per_sample = 1

warmup = 43 # 0
start_sim = '2020-03-05' # '2020-08-27' # Date of initial conditions
end_sim = '2022-01-01'

print('\n1) Simulating COVID-19 SEIRD '+str(n_samples)+' times')
out = model.sim(end_sim,start_date=start_sim,warmup=warmup)

In [None]:
# Inspect output of single simulation pre-calibration

fig,ax=plt.subplots(figsize=(16,5))

today = pd.Timestamp.today()
ax.axvline(today, color='k', linestyle='--', linewidth=1.5)

comp = 'H_in'

out[comp].sum(dim='Nc').sum(dim='place').plot(ax=ax, label='National simulation')
out[comp].sum(dim='Nc').sel(place=10000).plot(ax=ax, label='Antwerp simulation')
out[comp].sum(dim='Nc').sel(place=21000).plot(ax=ax, label='Brussels simulation')

# reset_index to make it into a DataFrame instead of a Series
df_sciensano.sum(axis=1).reset_index().plot(kind='scatter', x='DATE', y=0, ax=ax, label='National data', color='k', linewidth=1)

ax.set_title('New daily hospitalisations (simulation and data)')
ax.legend(loc='center')
ax.grid(False)
ax.set_ylabel(comp)

# Calibration

In [None]:
# ------------------
# Calibration set-up
# ------------------

# Start data of recalibration ramp
start_calibration = '2020-03-02' # first data available
# Last datapoint used to calibrate warmup and beta
end_calibration = '2021-08-27' # last data available

# Spatial unit: depends on aggregation
spatial_unit = f'{agg}_FULL-CALIBRATION'

# PSO settings
processes = int(os.getenv('SLURM_CPUS_ON_NODE', mp.cpu_count()))
sys.stdout.flush()
multiplier = 10
maxiter = 100
popsize = multiplier*processes

# MCMC settings
max_n = 1000
print_n = 10

# Offset needed to deal with zeros in data in a Poisson distribution-based calibration
poisson_offset = 1

In [None]:
# -------------------------
# Print statement to stdout
# -------------------------

print('\n------------------------------------------')
print('PERFORMING CALIBRATION OF WARMUP and BETAs')
print('------------------------------------------\n')
print('Using data from '+start_calibration+' until '+end_calibration+'\n')
print('1) Particle swarm optimization')
print('~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n')
print(f'Using {str(processes)} cores for a population of {popsize}, for maximally {maxiter} iterations.\n')
sys.stdout.flush()

In [None]:
# --------------
# define dataset
# --------------

# Only use hospitalisation data
data=[df_sciensano[start_calibration:end_calibration]]
states = ["H_in"]
weights = [1]

In [None]:
# CALIBRATION WITHOUT SEASONALITY. Takes a long time (at least 3 minutes per iteration at present). And it seems to get stuck :(

# -----------
# Perform PSO
# -----------

pars = ['warmup','beta_R', 'beta_U', 'beta_M']
bounds=((0.0,50.0),(0.005,0.060), (0.005,0.060), (0.005,0.060))

# run optimisation
theta = pso.fit_pso(model, data, pars, states, bounds, weights=weights, maxiter=maxiter, popsize=popsize, dist='poisson',
                    poisson_offset=poisson_offset, agg=agg, start_date=start_calibration, processes=processes)

# Assign estimate.
# warmup, pars_PSO = assign_PSO(model.parameters, pars, theta)
# model.parameters = pars_PSO

# Perform simulation with best-fit results
# out = model.sim(end_calibration,start_date=start_calibration,warmup=warmup)

# Print statement to stdout once
print(f'\nPSO RESULTS:')
print(f'------------')
print(f'warmup: {warmup}')
print(f'infectivities {pars[1:4]}: {theta[1:4]}.')
# print(f'seasonality {pars[4:6]}: {theta[4:6]}.')
sys.stdout.flush()

In [None]:
# plot result of best-fit

warmup = int(theta[0])
params.update({'beta_R' : theta[1],
               'beta_U' : theta[2],
               'beta_M' : theta[3]})

out = model.sim(end_calibration,start_date=start_calibration,warmup=warmup)

# Inspect output of single simulation pre-calibration

fig,ax=plt.subplots(figsize=(16,5))

today = pd.Timestamp.today()
ax.axvline(today, color='k', linestyle='--', linewidth=1.5)

comp = 'H_in'

out[comp].sum(dim='Nc').sum(dim='place').plot(ax=ax, label='National simulation')
out[comp].sum(dim='Nc').sel(place=10000).plot(ax=ax, label='Antwerp simulation')
out[comp].sum(dim='Nc').sel(place=21000).plot(ax=ax, label='Brussels simulation')

# reset_index to make it into a DataFrame instead of a Series
df_sciensano.sum(axis=1).reset_index().plot(kind='scatter', x='DATE', y=0, ax=ax, label='National data', color='k', linewidth=1)

ax.set_title('New daily hospitalisations (simulation and data)')
ax.legend(loc='center')
ax.grid(False)
ax.set_ylabel(comp)

In [None]:
# NOTE: I changed the boundaries

# -----------
# Perform PSO
# -----------

pars = ['warmup','beta_R', 'beta_U', 'beta_M', 'amplitude', 'peak_shift']
bounds=((20.0,80.0),(0.005,0.060), (0.005,0.060), (0.005,0.060), (0.5, 1.5), (-182.0, 183.0))

# run optimisation
theta = pso.fit_pso(model, data, pars, states, bounds, weights=weights, maxiter=maxiter, popsize=popsize, dist='poisson',
                    poisson_offset=poisson_offset, agg=agg, start_date=start_calibration, processes=processes)

# Assign estimate.
warmup, pars_PSO = assign_PSO(model.parameters, pars, theta)
model.parameters = pars_PSO

# Perform simulation with best-fit results
out = model.sim(end_calibration,start_date=start_calibration,warmup=warmup)

# Print statement to stdout once
print(f'\nPSO RESULTS:')
print(f'------------')
print(f'warmup: {warmup}')
print(f'infectivities {pars[1:4]}: {theta[1:4]}.')
print(f'seasonality {pars[4:6]}: {theta[4:6]}.')
sys.stdout.flush()