# Demo 1: epidemic model in C++. Visual comparison of the samplers

As first example, let's use black-it to calibrate a C++ ABM in the epidemiological domain. We will illustrate how the various components (samplers, losses) are configured and plugged in.

Finally, we will use the built-in visualization tools to form an intuition about **how different search strategies explore the parameter space**.

#### Basic elements of a SIR model

<table style="font-size: 110%">
    <tr style="background-color: white;">
    <td width="50%" style="vertical-align: top; text-align: left">In our model, each agent transitions among 3 states: <strong>Susceptible</strong>, <strong>Infectious</strong> and <strong>Recovered</strong>.<br><br>At each epoch, an Infectious agent has a probability <strong>β</strong> of infecting its Susceptible neighbours, and a probability <strong>γ</strong> to transition to the Recovered state.<br />From that moment on, it will no longer participate in the spreading of the disease.<img src="data/sir-model.png" alt="SIR agent model" style="width: 400px;"/></td>
    <td style="vertical-align: top; text-align: left">The connectivity between agents is modeled as a <strong>Watts-Strogatz</strong> small world random graph, a regular ring lattice of mean degree <strong>K</strong> where each node has a probability <strong>r</strong> of being randomly rewired.<br><br>In our model, these parameters will be <strong>input-calibrated</strong> (i.e., fixed).<img src="data/watts-strogatz-network.png" alt="Watts-Strogatz Network" style="width: 200px;"/></td>
    </tr>
</table>

## Calibration of a SIR model against realistic data

In this part of the tutorial we will use black-it to find the parameters of a SIR model fitted on the italian Covid-19 epidemiological data.

We will see that a proper modelling of the first wave of the epidemic requires the introduction of a **structural break** in the SIR simulator i.e., a specific point in time in which an abrupt change in the parameters occurs.

This is useful to model the effect of the lockdown over the spreading of the epidemic.

In [None]:
# preparatory imports
import numpy as np

import sir_util

from black_it.calibrator import Calibrator
from black_it.loss_functions.minkowski import MinkowskiLoss
from black_it.plot.plot_results import (
    plot_convergence,
    plot_losses_interact,
    plot_sampling_interact,
)
from black_it.samplers.best_batch import BestBatchSampler
from black_it.samplers.halton import HaltonSampler
from black_it.samplers.random_forest import RandomForestSampler

### Load reference data

For didactic puposes, let's load a previously prepared dataset containing a very rough estimate of the SIR data for the first 20 weeks of the italian Covid-19 epidemic. As the official data underestimates the number of cases, Susceptible and Recovered were rescaled by a constant factor.

Let's load and plot the real time series we want to reproduce:

In [None]:
real_data = np.loadtxt("data/italy_20_weeks.txt")
sir_util.plotSeries("Real data", real_data)

### Initialize a calibrator object

#### 1. Model simulator

In [None]:
from models.sir.sir_docker import SIR_w_breaks

#### 2. Loss function

We'll use a quadratic loss, a simple squared difference bewteen the two series

In [None]:
loss = MinkowskiLoss()

#### 3. Samplers

Let's choose an advanced, static combination of samplers. In the next demo we will follow a different approach.

In [None]:
sampler_batch_size = 16
samplers = [
    HaltonSampler(batch_size=sampler_batch_size),
    RandomForestSampler(batch_size=sampler_batch_size),
    BestBatchSampler(batch_size=sampler_batch_size),
]

#### 4. Parameter space (bounds and precision)

In [None]:
#    brktime, beta1, beta2, gamma
bounds_w_breaks = [
    [2, 0.1, 0,   0.1],
    [7, 0.2, 0.1, 0.3],
]
precisions_w_breaks = [1, 0.0005, 0.0005, 0.0005]

#### Initialize the Calibrator

There are almost **100 million** possible parameters to explore:

In [None]:
saving_folder = "output"
cal = Calibrator(
    samplers=samplers,
    real_data=real_data,
    model=SIR_w_breaks,
    parameters_bounds=bounds_w_breaks,
    parameters_precision=precisions_w_breaks,
    ensemble_size=1,
    loss_function=loss,
    saving_folder=saving_folder,
    random_state=0,
)

### Calibration

Perform 15 calibration rounds.

Note that, with these parameters, we would be able to achieve a much lower loss in 30 epochs.

In [None]:
params, losses = cal.calibrate(15)

Best parameters obtained so far:

In [None]:
sir_util.printBestParams(params)

### Compare the original and the calibrated time series

In [None]:
idxmin = np.argmin(cal.losses_samp)
sir_util.plotSeries("real (───) vs calibrated (-----)", real_data, cal.series_samp[idxmin, 0])

## Plots

Let's use the **black-it built-in functions** to visually explore how the calibration progressed.

In [None]:
plot_sampling_interact(saving_folder)

In [None]:
plot_convergence(saving_folder)