# Model calibration

Prepared by Omar A. Guerrero (oguerrero@turing.ac.uk, @guerrero_oa)

In this tutorial, we will calibrate the free parameters of PPI's model. First, we will load all the data that we have prepared in the previous tutorials. Then, we extract the relevant information and put it in adecquate data structures. Finally, we run the calibration function and save the results with the parameter values.

## Importing Python's libraries to manipulate and visualise data

In [1]:
import pandas as pd
import numpy as np

## Importing PPI's functions

In this example, we will import the PPI source code directly from the repository. This means that we will place a request to GitHub, download the `ppi.py` file, and copy it locally into the folder where these tutorials are saved. Then, we will import ppi.

In [2]:
import requests
url = 'https://raw.githubusercontent.com/oguerrer/ppi/main/source_code/ppi.py'
r = requests.get(url)
with open('ppi.py', 'w') as f:
    f.write(r.text)
import ppi

## Load data

### Indicators

In [3]:
df_indis = pd.read_csv('https://raw.githubusercontent.com/oguerrer/ppi/main/tutorials/clean_data/data_indicators.csv')

N = len(df_indis)
I0 = df_indis.I0.values # initial values
IF = df_indis.IF.values # final values
success_rates = df_indis.successRates.values # success rates
R = df_indis.instrumental # instrumental indicators
qm = df_indis.qm.values # quality of monitoring
rl = df_indis.rl.values # quality of the rule of law
indis_index = dict([(code, i) for i, code in enumerate(df_indis.seriesCode)]) # used to build the network matrix

### Interdependency network

In [4]:
df_net = pd.read_csv('https://raw.githubusercontent.com/oguerrer/ppi/main/tutorials/clean_data/data_network.csv')

A = np.zeros((N, N)) # adjacency matrix
for index, row in df_net.iterrows():
    i = indis_index[row.origin]
    j = indis_index[row.destination]
    w = row.weight
    A[i,j] = w

### Budget

In [5]:
df_exp = pd.read_csv('https://raw.githubusercontent.com/oguerrer/ppi/main/tutorials/clean_data/data_expenditure.csv')

Bs = df_exp.values[:,1::] # disbursement schedule (assumes that the expenditure programmes are properly sorted)

### Budget-indicator mapping

In [6]:
df_rela = pd.read_csv('https://raw.githubusercontent.com/oguerrer/ppi/main/tutorials/clean_data/data_relational_table.csv')

B_dict = {}
for index, row in df_rela.iterrows():
    B_dict[index] = [indis_index[indi] for indi in row.values[1::][row.values[1::].astype(str)!='nan']]

## Calibrate

In [7]:
T = 69

ppi.calibrate(I0, IF, success_rates, A=A, R=R, qm=qm, rl=rl,  Bs=Bs, B_dict=B_dict, 
              T=T, threshold=.8, parallel_processes=None, verbose=True)

AssertionError: The number of keys in B_dict should be the same as the number of ones in R