# Interactive notebook for using `helmpy` mean field models for parameter inference from data

The mean field or 'deterministic' models which can be run with the `helmpy.run_meanfield` method can be used to provide an efficient way to perform approximate posterior parameter inference with respect to some dataset(s), but before we illustrate how to do this, it will be informative to describe the theoretical background necessary to perform this inference.

## 1. Theoretical background

Here we outline the basic theoretical background for performing Bayesian parameter inference using mean field helminth models under the assumption that the system close to a state of endemic equilibrium. For helminth transmission parameter inference far out of equilibrium, or close to the unstable breakpoint in the transmission phase plane (for reference: [https://www.sciencedirect.com/science/article/pii/S002251931930445X]), it is not recommended that mean field models are used for many of the reasons outlined in [https://www.medrxiv.org/content/10.1101/2019.12.17.19013490v1] - in such instances, the stochastic inference method is recommended, though it is more computationally expensive. Bear in mind, also, that inference with the mean field model will typically underestimate the variance of the posterior over parameters in comparison to inference with the stochastic model and so it should be used when population sizes are larger so that this additional variance is minimised.

The formalism we will outline here assumes that the diagnostic data are either Kato-Katz counts (for _Ascaris lumbricoides, Trichuris trichiura, Schistosoma mansoni_ and hookworm diagnostic testing) or urine filtration counts (for _Schistosoma haematobium_ diagnostic testing). As ever in any canonical Bayesian problem, specification of the likelihood function ${\cal L}$ is not the end of the story. To infer a joint posterior distribution ${\cal P}$ given a dataset ${\cal D}$ over the collection of transmission and diagnostic parameters $\{ \Lambda (a,t),k,z,\lambda_{\rm d},k_{\rm d}\}$, Bayes' rule here reads 

$${\cal P}[ \Lambda (a,t),k,z,k_{\rm d} \vert {\cal D},\lambda_{\rm d}] = \frac{1}{{\cal E}} \pi [\Lambda (a,t)] \, \pi (k) \, \pi (z) \, \pi (k_{\rm d}) \,{\cal L}[{\cal D}\vert \Lambda (a,t),k,z,k_{\rm d},\lambda_{\rm d}] \,,$$

where ${\cal E}$ is a normalisation constant, we have assumed that $\lambda_{\rm d}$ is known prior to the inference and $\pi [\Lambda (a,t)]$, $\pi (k)$, $\pi (z)$ and $\pi (k_{\rm d})$ are the prior distributions over $\Lambda (a,t)$, $k$, $z$ and $k_{\rm d}$ which are the age and time-dependent force of infection, worm aggregation, density dependent fecundity factor and measured diagnostic aggregation parameter, respectively (all assumed to be independent of each other _a priori_). Kato-katz and urine filtration counts typically follow a distribution which appears to be negative binomial in shape. By computing the mean diagnostically-detected egg count $\hat{{\sf e}}_{\rm d}=\lambda_{\rm d}\hat{{\sf e}}$ from the transmission parameters $\{ M (a,t),k,z,\lambda_{\rm d},k_{\rm d}\}$, the likelihood distribution which will be used for the inference of these parameters is therefore likely to be well-approximated by

$${\cal L}[{\cal D}\vert M(a,t),k,z,\lambda_{\rm d},k_{\rm d} ] = \prod_{\forall {\sf e}_i\in {\cal D}}{\rm NB}\big\{ {\sf e}_i; \lambda_{\rm d}\hat{{\sf e}}[M(a,t),k,z],k_{\rm d}\big\} \,,$$

where $k_{\rm d}$ accounts for some diagnostic variance; $\lambda_{\rm d}$ accounts for the number diagnostically-detected eggs per female worm; and $M(a,t)$ is the age and time-dependent total mean worm burden. Note that if only prevalence data is available, ${\cal D}=\{ N_{\rm inf}/N \}$ and the likelihood is a binomial distribution with probability derived from the negative binomial above

$${\cal L}_{\rm prev}[N_{\rm inf}/N\vert  M(a,t),k,z,\lambda_{\rm d},k_{\rm d} ] = {\rm Bin}\big\{ N_{\rm inf}; N,{\sf P}_{\rm d}[ M(a,t),k,z,\lambda_{\rm d},k_{\rm d} ]\big\}$$

$${\sf P}_{\rm d}[ M(a,t),k,z,\lambda_{\rm d},k_{\rm d}]=1-\bigg\{ 1 + \frac{\lambda_{\rm d}\hat{{\sf e}}[M(a,t),k,z]}{k_{\rm d}}\bigg\}^{-k_{\rm d}}\,.$$

If prevalence is used, due to the reduced amount of information, it is typically required that $k_{\rm d}$ is also assumed to be known _a priori_. See, e.g., Anderson & May, 1991 or [https://www.sciencedirect.com/science/article/pii/S002251931930445X] for the motivations behind the calculation of the proportionality factor for the first moment of the egg count distribution $\hat{{\sf e}}$, which, for example, in the case of STH (fully polygamous male worms) is analytic (it is not in the case of monogamous schistosomes)

$$\hat{{\sf e}}_{\rm STH}[M(a,t),k,t] = \phi [M(a,t);k,z]\, f [M(a,t);k,z] M(a,t)$$

$$f[M(a,t);k,z] \equiv \left[ 1+(1-z)\frac{M(a,t)}{k}\right]^{-(k+1)} $$

$$\phi [M(a,t);k,z] \equiv 1-\left[ \frac{1+(1-z)M(a,t)/k}{1+(2-z)M(a,t)/(2k)}\right]^{k+1} \,.$$

The attentive reader will notice that the arguments of the equation directly above do not exactly match the likelihood required to complete Bayes' rule. In fact, one must consider the transmission dynamics to map between values of $\Lambda (a,t)$ and $M(a,t)$. Note: from the form of the equation above, one might be tempted to simply infer $M(a,t)$ directly. This is, however, problematic for ensuring that samples of the system are at equilibrium for a typical set of baseline data.

From Anderson & May, 1991, the mean field (or 'deterministic') transmission dynamics of the helminth infections considered here (STH and schistosomes) with age structure can be described by the following system 

$$\frac{\partial M}{\partial t} + \frac{\partial M}{\partial a} = \Lambda (a,t) - \mu_1 M(a,t) \,.$$

This equation may be converted to a differential equation with respect to time only, while discretising the mean worm burden into age bins $\{ a_i\}$, by integrating over $a$ using a survival rate kernel $S(a)$ like so

$$M_i(t) \equiv M(a_i,t) = \frac{\int^{a_{i+1/2}}_{a_{i-1/2}}{\rm d}a \, M(a,t) S(a)}{\int^{\infty}_{0}{\rm d}a S(a)} \,.$$

Choosing the $S(a) = e^{-\mu a}$ and assuming an age-constant $\Lambda (a_i,t)$ within the bin (as we have assumed before in the fitting procedure), one may obtain the following first-order differential equation corresponding to the dynamics in the $i$-th age bin

$$\frac{{\rm d} M_i}{{\rm d} t} = \Lambda (a_i,t) - (\mu + \mu_1)M_i(t) \,.$$

In order to obtain this equation above, we have assumed that the boundary flux between age bins must vanish

$$\left.\frac{\partial M_i}{\partial a} \right\vert_{a_{i+1/2}}=0 \,,$$

due to an approximated instananeous change in the worm burden for the individual (as a consequence of individuals changing force of infection $\Lambda (a_i,t) \rightarrow \Lambda (a_{i+1},t)$) - note that this also sets the other boundary flux

$$\left.\frac{\partial M_i}{\partial a} \right\vert_{a_{i-1/2}}=\left.\frac{\partial M_{i-1}}{\partial a} \right\vert_{a_{i-1/2}}=\left.\frac{\partial M_{(i-1)}}{\partial a} \right\vert_{a_{(i-1)+1/2}}=0 \,.$$

As a side note: in a fully stochastic individual-based model, the change in the expected worm burden will occur over a timescale of $1/\mu_1$, so ensuring that the age bin widths are wider than this timescale is a necessity for this approximation remain accurate. Note also that the birth rate into the first age bin should be set to $\mu$ to match the simulation.

The implicit solution to the equation for ${\rm d} M_i/{\rm d} t$ above is

$$M_i(t) = M_i(t_0)e^{-(\mu + \mu_1) (t-t_0)} + \int^{t}_{t_0} {\rm d}t' \Lambda (a_i,t')e^{-(\mu + \mu_1) (t-t')}\,,$$

this solution can be inserted into ${\cal L}[{\cal D}\vert  M_i(t),k,z,\lambda_{\rm d},k_{\rm d} ]$ to perform inference with ${\cal L}[{\cal D}\vert  \Lambda (a_i,t),k,z,\lambda_{\rm d},k_{\rm d}]$. Note also that given the rapid equilibration of the infectious reservoir ${\rm d}\Lambda (a_i,t) /{\rm d}t\rightarrow 0$, we may also identify (see Anderson & May, 1991 again...)

$$\Lambda (a_i,t) = (\mu + \mu_1)R_{0,i}\sum_{j=1}^{N_a}\frac{N_j}{N_{\rm tot}} \, \hat{{\sf e}}[M_j(t),k,z] \,,$$

where we have included a host death rate $\mu$, the number of people $N_i$ within an age group (and $N_{\rm tot}$ in total) and defined an age-dependent coefficient which contributes to the basic reproduction number $R_{0,i}$. By inserting $\Lambda (a_i,t)$ into the equation for ${\rm d} M_i/{\rm d} t$ above, the nonlinear dynamical system of equations that this generates is 

$$\frac{{\rm d} M_{i}}{{\rm d} t} = (\mu +\mu_1) R_{0,i}\sum_{j=1}^{N_a}\left\{ \frac{N_j}{N_{\rm tot}}\, \hat{{\sf e}}[M_j(t),k,z] \right\} - (\mu +\mu_1)M_i(t) \,,$$

where value of the overall $R_0$ may be obtained through the relation

$$R_{0}=\frac{1}{N_{\rm tot}}\sum^{N_a}_{i=1} N_iR_{0,i} \,.$$

Note also that, at equilibrium $M(a_i,t)\rightarrow M(a_i)\,\, \forall i$, the value of $R_0$ is constrained to

$$R_{0} = \frac{\sum_{i=1}^{N_a} N_iM_i}{\sum_{j=1}^{N_a} N_j \,\hat{{\sf e}}(M_j,k,z)} \,.$$

To include migration between clusters in the inference, $\Lambda (a_i,t)$ in the ${\rm d} M_i/{\rm d} t$ equation above would be modified by the expectations of compound Poisson processes modelling the net ingoing and outgoing eggs/larvae (see: [https://www.sciencedirect.com/science/article/pii/S002251931930445X]). We will not handle this case here though.

## 2. Setup with mock data

First we must import `helmpy` and the other modules necessary for the inference...

In [None]:
import sys
path_to_helmpy = '/Users/Rob/work/helmpy' # Give your path to helmpy here
sys.path.append(path_to_helmpy + '/source/') 
from helmpy import helmpy

# These modules are not necessary to run helmpy alone but will be useful for our demonstrations

# LEAVE THESE IMPORTS COMMENTED AS THEY ARE FOR PRODUCING LaTeX-STYLE FIGURES ONLY
#import matplotlib as mpl
#mpl.use('Agg')
#mpl.rc('font',family='CMU Serif')
#mpl.rcParams['xtick.labelsize'] = 15
#mpl.rcParams['ytick.labelsize'] = 15
#mpl.rcParams['axes.labelsize'] = 20
#from matplotlib import rc
#rc('text',usetex=True)
#rc('text.latex',preamble=r'\usepackage{mathrsfs}')
#rc('text.latex',preamble=r'\usepackage{sansmath}')
# LEAVE THESE IMPORTS COMMENTED AS THEY ARE FOR PRODUCING LaTeX-STYLE FIGURES ONLY

import numpy as np
import matplotlib.pyplot as plt
import scipy.special as spec
import time

...and make up some mock, e.g., Kato-Katz data to use...

In [None]:
meanegg = 30.0
varegg = 8000.0
kksamps = np.random.negative_binomial(meanegg**2.0/np.abs(varegg-meanegg),meanegg/varegg,size=250)