# Building a model of STI transmission

Hopefully, working through the discussion prompt, you will have identified some important characteristics of sexually transmitted infections (STIs) like syphilis, that make them different from acute respiratory infections like influenza. 

One important (and obvious) difference is that transmission of STIs requires requires sexual contact, or any type of contact allowing exchange of bodily fluids (such as intravenous drug use), in contrast to directly transmitted respiratory infections. This means that only the subset of the population that is sexually active is at risk of infection. It also entails that transmission of STIs does not depend on the population density, i.e. the rate of transmission is not directly driven by crowded conditions. Although the natural history of different STIs is highly disease-specific, they can be of long duration with prolonged asymptomatic periods, particularly in women (e.g. syphilis). Many STIs (e.g. gonorrhoea, chlamydia) do not induce long-term immunity and re-infection can be common. Infection with one STI (e.g. herpes simplex virus 2) may also enhance transmission of other STIs (e.g. HIV). 

These are just some important examples of characteristics that affect the development of mathematical models. As you have seen, the basic reproduction number for STIs is given by:
\begin{align}
R_0 = b c D
\end{align}

where $b$ is the transmission probability per partnership, $c$ is the mean rate of partner change per unit time, and $D$ is the average duration of the infection. This highlights the trade-off between the transmission probability and the duration of infection, which allows STIs to spread and persist in populations. 

Rearranging and applying this equation to syphilis, with $b$ = 0.35 and $D$ = 0.25 years, we can see that for this infection to persist in a population (for R<sub>0</sub> equal to or greater than 1), we would have to assume the average rate of partner change $c$ to be at least 11.4 partners per year in the whole sexually active population. It is clearly unrealistic to assume that this represents typical rates of partner exchange in the population! *However*, we know that a small proportion of high-risk individuals may have very high rates of partner exchange, and this heterogeneity may be enough to allow syphillis to persist in a population. This example illustrates why accounting for heterogeneity in sexual behaviour is so important in mathematical models of STIs!

## Syphilis model

Now, you will code and use a very simple model of syphilis, to explore the effect of population heterogeneity on its transmission dynamics and control. Similarly to the age-structured models you developed for influenza, we will stratify the model into 2 groups: individuals with high sexual activity levels (subscript H), and the general sexually active population (subscript L), as shown in the diagram below. Note that we are not representing population groups that are not sexually active in the model, and we are assuming no one enters the sexually active population (e.g. through through reaching sexual maturity) over the course of the simulation.

<img src="../GraphicsAndData/w10_nb4_model_diagram.png">

Untreated syphilis is characterised by a complex natural history, with clinical symptoms and infectivity varying over the course of the chronic infection. Primary and secondary syphilis last several weeks or months and are characterised by painless symptoms such as lesions and rash. However, after resolution of these clinical manifestations, the infection enters a long-lasting latent stage. The latent stage of infection is asymptomatic and not infectious for the most part, although more severe symptoms and infectivity may recur years or decades after the initial infection. 

Nowadays, syphilis can be treated, so progression to latent infection is less common where treatment services are available. Here, we are using the simplified framework shown above to represent syphilis transmission in a population with some treatment services. You can find more realistic structures and assumptions for syphilis models in the references at the end of the etivity.

We assume that syphilis infection is preceded by a exposed (but not yet infectious) period - compartment E. Infected individuals become infectious at a rate $\sigma$, moving into compartment I, which includes both primary and secondary syphilis. We assume no one develops latent infection. Instead, infected individuals can return to the susceptible compartment at a rate $\gamma$ as a result of receiving treatment, because syphilis infection does not confer immunity and people can therefore be reinfected.

**Parameters and initial conditions:**

You will model a sexually active population of 500,000 people, 2.8% of which are in the high activity group. Assume that the initial prevalence of infection is 1% in both the high and the low activity group, and that everyone else is susceptible. The parameter values for the model are:

$b$ = 0.35  
$c_H$ = 29  
$c_L$ = 1.3   
$\sigma$ = 1/0.077   
$\gamma$ = 1/0.25

Rates are in units of years$^{-1}$.

**Calculating the force of infection for the STI model:**

The approach to defining the force of infection of an STI in a heterogeneous population is slightly different than the one you used in the influenza model. Remember that previously, the force of infection was defined in proportion to the prevalence of infection in a given age group, because we had data on the daily number of contacts made between different age groups - we had values for each parameter in the contact matrix. 

In the case of sexual contacts, this mixing pattern cannot be measured directly - we may have an estimate of the transmission probability per partnership, $b$, and of the overall rate of partner change in the high- and the low-activity groups, $c_H$ and $c_L$, respectively, but we usually don't know what proportion of these partners were with people from the different activity groups. As a result, we have to make an **assumption** on how high- and low-activity groups mix within and between each other. 

For the purpose of this exercise, we will assume *proportionate mixing*, which means that new partnerships are formed randomly in proportion to the activity levels of both members of the partnership. Here, we define an individual's 'activity level' by the parameters $c_H$ and $c_L$. 

Using our parameter values, we can calculate the probability that a partner selected according to proportionate mixing belongs to the high-activity group, $p_H$, as follows:

\begin{align}
p_H &= \frac{c_H \frac{N_H}{N}}{c_H \frac{N_H}{N} + c_L \frac{N_L}{N}} \\
    &= \frac{29 \times 0.028}{29 \times 0.028 + 1.3 \times (1-0.028)} \\
    &= 0.39
\end{align}

The probability that a partner selected according to proportionate mixing belongs to the low-activity group, $p_L$, is then:

\begin{align}
p_L &= 1 - p_H \\
    &= 0.61
\end{align}

The forces of infection acting on susceptibles in the high- and low-activity groups are:

\begin{align}
\lambda_H &= b \times c_H \times p_H \times \frac{I_H(t)}{N_H} + b \times c_H \times p_L \times \frac{I_L(t)}{N_L}\\
\lambda_L &= b \times c_L \times p_H \times \frac{I_H(t)}{N_H} + b \times c_L \times p_L \times \frac{I_L(t)}{N_L}
\end{align}

where $b$ is the transmission probability per partnership, $c_H$ is the mean rate of partner change per unit time for high-activity individuals and $c_L$ is the mean rate of partner change per unit time for low-activity individuals. $\frac{I_H(t)}{N_H}$ and $\frac{I_L(t)}{N_L}$ are the prevalence of infection in the high- and low-activity groups at time $t$, respectively.

In the cell below, fill in the differential equations and the given parameter values, then simulate the model for 10 years. Plot the output to confirm syphilis infection reaches endemic equilibrium in this population, then answer the following questions.

In [None]:
# PACKAGES
require(deSolve)
require(reshape2)
require(ggplot2)

# Initial conditions
initial_state_values <- c(SH = #YOUR CODE#, 
                          EH = #YOUR CODE#,
                          IH = #YOUR CODE#,   
                          SL = #YOUR CODE#,  
                          EL = #YOUR CODE#,
                          IL = #YOUR CODE#)

# Parameters
parameters <- c(#YOUR CODE#)  

# Run simulation for 10 years
times <- seq(from = 0, to = 10, by = 0.1)

# MODEL FUNCTION
sti_model <- function(time, state, parameters) {  
  
  with(as.list(c(state, parameters)), {
    
    NH <- SH+EH+IH
    NL <- SL+EL+IL
    
    # Defining the force of infection 
    lambda_H <- #YOUR CODE#
    lambda_L <- #YOUR CODE#
    
    # The differential equations
    dSH <- #YOUR CODE#          
    dEH <- #YOUR CODE#
    dIH <- #YOUR CODE#

    dSL <- #YOUR CODE#       
    dEL <- #YOUR CODE#
    dIL <- #YOUR CODE#
    
    # Output
    return(list(c(dSH, dEH, dIH, dSL, dEL, dIL))) 
  })
}
    
    
# MODEL OUTPUT

output <- as.data.frame(ode(y = initial_state_values, 
                            times = times, 
                            func = sti_model,
                            parms = parameters))

# Turn output into long format
output_long <- melt(as.data.frame(output), id = "time") 

# Plot number of people in all compartments over time
ggplot(data = output_long,                                               
       aes(x = time, y = value, colour = variable, group = variable)) +  
  geom_line() +                                                          
  xlab("Time (years)")+                                                   
  ylab("Number of people") +                                
  labs(colour = "Compartment") 

### Question: What is the overall equilibrium prevalence (%) of infectious syphilis in this population? What is the prevalence (%) in the high- and low-activity groups, respectively?

The proportion of all new syphilis cases that are generated by individuals in the high-activity group can be calculated using the following formula:

\begin{align}
p(cases_H) = \frac{p_H \times \frac{I_H^{*}}{N_H}}{p_H \times \frac{I_H^{*}}{N_H} + p_L \times\frac{I_L^{*}}{N_L}}
\end{align}

where $\frac{I_H^{*}}{N_H}$ is the equilibrium prevalence in the high-activity group and $\frac{I_L^{*}}{N_L}$ is the equilibrium prevalence in the general population.

### Question: What proportion of new syphilis infections arose from the high activity population in this model?

In [None]:
### YOUR CODE GOES HERE ###

## Effect of heterogeneity in sexual behaviour on syphilis control 

Even though we are assuming that the population we are modelling already has some access to treatment services, we are now interested in investigating the potential impact of a pro-active screening and treatment programme for syphilis, and how best to deliver this. 

In the model code you developed above, add two additional parameters to represent treatment of infectious individuals in the high-activity group, $\tau_H$, and treatment of infectious individuals in the low-activity group, $\tau_L$. Treatment of infectious syphilis returns individuals to the susceptible compartment. 

Assume that the available resources allow for 125,000 individuals to be screened and treated per year.

### Question: What relative reduction in endemic equilibrium prevalence can be achieved by targeted screening of high-activity individuals, targeted screening of low-activity individuals, and random screening, respectively?

### Question: Which different assumptions could we make about sexual mixing in this population? Are there further heterogeneities with regard to sexual contact you might want to represent in the model?

## References

**A good introduction to the mathematical epidemiology of STIs:**

Boily, M.C. and Mâsse, B., 1997. Mathematical models of disease transmission: a precious tool for the study of sexually transmitted diseases. Canadian journal of public health, 88(4), pp.255-265.

**Further explanations on sexual mixing assumptions:**

Gupta, S., Anderson, R.M. and May, R.M., 1989. Networks of sexual contacts: implications for the pattern of spread of HIV. AIDS (London, England), 3(12), pp.807-817.

**Examples of syphilis modelling studies:**

Garnett, G.P., Aral, S.O., Hoyle, D.V., Cates Jr, W. and Anderson, R.M., 1997. The natural history of syphilis: implications for the transmission dynamics and control of infection. Sexually transmitted diseases, 24(4), pp.185-200.

Pourbohloul, B., Rekart, M.L. and Brunham, R.C., 2003. Impact of mass treatment on syphilis transmission: a mathematical modeling approach. Sexually transmitted diseases, 30(4), pp.297-305.