# Naïve Pooled Tumour Growth model

The tumour growth model presented in [1] is an hierarchical model, which consists of a structural model, a population model and an error model. The idea of this modelling approach is to use a structural model to provide a mechanistic understanding of the tumour growth, while acknowledging the biological differences between individuals with a population model. The error model is necessary to understand deviations of the model predictions from the observations due to uncaptured biological processes or measurement uncertainties. 

In this notebook we start with a simplified model structure to challenge the necessity for a hierarchical model structure. In other words, we assume that biological differences between the mouse tumours is negligible, and the growth dynamics can be captured with a structural model + error model with only one set model parameters.

## Structural model

The structural model in [1] for the tumour growth in absence of treatment is an ordinary differential equation for the tumour volume

\begin{equation*}
\frac{\text{d}V^s_T}{\text{d}t} = \frac{2\lambda _0\lambda _1 V^s_T}{2\lambda _0 V^s_T + \lambda _1},
\end{equation*}

where
- $V^s_T$ is the tumour volume predicted by the structural model in $[\text{cm}^3]$, 
- $\lambda_0$ is the exponential growth rate of the tumour in $[1/\text{day}]$, 
- $\lambda_1$ is the linear growth rate of the tumour in $[\text{cm}^3/\text{day}]$.

The superscript of the volume variable $V^s_T$ indicates that the tumour volume was predicted by the structural model. The distinction between the true tumour volume predictions $V_T$ and the predictions of the structural model $V^s_T$ will become clear once we have introduced the error model. In short, the volume predictions $V_T$ and the structural model predictions $V^s_T$ are connected through the model error $V_T = V^s_T + \varepsilon$.

The structural model makes a number of assumptions and approximations to describe the tumour growth. The most obvious assumption of the model is a transition of the tumour growth from an exponential growth to a linear growth at a characteristic tumour volume

\begin{equation*}
V^c_T = \frac{\lambda_1}{2\lambda _0}.
\end{equation*}

For $V^s_T\ll V^c_T$ the tumour is modelled by an exponential growth, while for $V^s_T\gg V^c_T$ the tumour growth is linear. This tumour growth model was first introduced in [3], and builds on the intuition that in the early stages of the tumour an abundance of oxygen and nutrients leads to a constant doubling time of cancerous cells, and therefore to an exponential growth. However, in later stages of the tumour growth oxygen and other sources of nutrition are being depleted inside the tumour, and only the 'shell' tumour cells are able to proliferate at the initial rate. While there are ways for the tumour mass to expand in the inside too, by changing the mode of metabolism to glycolysis or rewiring of the blood vessels to improve the oxygen supply, it is clear that the total growth rate of the tumour should be expected to slow down. Due to the complexity of the process it is not obvious that the growth should be expected to change qualitatively from exponential to linear. However, in [3] it was argued that a linear growth phase was observed for later stages of tumour evolution in experiments. By investigating the estimate for $V^c_T$ we will be able to assess this modelling choice directly.

### Validity of structural model

It is intuitively clear that the validity of the model cannot hold for all values of $V^s_T\in \mathbb{R}_{\geq 0}$ and $t\in\mathbb{R}_{\geq 0}$. For small tumour sizes, where the tumour may only consist of a small number of cancerous cells, it is no longer appropriate to assume a deterministic growth of the tumour. A stochastic model incorporating drift, and in particular a finite probablity of extinction may be more appropriate. In addition, assuming a constant growth rate of the tumour can only be justified if the mean growth behaviour of the cancerous cells does not substantially change over the simulated time period. This assumption will almost surely break down in the infancy of the tumour where mutations are essential for the selective advantage of the cancerous cells. It may therefore be expected that the proliferation rate of the cancer cells is not constant for small tumour volumes. Both these arguments suggest that the above model should have a lower limit, where the model loses validity. It is not clear what this limit should be exactly, but we may set it somewhat conservatively to $1 \, \text{mm}^3$. This tumour volume may approximately translate to a cell count of $10^7$ cells for which the deterministic approximation is well justified (assuming average cell volume of $100\, \mu\text{m}^3$, i.e. length, height and width of a cell between $1\mu\text{m}$ to $10\, \mu\text{m}$).

Similarly, with an average volume of a mouse of the order of $1\, \text{dm}^3$, it may be expected that the growth behaviour of the tumour will significantly change no later than for values of about $10^6\text{mm}^3$. We shall therefore limit the applicability of our model to the regime 

\begin{equation*}
V^s_T\in [10^{-3}, 10^3] \, \text{cm}^3.
\end{equation*}

The above arguments also suggest that we may want to introduce an upper time limit after which the modelling predictions can no longer be trusted. This time limit approximates the time point when mutations should be expected to alter the speed of the tumour growth, by either changing the metabolism, the blood vessel structure or other properties that may change the proliferation rate. It is not easy to estimate the order of this time point, but it may be approximated by an average mutation rate that can be derived from other studies. For now we will somewhat arbitrarily set the valid time interval for predictions to 

\begin{equation*}
t\in [0, 30] \, \text{day}.
\end{equation*}

This has little biological justification and is simply driven by the fact that the PKPD study contains samples over a range of 30 days. This limit may be challenged at a later stage.

## Error model

Generally, the structural model predictions $V^s_T$ should not be expected to capture the observed tumour growth exactly. On the one hand the accuracy of measurements is always bound by the intrinsic uncertainty of the measurement apparatuses and other uncertainties in the measurement process, and on the other hand the structural model should rather be seen as a gross simplification of the true underlying biological processes. As such the many uncaptured subtle processes may lead to random fluctuations around the predictions $V^s_T$. In [1] those fluctuations of the observations $V^{\text{obs}}_T$ from the structural model predictions $V^s_T$ were modelled by the residual error

\begin{equation*}
\varepsilon = V_T - V^s_T = (\sigma _{\text{base}} + \sigma _{\text{rel}} V^s_T)x ,
\end{equation*}

where $\sigma _{\text{base}}$ and $\sigma _{\text{rel}}$ are non-negative constants and $x $ is a standard Gaussian random variable, $x \sim \mathcal{N}(0, 1)$. $V_T$ emulates the behaviour of future measurements and incorporates their randomness due to measurement error and subtle processes that may not be captured by the structural model. So $V^{\text{obs}}_T$ may be interpreted as realisations of the random variable $V_T$. Intuitively the combined error model is a mixture of a constant Gaussian noise that formalises the expectation of a base level noise, and a heteroscedastic noise which assumes that the error will grow relative to the predicted volume. At this point also the error model remains an assumption that remains to be critically assessed by the end of the analysis.

It is important to note at this point that all meaningful predictions of the tumour growth model will be made by $V_T$. It is not justified to infer model parameters $\theta $ using the above described tumour growth model structure and strip away the error model for future predictions. One might be tempted to remove the undesired measurement noise in that way, which seems irrelevant for theoretical predictions of the tumour growth. However, neglecting the error model also disregards the biology that is too complex to be captured by the structural model. As long as we cannot distinguish between measurement error and biology, it is not justified to ignore the uncertainty introduced by the error model.

## Summary: Naïve pooled model structure

The naïve pooled model consists of a structural model $V^s_T$ that captures the mechanisms of the tumour growth, and an error model $\varepsilon $ which describe measurement uncertainties and oversimplifications of the structural model. The structural model and error model combined define a distribution of tumour growth curves that may predict tumour growth

\begin{equation*}
    V_T \sim \mathcal{N}(V^s_T, \sigma _{\text{tot}}^2).
\end{equation*}

Here $V^s_T$ is the solution of the structural model, which is parameterised by time $t$ and the strcutural model parameters $\psi = (V_0, \lambda _0, \lambda _1)$. $V_0$ is the initial tumour volume at $t=0$. $\sigma _{\text{tot}}$ is the standard deviation of the distribution of predicted tumour volumes around $V^s_T$ defined by the error model $\sigma _{\text{tot}} = \sigma _{\text{base}} + \sigma _{\text{rel}} V^s_T$.

In a more abstract notation we may refer to the naïve pooled model as 

\begin{equation*}
    V_T \sim \mathbb{P}(\cdot | \psi, \theta _V),
\end{equation*}

which makes the parameterisation of the model by $\psi = (V_0, \lambda _0, \lambda _1)$ and $\theta _V = (\sigma _{\text{base}}, \sigma _{\text{rel}})$ explicit.

## Implementation of naïve pooled model

We are using [myokit](http://myokit.org/) for the implementation of the structural model. Myokit enables us to solve the structural model ODE with an adaptive numerical solver called CVODE [3]. To implement the error model and perform the inference we are using [pints](https://pints.readthedocs.io/). 

Note that in general the quality of the inference of $\psi $ and $\theta _V$ can be significantly improved when all parameters are appropriately transformed. We will however choose to not transform the parameetrs at first, to illustrate how the inference may be stabilised with transformations.

### Implementation of structural model in myokit

We have implemented the structural model in myokit with untransformed parameters in a separate [module](https://github.com/DavAug/ErlotinibGefitinib/blob/master/pkpd/model.py). The structural model can now be created by calling ```pkpd.create_tumour_growth_model()```.

In [2]:
#
# Implementing the structural model in myokit.
#

from pkpd import model as m


# Create model
model = m.create_tumour_growth_model()

# Print structural model
print(model.code())

[[model]]
# Initial values
central.volume_tumor = 0

[central]
lambda_0 = 0
    in [1/day]
lambda_1 = 1
    in [mm^3/day]
time = 0 bind time
    in [day]
dot(volume_tumor) = 2 * (lambda_0 * (lambda_1 * volume_tumor)) / (2 * (lambda_0 * volume_tumor) + lambda_1)
    in [mm^3]




In [1]:
#
# Implementing the structural model in myokit.
#

import myokit
import numpy as np
import pandas as pd
import pints

from pkpd import model as m


# Create Model
class Model(pints.ForwardModel):
    def __init__(self):
        # Create myokit model
        model = m.create_tumour_growth_model()

        # Create simulator
        self.sim = myokit.Simulation(model)

    def n_parameters(self):
        """
        Number of parameters to fit. Here initial V_T, lambda_0, lambda_1
        """
        return 3

    def n_outputs(self):
        return 1

    def simulate(self, parameters, times):
        # Reset simulation
        self.sim.reset()

        # Sort input parameters
        initial_volume, lambda_0, lambda_1 = parameters

        # Set initial condition
        self.sim.set_state([initial_volume])

        # Set growth constants
        self.sim.set_constant('central.lambda_0', lambda_0)
        self.sim.set_constant('central.lambda_1', lambda_1)

        # Define logged variable
        loggedVariable = 'central.volume_tumor'

        # simulate for given dose
        output = self.sim.run(times[-1] + 1, log=[loggedVariable], log_times=times)
        result = output[loggedVariable]

        return np.array(result)

# Create inverse problem for each mouse ID
mouse_ids = lxf_data['#ID'].unique()
log_likelihoods = []
for ids in mouse_ids:
    # Mask dataframe for mouse data
    mouse_mask = lxf_data['#ID'] == ids
    times = lxf_data[mouse_mask]['TIME in day'].to_numpy()
    observed_volumes = lxf_data[mouse_mask]['TUMOUR VOLUME in mm^3'].to_numpy()
    problem = pints.SingleOutputProblem(Model(), times, observed_volumes)
    log_likelihoods.append(pints.GaussianLogLikelihood(problem))

# Create one log_likelihood for the inference from the individual problems
log_likelihood = pints.SumOfIndependentLogPDFs(log_likelihoods)

# Log priors (Just a quick choice)
log_prior_volume = pints.GaussianLogPrior(mean=100, sd=100)  # Not log-tranformed yet!
log_prior_lambda_0 = pints.HalfCauchyLogPrior(location=0, scale=20)
log_prior_lambda_1 = pints.HalfCauchyLogPrior(location=0, scale=20)
log_prior_std = pints.HalfCauchyLogPrior(location=0, scale=20)

# Create one log_prior
log_prior = pints.ComposedLogPrior(log_prior_volume, log_prior_lambda_0, log_prior_lambda_1, log_prior_std)

# Create posterior
log_posterior = pints.LogPosterior(log_likelihood, log_prior)

# intial guess of parameters
parameters = [1, 1, 1, 1]

# choose optimisation method
optimiser = pints.CMAES

# create optimisation object
opt = pints.OptimisationController(
    function=log_posterior,
    x0=parameters,
    method=optimiser)

# estimate parameters
estimates, _ = opt.run()

print(' ')
print('Estimates: ')
print('Initial tumour volume [mm^3]: ', estimates[0])
print('Exponential growth rate \lambda _0 [1/day]: ', estimates[1])
print('Linear growth rate \lambda _1 [mm^3/day]: ', estimates[2])
print('Standard-deviation of base-level noise [mm^3]: ', estimates[3])


## Prior selection

In this project, we will follow a Bayesian inference scheme. As a result, we need to specify priors for the paramaters 

## Bibliography

- <a name="ref1"> [1] </a> Eigenmann et. al., Combining Nonclinical Experiments with Translational PKPD Modeling to Differentiate Erlotinib and Gefitinib, Mol Cancer Ther (2016)
- <a name="ref2"> [2] </a> Koch et. al., Modeling of tumor growth and anticancer effects of combination therapy, Journal of Pharmacokinetics and Pharmacokinetics, (2009)
- <a name="ref3"> [3] </a> SUNDIALS: Suite of nonlinear and differential/algebraic equation solvers. Hindmarsh, Brown, Woodward, et al. (2005) ACM Transactions on Mathematical Software.

[Back to project overview](https://github.com/DavAug/ErlotinibGefitinib/blob/master/README.md)[Back to project overview](https://github.com/DavAug/ErlotinibGefitinib/blob/master/README.md)