# Identifiability of Structural Model for Lung Cancer Growth in Absence of Treatment

The tumour growth inhibitory effects of Erlotinib and Gefitinib were modelled with a population PKPD model in [1]. A population PKPD model is a hierarchical model which consists of a structural model, a population model, and an error model. Each sub-model captures a different aspect of the tumour growth inhibition biology, and is parametrised by a set of parameters. 

In this notebook we explore the identifiability of the structural model for the untreated tumour growth of LXF A677 explants implanted in mice [1]. In particular, we investigate the effects of log-transformations and non-dimensionalisation of the model parameters on the stability of the optimisation. 

## Structural growth model in absence of treatment

In a PKPD model the structural model is a mechanistic or empirical description of the drug's pharmacokinetics (PK) and pharmacodynamics (PD) for an individual. The PK model describes how the drug is distributed in the body upon administration of that individual, which is often referred to as "what the body does to the drug". The PD model captures the drug's effects on the progression of the disease, or "what the drug does to the body". 'Progression of the disease' may be quantified by any disease-related observable, such as biomarkers or in this case the tumour volume. A PD model does therefore not only need to capture the disease progression of an individual *under the influence* of a compound, but also needs to be able to describe its progresion in the *absence* of any treatment.

In [1] a structural model is proposed for the tumour growth in absence of the compound, which assumes that the tumour grows exponentially for small tumour volumes, and linearly for large tumour volumes

\begin{equation*}
    \frac{\text{d}V^s_T}{\text{d}t} = \frac{2\lambda _0\lambda _1 V^s_T}{2\lambda _0 V^s_T + \lambda _1}.
\end{equation*}

Here, $V^s_T$ is the predicted tumour volume by the structural model, $\lambda _0$ is the exponential growth rate, and $\lambda _1$ is the linear growth rate. The tumour growth of an individual in absence of the drug is thus parameterised by three parameters

\begin{equation*}
    \psi = (V_0, \lambda _0, \lambda _1),
\end{equation*}

where $V_0$ is the initial tumour volume, $V^s_T(t=0, \psi) = V_0$.

## Identifiability

The identifiability of a parametric model addresses the question of whether a unique set of model parameters $\psi $ can be inferred from data, which most closely capture the data, see e.g. [2, 3, 4]. In particular, under the assumption that the model could have generated the data with a given set of model parameters, identifiability addresses the question of whether those *true* parameters can be recovered from the data. 

In the PKPD modelling context, the identifiability of the structural model is of particular importance, as parameter values can often be biologically interpreted. This allows for example to characterise biochemical properties of the compound, or to translate the model from preclinical to clinical application. However, for those biological interpretations to be meaningful, we need to make sure that the values tightly couple the observed data and the proposed structural model. If multiple parameters for a model exist that equally well capture the observed behaviour, it is hard to imagine that the parameter values carry significant biological meaning.

In this study, we are interested in learning the structural model parameters $\psi $ of the tumour growth in absence of the compound from an *in vivo* experiment reported in [1]. Here, patient-derived tumour explants LXF A677 (adenocarcinoma of the lung) were implanted in mice. The tumour growth was monitored over a period of 30 days, see the [Lung Cancer Tumour Growth in Absence of Treatment](https://nbviewer.jupyter.org/github/DavAug/ErlotinibGefitinib/blob/master/notebooks/lung_cancer/control_growth/data_preparation.ipynb) for details.

In [1]:
#
# Visualise control growth data.
#

import os

import pandas as pd

import pkpd.plots


# Import data
# Get path of current working directory
path = os.getcwd()

# Import LXF A677 control growth data
data = pd.read_csv(path + '/data/lxf_control_growth.csv')

# Create scatter plot
fig = pkpd.plots.plot_measurements(data)

# Show figure
fig.show()

**Figure 1 - Untreated tumour growth:** Untreated tumour growth of patient-derived tumour explants LXF A677 (adenocarcinoma of the lung) implanted in mice. The colouring of the data points indicates the identity of the mice. The evolution of the body weight can be explored by using the buttons in the top right.

## Naïve optimisation of model parameters

Let us begin to explore the identifiability of the model by attempting to find an optimal set of model parameters $\psi $ for each mouse. Typically such a set of optimal parameters can be found by defining an objective function and an optimisation algorithm. The objective function $L(\psi | V^{\text{obs}}_{T})$ quantifies how close the model predictions $V^s_T(t_i, \psi)$ for a given set of model parameters $\psi $ are to the observed data $V^{\text{obs}}_{T, i}$. There are many choices for objective functions. We somewhat arbitrarily choose the Squared Distance Error Measure

\begin{equation*}
    L(\psi | V^\text{obs}_{T}) = \sum ^{n}_{i=1}\left( V^\text{obs}_{T, i} - V^s_T(t_i, \psi)\right) ^2,
\end{equation*}

where $n$ is the number of measurements, $V^\text{obs}_{T, i}$ is the measured tumour volume at time $t_i$, and $V^s_T(t_i, \psi)$ is the model prediction at time $t_i$ for model parameters $\psi $. This objective function is greater or equal to zero for all $\psi $, and only vanishes for a model parameter set $\hat \psi $ for which the model identically reproduces the observations. We would like to find this parameter set $\hat \psi $.

For real measurements, the measurement noise alone will make it improbable that the model will be able to reproduce the observations exactly, even if the structural model did capture the true underlying biological process. As a result, we are forced to weaken our condition for an optimal parameter set and look for the set of parameters $\hat \psi $ that globally minimises the objective function $L$. Note that this step introduces uncertainty into the modelling process, not only because we can no longer decide whether a given set of parameters is optimal by comparing its objective function score to zero, but also because it's much harder to decide whether the deviations from the observations are due to noise or due to the wrong modelling choice (more about model selection in following notebooks).

There are many algorithms that are designed to minimise an objective function. One of those algorithms is the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) optimiser [5]. We will choose to minimise the objective function $L$ with this algorithm, but other choices are equally valid. From a naïve perspective, we can choose any initial starting point $\psi _0 $ for the optimisation algorithm, as the true global minimum is independent of the method we use to find it. We therefore choose to start the optimisation somewhat arbitrarily at $\psi _0 = (1, 1, 1)$.

In [2]:
#
# Create structural growth model
#

import pkpd

# Create model
model = pkpd.TumourGrowthModel()

# Show model
print(model)

[[model]]
# Initial values
central.volume_t = 0

[central]
lambda_0 = 0
    in [1/day]
lambda_1 = 1
    in [cm^3/day]
time = 0 bind time
    in [day]
dot(volume_t) = 2 * (lambda_0 * (lambda_1 * volume_t)) / (2 * (lambda_0 * volume_t) + lambda_1)
    in [cm^3]




In [3]:
#
# Naive attempt to find an optimal parameter set psi for each mouse in the LXF A677 population.
#

import os

import numpy as np
import pandas as pd
import pints
from tqdm.notebook import tqdm

import pkpd


# Import data
# Get path of current working directory
path = os.getcwd()

# Import LXF A677 control growth data
data = pd.read_csv(path + '/data/lxf_control_growth.csv')
n_mice = len(data['#ID'].unique())

# Define container for the structural model estimates
# Shape (n_mice, n_parameters)
n_parameters = 3
mouse_parameters = np.empty(shape=(n_mice, n_parameters))

# Define container for the objective function score for the optimised parameters
mouse_scores = np.empty(shape=n_mice)

# Define "arbitrary" starting point for the optimisations
initial_parameters = [1, 1, 1]

# Find mouse parameters for LXF A677 population
mouse_ids = data['#ID'].unique()
for index, mouse_id in enumerate(tqdm(mouse_ids)):
    # Create mask for mouse with specfied ID
    mouse_mask = data['#ID'] == mouse_id

    # Get relevant time points
    times = data[mouse_mask]['TIME in day'].to_numpy()

    # Get measured tumour volumes
    observed_volumes = data[mouse_mask]['TUMOUR VOLUME in cm^3'].to_numpy()

    # Create inverse problem
    problem = pints.SingleOutputProblem(pkpd.TumourGrowthModel(), times, observed_volumes)

    # Create sum of squares error objective function
    error = pints.SumOfSquaresError(problem)

    # Create optimisation controller with a CMA-ES optimiser
    optimiser = pints.OptimisationController(
        function=error,
        x0=initial_parameters,
        method=pints.CMAES)

    # Disable logging mode
    optimiser.set_log_to_screen(False)

    # Parallelise optimisation
    optimiser.set_parallel(True)

    # Find optimal parameters
    estimates, score = optimiser.run()

    # Save estimates and score
    mouse_parameters[index, :] = estimates
    mouse_scores[index] = score

HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))




In [3]:
#
# Solve structural model for inferred model parameters.
#
# This cell needs the inferred model parameters:
# [mouse_parameters]
#

import numpy as np

import pkpd


# Create tumour growth model
model = pkpd.TumourGrowthModel()

# Define simulation time points in day
times = np.linspace(start=0, stop=30, num=200)
n_times = len(times)

# Create container for simulated tumour growth
# Shape (n_mice, n_times)
n_mice = len(mouse_parameters)
tumour_growth = np.empty(shape=(n_mice, n_times))

# Solve structural model for LXF A677 population
for mouse_id, mouse_params in enumerate(mouse_parameters):
    # Simulate mouse tumour growth
    tumour_growth[mouse_id, :] = model.simulate(parameters=mouse_params, times=times)


In [36]:
#
# Visualise simulated tumour growth together with control growth data.
#
# This cell needs the optimised model parameters, the simulated tumour growth and the control growth data:
# [mouse_parameters, times, tumour_growth, data]
#

import pandas as pd
import plotly.colors
import plotly.graph_objects as go


# Get number of individual mice
n_mice = len(tumour_growth)

# Define colorscheme
colors = plotly.colors.qualitative.Plotly[:n_mice]

# Create figure
fig = go.Figure()

# Scatter plot LXF A677 time-series data for each mouse
mouse_ids = data['#ID'].unique()
for index, id_m in enumerate(mouse_ids):
    # Create mask for mouse
    mask = data['#ID'] == id_m

    # Get time points for mouse
    observed_times = data['TIME in day'][mask]

    # Get observed tumour volumes for mouse
    observed_volumes = data['TUMOUR VOLUME in cm^3'][mask]

    # Get simulated tumour volumes for mouse
    simulated_volumes = tumour_growth[index, :]

    # Get mouse parameters
    params = mouse_parameters[index, :]

    # Plot data
    fig.add_trace(go.Scatter(
        x=observed_times,
        y=observed_volumes,
        legendgroup="ID: %d" % id_m,
        name="ID: %d" % id_m,
        showlegend=True,
        hovertemplate=
            "<b>Measurement </b><br>" +
            "ID: %d<br>" % id_m +
            "Time: %{x:} day<br>" +
            "Tumour volume: %{y:.02f} cm^3<br>" +
            "Cancer type: LXF A677<br>" +
            "<extra></extra>",
        mode="markers",
        marker=dict(
            symbol='circle',
            opacity=0.7,
            line=dict(color='black', width=1),
            color=colors[index])
    ))

    # Plot simulated growth
    fig.add_trace(go.Scatter(
        x=times,
        y=simulated_volumes,
        legendgroup="ID: %d" % id_m,
        name="ID: %d" % id_m,
        showlegend=False,
        hovertemplate=
            "<b>Simulation </b><br>" +
            "ID: %d<br>" % id_m +
            "Time: %{x:.0f} day<br>" +
            "Tumour volume: %{y:.02f} cm^3<br>" +
            "Cancer type: LXF A677<br>" +
            "<br>" +
            "<b>Parameter estimates </b><br>" +
            "Initial tumour volume: %.02f cm^3<br>" % params[0] +
            "Expon. growth rate: %.02f 1/day<br>" % params[1] +
            "Lin. growth rate: %.02f cm^3/day<br>" % params[2] +
            "<extra></extra>",
        mode="lines",
        line=dict(color=colors[index])
    ))

# Set X, Y axis and figure size
fig.update_layout(
    autosize=True,
    xaxis_title=r'$\text{Time in day}$',
    yaxis_title=r'$\text{Tumour volume in cm}^3$',
    template="plotly_white")

# Add switch between linear and log y-scale
fig.update_layout(
    updatemenus=[
        dict(
            type = "buttons",
            direction = "left",
            buttons=list([
                dict(
                    args=[{"yaxis.type": "linear"}],
                    label="Linear y-scale",
                    method="relayout"
                ),
                dict(
                    args=[{"yaxis.type": "log"}],
                    label="Log y-scale",
                    method="relayout"
                )
            ]),
            pad={"r": 0, "t": -10},
            showactive=True,
            x=0.0,
            xanchor="left",
            y=1.15,
            yanchor="top"
        ),
    ]
)

# Show figure
fig.show()

**Figure 2 - Naïve fits:** Naïve optimisation of the structural model starting from an initial point $\psi _0=(1, 1, 1)$ in parameter space for each mouse. We used the CMA-ES optimiser to minimise the squared distance between the model predictions and the observations.

## Stability of optimisation

Minimising the squared distance between the structural model predictions and the observations using the CMA-ES optimiser results in more or less(!) good fits of the data. In particular, each fit corresponds to a mouse-specific set of model parameters $\hat \psi$, which in theory globally minimise the associated objective function $L$. But how can we be sure that those estimates $\hat \psi$ truly minimise the objective function?

While a numerical method will never be able to globally explore a continuous space of model parameter, we can check that the optimisation at least realiably returns the same set of model parameters for each mouse. In other words, are the $\hat \psi $ consistent across multiple optimisation runs. If all runs return the same parameters for a mouse, we can be more confident that the parameter estimates (in combination with the proposed model) do capture its biology (assuming that there are no systematic problems with the objective function or the optimiser). 

To test the stability of the CMA-ES optimiser, we choose different starting points for a number of optimisations (uniform in $[10^{-3}, 10^3]$ for each parameter). Ideally we would like to see that each optimisation returns very similar model parameters for a mouse. This would give us confidence to interpret those model parameters in a biological context.

In [3]:
#
# Run optimisation multiple times from random initial starting points.
#
# This cell needs above defined wrapped myokit model:
# [PintsModel]
#

import os

import myokit
import numpy as np
import pandas as pd
import pints
from tqdm.notebook import tqdm


# Define number of optimisation runs for each individual
n_runs = 10

# Import data
# Get path of current working directory
path = os.getcwd()

# Import LXF A677 control growth data
data = pd.read_csv(path + '/data/lxf_control_growth.csv')
n_mice = len(data['#ID'].unique())

# Define container for the structural model estimates
# Shape (n_mice, n_runs, n_parameters)
n_parameters = 3
mouse_parameters_multi_runs = np.empty(shape=(n_mice, n_runs, n_parameters))

# Define container for the objective function score for the optimised parameters
mouse_scores_multi_runs = np.empty(shape=(n_mice, n_runs))

# Define random starting points over many orders of magnitude
# Shape = (n_runs, n_parameters)
initial_parameters = np.random.uniform(low=1E-3, high=1E3, size=(n_runs, n_parameters))

# Find mouse parameters for LXF A677 population
mouse_ids = data['#ID'].unique()
for index, mouse_id in enumerate(tqdm(mouse_ids)):
    # Create mask for mouse with specfied ID
    mouse_mask = data['#ID'] == mouse_id

    # Get relevant time points
    times = data[mouse_mask]['TIME in day'].to_numpy()

    # Get measured tumour volumes
    observed_volumes = data[mouse_mask]['TUMOUR VOLUME in cm^3'].to_numpy()

    # Create inverse problem
    problem = pints.SingleOutputProblem(PintsModel(), times, observed_volumes)

    # Create sum of squares error objective function
    error = pints.SumOfSquaresError(problem)

    # Run optimisation multiple times
    for run_id, initial_params in enumerate(initial_parameters):
        # Create optimisation controller with a CMA-ES optimiser
        optimiser = pints.OptimisationController(
            function=error,
            x0=initial_params,
            method=pints.CMAES)

        # Disable logging mode
        optimiser.set_log_to_screen(False)

        # Parallelise optimisation
        optimiser.set_parallel(True)

        # Find optimal parameters
        try:
            estimates, score = optimiser.run()
        except:
            # If inference breaks fill estimates with nan
            estimates = np.array([np.nan, np.nan, np.nan])
            score = np.nan

        # Save estimates and score
        mouse_parameters_multi_runs[index, run_id, :] = estimates
        mouse_scores_multi_runs[index, run_id] = score

HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))


----------------------------------------
Unexpected termination.
Current best score: 355.8225057866121
Current best position:
 5.77505038035620366e+00
 2.37741144625073872e+02
 1.06447912564149871e-01
----------------------------------------



HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))


----------------------------------------
Unexpected termination.
Current best score: 1.1143903892920437
Current best position:
-1.98193859211602985e-01
 6.85203818703210459e+02
 6.01382961286088605e-02
----------------------------------------

----------------------------------------
Unexpected termination.
Current best score: 1.1204039287515835
Current best position:
-1.83690385962424457e-01
 1.15435668722883790e+02
 5.87309614935655616e-02
----------------------------------------



HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))




HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))


----------------------------------------
Unexpected termination.
Current best score: 6567.761433282033
Current best position:
 8.11486075169024161e+01
-2.44603126403854219e+02
-2.37478826592146220e+02
----------------------------------------



HBox(children=(FloatProgress(value=0.0, max=10.0), HTML(value='')))




The "Unexpected termination" message indicates that we ran into optimisation errors for at least one run. Let us explore the parameters from the successfull optimisations below.

In [5]:
#
# Visualisation of the spread of optimised model parameters for multiple runs from different initial points.
#
# This cell needs the above optimised model parameters and their respective objective function scores, as well as the data
# [mouse_parameters_multi_runs, mouse_scores_multi_runs, data]
#

import plotly.colors
import plotly.graph_objects as go


# Get mouse ids
mouse_ids = data['#ID'].unique()

# Get number of parameters + score (for visualisation)
n_params = mouse_parameters_multi_runs.shape[2] + 1

# Define colorscheme
colors = plotly.colors.qualitative.Plotly[:n_params]

# Get optimised parameter sets
optimised_parameters =  mouse_parameters_multi_runs

# Get scores for parameters
scores = mouse_scores_multi_runs

# Create figure
fig = go.Figure()

# Box plot of optimised model parameters
for index, id_m in enumerate(mouse_ids):
    # Get optimised parameters
    parameters = optimised_parameters[index, ...]

    # Get scores
    score = scores[index, :]

    # Create box plot of for initial tumour volume
    fig.add_trace(
        go.Box(
            y=parameters[:, 0],  
            name="Initial tumour volume in cm^3",
            boxpoints='all',
            jitter=0.2,
            pointpos=-1.5,
            visible=True if index == 0 else False,
            marker=dict(
                symbol='circle',
                opacity=0.7,
                line=dict(color='black', width=1)),
            marker_color=colors[0],
            line_color=colors[0]))

    # Create box plot of for exponential tumour growth
    fig.add_trace(
        go.Box(
            y=parameters[:, 1],  
            name="Exponential growth rate in 1/day",
            boxpoints='all',
            jitter=0.2,
            pointpos=-1.5,
            visible=True if index == 0 else False,
            marker=dict(
                symbol='circle',
                opacity=0.7,
                line=dict(color='black', width=1)),
            marker_color=colors[1],
            line_color=colors[1]))

    # Create box plot of for linear tumour growth
    fig.add_trace(
        go.Box(
            y=parameters[:, 2],  
            name="Linear growth rate in cm^3/day",
            boxpoints='all',
            jitter=0.2,
            pointpos=-1.5,
            visible=True if index == 0 else False,
            marker=dict(
                symbol='circle',
                opacity=0.7,
                line=dict(color='black', width=1)),
            marker_color=colors[2],
            line_color=colors[2]))
    
    # Create box plot of for objective function score
    fig.add_trace(
        go.Box(
            y=score,  
            name="Score",
            boxpoints='all',
            jitter=0.2,
            pointpos=-1.5,
            visible=True if index == 0 else False,
            marker=dict(
                symbol='circle',
                opacity=0.7,
                line=dict(color='black', width=1)),
            marker_color=colors[3],
            line_color=colors[3]))

# Set figure size
fig.update_layout(
    autosize=True,
    template="plotly_white",
    yaxis_title="Estimates")

# Add switch between mice
fig.update_layout(
    updatemenus=[
        dict(
            type = "buttons",
            direction = "right",
            buttons=list([
                dict(
                    args=[{"visible": [True]*4 + [False]*(4 * 7)}],
                    label="ID: %d" % mouse_ids[0],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*4 + [True]*4 + [False]*(4 * 6)}],
                    label="ID: %d" % mouse_ids[1],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 2) + [True]*4 + [False]*(4 * 5)}],
                    label="ID: %d" % mouse_ids[2],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 3) + [True]*4 + [False]*(4 * 4)}],
                    label="ID: %d" % mouse_ids[3],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 4) + [True]*4 + [False]*(4 * 3)}],
                    label="ID: %d" % mouse_ids[4],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 5) + [True]*4 + [False]*(4 * 2)}],
                    label="ID: %d" % mouse_ids[5],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 6) + [True]*4 + [False]* 4}],
                    label="ID: %d" % mouse_ids[6],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 7) + [True]*4}],
                    label="ID: %d" % mouse_ids[7],
                    method="restyle"
                )
            ]),
            pad={"r": 0, "t": -10},
            showactive=True,
            x=0.0,
            xanchor="left",
            y=1.1,
            yanchor="top"
        )
    ]
)

# Position legend
fig.update_layout(legend=dict(
    yanchor="bottom",
    y=0.01,
    xanchor="left",
    x=1.05))

# Show figure
fig.show()


**Figure 3 - Naïve optimisation:** Box plot of the structural model parameter estimates $\hat \psi $ (initial tumour volume $V_0$, exponential growth rate $\lambda _0$, linear growth rate $\lambda _1$) found by naïve optimisation of the squared distance between the predictions and the observations using a CMA-ES optimiser. For each individual the optimisation routine was run 10 times from a uniformly sampled starting point in $[10^{-3}, 10^3]$. In addition to the optimised parameters, also the distribution of the associated objective function scores is presented.

Running the same optimisation routine multiple times from varying starting points reveals that the CMA-ES optimiser does indeed not return the same "optimal" set of model parameters every time, see Figure 3. Especially the estimates for the exponential growth rate $\lambda _0$ and the linear growth rate $\lambda _1$ appear to vary significantly between optimisations for all individuals. This exercise highlights the importance of running optimisation routines multiple times to test their stability.

The reasons why numerical optimisers, such as CMA-ES, may fail to produce the same estimates when running the optimisation routine multiple times from different starting points can be grouped into two categories: Limitations of the optimisation algorithm; and non-identifiability of the model.

No numerical ("black box") optimiser can guarantee to find the global minimum of an objective function. That is because such optimisers try to find the smallest value of an objective function by repeatedly evaluating it at different points until an optimiser-specific convergence criterion is met. The point with the smallest objective function value is then returned as the minimum of the objective function. This illustrates a generic problem of numerical optimisation: No numerical optimiser globally explores the search space (numerically not feasible / impossible), but instead relies on good heuristics to suggest points for evaluation and to determine convergence. As a result, the performance of an optimiser strongly depends on whether those strategies are well suited for the objective function at hand. Fortunately there seem to be a number of strategies to generically improve the performance of optimisers, such as limiting the search space, non-dimensionalising the model, and log-transforming the model parameters. For more details on the pitfalls of optimisation routines, please have a look at the dedicated notebook, [Numerical Optimisation and its Pitfalls](https://nbviewer.jupyter.org/github/DavAug/ErlotinibGefitinib/blob/master/notebooks/inference_pitfalls/optimisation_pitfalls.ipynb).

In addition, a model may not have a unique minimum in the search space. Such a model is said to be non-identifiable [6]. This occurs when either the model is redundantly parameterised (structural non-identifiability), or the predictions of the observed behaviour are not sensitive to a subset of the model parameters (practical non-identifiability). For more details on structural or practical identifiability, please have a look at the dedicated notebook,[Structural and Practical Identifiability Explained](https://nbviewer.jupyter.org/github/DavAug/ErlotinibGefitinib/blob/master/notebooks/inference_pitfalls/identifiability.ipynb).

## Limiting the search space

Figure 3 shows that naïvely minimising the squared distance between the model predictions and the observations does not lead to reliable parameter estimates. In particular, extremely large parameter estimates as well as negative values are not biologically feasible, and therefore are almost certainly artefacts of the numerical optimisation procedure. 

In fact, one can show that for large parameter magnitudes the optimiser fails to explore the parameter space meaningfully as differences in the objective function score are beyond numerical floating point accuracy. As a result, the CMA-ES optimiser is no longer able to detect a descend of the objective function in any direction and terminates prematurely. For more details please have a look at the dedicated notebook, [Numerical Optimisation and its Pitfalls](https://nbviewer.jupyter.org/github/DavAug/ErlotinibGefitinib/blob/master/notebooks/inference_pitfalls/optimisation_pitfalls.ipynb). In addition, negative parameter values may locally minimise the objective, but are biologically not feasible and should therefore be excluded from the search space.

As a result, our first measure to stabilise the optimisation will be to limit the search space to biologically feasible parameter values. We choose to limit the model parameters to values in $[10^{-3}, 10^3]$. Those parameter ranges should be sufficient, unless we have chosen vastly inappropriate units for the model parameters.

In [7]:
#
# Run optimisation multiple times from random initial starting points. The search space is limited to [10^-3, 10^3].
#
# This cell needs above defined wrapped myokit model:
# [PintsModel]
#

import os

import myokit
import numpy as np
import pandas as pd
import pints
from tqdm.notebook import tqdm


# Define number of optimisation runs for each individual
n_runs = 10

# Import data
# Get path of current working directory
path = os.getcwd()

# Import LXF A677 control growth data
data = pd.read_csv(path + '/data/lxf_control_growth.csv')
n_mice = len(data['#ID'].unique())

# Define container for the structural model estimates
# Shape (n_mice, n_runs, n_parameters)
n_parameters = 3
mouse_parameters_constrained = np.empty(shape=(n_mice, n_runs, n_parameters))

# Define container for the objective function score for the optimised parameters
mouse_scores_constrained = np.empty(shape=(n_mice, n_runs))

# Define random starting points over many orders of magnitude
# Shape = (n_runs, n_parameters)
initial_parameters = np.random.uniform(low=1E-3, high=1E3, size=(n_runs, n_parameters))

# Find mouse parameters for LXF A677 population
mouse_ids = data['#ID'].unique()
for index, mouse_id in enumerate(tqdm(mouse_ids)):
    # Create mask for mouse with specfied ID
    mouse_mask = data['#ID'] == mouse_id

    # Get relevant time points
    times = data[mouse_mask]['TIME in day'].to_numpy()

    # Get measured tumour volumes
    observed_volumes = data[mouse_mask]['TUMOUR VOLUME in cm^3'].to_numpy()

    # Create inverse problem
    problem = pints.SingleOutputProblem(PintsModel(), times, observed_volumes)

    # Create sum of squares error objective function
    error = pints.SumOfSquaresError(problem)

    # Create boundaries to biologically relevant values
    boundary = pints.RectangularBoundaries(lower=[1E-3]*3, upper=[1E3]*3)

    # Run optimisation multiple times
    for run_id, initial_params in enumerate(initial_parameters):
        # Create optimisation controller with a CMA-ES optimiser
        optimiser = pints.OptimisationController(
            function=error,
            x0=initial_params,
            boundaries=boundary,
            method=pints.CMAES)

        # Disable logging mode
        optimiser.set_log_to_screen(False)

        # Parallelise optimisation
        optimiser.set_parallel(True)

        # Find optimal parameters
        try:
            estimates, score = optimiser.run()
        except:
            # If inference breaks fill estimates with nan
            estimates = np.array([np.nan, np.nan, np.nan])
            score = np.nan

        # Save estimates and score
        mouse_parameters_constrained[index, run_id, :] = estimates
        mouse_scores_constrained[index, run_id] = score

HBox(children=(FloatProgress(value=0.0, max=8.0), HTML(value='')))




In [8]:
#
# Visualisation of the spread of optimised model parameters for multiple runs from different initial points.
#
# This cell needs the above optimised model parameters and their respective objective function scores, as well as the data
# [mouse_parameters_constrained, mouse_parameters_constrained, data]
#

import plotly.colors
import plotly.graph_objects as go


# Get mouse ids
mouse_ids = data['#ID'].unique()

# Get number of parameters + score (for visualisation)
n_params = mouse_parameters_constrained.shape[2] + 1

# Define colorscheme
colors = plotly.colors.qualitative.Plotly[:n_params]

# Get optimised parameter sets
optimised_parameters =  mouse_parameters_constrained

# Get scores for parameters
scores = mouse_scores_constrained

# Create figure
fig = go.Figure()

# Box plot of optimised model parameters
for index, id_m in enumerate(mouse_ids):
    # Get optimised parameters
    parameters = optimised_parameters[index, ...]

    # Get scores
    score = scores[index, :]

    # Create box plot of for initial tumour volume
    fig.add_trace(
        go.Box(
            y=parameters[:, 0],  
            name="Initial tumour volume in cm^3",
            boxpoints='all',
            jitter=0.2,
            pointpos=-1.5,
            visible=True if index == 0 else False,
            marker=dict(
                symbol='circle',
                opacity=0.7,
                line=dict(color='black', width=1)),
            marker_color=colors[0],
            line_color=colors[0]))

    # Create box plot of for exponential tumour growth
    fig.add_trace(
        go.Box(
            y=parameters[:, 1],  
            name="Exponential growth rate in 1/day",
            boxpoints='all',
            jitter=0.2,
            pointpos=-1.5,
            visible=True if index == 0 else False,
            marker=dict(
                symbol='circle',
                opacity=0.7,
                line=dict(color='black', width=1)),
            marker_color=colors[1],
            line_color=colors[1]))

    # Create box plot of for linear tumour growth
    fig.add_trace(
        go.Box(
            y=parameters[:, 2],  
            name="Linear growth rate in cm^3/day",
            boxpoints='all',
            jitter=0.2,
            pointpos=-1.5,
            visible=True if index == 0 else False,
            marker=dict(
                symbol='circle',
                opacity=0.7,
                line=dict(color='black', width=1)),
            marker_color=colors[2],
            line_color=colors[2]))
    
    # Create box plot of for objective function score
    fig.add_trace(
        go.Box(
            y=score,  
            name="Score",
            boxpoints='all',
            jitter=0.2,
            pointpos=-1.5,
            visible=True if index == 0 else False,
            marker=dict(
                symbol='circle',
                opacity=0.7,
                line=dict(color='black', width=1)),
            marker_color=colors[3],
            line_color=colors[3]))

# Set figure size
fig.update_layout(
    autosize=True,
    template="plotly_white",
    yaxis_title="Estimates")

# Add switch between mice
fig.update_layout(
    updatemenus=[
        dict(
            type = "buttons",
            direction = "right",
            buttons=list([
                dict(
                    args=[{"visible": [True]*4 + [False]*(4 * 7)}],
                    label="ID: %d" % mouse_ids[0],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*4 + [True]*4 + [False]*(4 * 6)}],
                    label="ID: %d" % mouse_ids[1],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 2) + [True]*4 + [False]*(4 * 5)}],
                    label="ID: %d" % mouse_ids[2],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 3) + [True]*4 + [False]*(4 * 4)}],
                    label="ID: %d" % mouse_ids[3],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 4) + [True]*4 + [False]*(4 * 3)}],
                    label="ID: %d" % mouse_ids[4],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 5) + [True]*4 + [False]*(4 * 2)}],
                    label="ID: %d" % mouse_ids[5],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 6) + [True]*4 + [False]* 4}],
                    label="ID: %d" % mouse_ids[6],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 7) + [True]*4}],
                    label="ID: %d" % mouse_ids[7],
                    method="restyle"
                )
            ]),
            pad={"r": 0, "t": -10},
            showactive=True,
            x=0.0,
            xanchor="left",
            y=1.1,
            yanchor="top"
        )
    ]
)

# Position legend
fig.update_layout(legend=dict(
    yanchor="bottom",
    y=0.01,
    xanchor="left",
    x=1.05))

# Show figure
fig.show()


**Figure 4 - Constrained search space:** Scatter and box plot of the structural model parameter estimates $\hat \psi $ (initial tumour volume $V_0$, exponential growth rate $\lambda _0$, linear growth rate $\lambda _1$) found by minimising the squared distance between the predictions and the observations using a CMA-ES optimiser. For each individual the optimisation routine was run 10 times from a uniformly sampled starting point in $[10^{-3}, 10^3]$. The search space was constrained to values within a biologically realistic range of $[10^{-3}, 10^3]$ for all parameters. In addition to the optimised parameters, also the distribution of the associated objective function scores is presented.

Constraining the search space for the optimisation leads to a significantly improved stability of the optimisation. However, for some mice the spread of the optimised model parameters still extends over the entire allowed parameter range.

## Non-dimensionalisation of the model

The reason why constraining the parameter space is not sufficient to remove the instabilities of the optimisation is that the structural model introduces apparent practical non-identifiabilities for vastly different growth parameters, $2\lambda _0\ll \lambda _1$ or $2\lambda _0\gg \lambda _1$. This becomes most obvious in the limit where the parameter ratio $V_{\text{crit}} = \lambda _1 / 2\lambda _0$ goes to infinity or zero, e.g.

\begin{equation*}
    \lim _{V_{\text{crit}}\rightarrow 0} \frac{\text{d}V^s_T}{\text{d}t} = \lambda _1.
\end{equation*}

In this limit the exponentially growth rate $\lambda _0$ becomes non-identifiable. But already for finitely small critical volumes $V_{\text{crit}}\ll V^s_T$ the differences in 

for any small is tehroeticallyb identifiable but not with nbumerical methods.

One of the simplest strategies to avoid a structural non-identifiability of a model is to transform the model parameters $\psi $ from a biologically meaningful set of parameters $(V_0, \lambda _0, \lambda _1)$ to a set of dimensionless model parameters. This can be done by introducing characteristic scales of the problem, such as a characteristic tumour volume $V^c_T$ and a characteristic time $t^c$. If we express the structural predictions of the tumour volume in units of the characteristic tumour volume $v = V^s_T / V^c_T$ and the time in units of the characteristic time $\tau = t / t^c $, we can rewrite the structural model in a nondimensional form

\begin{equation*}
   \frac{\text{d}v}{\text{d}\tau} = \frac{a_1 v}{v + a_0},
\end{equation*}

where $a_0 = \lambda _1 / 2 \lambda _0 V^c_T$ and $a_1 = \lambda _1 t^c / V^c_T$. In other words, we used the characteristic volume and time scales to express the model parameters in units of them

\begin{equation*}
   (V_0, \lambda _0, \lambda _1) = \left( v_0\, V^c_T, \frac{a_1}{2a_0}\frac{1}{t^c}, a_1 \frac{V^c_T}{t^c} \right) .
\end{equation*}

Here, $(v_0, a_0, a_1)$ take the role of the model parameters. Arguably for this biological process a characteristic volume scale is $1\, \text{cm}^3$ and a characteristic time scale is $1\, \text{day}$.


In [11]:
#
# Create structural model.
#

import pkpd.model as m


# Create model
model = m.create_dimless_tumour_growth_model()

# Show model
print(model.code())

[[model]]
# Initial values
central.volume_t = 0

[central]
a_0 = 1
    in [1]
a_1 = 0
    in [1]
time = 0 bind time
    in [1]
dot(volume_t) = a_1 * volume_t / (volume_t + a_0)
    in [1]




In [12]:
#
# Define pints model wrapper such that myokit model can be used for inference.
#

import myokit
import pints

from pkpd import model as model


# Wrap myokit model, so it can be used with pints
class DimensionlessPintsModel(pints.ForwardModel):
    def __init__(self):
        # Create myokit model
        model = m.create_dimless_tumour_growth_model()

        # Create simulator
        self.sim = myokit.Simulation(model)

    def n_parameters(self):
        """
        Number of parameters to fit. Here initial v, a_0, a_1.
        """
        return 3

    def n_outputs(self):
        return 1

    def simulate(self, parameters, times):
        # Reset simulation
        self.sim.reset()

        # Sort input parameters
        initial_volume, a_0, a_1 = parameters

        # Set initial condition
        self.sim.set_state([initial_volume])

        # Set growth constants
        self.sim.set_constant('central.a_0', a_0)
        self.sim.set_constant('central.a_1', a_1)

        # Define logged variable
        loggedVariable = 'central.volume_t'

        # Simulate
        output = self.sim.run(times[-1] + 1, log=[loggedVariable], log_times=times)
        result = output[loggedVariable]

        return np.array(result)

In [13]:
#
# Run optimisation multiple times from random initial starting points. 
#
# Characteristic scale were chosen to be: V^c_T = 1 cm^3, t^c = 1 day.
#
# This cell needs above defined wrapped myokit model:
# [DimensionlessPintsModel]
#

import os

import myokit
import numpy as np
import pandas as pd
import pints


# Define characteristic scales
characteristic_volume = 1  # in cm^3
characteristic_time = 1  # in day

# Define number of optimisation runs for each individual
n_runs = 10

# Import data
# Get path of current working directory
path = os.getcwd()

# Import LXF A677 control growth data
data = pd.read_csv(path + '/data/lxf_control_growth.csv')
n_mice = len(data['#ID'].unique())

# Define container for the structural model estimates
# Shape (n_mice, n_runs, n_parameters)
n_parameters = 3
mouse_parameters_multi_runs_dimless = np.empty(shape=(n_mice, n_runs, n_parameters))

# Define container for the objective function score for the optimised parameters
mouse_scores_multi_runs_dimless = np.empty(shape=(n_mice, n_runs))

# Define random starting points over many orders of magnitude
# Shape = (n_runs, n_parameters)
initial_parameters = np.random.uniform(low=1E-3, high=1E3, size=(n_runs, n_parameters))

# Find mouse parameters for LXF A677 population
mouse_ids = data['#ID'].unique()
for index, mouse_id in enumerate(mouse_ids):
    # Create mask for mouse with specfied ID
    mouse_mask = data['#ID'] == mouse_id

    # Get relevant time points
    times = data[mouse_mask]['TIME in day'].to_numpy() / characteristic_time

    # Get measured tumour volumes
    observed_volumes = data[mouse_mask]['TUMOUR VOLUME in cm^3'].to_numpy() / characteristic_volume

    # Create inverse problem
    problem = pints.SingleOutputProblem(DimensionlessPintsModel(), times, observed_volumes)

    # Create sum of squares error objective function
    error = pints.SumOfSquaresError(problem)

    # Create boundaries to biologically relevant values
    boundary = pints.RectangularBoundaries(lower=[1E-3]*3, upper=[1E3]*3)

    # Run optimisation multiple times
    for run_id, initial_params in enumerate(initial_parameters):
        # Create optimisation controller with a CMA-ES optimiser
        optimiser = pints.OptimisationController(
            function=error,
            x0=initial_params,
            boundaries=boundary,
            method=pints.CMAES)

        # Disable logging mode
        optimiser.set_log_to_screen(False)

        # Parallelise optimisation
        optimiser.set_parallel(True)

        # Find optimal parameters
        try:
            estimates, score = optimiser.run()
        except:
            # If inference breaks fill estimates with nan
            estimates = np.array([np.nan, np.nan, np.nan])
            score = np.nan

        # Save estimates and score
        mouse_parameters_multi_runs_dimless[index, run_id, :] = estimates
        mouse_scores_multi_runs_dimless[index, run_id] = score

In [14]:
#
# Transform dimensionless parameters back to biological parameters.
# 
# This cell needs the above inferred dimensionless parameters and the characteristic volume and time scale:
# [mouse_parameters_multi_runs_dimless, characteristic_volume, characteristic_time]
#

import numpy as np


# Initialise container for backtransformed paramters
# Shape (n_mice, n_runs, n_parameters)
mouse_parameters_dimless_optimisation = np.empty(shape=mouse_parameters_multi_runs_dimless.shape)

# Transform initial volumes
mouse_parameters_dimless_optimisation[:, :, 0] = mouse_parameters_multi_runs_dimless[:, :, 0] * characteristic_volume

# Transform exponential growth rate
# lambda_0 = a_1 / a_0 /t^c
mouse_parameters_dimless_optimisation[:, :, 1] = \
    mouse_parameters_multi_runs_dimless[:, :, 2] / 2 / mouse_parameters_multi_runs_dimless[:, :, 1] / characteristic_time

# Transform linear growth rate
# lambda_1 = a_1 / a_0 /t^c
mouse_parameters_dimless_optimisation[:, :, 2] = \
    mouse_parameters_multi_runs_dimless[:, :, 2] * characteristic_volume / characteristic_time


In [15]:
#
# Visualisation of the spread of optimised model parameters for multiple runs from different initial points.
#
# This cell needs the above optimised paraemeters and their respective objective function scores, as well as the data
# [mouse_parameters_dimless_optimisation, mouse_scores_multi_runs_dimless, data]
#

import plotly.colors
import plotly.graph_objects as go


# Get mouse ids
mouse_ids = data['#ID'].unique()

# Get number of parameters + score (for visualisation)
n_params = mouse_parameters_multi_runs_dimless.shape[2] + 1

# Define colorscheme
colors = plotly.colors.qualitative.Plotly[:n_params]

# Get optimised parameter sets
optimised_parameters =  mouse_parameters_dimless_optimisation

# Get scores for parameters
scores = mouse_scores_multi_runs_dimless

# Create figure
fig = go.Figure()

# Box plot of optimised model parameters
for index, id_m in enumerate(mouse_ids):
    # Get optimised parameters
    parameters = optimised_parameters[index, ...]

    # Get scores
    score = scores[index, :]

    # Create box plot of for initial tumour volume
    fig.add_trace(
        go.Box(
            y=parameters[:, 0],  
            name="Initial tumour volume in cm^3",
            boxpoints='all',
            jitter=0.2,
            pointpos=-1.5,
            visible=True if index == 0 else False,
            marker=dict(
                symbol='circle',
                opacity=0.7,
                line=dict(color='black', width=1)),
            marker_color=colors[0],
            line_color=colors[0]))

    # Create box plot of for exponential tumour growth
    fig.add_trace(
        go.Box(
            y=parameters[:, 1],  
            name="Exponential growth rate in 1/day",
            boxpoints='all',
            jitter=0.2,
            pointpos=-1.5,
            visible=True if index == 0 else False,
            marker=dict(
                symbol='circle',
                opacity=0.7,
                line=dict(color='black', width=1)),
            marker_color=colors[1],
            line_color=colors[1]))

    # Create box plot of for linear tumour growth
    fig.add_trace(
        go.Box(
            y=parameters[:, 2],  
            name="Linear growth rate in cm^3/day",
            boxpoints='all',
            jitter=0.2,
            pointpos=-1.5,
            visible=True if index == 0 else False,
            marker=dict(
                symbol='circle',
                opacity=0.7,
                line=dict(color='black', width=1)),
            marker_color=colors[2],
            line_color=colors[2]))
    
    # Create box plot of for objective function score
    fig.add_trace(
        go.Box(
            y=score,  
            name="Score",
            boxpoints='all',
            jitter=0.2,
            pointpos=-1.5,
            visible=True if index == 0 else False,
            marker=dict(
                symbol='circle',
                opacity=0.7,
                line=dict(color='black', width=1)),
            marker_color=colors[3],
            line_color=colors[3]))

# Set figure size
fig.update_layout(
    autosize=True,
    template="plotly_white")

# Add switch between mice
fig.update_layout(
    updatemenus=[
        dict(
            type = "buttons",
            direction = "right",
            buttons=list([
                dict(
                    args=[{"visible": [True]*4 + [False]*(4 * 7)}],
                    label="ID: %d" % mouse_ids[0],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*4 + [True]*4 + [False]*(4 * 6)}],
                    label="ID: %d" % mouse_ids[1],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 2) + [True]*4 + [False]*(4 * 5)}],
                    label="ID: %d" % mouse_ids[2],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 3) + [True]*4 + [False]*(4 * 4)}],
                    label="ID: %d" % mouse_ids[3],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 4) + [True]*4 + [False]*(4 * 3)}],
                    label="ID: %d" % mouse_ids[4],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 5) + [True]*4 + [False]*(4 * 2)}],
                    label="ID: %d" % mouse_ids[5],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 6) + [True]*4 + [False]* 4}],
                    label="ID: %d" % mouse_ids[6],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 7) + [True]*4}],
                    label="ID: %d" % mouse_ids[7],
                    method="restyle"
                )
            ]),
            pad={"r": 0, "t": -10},
            showactive=True,
            x=0.0,
            xanchor="left",
            y=1.1,
            yanchor="top"
        )
    ]
)

# Position legend
fig.update_layout(legend=dict(
    yanchor="bottom",
    y=0.01,
    xanchor="left",
    x=1.05))

# Show figure
fig.show()


**Figure 5:** Scatter and box plot of the structural model parameters $\psi $ (initial tumour volume $V_0$, exponential growth rate $\lambda _0$, linear growth rate $\lambda _1$) found by minimising the squared distance between the predictions and the observations using a CMA-ES optimiser. For each individual the optimisation routine was run 10 times from a uniformly sampled starting point in $[10^{-3}, 10^3]$. In addition to the optimised parameters, also the distribution of the associated objective function scores is presented.

## Log-transforming the model parameters

In fact, one can show that for large parameter magnitudes the model can develop apparent practical non-identifiabilties which are related to the floating point accuracy of the numerically evaluated objective function. For example in a scenario where $2\lambda _0 \ll \lambda _1 $, the structural growth model approximately reduces to

\begin{equation*}
    
\begin{equation*}

In [16]:
#
# Define log-tranformed pints model wrapper such that myokit model can be used for inference. In contrast to the PintsModel above, this model expects log-transformed parameters log(psi).
#

import myokit
import numpy as np
import pints

from pkpd import model as m


# Wrap myokit model, so it can be used with pints
class LogTransformedPintsModel(pints.ForwardModel):
    def __init__(self):
        # Create myokit model
        model = m.create_tumour_growth_model()

        # Create simulator
        self.sim = myokit.Simulation(model)

    def n_parameters(self):
        """
        Number of parameters to fit. Here log(V^s_T), log(lambda_0), log(lambda_1)
        """
        return 3

    def n_outputs(self):
        return 1

    def simulate(self, log_parameters, times):
        # Reset simulation
        self.sim.reset()

        # Sort input parameters and transform to linear scale
        initial_volume, lambda_0, lambda_1 = np.exp(log_parameters)

        # Set initial condition
        self.sim.set_state([initial_volume])

        # Set growth constants
        self.sim.set_constant('central.lambda_0', lambda_0)
        self.sim.set_constant('central.lambda_1', lambda_1)

        # Define logged variable
        loggedVariable = 'central.volume_t'

        # Simulate
        output = self.sim.run(times[-1] + 1, log=[loggedVariable], log_times=times)
        result = output[loggedVariable]

        return np.array(result)

In [17]:
#
# Run optimisation multiple times from random initial starting points.
#
# This cell needs above defined wrapped myokit model:
# [LogTransformedPintsModel]
#

import os

import myokit
import numpy as np
import pandas as pd
import pints


# Define number of optimisation runs for each individual
n_runs = 10

# Import data
# Get path of current working directory
path = os.getcwd()

# Import LXF A677 control growth data
data = pd.read_csv(path + '/data/lxf_control_growth.csv')
n_mice = len(data['#ID'].unique())

# Define container for the structural model estimates
# Shape (n_mice, n_runs, n_parameters)
n_parameters = 3
mouse_parameters_multi_runs_log_transformed = np.empty(shape=(n_mice, n_runs, n_parameters))

# Define container for the objective function score for the optimised parameters
mouse_scores_multi_runs_log_transformed = np.empty(shape=(n_mice, n_runs))

# Define random starting points over many orders of magnitude
# Shape = (n_runs, n_parameters)
initial_parameters = np.random.uniform(low=1E-3, high=1E3, size=(n_runs, n_parameters))

# Find mouse parameters for LXF A677 population
mouse_ids = data['#ID'].unique()
for index, mouse_id in enumerate(mouse_ids):
    # Create mask for mouse with specfied ID
    mouse_mask = data['#ID'] == mouse_id

    # Get relevant time points
    times = data[mouse_mask]['TIME in day'].to_numpy()

    # Get measured tumour volumes
    observed_volumes = data[mouse_mask]['TUMOUR VOLUME in cm^3'].to_numpy()

    # Create inverse problem
    problem = pints.SingleOutputProblem(LogTransformedPintsModel(), times, observed_volumes)

    # Create sum of squares error objective function
    error = pints.SumOfSquaresError(problem)

    # Create boundaries to biologically relevant values
    boundary = pints.RectangularBoundaries(lower=np.log([1E-3]*3), upper=np.log([1E3]*3))

    # Run optimisation multiple times
    for run_id, initial_params in enumerate(initial_parameters):
        # Transform parameters to log-scale
        log_initial_params = np.log(initial_params)

        # Create optimisation controller with a CMA-ES optimiser
        optimiser = pints.OptimisationController(
            function=error,
            x0=log_initial_params,
            boundaries=boundary,
            method=pints.CMAES)

        # Disable logging mode
        optimiser.set_log_to_screen(False)

        # Parallelise optimisation
        optimiser.set_parallel(True)

        # Find optimal parameters
        try:
            estimates, score = optimiser.run()
        except:
            # If inference breaks fill estimates with nan
            estimates = np.array([np.nan, np.nan, np.nan])
            score = np.nan

        # Save estimates and score (back transformed to linear scale)
        mouse_parameters_multi_runs_log_transformed[index, run_id, :] = np.exp(estimates)
        mouse_scores_multi_runs_log_transformed[index, run_id] = score

In [18]:
#
# Visualisation of the spread of optimised model parameters for multiple runs from different initial points.
#
# This cell needs the above optimised initial parameter from psi_0=(1, 1, 1) and the five runs from a random initial starting point, and their respective objective function scores, as well as the data
# [mouse_parameters_multi_runs_log_transformed, mouse_scores_multi_runs_log_transformed, data]
#

import plotly.colors
import plotly.graph_objects as go


# Get mouse ids
mouse_ids = data['#ID'].unique()

# Get number of parameters + score (for visualisation)
n_params = mouse_parameters_multi_runs_log_transformed.shape[2] + 1

# Define colorscheme
colors = plotly.colors.qualitative.Plotly[:n_params]

# Get optimised parameters
optimised_parameters = mouse_parameters_multi_runs_log_transformed

# Get optimised parameters
scores = mouse_scores_multi_runs_log_transformed

# Create figure
fig = go.Figure()

# Box plot of optimised model parameters
for index, id_m in enumerate(mouse_ids):
    # Get optimised parameters
    parameters = optimised_parameters[index, ...]

    # Get scores
    score = scores[index, :]

    # Create box plot of for initial tumour volume
    fig.add_trace(
        go.Box(
            y=parameters[:, 0],  
            name="Initial tumour volume in cm^3",
            boxpoints='all',
            jitter=0.2,
            pointpos=-1.5,
            visible=True if index == 0 else False,
            marker=dict(
                symbol='circle',
                opacity=0.7,
                line=dict(color='black', width=1)),
            marker_color=colors[0],
            line_color=colors[0]))

    # Create box plot of for exponential tumour growth
    fig.add_trace(
        go.Box(
            y=parameters[:, 1],  
            name="Exponential growth rate in 1/day",
            boxpoints='all',
            jitter=0.2,
            pointpos=-1.5,
            visible=True if index == 0 else False,
            marker=dict(
                symbol='circle',
                opacity=0.7,
                line=dict(color='black', width=1)),
            marker_color=colors[1],
            line_color=colors[1]))

    # Create box plot of for linear tumour growth
    fig.add_trace(
        go.Box(
            y=parameters[:, 2],  
            name="Linear growth rate in cm^3/day",
            boxpoints='all',
            jitter=0.2,
            pointpos=-1.5,
            visible=True if index == 0 else False,
            marker=dict(
                symbol='circle',
                opacity=0.7,
                line=dict(color='black', width=1)),
            marker_color=colors[2],
            line_color=colors[2]))
    
    # Create box plot of for objective function score
    fig.add_trace(
        go.Box(
            y=score,  
            name="Score",
            boxpoints='all',
            jitter=0.2,
            pointpos=-1.5,
            visible=True if index == 0 else False,
            marker=dict(
                symbol='circle',
                opacity=0.7,
                line=dict(color='black', width=1)),
            marker_color=colors[3],
            line_color=colors[3]))

# Set figure size
fig.update_layout(
    autosize=True,
    template="plotly_white")

# Add switch between mice
fig.update_layout(
    updatemenus=[
        dict(
            type = "buttons",
            direction = "right",
            buttons=list([
                dict(
                    args=[{"visible": [True]*4 + [False]*(4 * 7)}],
                    label="ID: %d" % mouse_ids[0],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*4 + [True]*4 + [False]*(4 * 6)}],
                    label="ID: %d" % mouse_ids[1],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 2) + [True]*4 + [False]*(4 * 5)}],
                    label="ID: %d" % mouse_ids[2],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 3) + [True]*4 + [False]*(4 * 4)}],
                    label="ID: %d" % mouse_ids[3],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 4) + [True]*4 + [False]*(4 * 3)}],
                    label="ID: %d" % mouse_ids[4],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 5) + [True]*4 + [False]*(4 * 2)}],
                    label="ID: %d" % mouse_ids[5],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 6) + [True]*4 + [False]* 4}],
                    label="ID: %d" % mouse_ids[6],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 7) + [True]*4}],
                    label="ID: %d" % mouse_ids[7],
                    method="restyle"
                )
            ]),
            pad={"r": 0, "t": -10},
            showactive=True,
            x=0.0,
            xanchor="left",
            y=1.1,
            yanchor="top"
        )
    ]
)

# Position legend
fig.update_layout(legend=dict(
    yanchor="bottom",
    y=0.01,
    xanchor="left",
    x=1.05))

# Show figure
fig.show()


**Figure 6**:

## Non-dimensional model with log-transformed parameters

In [1]:
#
# Define pints model wrapper such that myokit model can be used for inference.
#

import myokit
import pints

from pkpd import model as m


# Wrap myokit model, so it can be used with pints
class DimensionlessLogTransformedPintsModel(pints.ForwardModel):
    def __init__(self):
        # Create myokit model
        model = m.create_dimless_tumour_growth_model()

        # Create simulator
        self.sim = myokit.Simulation(model)

    def n_parameters(self):
        """
        Number of parameters to fit. Here log v, log a_0, log a_1.
        """
        return 3

    def n_outputs(self):
        return 1

    def simulate(self, log_parameters, times):
        # Reset simulation
        self.sim.reset()

        # Sort input parameters
        initial_volume, a_0, a_1 = np.exp(log_parameters)

        # Set initial condition
        self.sim.set_state([initial_volume])

        # Set growth constants
        self.sim.set_constant('central.a_0', a_0)
        self.sim.set_constant('central.a_1', a_1)

        # Define logged variable
        loggedVariable = 'central.volume_t'

        # Simulate
        output = self.sim.run(times[-1] + 1, log=[loggedVariable], log_times=times)
        result = output[loggedVariable]

        return np.array(result)

In [2]:
#
# Run optimisation multiple times from random initial starting points.
#
# This cell needs the above defined wrapped myokit model:
# [DimensionlessLogTransformedPintsModel]
#

import os

import myokit
import numpy as np
import pandas as pd
import pints


# Define characteristic scales
characteristic_volume = 1  # in cm^3
characteristic_time = 1  # in day

# Define number of optimisation runs for each individual
n_runs = 10

# Import data
# Get path of current working directory
path = os.getcwd()

# Import LXF A677 control growth data
data = pd.read_csv(path + '/data/lxf_control_growth.csv')
n_mice = len(data['#ID'].unique())

# Define container for the structural model estimates
# Shape (n_mice, n_runs, n_parameters)
n_parameters = 3
mouse_parameters_dimless_log_transformed = np.empty(shape=(n_mice, n_runs, n_parameters))

# Define container for the objective function score for the optimised parameters
mouse_scores_dimless_log_transformed = np.empty(shape=(n_mice, n_runs))

# Define random starting points over many orders of magnitude
# Shape = (n_runs, n_parameters)
initial_parameters = np.random.uniform(low=1E-3, high=1E3, size=(n_runs, n_parameters))

# Find mouse parameters for LXF A677 population
mouse_ids = data['#ID'].unique()
for index, mouse_id in enumerate(mouse_ids):
    # Create mask for mouse with specfied ID
    mouse_mask = data['#ID'] == mouse_id

    # Get relevant time points
    times = data[mouse_mask]['TIME in day'].to_numpy() / characteristic_time

    # Get measured tumour volumes
    observed_volumes = data[mouse_mask]['TUMOUR VOLUME in cm^3'].to_numpy() / characteristic_volume

    # Create inverse problem
    problem = pints.SingleOutputProblem(DimensionlessLogTransformedPintsModel(), times, observed_volumes)

    # Create sum of squares error objective function
    error = pints.SumOfSquaresError(problem)

    # Create boundaries to biologically relevant values
    boundary = pints.RectangularBoundaries(lower=np.log([1E-3]*3), upper=np.log([1E3]*3))

    # Run optimisation multiple times
    for run_id, initial_params in enumerate(initial_parameters):
        # Transform parameters to log-scale
        log_initial_params = np.log(initial_params)

        # Create optimisation controller with a CMA-ES optimiser
        optimiser = pints.OptimisationController(
            function=error,
            x0=log_initial_params,
            boundaries=boundary,
            method=pints.CMAES)

        # Disable logging mode
        optimiser.set_log_to_screen(False)

        # Parallelise optimisation
        optimiser.set_parallel(True)

        # Find optimal parameters
        try:
            estimates, score = optimiser.run()
        except:
            # If inference breaks fill estimates with nan
            estimates = np.array([np.nan, np.nan, np.nan])
            score = np.nan

        # Save estimates and score (back transformed to linear scale)
        mouse_parameters_dimless_log_transformed[index, run_id, :] = np.exp(estimates)
        mouse_scores_dimless_log_transformed[index, run_id] = score

In [4]:
#
# Transform dimensionless parameters back to biological parameters.
# 
# This cell needs the above inferred dimensionless parameters and the characteristic volume and time scale:
# [mouse_parameters_multi_runs_dimless, characteristic_volume, characteristic_time]
#

import numpy as np


# Initialise container for backtransformed paramters
# Shape (n_mice, n_runs, n_parameters)
mouse_parameters_dimless_log_transformed_optimisation = np.empty(shape=mouse_parameters_dimless_log_transformed.shape)

# Transform initial volumes
mouse_parameters_dimless_log_transformed_optimisation[:, :, 0] = mouse_parameters_dimless_log_transformed[:, :, 0] * characteristic_volume

# Transform exponential growth rate
# lambda_0 = a_1 / a_0 /t^c
mouse_parameters_dimless_log_transformed_optimisation[:, :, 1] = \
    mouse_parameters_dimless_log_transformed[:, :, 2] / 2 / mouse_parameters_dimless_log_transformed[:, :, 1] / characteristic_time

# Transform linear growth rate
# lambda_1 = a_1 / a_0 /t^c
mouse_parameters_dimless_log_transformed_optimisation[:, :, 2] = \
    mouse_parameters_dimless_log_transformed[:, :, 2] * characteristic_volume / characteristic_time


In [22]:
#
# Visualisation of the spread of optimised model parameters for multiple runs from different initial points.
#
# This cell needs the above optimised initial parameter from psi_0=(1, 1, 1) and the five runs from a random initial starting point, and their respective objective function scores, as well as the data
# [mouse_parameters_dimless_log_transformed_optimisation, mouse_scores_dimless_log_transformed, data]
#

import plotly.colors
import plotly.graph_objects as go


# Get mouse ids
mouse_ids = data['#ID'].unique()

# Get number of parameters + score (for visualisation)
n_params = mouse_parameters_dimless_log_transformed_optimisation.shape[2] + 1

# Define colorscheme
colors = plotly.colors.qualitative.Plotly[:n_params]

# Get optimised parameters
optimised_parameters = mouse_parameters_dimless_log_transformed_optimisation

# Get optimised parameters
scores = mouse_scores_dimless_log_transformed

# Create figure
fig = go.Figure()

# Box plot of optimised model parameters
for index, id_m in enumerate(mouse_ids):
    # Get optimised parameters
    parameters = optimised_parameters[index, ...]

    # Get scores
    score = scores[index, :]

    # Create box plot of for initial tumour volume
    fig.add_trace(
        go.Box(
            y=parameters[:, 0],  
            name="Initial tumour volume in cm^3",
            boxpoints='all',
            jitter=0.2,
            pointpos=-1.5,
            visible=True if index == 0 else False,
            marker=dict(
                symbol='circle',
                opacity=0.7,
                line=dict(color='black', width=1)),
            marker_color=colors[0],
            line_color=colors[0]))

    # Create box plot of for exponential tumour growth
    fig.add_trace(
        go.Box(
            y=parameters[:, 1],  
            name="Exponential growth rate in 1/day",
            boxpoints='all',
            jitter=0.2,
            pointpos=-1.5,
            visible=True if index == 0 else False,
            marker=dict(
                symbol='circle',
                opacity=0.7,
                line=dict(color='black', width=1)),
            marker_color=colors[1],
            line_color=colors[1]))

    # Create box plot of for linear tumour growth
    fig.add_trace(
        go.Box(
            y=parameters[:, 2],  
            name="Linear growth rate in cm^3/day",
            boxpoints='all',
            jitter=0.2,
            pointpos=-1.5,
            visible=True if index == 0 else False,
            marker=dict(
                symbol='circle',
                opacity=0.7,
                line=dict(color='black', width=1)),
            marker_color=colors[2],
            line_color=colors[2]))
    
    # Create box plot of for objective function score
    fig.add_trace(
        go.Box(
            y=score,  
            name="Score",
            boxpoints='all',
            jitter=0.2,
            pointpos=-1.5,
            visible=True if index == 0 else False,
            marker=dict(
                symbol='circle',
                opacity=0.7,
                line=dict(color='black', width=1)),
            marker_color=colors[3],
            line_color=colors[3]))

# Set figure size
fig.update_layout(
    autosize=True,
    template="plotly_white")

# Add switch between mice
fig.update_layout(
    updatemenus=[
        dict(
            type = "buttons",
            direction = "right",
            buttons=list([
                dict(
                    args=[{"visible": [True]*4 + [False]*(4 * 7)}],
                    label="ID: %d" % mouse_ids[0],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*4 + [True]*4 + [False]*(4 * 6)}],
                    label="ID: %d" % mouse_ids[1],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 2) + [True]*4 + [False]*(4 * 5)}],
                    label="ID: %d" % mouse_ids[2],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 3) + [True]*4 + [False]*(4 * 4)}],
                    label="ID: %d" % mouse_ids[3],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 4) + [True]*4 + [False]*(4 * 3)}],
                    label="ID: %d" % mouse_ids[4],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 5) + [True]*4 + [False]*(4 * 2)}],
                    label="ID: %d" % mouse_ids[5],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 6) + [True]*4 + [False]* 4}],
                    label="ID: %d" % mouse_ids[6],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 7) + [True]*4}],
                    label="ID: %d" % mouse_ids[7],
                    method="restyle"
                )
            ]),
            pad={"r": 0, "t": -10},
            showactive=True,
            x=0.0,
            xanchor="left",
            y=1.1,
            yanchor="top"
        )
    ]
)

# Position legend
fig.update_layout(legend=dict(
    yanchor="bottom",
    y=0.01,
    xanchor="left",
    x=1.05))

# Show figure
fig.show()


**Figure 7:**

A linear growth of 100 cm^3/day does not seem feasible. It looks more like an artefact of the model. An extremely large linear growth rate keeps the model in the exponential growth phase for longer.

## Ultimate identifiability test

We now have a good idea for reasonable parameter values $\psi $ for all mice. The ultimate identifiability test is now to simulate data with the inferred parameters and check whether we can recover the exact same parameters.

In [20]:
#
# Simulate "noise-free" data for with median of inferred model parameters. 
#
# This cell needs the above inferred model parameters, and the above defined model.
# [mouse_parameters_dimless_log_transformed_optimisation, PintsModel]
#

import os

import numpy as np
import pandas as pd


# Import data
# Get path of current working directory
path = os.getcwd()

# Import LXF A677 control growth data
data = pd.read_csv(path + '/data/lxf_control_growth.csv')

# Get mouse IDs and times
mouse_ids_and_times = data[['#ID', 'TIME in day']]

# Get median parameters for each mouse
median_parameters = np.median(mouse_parameters_dimless_log_transformed_optimisation, axis=1)

# Instantiate model
model = PintsModel()

# Create container for simulated synthesised data
simulated_data = []

# Simulate "noise-free" data
mouse_ids = mouse_ids_and_times['#ID'].unique()
for index, mouse_id in enumerate(mouse_ids):
    # Create mask for mouse
    mask = mouse_ids_and_times['#ID'] == mouse_id

    # Get times
    times = mouse_ids_and_times[mask]['TIME in day'].to_numpy()

    # Get parameters
    parameters = median_parameters[index, :]

    # Simulate data
    simulated_volumes = model.simulate(parameters, times)

    # Save dataframe
    df = pd.DataFrame({
        '#ID': mouse_ids_and_times[mask]['#ID'],
        'TIME in day': mouse_ids_and_times[mask]['TIME in day'],
        'SIMULATED TUMOUR VOLUME in cm^3': simulated_volumes})
    simulated_data.append(df)

# Merge mouse dataframes to one dataframe
simulated_data = pd.concat(simulated_data)


In [24]:
#
# Run optimisation multiple times from random initial starting points.
#
# This cell needs the above defined wrapped myokit model and the simulated data:
# [DimensionlessLogTransformedPintsModel, simulated_data]
#

import myokit
import numpy as np
import pandas as pd
import pints


# Define characteristic scales
characteristic_volume = 1  # in cm^3
characteristic_time = 1  # in day

# Define number of optimisation runs for each individual
n_runs = 10

# Get number of mice
n_mice = len(simulated_data['#ID'].unique())

# Define container for the structural model estimates
# Shape (n_mice, n_runs, n_parameters)
n_parameters = 3
recovered_parameters = np.empty(shape=(n_mice, n_runs, n_parameters))

# Define container for the objective function score for the optimised parameters
recovered_parameters_scores = np.empty(shape=(n_mice, n_runs))

# Define random starting points over many orders of magnitude
# Shape = (n_runs, n_parameters)
initial_parameters = np.random.uniform(low=1E-3, high=1E3, size=(n_runs, n_parameters))

# Find mouse parameters for LXF A677 population
mouse_ids = simulated_data['#ID'].unique()
for index, mouse_id in enumerate(mouse_ids):
    # Create mask for mouse with specfied ID
    mouse_mask = simulated_data['#ID'] == mouse_id

    # Get relevant time points
    times = simulated_data[mouse_mask]['TIME in day'].to_numpy() / characteristic_time

    # Get measured tumour volumes
    observed_volumes = simulated_data[mouse_mask]['SIMULATED TUMOUR VOLUME in cm^3'].to_numpy() / characteristic_volume

    # Create inverse problem
    problem = pints.SingleOutputProblem(DimensionlessLogTransformedPintsModel(), times, observed_volumes)

    # Create sum of squares error objective function
    error = pints.SumOfSquaresError(problem)

    # Create boundaries to biologically relevant values
    boundary = pints.RectangularBoundaries(lower=np.log([1E-3]*3), upper=np.log([1E3]*3))

    # Run optimisation multiple times
    for run_id, initial_params in enumerate(initial_parameters):
        # Transform parameters to log-scale
        log_initial_params = np.log(initial_params)

        # Create optimisation controller with a CMA-ES optimiser
        optimiser = pints.OptimisationController(
            function=error,
            x0=log_initial_params,
            boundaries=boundary,
            method=pints.CMAES)

        # Disable logging mode
        optimiser.set_log_to_screen(False)

        # Parallelise optimisation
        optimiser.set_parallel(True)

        # Find optimal parameters
        try:
            estimates, score = optimiser.run()
        except:
            # If inference breaks fill estimates with nan
            estimates = np.array([np.nan, np.nan, np.nan])
            score = np.nan

        # Save estimates and score (back transformed to linear scale)
        recovered_parameters[index, run_id, :] = np.exp(estimates)
        recovered_parameters_scores[index, run_id] = score

In [25]:
#
# Transform dimensionless parameters back to biological parameters.
# 
# This cell needs the above inferred dimensionless parameters and the characteristic volume and time scale:
# [recovered_parameters, characteristic_volume, characteristic_time]
#

import numpy as np


# Initialise container for backtransformed paramters
# Shape (n_mice, n_runs, n_parameters)
recovered_parameters_transformed = np.empty(shape=recovered_parameters.shape)

# Transform initial volumes
recovered_parameters_transformed[:, :, 0] = recovered_parameters[:, :, 0] * characteristic_volume

# Transform exponential growth rate
# lambda_0 = a_1 / a_0 /t^c
recovered_parameters_transformed[:, :, 1] = \
    recovered_parameters[:, :, 2] / 2 / recovered_parameters[:, :, 1] / characteristic_time

# Transform linear growth rate
# lambda_1 = a_1 / a_0 /t^c
recovered_parameters_transformed[:, :, 2] = \
    recovered_parameters[:, :, 2] * characteristic_volume / characteristic_time


In [26]:
#
# Visualisation of the spread of optimised model parameters for multiple runs from different initial points.
#
# This cell needs the above optimised initial parameter from psi_0=(1, 1, 1) and the five runs from a random initial starting point, and their respective objective function scores, as well as the data
# [recovered_parameters_transformed, recovered_parameters_scores, data]
#

import plotly.colors
import plotly.graph_objects as go


# Get mouse ids
mouse_ids = data['#ID'].unique()

# Get number of parameters + score (for visualisation)
n_params = recovered_parameters_transformed.shape[2] + 1

# Define colorscheme
colors = plotly.colors.qualitative.Plotly[:n_params]

# Get optimised parameters
optimised_parameters = recovered_parameters_transformed

# Get optimised parameters
scores = recovered_parameters_scores

# Create figure
fig = go.Figure()

# Box plot of optimised model parameters
for index, id_m in enumerate(mouse_ids):
    # Get optimised parameters
    parameters = optimised_parameters[index, ...]

    # Get scores
    score = scores[index, :]

    # Create box plot of for initial tumour volume
    fig.add_trace(
        go.Box(
            y=parameters[:, 0],  
            name="Initial tumour volume in cm^3",
            boxpoints='all',
            jitter=0.2,
            pointpos=-1.5,
            visible=True if index == 0 else False,
            marker=dict(
                symbol='circle',
                opacity=0.7,
                line=dict(color='black', width=1)),
            marker_color=colors[0],
            line_color=colors[0]))

    # Create box plot of for exponential tumour growth
    fig.add_trace(
        go.Box(
            y=parameters[:, 1],  
            name="Exponential growth rate in 1/day",
            boxpoints='all',
            jitter=0.2,
            pointpos=-1.5,
            visible=True if index == 0 else False,
            marker=dict(
                symbol='circle',
                opacity=0.7,
                line=dict(color='black', width=1)),
            marker_color=colors[1],
            line_color=colors[1]))

    # Create box plot of for linear tumour growth
    fig.add_trace(
        go.Box(
            y=parameters[:, 2],  
            name="Linear growth rate in cm^3/day",
            boxpoints='all',
            jitter=0.2,
            pointpos=-1.5,
            visible=True if index == 0 else False,
            marker=dict(
                symbol='circle',
                opacity=0.7,
                line=dict(color='black', width=1)),
            marker_color=colors[2],
            line_color=colors[2]))
    
    # Create box plot of for objective function score
    fig.add_trace(
        go.Box(
            y=score,  
            name="Score",
            boxpoints='all',
            jitter=0.2,
            pointpos=-1.5,
            visible=True if index == 0 else False,
            marker=dict(
                symbol='circle',
                opacity=0.7,
                line=dict(color='black', width=1)),
            marker_color=colors[3],
            line_color=colors[3]))

# Set figure size
fig.update_layout(
    autosize=True,
    template="plotly_white")

# Add switch between mice
fig.update_layout(
    updatemenus=[
        dict(
            type = "buttons",
            direction = "right",
            buttons=list([
                dict(
                    args=[{"visible": [True]*4 + [False]*(4 * 7)}],
                    label="ID: %d" % mouse_ids[0],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*4 + [True]*4 + [False]*(4 * 6)}],
                    label="ID: %d" % mouse_ids[1],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 2) + [True]*4 + [False]*(4 * 5)}],
                    label="ID: %d" % mouse_ids[2],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 3) + [True]*4 + [False]*(4 * 4)}],
                    label="ID: %d" % mouse_ids[3],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 4) + [True]*4 + [False]*(4 * 3)}],
                    label="ID: %d" % mouse_ids[4],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 5) + [True]*4 + [False]*(4 * 2)}],
                    label="ID: %d" % mouse_ids[5],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 6) + [True]*4 + [False]* 4}],
                    label="ID: %d" % mouse_ids[6],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 7) + [True]*4}],
                    label="ID: %d" % mouse_ids[7],
                    method="restyle"
                )
            ]),
            pad={"r": 0, "t": -10},
            showactive=True,
            x=0.0,
            xanchor="left",
            y=1.1,
            yanchor="top"
        )
    ]
)

# Position legend
fig.update_layout(legend=dict(
    yanchor="bottom",
    y=0.01,
    xanchor="left",
    x=1.05))

# Show figure
fig.show()


The model seems ok but can already see that sometimes the model tries to stay exponential by cranking up the linear growth rate. This limits the interpretability of the linear growth rate. 

A more feasible parameterisation may be the following

\begin{equation*}
    \frac{\text{d}V^s_T}{\text{d}t} = \frac{\lambda V^s_T}{V^s_T / V_{\text{crit}} + 1}.
\end{equation*}

$\lambda $ is exponential growth rate and $V_{\text{crit}}$ is the critical tumour volume above which the tumour growth transitions to a linear growth phase with linear growth rate $\lambda V_{\text{crit}}$.

Then $V_{\text{crit}} = a_0 V^c_T$ and $\lambda = a_1 / a_0 t^c$. 

In [27]:
#
# Transform dimensionless parameters to new set of biological parameters.
# 
# This cell needs the above inferred dimensionless parameters and the characteristic volume and time scale:
# [recovered_parameters, characteristic_volume, characteristic_time]
#

import numpy as np


# Initialise container for backtransformed paramters
# Shape (n_mice, n_runs, n_parameters)
new_parameters = np.empty(shape=recovered_parameters.shape)

# Transform initial volumes
new_parameters[:, :, 0] = recovered_parameters[:, :, 0] * characteristic_volume

# Transform critical volume 
# V_crit = a_0 * V^c
new_parameters[:, :, 1] = recovered_parameters[:, :, 1] * characteristic_volume

# Transform exponential growth rate
# lambda = a_1 / a_0 / t^c
new_parameters[:, :, 2] = \
    recovered_parameters[:, :, 2] / recovered_parameters[:, :, 1] / characteristic_time


In [28]:
#
# Visualisation of the spread of optimised model parameters for multiple runs from different initial points.
#
# This cell needs the above optimised initial parameter from psi_0=(1, 1, 1) and the five runs from a random initial starting point, and their respective objective function scores, as well as the data
# [new_parameters, recovered_parameters_scores, data]
#

import plotly.colors
import plotly.graph_objects as go


# Get mouse ids
mouse_ids = data['#ID'].unique()

# Get number of parameters + score (for visualisation)
n_params = new_parameters.shape[2] + 1

# Define colorscheme
colors = plotly.colors.qualitative.Plotly[:n_params]

# Get optimised parameters
optimised_parameters = new_parameters

# Get optimised parameters
scores = recovered_parameters_scores

# Create figure
fig = go.Figure()

# Box plot of optimised model parameters
for index, id_m in enumerate(mouse_ids):
    # Get optimised parameters
    parameters = optimised_parameters[index, ...]

    # Get scores
    score = scores[index, :]

    # Create box plot of for initial tumour volume
    fig.add_trace(
        go.Box(
            y=parameters[:, 0],  
            name="Initial tumour volume in cm^3",
            boxpoints='all',
            jitter=0.2,
            pointpos=-1.5,
            visible=True if index == 0 else False,
            marker=dict(
                symbol='circle',
                opacity=0.7,
                line=dict(color='black', width=1)),
            marker_color=colors[0],
            line_color=colors[0]))

    # Create box plot for critical volume
    fig.add_trace(
        go.Box(
            y=parameters[:, 1],  
            name="Critical volume in cm^3",
            boxpoints='all',
            jitter=0.2,
            pointpos=-1.5,
            visible=True if index == 0 else False,
            marker=dict(
                symbol='circle',
                opacity=0.7,
                line=dict(color='black', width=1)),
            marker_color=colors[1],
            line_color=colors[1]))

    # Create box plot of for exponential tumour growth
    fig.add_trace(
        go.Box(
            y=parameters[:, 2],  
            name="Exponential growth rate in 1/day",
            boxpoints='all',
            jitter=0.2,
            pointpos=-1.5,
            visible=True if index == 0 else False,
            marker=dict(
                symbol='circle',
                opacity=0.7,
                line=dict(color='black', width=1)),
            marker_color=colors[2],
            line_color=colors[2]))
    
    # Create box plot of for objective function score
    fig.add_trace(
        go.Box(
            y=score,  
            name="Score",
            boxpoints='all',
            jitter=0.2,
            pointpos=-1.5,
            visible=True if index == 0 else False,
            marker=dict(
                symbol='circle',
                opacity=0.7,
                line=dict(color='black', width=1)),
            marker_color=colors[3],
            line_color=colors[3]))

# Set figure size
fig.update_layout(
    autosize=True,
    template="plotly_white")

# Add switch between mice
fig.update_layout(
    updatemenus=[
        dict(
            type = "buttons",
            direction = "right",
            buttons=list([
                dict(
                    args=[{"visible": [True]*4 + [False]*(4 * 7)}],
                    label="ID: %d" % mouse_ids[0],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*4 + [True]*4 + [False]*(4 * 6)}],
                    label="ID: %d" % mouse_ids[1],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 2) + [True]*4 + [False]*(4 * 5)}],
                    label="ID: %d" % mouse_ids[2],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 3) + [True]*4 + [False]*(4 * 4)}],
                    label="ID: %d" % mouse_ids[3],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 4) + [True]*4 + [False]*(4 * 3)}],
                    label="ID: %d" % mouse_ids[4],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 5) + [True]*4 + [False]*(4 * 2)}],
                    label="ID: %d" % mouse_ids[5],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 6) + [True]*4 + [False]* 4}],
                    label="ID: %d" % mouse_ids[6],
                    method="restyle"
                ),
                dict(
                    args=[{"visible": [False]*(4 * 7) + [True]*4}],
                    label="ID: %d" % mouse_ids[7],
                    method="restyle"
                )
            ]),
            pad={"r": 0, "t": -10},
            showactive=True,
            x=0.0,
            xanchor="left",
            y=1.1,
            yanchor="top"
        )
    ]
)

# Position legend
fig.update_layout(legend=dict(
    yanchor="bottom",
    y=0.01,
    xanchor="left",
    x=1.05))

# Show figure
fig.show()


**Figure X:**

There is a qulatitative differnce to observe. Some mice have biologically infeasibly large critical volumes. ALthough mice 95 and 140 seem be among the lighter mice in the control group, there is little evidence just yet to correlate this to the body mass. It is more likely that the qualitative change in growth behaviour is induced by a change in the tumour biology, i.e. metabolic adaptation of the tumour.

## Bibliography

- <a name="ref1"> [1] </a> Eigenmann et. al., Combining Nonclinical Experiments with Translational PKPD Modeling to Differentiate Erlotinib and Gefitinib, Mol Cancer Ther (2016)
- <a name="ref2"> [2] </a> Bellman, R. & Åström, K., On structural identifiability.Mathematical Biosciences, 7, 329 – 339 (1970)
- <a name="ref2"> [3] </a> Janzén, D. L. I. et al., Parameter identifiability of fundamental pharmacodynamic models.Frontiers in Physiology, 7, 590 (2016)
- <a name="ref2"> [4] </a> Lavielle, M. & Aarons, L., What do we mean by identifiability in mixed effects models?Journal of Pharmacoki-netics and Pharmacodynamics, 43, 111–122 (2016)
- <a name="ref2"> [5] </a> Hansen, N & Müller, S. D. & Koumoutsakos, P., Reducing the Time Complexity of the Derandomized Evolution Strategy with Covariance Matrix Adaptation (CMA-ES), Evolutionary Computation, 1-18 (2003)
- <a name="ref2"> [6] </a> Raue, A. et al., Structural and practical identifiability analysis of partially observed dynamical models by ex-ploiting the profile likelihood, Bioinformatics, 25, 1923–1929 (2009)

[Back to project overview](https://github.com/DavAug/ErlotinibGefitinib/blob/master/README.md) | [Back to lung cancer control growth overview](https://nbviewer.jupyter.org/github/DavAug/ErlotinibGefitinib/blob/master/notebooks/lung_cancer/control_growth/overview.ipynb) | [Forward to next notebook](https://nbviewer.jupyter.org/github/DavAug/ErlotinibGefitinib/blob/master/notebooks/lung_cancer/control_growth/error_model.ipynb)