# Control Growth Data

In [1], the tumour growth inhibition (TGI) PKPD model of Erlotinib and Gefitinib was derived from two separate *in vivo* experiments. In particular, the growth of patient-derived tumour explants LXF A677 (adenocarcinoma of the lung) and cell line-derived tumour xenografts VXF A431 (vulva cancer) in mice were monitored. Each experiment comprised a control growth group and three groups that were treated with either Erlotinib or Gefitnib at one of three dose levels. Treatments were orally administered once a day.

In this notebook, we focus on establishing a good understanding of the untreated tumour growth. In particular, this will allow us to critically assess the modelling choices in [1], and explore alternatives. It further allows us to derive posteriors for the growth parameters, that may inform the choice of priors for the full TGI-PKPD model inference.

We will now import the data sets and standardise their format for the inference.

## Raw LXF A677 control growth data

In [1]:
#
# Import raw LXF A677 data.
#

import os
import pandas as pd


# Import LXF A677 data
path = os.getcwd()  # make import independent of local path structure
lxf_data_raw = pd.read_csv(path + '/data_raw/Ctrl_Growth_LXF.csv')

# Display data
print('Raw LXF A677 Control Growth Data Set:')
lxf_data_raw

Raw LXF A677 Control Growth Data Set:


Unnamed: 0,#ID,TIME,DOSE,ADDL,II,Y,YTYPE,CENS,CELL LINE,DOSE GROUP,DRUG,DRUGCAT,EXPERIMENT,BW,YTV,KA,V,KE,w0
0,40,0,.,.,.,191.808,2,.,1,0,2,0,2,26.8,.,55,1.11,3.98,191.8080
1,94,0,.,.,.,77.2475,2,.,1,0,2,0,2,18.3,.,55,1.11,3.98,77.2475
2,95,0,.,.,.,186.2,2,.,1,0,2,0,2,22.3,.,55,1.11,3.98,186.2000
3,40,3,0,.,.,.,.,.,1,0,2,0,2,26.1,.,55,1.11,3.98,191.8080
4,40,4,0,2,1,.,.,.,1,0,2,0,2,26.5,.,55,1.11,3.98,191.8080
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
153,140,2,.,.,.,126.852,2,.,1,0,2,0,2,23.6,126.852,55,1.11,3.98,79.3305
154,94,4,.,.,.,125.316,2,.,1,0,2,0,2,18.5,125.316,55,1.11,3.98,77.2475
155,170,4,.,.,.,109.33,2,.,1,0,2,0,2,27.9,109.33,55,1.11,3.98,80.0565
156,170,2,.,.,.,94.221,2,.,1,0,2,0,2,27.7,94.221,55,1.11,3.98,80.0565


## Cleaning the data

The datasets were kindly provided to us by the authors of [1], who developed their PKPD model using a commercially available software called [Monolix](http://lixoft.com/products/monolix/). As a result, the keys of the dataset are motivated by Monolix's naming conventions. 

The keys of interest for our analysis are

- **#ID** indicating which mouse was measured,
- **TIME** indicating the time point of each measurement,
- **TUMOUR VOLUME** indicating the measured tumour volume,
- **BODY WEIGHT** indicating the body weight of the mouse.

In discussion with the authors and a comparison with the Roche study report, the relevant columns in the dataset were identified as **#ID**, **TIME**, **Y** and **BW**, were **Y** endoces for **TUMOUR VOLUME** and **BW** for **BODY WEIGHT**. The remaining keys partially Monolix-specific modelling keys, partially inferred model parameters. We thus ignore those columns.

Remarks on units of relevant columns:

The raw data sets do not contain the units of the measured quantities. From [1] as well as Roche's study report, we may infer that

- **TIME**: is measured in $\text{day}$,
- **TUMOUR VOLUME**: is measured in $\text{mm}^3$,
- **BODY WEIGHT**: is measured in $\text{g}$.

For reasons that will become clear later, we will choose to measure the tumour volume in $\text{cm}^3$.

## Cleaned LXF A677 control growth data

In [3]:
#
# Create LXF A677 data from raw data set.
#

import os
import pandas as pd


# Import LXF A677 data
path = os.getcwd()  # to make import independent of local path structure
lxf_data_raw = pd.read_csv(path + '/data_raw/Ctrl_Growth_LXF.csv')

# Make sure that data is stored as numeric data
lxf_data = lxf_data_raw.apply(pd.to_numeric, errors='coerce')

# Mask data for non-null Y rows
lxf_data = lxf_data[lxf_data['Y'].notnull()]

# Rename TIME to TIME in day
lxf_data = lxf_data.rename(columns={'TIME': 'TIME in day'})

# Rename Y to TUMOUR VOLUME in mm^3
lxf_data = lxf_data.rename(columns={'Y': 'TUMOUR VOLUME in mm^3'})

# Rename BW to BODY WEIGHT in g
lxf_data = lxf_data.rename(columns={'BW': 'BODY WEIGHT in g'})

# Raise error if DOSE, ADDL, II, YTYPE, CENS, CELL LINE, DOSE GROUP, DRUG, EXPERIMENT or DRUGCAT are not uni-valued
if len(lxf_data['DOSE'].unique()) > 1:
    raise ValueError
if len(lxf_data['ADDL'].unique()) > 1:
    raise ValueError
if len(lxf_data['II'].unique()) > 1:
    raise ValueError
if len(lxf_data['YTYPE'].unique()) > 1:
    raise ValueError
if len(lxf_data['CENS'].unique()) > 1:
    raise ValueError
if len(lxf_data['CELL LINE'].unique()) > 1:
    raise ValueError
if len(lxf_data['DOSE GROUP'].unique()) > 1:
    raise ValueError
if len(lxf_data['DRUG'].unique()) > 1:
    raise ValueError
if len(lxf_data['EXPERIMENT'].unique()) > 1:
    raise ValueError
if len(lxf_data['DRUGCAT'].unique()) > 1:
    raise ValueError

# Keep only #ID, TIME and TUMOUR VOLUME column
lxf_data = lxf_data[['#ID', 'TIME in day', 'TUMOUR VOLUME in mm^3', 'BODY WEIGHT in g']]

# Sort data such that time is increasing (for later convenience)
lxf_data.sort_values('TIME in day', inplace=True)

# Convert tumour measurements to cm^3
lxf_data['TUMOUR VOLUME in mm^3'] *= 1E-03
lxf_data = lxf_data.rename(columns={'TUMOUR VOLUME in mm^3': 'TUMOUR VOLUME in cm^3'})

# Delete raw data from memory
del lxf_data_raw

# Display cleaned data set
print('LXF A677 Control Growth:')
lxf_data

LXF A677 Control Growth:


Unnamed: 0,#ID,TIME in day,TUMOUR VOLUME in cm^3,BODY WEIGHT in g
0,40,0,0.191808,26.8
1,94,0,0.077248,18.3
2,95,0,0.186200,22.3
59,136,0,0.118588,25.4
60,140,0,0.079330,22.7
...,...,...,...,...
77,136,30,1.459342,24.2
103,94,30,0.576240,19.2
90,169,30,0.746986,28.0
67,140,30,2.122582,24.1


## Illustrate control growth data

We use [plotly](https://plotly.com/python/) to create interactive visualisations of the time-series data.

In [17]:
#
# Visualise control growth data.
#
# This cell needs the cleaned lung cancer tumour growth data:
# [lxf_data]
#

import plotly.colors
import plotly.graph_objects as go


# Get mouse ids
mouse_ids = lxf_data['#ID'].unique()

# Get number of mice
n_mice = len(mouse_ids)

# Define colorscheme
colors = plotly.colors.qualitative.Plotly[:n_mice]

# Create figure
fig = go.Figure()

# Scatter plot LXF A677 time-series data for each mouse
for index, id_m in enumerate(mouse_ids):
    # Create mask for mouse
    mask = lxf_data['#ID'] == id_m

    # Get time points for mouse
    times = lxf_data['TIME in day'][mask]

    # Get observed tumour volumes for mouse
    observed_volumes = lxf_data['TUMOUR VOLUME in cm^3'][mask]

    # Get mass time series
    masses = lxf_data['BODY WEIGHT in g'][mask]

    # Plot tumour volume over time
    fig.add_trace(go.Scatter(
        x=times,
        y=observed_volumes,
        name="ID: %d" % id_m,
        showlegend=True,
        hovertemplate=
            "<b>ID: %d</b><br>" % (id_m) +
            "Time: %{x:} day<br>" +
            "Tumour volume: %{y:.02f} cm^3<br>" +
            "Body weight: %{text}<br>" +
            "Cancer type: Lung cancer (LXF A677)<br>" +
            "<extra></extra>",
        text=['%.01f g' % mass for mass in masses],
        mode="markers",
        marker=dict(
            symbol='circle',
            color=colors[index],
            opacity=0.7,
            line=dict(color='black', width=1))
    ))

    # Plot mass over time
    fig.add_trace(go.Scatter(
        x=times,
        y=masses,
        name="ID: %d" % id_m,
        showlegend=True,
        visible=False,
        hovertemplate=
            "<b>ID: %d</b><br>" % (id_m) +
            "Time: %{x:} day<br>" +
            "Tumour volume: %{y:.02f} cm^3<br>" +
            "Body weight: %{text}<br>" +
            "Cancer type: Lung cancer (LXF A677)<br>" +
            "<extra></extra>",
        text=['%.01f g' % mass for mass in masses],
        mode="markers",
        marker=dict(
            symbol='circle',
            color=colors[index],
            opacity=0.7,
            line=dict(color='black', width=1))
    ))

# Set X, Y axis and figure size
fig.update_layout(
    autosize=True,
    xaxis_title=r'$\text{Time in day}$',
    yaxis_title=r'$\text{Tumour volume in cm}^3$',
    template="plotly_white")

# Add switch between linear and log y-scale
fig.update_layout(
    updatemenus=[
        dict(
            type = "buttons",
            direction = "left",
            buttons=list([
                dict(
                    args=[{"yaxis.type": "linear"}],
                    label="Linear y-scale",
                    method="relayout"
                ),
                dict(
                    args=[{"yaxis.type": "log"}],
                    label="Log y-scale",
                    method="relayout"
                )
            ]),
            pad={"r": 0, "t": -10},
            showactive=True,
            x=0.0,
            xanchor="left",
            y=1.15,
            yanchor="top"
        ),
        dict(
            type = "buttons",
            direction = "down",
            buttons=list([
                dict(
                    args=[
                        {"visible": [True, False] * n_mice,},
                        {"yaxis": {"title": r'$\text{Tumour volume in cm}^3$'}}],
                    label="Tumour volume",
                    method="update"
                ),
                dict(
                    args=[
                        {"visible": [False, True] * n_mice}, 
                        {"yaxis": {"title": r'$\text{Body weight in g}$'}}],
                    label="Body weight",
                    method="update"
                ),
            ]),
            pad={"r": 0, "t": -10},
            showactive=True,
            x=1.07,
            xanchor="left",
            y=1.1,
            yanchor="top"
        ),
    ]
)

# Position legend
fig.update_layout(legend=dict(
    yanchor="bottom",
    y=0.01,
    xanchor="left",
    x=1.05))

# Show figure
fig.show()

**Figure 1:** Untreated tumour growth of patient-derived tumour explants LXF A677 (adenocarcinoma of the lung) that were implanted in mice. The colouring of the data points indicates that the measurements belong to the same mouse. Mouse ID and further information can be explored by hovering over the data points. The evolution of the body weight can be explored using the buttons in the top right.

## Export cleaned data

In [18]:
#
# Export cleaned data sets for inference in other notebooks.
#

import os
import pandas as pd


# Get path of current working directory
path = os.getcwd()

# Export cleaned LXF A677 control growth data
lxf_data.to_csv(path + '/data/lxf_control_growth.csv')

## Bibliography

- <a name="ref1"> [1] </a> Eigenmann et. al., Combining Nonclinical Experiments with Translational PKPD Modeling to Differentiate Erlotinib and Gefitinib, Mol Cancer Ther (2016)

[Back to project overview](https://github.com/DavAug/ErlotinibGefitinib/blob/master/README.md) | [Back to lung cancer control growth overview](https://nbviewer.jupyter.org/github/DavAug/ErlotinibGefitinib/blob/master/notebooks/lung_cancer/control_growth/overview.ipynb) | [Forward to next notebook](https://nbviewer.jupyter.org/github/DavAug/ErlotinibGefitinib/blob/master/notebooks/lung_cancer/control_growth/identifiability.ipynb)