# Simulating distributions MT catastrophe in a two-step model {#exr-two-step-sim}

<hr />

In @exr-two-step-mt-model, you worked out the PDF for a model where two biochemical processes have to happen in succession to trigger microtubule catastrophe. The result was

$$\begin{align}
&f(t;\beta_1, \beta_2) = \frac{\beta_1 \beta_2}{\beta_2 - \beta_1}\left(\mathrm{e}^{-\beta_1 t} - \mathrm{e}^{-\beta_2 t}\right), \\[1em]
&F(t; \beta_1, \beta_2) = 
\frac{\beta_1 \beta_2}{\beta_2-\beta_1}\left[
\frac{1}{\beta_1}\left(1-\mathrm{e}^{- \beta_1 t}\right)- \frac{1}{\beta_2}\left(1-\mathrm{e}^{-\beta_2 t}\right)
\right].
\end{align}
$$

In a typical experiment, Gardner and Zanic measured about 150 catastrophe events. Use random number generation to simulate one of these experiments with this successive Poisson process model and plot the ECDF of times to catastrophe. That is, generate 150 random numbers that are distributed according to the story of the model. You can plot the time axis of the ECDF in units of $\beta_1^{-1}$. Do this for several values of $\beta_2/\beta_1$. Overlay the analytical CDF with an ECDF from your simulation to verify that they match.


## Solution

<hr>

In [1]:
# Colab setup ------------------
import os, sys, subprocess
if "google.colab" in sys.modules:
    cmd = "pip install --upgrade polars iqplot watermark"
    process = subprocess.Popen(cmd.split(), stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    stdout, stderr = process.communicate()
    data_path = "https://s3.amazonaws.com/bebi103.caltech.edu/data/"
else:
    data_path = "../data/"
# ------------------------------

import polars as pl
import numpy as np
import scipy.stats as st

import iqplot

import bokeh.io
bokeh.io.output_notebook()

To simulate this model, we simply draw a time out of each of two Exponential distributions and add them.

In [2]:
rng = np.random.default_rng(3252)

def draw_model(beta_1, beta_2, size=1):
    return rng.exponential(1/beta_1, size=size) + rng.exponential(1/beta_2, size=size)

We will set $\beta_1 = 1$ (since that sets the units of time), and will generate ECDFs for various values of $\beta_2 / \beta_1$.

In [3]:
n_samples = 150
p = None

p = bokeh.plotting.figure(
    frame_height=250,
    frame_width=400,
    x_axis_label="time to catastrophe × β₁",
    y_axis_label="ECDF",
)

beta_ratio = [0.1, 0.3, 1, 3, 10]

catastrophe_times = np.concatenate(
    [draw_model(1, br, size=n_samples) for br in beta_ratio]
)
beta_ratios = np.concatenate([[br] * n_samples for br in beta_ratio])
df = pl.DataFrame(
    {"β₂/β₁": beta_ratios, "time to catastrophe × β₁": catastrophe_times}
)

p = iqplot.ecdf(
    df,
    q="time to catastrophe × β₁",
    cats="β₂/β₁",
    palette=bokeh.palettes.Blues7[1:-1][::-1],
)
p.legend.title = "β₂/β₁"

bokeh.io.show(p)

When zooming in to the small end of the range, we can see that the distributions all have inflection points in the CDF, as opposed to the Exponential, which does not. We can proceed to compute a smooth CDF to overlay.

In [4]:
def model_cdf(t, beta_1, beta_2):
    if np.isclose(beta_1, beta_2):
        return st.gamma.cdf(t, 2, loc=0, scale=1 / beta_1)

    cdf = (1 - np.exp(-beta_1 * t)) / beta_1 - (1 - np.exp(-beta_2 * t)) / beta_2

    return beta_1 * beta_2 * cdf / (beta_2 - beta_1)

Let's now overlat the theoretical CDF on the plot.

In [5]:
t = np.linspace(0, 70, 500)

for br in beta_ratio:
    cdf = model_cdf(t, 1, br)
    p.line(t, cdf, color="orange")

bokeh.io.show(p)

Looks like we got it right!

## Computing environment

In [6]:
%load_ext watermark
%watermark -v -p numpy,polars,bokeh,iqplot,jupyterlab

Python implementation: CPython
Python version       : 3.13.5
IPython version      : 9.4.0

numpy     : 2.2.6
polars    : 1.31.0
bokeh     : 3.7.3
iqplot    : 0.3.7
jupyterlab: 4.4.5

