<html>
    <summary></summary>
         <div> <p></p> </div>
         <div style="font-size: 20px; width: 800px;"> 
              <h1>
               <left>Group practice on fitting models to data.</left>
              </h1>
              <p><left>============================================================================</left> </p>
<pre>Course: ASU CBP Summer School 2025
Instructor: Dr. Douglas Shepherd
Contact Info: douglas.shepherd@asu.edu
Authors: Dr. Douglas Shepherd
</pre>
         </div>
    </p>

</html>

<details>
  <summary>Copyright info</summary>

```
Copyright 2025 Douglas Shepherd

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
```
<details>



<table class="tfo-notebook-buttons" align="left">
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/QI2lab/2025-CBP-SummerSchool/blob/main/Module3-ModelingBiochemicalReactions/M3D_Fitting_Models_to_Data.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/QI2lab/2025-CBP-SummerSchool/blob/main/Module3-ModelingBiochemicalReactions/M3D_Fitting_Models_to_Data.ipynb"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />View source on GitHub</a>
  </td>
</table>

----------
# Learning Objectives for this group exercise:
--------------

After this group exercise, you should be able to:
* Define how to use least-squares fitting to fit a model to data
* Perform parameter fitting for an ODE, SSA, and CME model of birth-decay

![alt text](Figures/StoichiometryAndPropensity_2023/Slide27.png)

# 1. Simulate birth-decay using random parameters

In [None]:
# Load necessary packages and libraries
import matplotlib.pyplot as plt
import numpy as np
from matplotlib import rc
rc('animation', html='jshtml')
figSize=600
rng = np.random.default_rng()

### Create the birth-decay model system

In [None]:
# Here we have a single species (N=1) that starts at x0 = 10 mg/mL
x0=np.array([10])

# Define the stoichiometry matrix as a 1 (species) by 2 (reaction) numpy array,
S=np.array([1,-1])

# Define the reaction rate parameters as a dictionary
k1_random = rng.uniform(low=0.3, high=0.7)
k2_random = rng.uniform(low=0.3, high=0.7)
pars = {'k1': k1_random, 'k2': k2_random}
# Units are:  k1 = (mg/mL) / minute
#             k2 = 1 / minute

# Define the reaction rate (propensity) functions
def W(x,t,pars=pars):
  return np.array([pars['k1'],pars['k2']*x[0]])

### Define the functions needed to run an SSA at set time points

In [None]:
# Run the many SSAs using the random parameters
# Let's define a simple function that computes the time of the next reaction given our propensity functions:
def next_time(x, t, pars):
    return -np.log(np.random.rand()) / np.sum(W(x, t, pars))

# Let's define a simple function that computes the index of the next reaction given our propensity functions:
def next_reaction(x, t, pars):
    Wx = W(x, t, pars)
    Wx_sum = np.sum(Wx)
    r = np.random.rand()
    i = 0
    W_sum = Wx[0]
    while W_sum / Wx_sum < r:
        i += 1
        W_sum += Wx[i]
    return i

# Gillespie algorithm to return the trajectories at specific time points.
def gillespie1(x0, t0, tmax, S, pars, trajectory_times = None):
    # If the user did not provide specific time points, we will use the default
    if trajectory_times is None:
        trajectory_times = np.linspace(t0, tmax, 20)
    
    # Initialize the time and the state
    t = t0
    x = x0
     
    # Initialize the output
    states = np.zeros((len(trajectory_times), len(x0)))

    # next time index
    nextTimeIndex = 0

    # Run the simulation
    while t < tmax:
        # Compute the time of the next reaction
        tau = next_time(x, t, pars)
        t += tau
        
        # Check to see if we need to save the state
        while t >= trajectory_times[nextTimeIndex]:
            states[nextTimeIndex] = x
            nextTimeIndex += 1
            if nextTimeIndex >= len(trajectory_times):
                return states
        
        # Compute the index of the next reaction
        i = next_reaction(x, t, pars)

        # Update the state
        x = x + S[:, i]

### Run the SSA and plot the results for the model

In [None]:
n_traj = 1000
t_max = 100
trajectory_times = np.linspace(0,t_max,1000)

# Initialize the results
experimental_results = np.zeros((n_traj, len(trajectory_times)))
for i in range(n_traj):
    experimental_results[i] = gillespie1(x0, 0, t_max, S, pars, trajectory_times)[:, 2]

# Plot the results for the protein counts
plt.plot(trajectory_times, experimental_results.T, color='gray', alpha=0.5)
plt.plot(trajectory_times, np.mean(experimental_results, axis=0), color='r', linewidth=2)
plt.fill_between(trajectory_times, np.mean(experimental_results, axis=0) - np.std(experimental_results, axis=0), np.mean(experimental_results, axis=0) + np.std(experimental_results, axis=0), color='r', alpha=0.3)
plt.xlabel('Time')
plt.ylabel('Number of molecules')
plt.title('Counts versus time')
plt.show()

# 2. Fitting the data

### Strategy

1. Define what kind of model to use (ODE, SSA, or CME)
2. Decide on a fitting routine to use (least squares, likelihood, etc...)
3. Run the fitting and compare the obtained $k_1$, $k_2$ to the known $k1_random$, $k2_random$
4. Discuss how to improve your parameter estimates