<html>
    <summary></summary>
         <div> <p></p> </div>
         <div style="font-size: 20px; width: 800px;"> 
              <h1>
               <left>Modeling Biological Processes - Chemical Master Equations</left>
              </h1>
              <p><left>============================================================================</left> </p>
<pre>Course: ASU CBP Summer School 2025
Instructor: Dr. Douglas Shepherd
Contact Info: douglas.shepherd@asu.edu
Authors: Dr. Michael May, Dr. Brian Munsky, Dr. Douglas Shepherd
</pre>
         </div>
    </p>

</html>

<details>
  <summary>Copyright info</summary>

```
Copyright 2024 Brian Munsky

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
```
<details>



<table class="tfo-notebook-buttons" align="left">
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/QI2lab/2025-CBP-SummerSchool/blob/main/Module3-ModelingBiochemicalReactions/M3C_Chemical_Master_Equation.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/QI2lab/2025-CBP-SummerSchool/blob/main/Module3-ModelingBiochemicalReactions/M3C_Chemical_Master_Equation.ipynb"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />View source on GitHub</a>
  </td>
</table>

----------
# Learning Objectives for this Notebook:
--------------

After completing this notebook, you should be able to:
* Define and describe the **Chemical Master Equation** in terms of stoichiometry vectors and propensity functions.
* Define, and be able to construct the **state space** for a CME analysis.
* Build a CME **Infinitesimal Generator matrix** using knowledge of stoichiometry vectors and propensity functions.
* Recognize import properties of the **Infinitesimal Generator**, especially those involving special **eigenvalues** and **eigenvectors**. 
* Set up and solve the CME using **Finite State Projections** (FSP) using **ODE integrators** or **matrix exponentials**.
* Make plots and animations of CME solutions.

# 1. Motivation for Computing Single-Cell distributions
![alt text](Figures/ChemicalMasterEquation_2022/Slide2.png)
![alt text](Figures/ChemicalMasterEquation_2022/Slide3.png)

# 2. The Markov Chain Representation for Chemical Kinetics
![alt text](Figures/ChemicalMasterEquation_2022/Slide6.png)
![alt text](Figures/ChemicalMasterEquation_2022/Slide7.png)
![alt text](Figures/ChemicalMasterEquation_2022/Slide8.png)

# 3. The Chemical Master Equation

![alt text](Figures/ChemicalMasterEquation_2022/Slide9.png)
![alt text](Figures/ChemicalMasterEquation_2022/Slide11.png)


# 4. The Infitesimal Generator Matrix
![alt text](Figures/ChemicalMasterEquation_2022/Slide13.png)
![alt text](Figures/ChemicalMasterEquation_2022/Slide15.png)
![alt text](Figures/ChemicalMasterEquation_2022/Slide16.png)

# 5. The Finite State Projection Approximation
![alt text](Figures/ChemicalMasterEquation_2022/Slide17.png)
![alt text](Figures/ChemicalMasterEquation_2022/Slide18.png)
![alt text](Figures/ChemicalMasterEquation_2022/Slide19.png)
![alt text](Figures/ChemicalMasterEquation_2022/Slide20.png)
![alt text](Figures/ChemicalMasterEquation_2022/Slide21.png)
![alt text](Figures/ChemicalMasterEquation_2022/Slide22.png)

# 6. Python Codes
## 6.1. Import Libraries
```python

In [None]:
# Load necessary packages and libraries
import matplotlib.pyplot as plt
import numpy as np
import scipy
from IPython.display import Image
import matplotlib.animation as animation
from matplotlib import rc
from scipy.integrate import ode

rc('animation', html='jshtml')
figSize=600

## 6.2. Construct the Infinitesimal Generator

Assuming that one has a defined model (in terms of stoichiometry matrix and propensity functions), then the main new concept needed to solve the CME using the FSP approach is to generate athe infitesimal generator matrix.  Below is a simple code to achieve this task.

In [None]:
# Here, I provide a function to build the infinitesimal generator matrix for a continuous time Markov chain.
# The function takes the stoichiometry matrix S, the propensity vector function W, the states in the FSP approximation, and the time t as input.
# The function returns the infinitesimal generator matrix.

def build_inf_gen(S, W, states, pars, t=0):
    """ Build the infinitesimal generator matrix for a continuous time Markov chain.
    S: Stoichiometry matrix
    W: Propensity vector function
    states: list of states in FSP approximation
    t: time
    """
    
    # Determie the number of species, states and reactions
    nSpecies, nStates = states.shape
    nReactions = S.shape[1]
    
    # Compute the propensity functions for all states.
    propensities = W(states, t, pars)

    # Step through each reaction
    infGens = []
    infGen = np.zeros((nStates+1,nStates+1))
    sink = np.zeros(nStates)
    for mu in range(nReactions):
        # Compute flow of probability out of all states due to reaction mu
        infGens.append(-np.diag(propensities[mu,:]))
        
        for i in range(nStates):
            # Compute the states after the reaction mu
            newState = (states[:,i] + S[:,mu]).reshape(-1,1) 

            # Check if the state is non-negative
            if np.all(newState >= 0):
                # Find the index of the state
                try:
                    j = np.where((states == newState).all(axis=0))[0][0]
                    infGens[mu][j,i] += propensities[mu,i]

                except:
                    sink[i] += propensities[mu,i]
        # Add the current reaction to the infinitesimal generator
        infGen[:nStates,:nStates] += infGens[mu]
    
    # Add the sink as the final row of the infinitesimal generator
    infGen[nStates,:nStates] = sink

    return infGens, sink, infGen

# 7 Examples
## 7.1. Bursting Gene Expression

![alt text](Figures/StoichiometryAndPropensity_2023/Slide29.png)

### (Steps 1-4) Define Stoichiomety Matrix And Propensity Functions

In [None]:
# Note - This code is copied exactly from the previous Module M5A.  We are going to use the same steps as
# before to define the model.  The only difference is that we are going to use the Gillespie algorithm to
# simulate the model instead of the ODE solver.

# Step 1: The number of species is 3:  'OFF', 'ON', 'Protein'
#         The initial conditions are:  x0 = [1, 0, 0]
#         The initial condition corresponds to 1 molecule of 'OFF' and 0 molecules of 'ON' and 'Protein'
#         The initial condition is defined as a 3 x 1 numpy array:
x0 = np.array([1, 0, 5])

# Step 2: The number of reactions is 4:
#         R1:  OFF -> ON
#         R2:  ON -> OFF
#         R3:  ON -> ON + Protein
#         R4:  Protein -> null

# Step 3: The stoichiometry matrix is a 3 x 4 matrix:
#         Rows correspond to species and columns correspond to reactions.
#         The matrix is defined as follows:
S = np.array([[-1, 1, 0, 0],
        [1, -1, 0, 0],
        [0, 0, 1, -1]])

# Step 4: Define the reaction rate parameters as a dictionary
pars = {'kon': 0.02, 'koff': 0.05, 'kP': 10, 'gam': 0.1}
def W(x,t,pars=pars):
  return np.array([pars['kon']*x[0],
                   pars['koff']*x[1],
                   pars['kP']*x[1],
                   pars['gam']*x[2]])


### Step 5 - Specify State Space

In [None]:
# Specify the State Space of the FSP projection.
maxProtein = 180
states = np.zeros((3, maxProtein*2+2))
for i in range(maxProtein+1):
    states[:,2*i:2*i+2] = [[1,0], [0,1], [i,i]]

### Step 6 - Build Infinitesimal Generator Matrix.

In [None]:
# Compute the Infinitesimal generators:
infGens, sink, infGenTotal = build_inf_gen(S, W, states, pars)

In [None]:
#Show that probability is conserved by checking the column sum of the infinitesimal generator is zero.

# Perform the column sum of 'infGenTotal'
print('Maximum absolute column sum of the Infinitesimal Generator Matrix: ',
       np.max(np.abs(np.sum(infGenTotal, axis=0))))

### Step 7 - Specify the Initial Probability Distribution


In [None]:
# Here is a convenient code to find which index corresponds to the initial state:

# Find which state corresponds to the initial condition
idx0 = np.where([(states[:,i]==x0).all() for i in range(states.shape[1])])

# Initialize the initial probability vector
P0 = np.zeros(infGenTotal.shape[1])

# Set the probability of the initial state to 1
P0[idx0] = 1


In [None]:
print(x0, P0)

### Step 8 - Solve the FSP

In [None]:
# Define the time span for the simulation
tspan = np.linspace(0, 2000, 100)

def FSPrhs(t, P):
    return infGenTotal @ P

def jac(t, P):
    return infGenTotal

In [None]:
# Define the ODE solver
solver = ode(FSPrhs, jac=jac)
solver.set_integrator('vode', method='bdf')
solver.set_initial_value(P0)

# Solve the ODE
P_approach1 = np.zeros((len(tspan), len(P0)))
P_approach1[0,:] = P0
for i in range(1, len(tspan)):
    P_approach1[i,:] = solver.integrate(tspan[i])


In [None]:
# Plot the results
plt.figure(figsize=(4,4))
plt.plot(tspan, P_approach1[:,-1])

plt.figure(figsize=(4,4))
pProtein = P_approach1[:,:-1:2]+P_approach1[:,1:-1:2]
plt.plot(np.log(pProtein[-1,:]))

Solve again using the matrix exponential approach.
The solution of $P(x,t+\Delta t)$ is given by

$P(x,t+\Delta t)=EXPM(A\Delta t)\cdot P(t)$

In [None]:
deltaT = tspan[1]-tspan[0]
expmAt = scipy.linalg.expm(infGenTotal*deltaT)

# Solve the ODE
P_approach2 = np.zeros((len(tspan), len(P0)))
P_approach2[0,:] = P0
for i in range(1, len(tspan)):
    P_approach2[i,:] = expmAt @ P_approach2[i-1,:]


In [None]:
# Plot the results
plt.figure(figsize=(4,4))
plt.plot(tspan, P_approach1[:,-1])
plt.plot(tspan, P_approach2[:,-1],'--')
# labels
plt.xlabel('Time')
plt.ylabel('Escape Probability')
plt.legend(['ODE Solver', 'Matrix Exponential'])

plt.figure(figsize=(4,4))
pProtein1 = P_approach1[:,:-1:2]+P_approach1[:,1:-1:2]
pProtein2 = P_approach2[:,:-1:2]+P_approach2[:,1:-1:2]
plt.plot(pProtein[-1,:])
plt.plot(pProtein1[-1,:],'--')
plt.xlabel('Protein Level')
plt.ylabel('Probability')
plt.legend(['ODE Solver', 'Matrix Exponential'])

**Compare the FSP Solution to the SSA solution.**

In [None]:
# Here is our Gillespie algorithm from Module M5B (slightly simplified)

def gillespie(x0, t0, tmax, S, pars, trajectoryTimes = None):
    # If the user did not provide specific time points, we will use the default
    if trajectoryTimes is None:
        trajectoryTimes = np.linspace(t0, tmax, 20)
    
    # Initialize the time and the state
    t = t0
    x = x0
     
    # Initialize the output
    states = np.zeros((len(trajectoryTimes), len(x0)))

    # next time index
    nextTimeIndex = 0

    # Run the simulation
    while t < tmax:

        Wx = W(x, t, pars)
        Wx_sum = np.sum(Wx)

        # Compute the time of the next reaction
        t += -np.log(np.random.rand()) / Wx_sum

        # Check to see if we need to save the state
        while t >= trajectoryTimes[nextTimeIndex]:
            states[nextTimeIndex] = x
            nextTimeIndex += 1
            if nextTimeIndex >= len(trajectoryTimes):
                return states
        
        # Find the index of the next reaction
        r = np.random.rand()
        i = 0
        W_sum = Wx[0]
        while W_sum / Wx_sum < r:
            i += 1
            W_sum += Wx[i]

        # Update the state
        x = x + S[:, i]

def nGillespies(x0, t0, tmax, S, pars, nTrajectories, trajectoryTimes = None):
    if trajectoryTimes is None:
        trajectoryTimes = np.linspace(t0, tmax, 20)
    
    # Initialize the output
    states = np.zeros((nTrajectories, len(trajectoryTimes), len(x0)))

    # Run the simulation
    for i in range(nTrajectories):
        states[i,:,:] = gillespie(x0, t0, tmax, S, pars, trajectoryTimes)
    
    return states

In [None]:
# Run the Gillespie algorithm
nTrajectories = 100
SSAsolns = nGillespies(x0, 0, np.max(tspan), S, pars, nTrajectories, tspan)

In [None]:
# Plot a histogram of the results at the final time.

proteinCounts = SSAsolns[:,-1,2]
plt.hist(proteinCounts, bins=range(0, maxProtein+1), density=True)
plt.plot(pProtein[-1,:], alpha=0.5)

plt.xlabel('Protein Level')
plt.ylabel('Probability')
plt.legend(['FSP', 'Gillespie'])