In [1]:
# some initial imports

import jax.numpy as jnp
import jax.lax

# Using the JAX-powered resource-aware cell modelling package _rc_e_coli_jax_ for gene circuit simulations

This Jupyter notebook provides a step-by-step guide for simulating how a gene circuit behaves in the context of the host cell and other competing synthetic genes. Underlying these simulations is a resource-aware coarse-grained cell model published in [Sechkar et al., 2024](https://doi.org/10.1038/s41467-024-46410-9), which here has been implemented in JAX to enable efficient parallelised simulations on the GPU.

As an example case, we consider the case of a gene _ta_ that encodes a transcription activation factor which, upon chemical modulation, promotes the expression of the output gene _x_. In addition to these and the host cell's own genes, we consider another disturbing synthetic gene _dist_ that competes with _ta_ and _x_ for the host's gene expression resources. Each gene has two modelled variables associated with it: its mRNA concentration in the cell _m_ and its protein level _p_. The resultant circuit is depicted in the figure below (adapted from [Sechkar et al., 2024](https://doi.org/10.1038/s41467-024-46410-9)).

<div>
<img src="example.png" width="750"/>
</div>

## Circuit model definition

We start by defining the functions that allow to model the circuit of interest -- namely, the initialiser _initialise()_, the gene regulation function _F_calc()_, the deterministic ordinary differential equation (ODE) model function _ode()_, and - if needed - a reaction propensity function _v()_ for stochastic simulations. 

In the file _one_constit.py_ they are specified for simulating a single constituive gene present in the cell. For any new gene circuit, one can copy and paste this file's contents into a new file - then modify the circuit-specific code fragments, which have all been highighted as follows:

````{python}
    # -------- SPECIFY [X] FROM HERE...
    # -------- ...TO HERE
````

### Circuit initialisation
In our case this means that in the initialiser function, the genes specified are _ta_, _x_ and _dist_. For out case study, we assume that their DNA concentrations $c$ and promoter strengths $\alpha$ are not necessarily the deafult values, so we re-specify them below. We additionally set the parameters of the Hill functions that describe the chemical inducer's binding to the transcription activation factors and the subsequent binding between the output gene's DNA and the inducer-bound transcription activation factor.

In [2]:
def initialise():
    # -------- SPECIFY CIRCUIT COMPONENTS FROM HERE...
    genes = ['ta','x','dist']  # names of genes in the circuit
    miscs = []  # names of miscellaneous species involved in the circuit (e.g. metabolites)
    # -------- ...TO HERE

    # for convenience, one can refer to the species' concs. by names instead of positions in x
    # e.g. x[name2pos['m_xtra']] will return the concentration of mRNA of the gene 'xtra'
    name2pos = {}
    for i in range(0, len(genes)):
        name2pos['m_' + genes[i]] = 8 + i  # mRNA
        name2pos['p_' + genes[i]] = 8 + len(genes) + i  # protein
    for i in range(0, len(miscs)):
        name2pos[miscs[i]] = 8 + len(genes) * 2 + i  # miscellaneous species
    for i in range(0, len(genes)):
        name2pos['k_' + genes[i]] =  i  # effective mRNA-ribosome dissociation constants (in k_het, not x!!!)
    for i in range(0, len(genes)):
        name2pos['F_' + genes[i]] =  i  # transcription regulation functions (in F, not x!!!)
    for i in range(0, len(genes)):
        name2pos['mscale_' + genes[i]] =  i  # mRNA count scaling factors (in mRNA_count_scales, not x!!!)

    # default gene parameters to be imported into the main model's parameter dictionary
    default_par = {}
    for gene in genes: # gene parameters
        default_par['func_' + gene] = 1.0  # gene functionality - 1 if working, 0 if mutated
        default_par['c_' + gene] = 1.0  # copy no. (nM)
        default_par['a_' + gene] = 100.0  # promoter strength (unitless)
        default_par['b_' + gene] = 6.0  # mRNA decay rate (/h)
        default_par['k+_' + gene] = 60.0  # ribosome binding rate (/h/nM)
        default_par['k-_' + gene] = 60.0  # ribosome unbinding rate (/h)
        default_par['n_' + gene] = 300.0  # protein length (aa)
        default_par['d_' + gene] = 0.0  # rate of active protein degradation - zero by default (/h)

    # default initial conditions
    default_init_conds = {}
    for gene in genes:
        default_init_conds['m_' + gene] = 0
        default_init_conds['p_' + gene] = 0
    for misc in miscs:
        default_init_conds[misc] = 0

    # -------- DEFAULT VALUES OF CIRCUIT-SPECIFIC PARAMETERS CAN BE SPECIFIED FROM HERE...
    
    # gene concentrations (nM) and promoter strengths (unitless)
    default_par['c_ta']=100
    default_par['c_x']=100
    default_par['a_ta']=50
    default_par['a_x']=50
    
    # binging between the chemical inducer and the transcription activation factor
    default_par['K_ta-f']=1000  # Half-saturation constant (nM)
    
    # binding between the output gene's DNA and the inducer-bound transcription activation factor
    default_par['K_dna(x)-taf']=700  # Half-saturation constant (nM)
    default_par['eta_dna(x)-taf']=2 # Hill coefficient/cooperativity of binding
    default_par['baseline']=0.1 # baseline output gene promoter activity in abscence of binding
    
    # time of the inducer's addition to the medium (h)
    default_par['t_add']=25
    default_par['f_added']=1000  # added inducer concentration (nM)
    # -------- ...TO HERE

    # default palette and dashes for plotting (5 genes + misc. species max)
    default_palette = ["#0072BD", "#D95319", "#4DBEEE", "#A2142F", "#FF00FF"]
    default_dash = ['solid']
    # match default palette to genes and miscellaneous species, looping over the five colours we defined
    circuit_styles = {'colours': {}, 'dashes': {}}  # initialise dictionary
    for i in range(0, len(genes)):
        circuit_styles['colours'][genes[i]] = default_palette[i % len(default_palette)]
        circuit_styles['dashes'][genes[i]] = default_dash[i % len(default_dash)]
    for i in range(len(genes), len(genes) + len(miscs)):
        circuit_styles['colours'][miscs[i - len(genes)]] = default_palette[i % len(default_palette)]
        circuit_styles['dashes'][miscs[i - len(genes)]] = default_dash[i % len(default_dash)]

    # --------  YOU CAN RE-SPECIFY COLOURS FOR PLOTTING FROM HERE...
    # -------- ...TO HERE

    return default_par, default_init_conds, genes, miscs, name2pos, circuit_styles

### Gene transcription regulation

We now proceed to define the gene transcription regulation function _F_calc()_.

As mentioned above, the genes _ta_ and _dist_ are constitutive, so the value of the gene transcription regulation function _F_ will be 1 at all times. For the gene _ta_, regulation happens as follows. First, the inducer with the concentration $f$ binds the transcription activation protein $p_{ta}$, hence the concentration of the active inducer-bound factor being given by a Hill function:
$$p_{ta}^{act}=p_{ta} \frac{f}{f+K_{ta-f}}$$
Second, the active transcription activation factor binds the output gene's DNA, hence the output gene's promoter activity being given by another Hill function:
$$F_{x}=baseline+\frac{(p_{ta}^{act})^{\eta_{dna(x)-taf}}}{(p_{ta}^{act})^{\eta_{dna(x)-taf}} + (K_{dna(x)-taf})^{\eta_{dna(x)-taf}}$$

The inducer may not be originally present in the culture medium, but rather than added to it at some time $t_{add}$ as a pulse input (see below). This can be recreated by using the _jax.lax.select()_ function in _F_calc()_.
$$f(t)=\begin{cases}0 & \text{ if } t<t_{add} \\ f_{added} & \text{ if } t\geq t_{add}\end{cases}$$


In [3]:
def F_calc(t ,x, par, name2pos):
    # --------  SPECIFY THE TRANSCRIPTIOPN REGULATION FUNCTION FROM HERE...
    F_ta = 1 # ta gene is constitutive
    F_dist = 1 # dist gene is constitutive
    
    # get the time-dependent inducer concentration
    f = jax.lax.select(t<par['t_add'], 0, par['f_added'])
    
    # get the concentration of the transcription activation factor using the 'variable name-position in state vector' mapper
    p_ta = x[name2pos['p_ta']] 
    
    # binding between the chemical inducer and the transcription activation factor
    p_ta_act = p_ta * f/(f+par['K_ta-f'])
    
    # binding between the output gene's DNA and the inducer-bound transcription activation factor
    F_x = par['baseline']+(p_ta_act**par['eta_dna(x)-taf'])/(p_ta_act**par['eta_dna(x)-taf']+par['K_dna(x)-taf']**par['eta_dna(x)-taf'])
    
    # returning the regulation function values in the same order as we specified the genes in the 'initialise()' function
    return jnp.array([F_ta, F_x, F_dist])
    # -------- ...TO HERE

### Deterministic ODE model definition

We now define the deterministic ODE model function _ode()_, taken from Equations (S128) to (S133) of the Supplementary Information to [Sechkar et al., 2024](https://doi.org/10.1038/s41467-024-46410-9):

\begin{align}
    \dot{m_{ta}} &= F_{ta} c_{ta} \alpha_{ta} \lambda(\epsilon,B) - (\beta_{ta} + \lambda(\epsilon,B))m_{ta}
    \\
    \dot{m_{x}} &= F_{x}(f,p_{ta}) \cdot c_{x} \alpha_{x} \lambda(\epsilon,B) - (\beta_{x} + \lambda(\epsilon,B))m_{x}
    \\
    \dot{m}_{dist} &= F_{dist} c_{dist} \alpha_{dist} \lambda(\epsilon,B) - (\beta_{dist} + \lambda(\epsilon,B))m_{dist}
    \\  
    \dot{p_{ta}} &= \frac{\epsilon(t^c)}{n_{ta}} \cdot 
    \frac{m_{ta} / k_{ta}}{D} R - (\delta_{ta} + \lambda(\epsilon,B)) \cdot p_{ta}
    \\
    \dot{p_{x}} &= \frac{\epsilon(t^c)}{n_{x}} \cdot 
    \frac{m_{x} / k_{x}}{D} R - (\delta_{x} + \lambda(\epsilon,B)) \cdot p_{x}
    \\
    \dot{p}_{dist} &= \frac{\epsilon(t^c)}{n_{dist}} \cdot 
    \frac{m_{dist} / k_{dist}}{D} R - (\delta_{dist} + \lambda(\epsilon,B)) \cdot p_{dist}
\end{align}

Here, $\lambda$ is the cell's growth rate, $R$ is the concentration of ribosomes in the cell, $\epsilon$ is the translation elongation rate, and $D$ is the 'resource competition denomionator' capturing the extent of competition for gene expression resources (i.e. ribosomes) in the cell. For gene $i$, $F_i$ its transcription regulation function (implemented in _F_calc()_ as discussed above), $c_i$ is its DNA concentration, $\alpha_i$ is its promoter strength, $\beta_i$ is its mRNA degradation rate, $\delta_i$ is the protein degradation rate, $n_i$ is the number of amino acids in the encoded protein, and $k_i$ is the effective mRNA-ribosome dissociation constant. A more detailed discussion of the model equations can be found in [Sechkar et al., 2024](https://doi.org/10.1038/s41467-024-46410-9).

In our JAX implementation _ode()_, each of the ODEs above is simply typed into the corresponding entry in the array dxdt, where the order of the genes is the same as specified in _initialise()_ and all mRNA level ODEs come before those for the protein concentrations. Once again, the variable with a given name can be retrieved from the state vector using the name2pos mapper, e.g. _x[name2pos['m_ta']]_ will return the value of the mRNA concentration of the gene _ta_.

In [None]:
def ode(F_calc,     # calculating the transcription regulation functions
            t,  x,  # time, cell state, external inputs
            e, l, # translation elongation rate, growth rate
            R, # ribosome count in the cell, resource
            k_het, D, # effective mRNA-ribosome dissociation constants for synthetic genes, resource competition denominator
            par,  # system parameters
            name2pos  # name to position decoder
            ):
    # GET REGULATORY FUNCTION VALUES
    F = F_calc(t, x, par, name2pos)

    # --------  SPECIFY THE ODEs FROM HERE...
    return [# mRNAs
            F[name2pos['F_ta']] * par['c_ta'] * par['a_ta'] * e - (par['b_ta'] + l) * x[name2pos['m_ta']],  # m_ta
            F[name2pos['F_x']] * par['c_x'] * par['a_x'] * e - (par['b_x'] + l) * x[name2pos['m_x']],  # m_x
            F[name2pos['F_dist']] * par['c_dist'] * par['a_dist'] * e - (par['b_dist'] + l) * x[name2pos['m_dist']],  # m_dist
            # proteins
            (e / par['n_ta']) * (x[name2pos['m_ta']] / k_het[name2pos['k_ta']] / D) * R - (l + par['d_ta']) * x[name2pos['p_ta']],  # p_ta
            (e / par['n_x']) * (x[name2pos['m_x']] / k_het[name2pos['k_x']] / D) * R - (l + par['d_x']) * x[name2pos['p_x']],  # p_x
            (e / par['n_dist']) * (x[name2pos['m_dist']] / k_het[name2pos['k_dist']] / D) * R - (l + par['d_dist']) * x[name2pos['p_dist']]  # p_dist
            ]
    # -------- ...TO HERE

### Stochastic model definition

To account for the stochasticity of gene expression, our package also allows to perform hybrid simulations of gene circuit performance, where the host cell variables are still treated deterministically (since they are coarse-grained variables representing the mean dynamics of many different variables, whose fluctuations are averaged out)and the synthetic gene variables are treated stochastically. 

Stochastic simulation of synthetic gene expression requires to define the reaction propensity function _v()_. This can be done simply by putting each ODE term, in order of appearance in the ODE array above, into a separate entry in the propensity vector, with a few caveats. 

First, due to the way that the translation of a single mRNA by several ribosomes is modelled, all terms involving mRNAs must be scaled by a factor found in the _mRNA_count_scales_ vector, which can be accessed using the same decoder _name2pos_. 

Second, the model allows to consider mRNA removal due to antibiotic action, which we do not do in this case. However, to maintain the correct order of the propensities, we include an additional term, set to zero, for all mRNAs. Should it be needed, the details of how to model antibiotic action can be found in the Supplementary Information to [Sechkar et al., 2024](https://doi.org/10.1038/s41467-024-46410-9).

In [None]:
def v(F_calc,     # calculating the transcription regulation functions
            t,  x,  # time, cell state, external inputs
            e, l, # translation elongation rate, growth rate
            R, # ribosome count in the cell, resource
            k_het, D, # effective mRNA-ribosome dissociation constants for synthetic genes, resource competition denominator
            mRNA_count_scales, # scaling factors for mRNA counts
            par,  # system parameters
            name2pos
            ):
    # GET REGULATORY FUNCTION VALUES
    F = F_calc(t, x, par, name2pos)

    # --------  SPECIFY THE PROPENSITIES FROM HERE...
    return [
            # synthesis, degradation, dilution of ta gene mRNA - note the scaling factor added
            l * F[name2pos['F_ta']] * par['c_ta'] * par['a_ta'] / mRNA_count_scales[name2pos['mscale_ta']],
            par['b_ta'] * x[name2pos['m_ta']] / mRNA_count_scales[name2pos['mscale_ta']],
            l * x[name2pos['m_ta']] / mRNA_count_scales[name2pos['mscale_ta']],
            # mRNA removal due to chloramphenicol action - set to zero
            0,
            # synthesis, degradation, dilution of x gene mRNA - note the scaling factor added
            l * F[name2pos['F_x']] * par['c_x'] * par['a_x'] / mRNA_count_scales[name2pos['mscale_x']],
            par['b_x'] * x[name2pos['m_x']] / mRNA_count_scales[name2pos['mscale_x']],
            l * x[name2pos['m_x']] / mRNA_count_scales[name2pos['mscale_x']],
            # mRNA removal due to chloramphenicol action - set to zero
            0,
            # synthesis, degradation, dilution of dist gene mRNA - note the scaling factor added
            l * F[name2pos['F_dist']] * par['c_dist'] * par['a_dist'] / mRNA_count_scales[name2pos['mscale_dist']],
            par['b_dist'] * x[name2pos['m_dist']] / mRNA_count_scales[name2pos['mscale_dist']],
            l * x[name2pos['m_dist']] / mRNA_count_scales[name2pos['mscale_dist']],
            # mRNA removal due to chloramphenicol action - set to zero
            0,
            # synthesis, degradation, dilution of ta gene protein
            (e / par['n_ta']) * (x[name2pos['m_ta']] / k_het[name2pos['k_ta']] / D) * R,
            par['d_ta'] * x[name2pos['p_ta']],
            l * x[name2pos['p_ta']],
            # synthesis, degradation, dilution of x gene protein
            (e / par['n_x']) * (x[name2pos['m_x']] / k_het[name2pos['k_x']] / D) * R,
            par['d_x'] * x[name2pos['p_x']],
            l * x[name2pos['p_x']],
            # synthesis, degradation, dilution of dist gene protein
            (e / par['n_dist']) * (x[name2pos['m_dist']] / k_het[name2pos['k_dist']] / D) * R,
            par['d_dist'] * x[name2pos['p_dist']],
            l * x[name2pos['p_dist']]
    ]
    # -------- ...TO HERE

## Gene circuit simulation

Now that the circuit has been defined, we can proceed to simulate its behaviour.