# L6a: Incorportaing Gene Expression Logic into Flux Balance Analysis
In this lecture, we'll continue our discussion of Flux Balance Analysis (FBA), and oarticularly what the contraints are saying in the flux estimation problem. Last time we simplified the matreial balance constraints, and the flux bounds constraints. Today, we'll talk about how we can incorporate gene expression logic into the FBA problem. The key ideas of this lecture are:
* __Flux balance analysis (FBA)__ is a mathematical approach used to analyze the flow of metabolites through a metabolic network. It assumes a steady state where metabolite production, consumption, and transport rates are balanced. The FBA problem is formulated as a linear programming (LP) problem to maximize or minimize fluxes through the network, subject to constraints. 
* __Flux bounds constraints__ limit the range of possible fluxes through a metabolic network. These bounds can incorporate additional information, such as experimental data or prior knowledge about the system, into the FBA problem.
* __Gene expression logic__ can be incorporated into the FBA problem by using gene-protein-reaction (GPR) rules to link gene expression levels to enzyme activity and metabolic fluxes. GPR rules define the relationship between genes, proteins, and reactions in a metabolic network, allowing for the integration of gene expression data into the FBA model.

Lecture notes can be downloaded: [here!](https://github.com/varnerlab/CHEME-5450-Lectures-Spring-2025/blob/main/lectures/week-6/L6a/docs/Notes.pdf)


## A model for flux bounds
The flux bounds are important constraints in flux balance analysis calculations and the convex decomposition of the stoichiometric array. Beyond their role in the flux estimation problem, the flux bounds are _integrative_, i.e., these constraints integrate many types of genetic and biochemical information into the problem. A general model for these bounds is given by:
$$
\begin{align*}
-\delta_{j}\underbrace{\left[{V_{max,j}^{\circ}}\left(\frac{e}{e^{\circ}}\right)\theta_{j}\left(\dots\right){f_{j}\left(\dots\right)}\right]}_{\text{reverse: other functions or parameters?}}\leq\hat{v}_{j}\leq{V_{max,j}^{\circ}}\left(\frac{e}{e^{\circ}}\right)\theta_{j}\left(\dots\right){f_{j}\left(\dots\right)}
\end{align*}
$$
where $V_{max,j}^{\circ}$ denotes the maximum reaction velocity (units: `flux`) computed at some _characteristic enzyme abundance_. Thus, the maximum reaction velocity is given by:
$$
V_{max,j}^{\circ} \equiv k_{cat,j}^{\circ}e^{\circ}
$$
where $k_{cat,j}$ is the catalytic constant or turnover number for the enzyme (units: `1/time`) and $e^{\circ}$ is a characteristic enzyme abundance (units: `concentration`). The term $\left(e/e^{\circ}\right)$ is a correction to account for the _actual_ enzyme abundance catalyzing the reaction (units: `dimensionless`). The $\theta_{j}\left(\dots\right)\in\left[0,1\right]$ is the current fraction of maximial enzyme activity of enzyme $e$ in reaction $j$. The activity model $\theta_{j}\left(\dots\right)$ describes [allosteric effects](https://en.wikipedia.org/wiki/Allosteric_regulation) on the reaction rate, and is a function of the regulatory and the chemical state of the system, the concentration of substrates, products, and cofactors (units: `dimensionless`).
Finally, the $f_{j}\left(\dots\right)$ is a function describing the substrate (reactants) dependence of the reaction rate $j$ (units: `dimensionless`). 

* __Parameters__: We need estimates for the $k_{cat,j}^{\circ}$ for all enzymes in the system we are interested in and a _reasonable policy_ for specifying a characteristic value for $e^{\circ}$. In addition, the $\theta_{j}\left(\dots\right)$ and $f_{j}\left(\dots\right)$ models can also have associated parameters, e.g., saturation or binding constants, etc. Thus, we need to estimate these from literature studies or experimental data.
* __Reversibility__: Next, we need to estimate the binary direction parameter $\delta_{j}\in\left\{0,1\right\}$. The value of $\delta_{j}$ describes the reversibility of reaction $j$; if reaction $j$ is __reversible__ $\delta_{j}=1$. If reaction $j$ is __irreversible__ $\delta_{j}=0$


### Simplfied bounds model
Let's initially assume that $(e/e^{\circ})\sim{1}$, there are no allosteric inputs $\theta_{j}\left(\dots\right)\sim{1}$, and the substrates are saturating $f_{j}\left(\dots\right)\sim{1}$. 
Then, the flux bounds are given by:
$$
\begin{align*}
-\delta_{j}V_{max,j}^{\circ}\leq{\hat{v}_{j}}\leq{V_{max,j}^{\circ}}
\end{align*}
$$
This is a simple model for the flux bounds. It is easy to see that the flux bounds are a function of the maximum reaction velocity, the catalytic constant or turnover number, and our assumed value of a characteristic enzyme abundance.

### Turnover numbers
The turnover number, $k_{cat}$, measures an enzyme's catalytic activity, defined as the number of substrate molecules converted to product per enzyme molecule per unit time. Units are typically `1/time` (e.g., `1/s` or `1/min`). Values can be obtained from primary literature or databases like [BRENDA](https://www.brenda-enzymes.org/):

* [Antje Chang et al., BRENDA, the ELIXIR core data resource in 2021: new developments and updates, Nucleic Acids Research, Volume 49, Issue D1, 8 January 2021, Pages D498–D508, https://doi.org/10.1093/nar/gkaa1025](https://academic.oup.com/nar/article/49/D1/D498/5992283)

Use [BRENDA](https://www.brenda-enzymes.org/) to find turnover numbers for:
* __Enzyme 1__: Arginase (EC 3.5.3.1) in humans.
* __Enzyme 2__: Argininosuccinate synthase (EC 6.3.4.5) in humans.

### Reversibility
The second thing we need to estimate is the reversibility parameter.
The reversibility parameter $\delta_{j}$ can be computed in several ways. For example, [one method in the literature](https://pubmed.ncbi.nlm.nih.gov/27159581/) is to use the sign of Gibbs reaction energy:
$$
\begin{equation*}
\delta_{i} = \begin{cases}
0 & \text{if }\text{sign}\left(\Delta{G}^{\circ} - \Delta{G}^{\star}\right)= -1 \quad\text{irreversible} \\
1 & \text{if }\text{sign}\left(\Delta{G}^{\circ} - \Delta{G}^{\star}\right)= +1 \quad\text{reversible}
\end{cases}
\end{equation*}
$$
where $\Delta{G}^{\circ}$ is the [standard Gibbs free energy change of the reaction](https://en.wikipedia.org/wiki/Gibbs_free_energy#Gibbs_free_energy_of_reactions), and $\Delta{G}^{\star}$ is a threshold value (hyperparameter). The threshold value can be set to zero or some other value. Alternatively, the value of $\delta_{j}$ can be assigned based upon a cutoff $K^{\star}$ on the equilibrium constant:
$$
\begin{equation*}
\delta_{i} = \begin{cases}
0 & \text{if }K_{eq}>\,K^{\star}\quad\text{irreversible} \\
1 & \text{if }K_{eq}\leq\,K^{\star}\quad\text{reversible}
\end{cases}
\end{equation*}
$$
where you specify the value $K^{\star}$ based upon some intution or other criteria. We can compute the $\Delta{G}^{\circ}$ values using [eQuilibrator](https://equilibrator.weizmann.ac.il):
* [Beber ME, Gollub MG, Mozaffari D, Shebek KM, Flamholz AI, Milo R, Noor E. eQuilibrator 3.0: a database solution for thermodynamic constant estimation. Nucleic Acids Res. 2022 Jan 7;50(D1): D603-D609. doi: 10.1093/nar/gkab1106. PMID: 34850162; PMCID: PMC8728285.](https://pubmed.ncbi.nlm.nih.gov/34850162/)
The [eQuilibrator application programming interface](https://equilibrator.weizmann.ac.il) is a tool for thermodynamic calculations in biological reaction networks. It was developed by the [Milo lab](https://www.weizmann.ac.il/plants/Milo/) at the Weizmann Institute in Rehovot, Israel. The [`eQuilibrator.jl` package](https://github.com/stelmo/eQuilibrator.jl) is a [Julia](https://julialang.org) wrapper around eQuilibrator (which is written in Python). 

Use [eQuilibrator](https://equilibrator.weizmann.ac.il) to find the $\delta$ values for:
* __Enzyme 1__: Arginase (EC 3.5.3.1) in humans.
* __Enzyme 2__: Argininosuccinate synthase (EC 6.3.4.5) in humans

where we assume a threshold value of $\Delta{G}^{\star}=-5.5$ kJ/mol (__hmmm__: where did this come from)?

## Gene expression logic
Suppose the flux problem we were interested in was composed of enzymes encoded by the genes $\mathcal{G}=1,2,\dots,N$.
The _action_ of each gene is described by two differential equations, one for mRNA ($m_{j}$) and a second for the corresponding protein ($p_{j}$):
$$
\begin{align*}
	\dot{m}_{j} &= r_{X,j}u_{j}\left(\dots\right) - \left(\theta_{m,j}+\mu\right)\cdot{m_{j}}+\lambda_{j}\quad{j=1,2,\dots,N}\\
	\dot{p}_{j} &= r_{L,j}w_{j}\left(\dots\right) - \left(\theta_{p,j}+\mu\right)\cdot{p_{j}}
\end{align*}
$$
Terms in the balances:
* The term $r_{X,j}u_{j}\left(\dots\right)$ in the mRNA balance, which denotes the _regulated rate of transcription_ for gene $j$. This is 
the product of a _kinetic limit_ $r_{X,j}$ (units: nmol/gDW-h) and a transcription control function $0\leq{u_{j}\left(\dots\right)}\leq{1}$ (dimensionless).
The final term $\lambda_{j}$ is the _unregulated expression rate_ of mRNA $j$ (units: nmol/gDW-h), i.e., this is the _leak_ expression rate.
* The _regulated rate of translation_ of mRNA $j$, denoted by $r_{L,j}w_{j}$, is also the product of the
kinetic limit of translation (units: nmol/gDW-h) and a translational control term $0\leq{w_{j}\left(\dots\right)}\leq{1}$ (dimensionless).
* Lastly, $\theta_{\star,j}$ denotes the first-order rate constant (units: 1/time) governing degradation of protein and mRNA, and $\mu$ is the specific growth rate of the cell (units: 1/time). We get the latter term because we are using cell-specific concentration units (e.g., nmol/gDW).

### Steady-state assumption
We have publically said (without proof, yet) that gene expression _is slow_ and metabolism _is fast_. This means that the mRNA and protein concentrations are at approximate steady state, i.e., $\dot{m}_{j}=\dot{p}_{j}=0$ from the perspective of the metabolic network. This allows us to solve the gene expression equations for the steady-state mRNA and protein concentrations. Let's show the steps to compute the steady-state mRNA concentration $m^{\star}_{j}$:
$$
\begin{align*}
r_{X,j}u_{j}\left(\dots\right) - \left(\theta_{m,j}+\mu\right)\cdot{m_{j}}+\lambda_{j} & = \dot{m}_{j}\\
r_{X,j}u_{j}\left(\dots\right) - \left(\theta_{m,j}+\mu\right)\cdot{m_{j}}+\lambda_{j} &= 0 \\
r_{X,j}u_{j}\left(\dots\right) + \lambda_{j} & = \left(\theta_{m,j}+\mu\right)\cdot{m_{j}}\\
\frac{r_{X,j}u_{j}\left(\dots\right) + \lambda_{j}}{\theta_{m,j}+\mu} & = m^{\star}_{j}
\end{align*}
$$
