# L6c: Fuzzy Logic and Boltzamnn Gene Expression Control Models in Flux Balance Analysis
In this lecture, we'll introduce additional models for the $u(...)$ and $w(...)$ functions that appear in the mRNA and protein equations. The key concepts in the lecture are:
* __Boolean gene expression logic__ can be incorporated into the FBA problem by using gene-protein-reaction (GPR) rules to link gene expression levels to enzyme activity and metabolic fluxes. GPR rules define the relationship between genes, proteins, and reactions in a metabolic network, allowing for the integration of gene expression data into the FBA model. The GPR rules are logical expressions in [a boolean model](https://en.wikipedia.org/wiki/Boolean_algebra).
* __Fuzzy logic__ is a form of many-valued logic that allows for the processing of variables with truth values ranging between 0 and 1, enabling the handling of imprecise or uncertain data in a way that mimics human decision-making processes. Thus, we are still rule based (with all the advantages of that) but we can now also deal with uncertainty or ambiguity. 
* __Boltzmann gene expression models__ are a probabilistic (not really) approach to modeling gene expression that incorporates the effects of transcription factor binding and other regulatory interactions, allowing for a more detailed description of how genes are expressed in response to various signals and conditions.

## Review
In the last lecture, introduced our first model of gene expression logic that relied on [boolean logic](https://en.wikipedia.org/wiki/Boolean_algebra). In this model, we formulate the mRNA and protein equations as:
Let's investigate how we could describe gene regulation in flux balance analysis. Suppose the flux problem we were interested in was composed of enzymes encoded by the genes $\mathcal{G}=1,2,\dots, N$.
The _action_ of each gene is described by two differential equations, one for mRNA concentration ($m_{j}$, units: `nmol/gDW`) and a second for the corresponding protein concentration ($p_{j}$, units: `nmol/gDW`):
$$
\begin{align*}
	\dot{m}_{j} &= r_{X,j}u_{j}\left(\dots\right) - \left(\theta_{m,j}+\mu\right)\cdot{m_{j}}+\lambda_{j}\quad{j=1,2,\dots,N}\\
	\dot{p}_{j} &= r_{L,j}w_{j}\left(\dots\right) - \left(\theta_{p,j}+\mu\right)\cdot{p_{j}}
\end{align*}
$$
Terms in the balances:
* _Transcription_: The term $r_{X,j}u_{j}\left(\dots\right)$ in the mRNA balance, which denotes the _regulated rate of transcription_ for gene $j$. This is 
the product of a _kinetic limit_ $r_{X,j}$ (units: `nmol/gDW-h`) and a transcription control function $0\leq{u_{j}\left(\dots\right)}\leq{1}$ (dimensionless).
The final term $\lambda_{j}$ is the _unregulated expression rate_ of mRNA $j$ (units: `nmol/gDW-time`), i.e., this is the _leak_ expression rate.
* _Translation_: The _regulated rate of translation_ of mRNA $j$, denoted by $r_{L,j}w_{j}$, is also the product of the
kinetic limit of translation (units: `nmol/gDW-time`) and a translational control term $0\leq{w_{j}\left(\dots\right)}\leq{1}$ (dimensionless).
* _Degradation_: Lastly, $\theta_{\star,j}$ denotes the first-order rate constant (units: `1/time`) governing degradation of protein and mRNA, and $\mu$ is the specific growth rate of the cell (units: `1/time`). We get the latter term using cell-specific concentration units (e.g., `nmol/gDW`).

### Steady-State Concentrations
At steady-state, the Let's show the steps to compute the steady-state mRNA concentration $m^{\star}_{j}$:
$$
\begin{align*}
r_{X,j}u_{j}\left(\dots\right) - \left(\theta_{m,j}+\mu\right)\cdot{m_{j}}+\lambda_{j} & = \dot{m}_{j}\\
r_{X,j}u_{j}\left(\dots\right) - \left(\theta_{m,j}+\mu\right)\cdot{m^{\star}_{j}}+\lambda_{j} &= 0 \\
r_{X,j}u_{j}\left(\dots\right) + \lambda_{j} & = \left(\theta_{m,j}+\mu\right)\cdot{m^{\star}_{j}}\\
\frac{r_{X,j}u_{j}\left(\dots\right) + \lambda_{j}}{\theta_{m,j}+\mu} &= m^{\star}_{j}\quad\text{for }j=1,2,\dots,N\quad\blacksquare
\end{align*}
$$
Following the same steps, we can compute the steady-state protein concentration $p^{\star}_{j}$:
$$
\begin{equation*}
p^{\star}_{j} = \frac{r_{L,j}w_{j}\left(\dots\right)}{\theta_{p,j}+\mu}\quad\text{for }j=1,2,\dots,N\quad\blacksquare
\end{equation*}
$$

If we had boolean descriptions for the control functions, we could estimate the steady-state mRNA and protein concentrations. Then the steady-state mRNA and protein concentration expressions in the `ON` case ($u_{j} = 1$ and $w_{j} = 1$) are given by:
$$
\begin{align*}
m^{\star}_{j} &= \frac{r_{X,j} + \lambda_{j}}{\theta_{m,j}+\mu}\qquad\,p^{\star}_{j} = \frac{r_{L,j}}{\theta_{p,j}+\mu}\\
\end{align*}
$$
and in the `OFF` case with $\lambda_{j}>0$:
$$
\begin{align*}
m^{\star}_{j} &= \frac{\lambda_{j}}{\theta_{m,j}+\mu}\qquad\,p^{\star}_{j} = \frac{r_{L,j}}{\theta_{p,j}+\mu}\\
\end{align*}
$$
or in the `OFF` case if $\lambda_{j}=0$: 
$$
\begin{align*}
m^{\star}_{j} &= 0\qquad\,p^{\star}_{j} = 0\\
\end{align*}
$$

### Wrinkle: Going from protein to enzyme abundance
Once we have $p^{\star}_{j}$, we can compute the enzyme abundance $e_{j}$ in the system. However, there are three _base cases_ to consider:
* __One to one__: If enzyme $e$ corresponds directly to protein $p$, then we can use the $p^{\star}$ expression directly.
* __Multisubunit__: If $e_{j}$ is a complex of different protein subunits, we can use the gene-protein-reaction (GPR) rules to compute the enzyme abundance $e_{j}$ as an `AND` combination. The `AND` rule requires all subunits to be expressed.
* __Isoforms__: Alternatively, if $e_{j}$ is a single protein but there are multiple isoforms, we can use the GPR rules to compute the enzyme abundance $e_{j}$ as an `OR` combination.

__Boolean logic in FBA example papers__:
* [Covert MW, Palsson BO. Constraints-based models: regulation of gene expression reduces the steady-state solution space. J Theor Biol. 2003 Apr 7;221(3):309-25. doi: 10.1006/jtbi.2003.3071. PMID: 12642111.](https://pubmed.ncbi.nlm.nih.gov/12642111/)
* [Orth JD, Fleming RM, Palsson BØ. Reconstruction and Use of Microbial Metabolic Networks: the Core Escherichia coli Metabolic Model as an Educational Guide. EcoSal Plus. 2010 Sep;4(1). doi: 10.1128/ecosalplus.10.2.1. PMID: 26443778.](https://pubmed.ncbi.nlm.nih.gov/26443778/)
* [Covert MW, Knight EM, Reed JL, Herrgard MJ, Palsson BO. Integrating high-throughput and computational data elucidates bacterial networks. Nature. 2004 May 6;429(6987):92-6. doi: 10.1038/nature02456. PMID: 15129285.](https://pubmed.ncbi.nlm.nih.gov/15129285/)

Incorporating Boolean regulatory models, which are typically parameter-free, into flux balance analysis calculations (and ultimately metabolic design calculations) improves the ability of this type of mathematical model to simulate (predict) metabolic function.

__Could we do better?__ What if we could build a model of the logic driving gene expression for each individual promoter, that was a continuous function of the concentrations of the transcription factors, etc that bind to the promoter? 

## Fuzzy Logic

Fuzzy logic is an extension of Boolean logic in which variables vary continuously between 0 and 1. The same `basic` Boolean operations are defined on fuzzy variables (Conjunction, Disjunction, and Negation). However, these operations now produce continuous outputs when applied to fuzzy inputs. Fuzzy logic has been used to model signal transduction systems and regulatory logic in various systems. 

### Fuzzy logic example papers:
* [Morris MK, Saez-Rodriguez J, Clarke DC, Sorger PK, Lauffenburger DA. Training signaling pathway maps to biochemical data with constrained fuzzy logic: quantitative analysis of liver cell responses to inflammatory stimuli. PLoS Comput Biol. 2011 Mar;7(3):e1001099. doi: 10.1371/journal.pcbi.1001099. Epub 2011 Mar 3. PMID: 21408212; PMCID: PMC3048376.](https://pubmed.ncbi.nlm.nih.gov/21408212/)
* [Mitsos A, Melas IN, Morris MK, Saez-Rodriguez J, Lauffenburger DA, Alexopoulos LG. Non-Linear Programming (NLP) formulation for quantitative modeling of protein signal transduction pathways. PLoS One. 2012;7(11):e50085. doi: 10.1371/journal.pone.0050085. Epub 2012 Nov 30. PMID: 23226239; PMCID: PMC3511450.](https://pubmed.ncbi.nlm.nih.gov/23226239/)
* [Hu CY, Varner JD, Lucks JB. Generating Effective Models and Parameters for RNA Genetic Circuits. ACS Synth Biol. 2015 Aug 21;4(8):914-26. doi: 10.1021/acssynbio.5b00077. Epub 2015 Jul 2. PMID: 26046393.](https://pubmed.ncbi.nlm.nih.gov/26046393/)
* [Gould R, Bassen DM, Chakrabarti A, Varner JD, Butcher J. Population Heterogeneity in the Epithelial to Mesenchymal Transition is Controlled by NFAT and Phosphorylated Sp1. PLoS Comput Biol. 2016;12(12):e1005251. Published 2016 Dec 27. doi:10.1371/journal.pcbi.1005251](https://www.ncbi.nlm.nih.gov/labs/pmc/articles/PMC5189931/)


### Discrete state promoter model
Suppose a promoter $P$ can exist in $\mathcal{S}$ possible _discrete_ microstates, and each microstate $s\in\mathcal{S}$ has some pseudo energy $\epsilon_{s}$, where the _ground state_ $s_{1}\in\mathcal{S}$ has $\epsilon_{1}=0$ (by definition). Some microstates will lead to expression (the ability to produce an mRNA molecule by transcription), while others will not. 

The probability (weight) that promoter $P$ is in microstate $s$ follows a [Boltzmann distribution](https://en.wikipedia.org/wiki/Boltzmann_distribution) which says:
$$
\begin{align*}
p_{s} & = \frac{1}{Z} \times f_{s}\exp\left(-\beta\epsilon_{s}\right)\qquad\forall{s\in\mathcal{S}}
\end{align*}
$$
where $p_{s}$ is the probability of microstate $s\in\mathcal{S}$, 
$f_{s}$ is a system state specific factor where $f_{s}\in\left[0,1\right]$,
the $\beta$ is the [thermodynamic beta](https://en.wikipedia.org/wiki/Thermodynamic_beta) and 
$Z$ is normalization factor (called the [Partiton function](https://en.wikipedia.org/wiki/Partition_function_(statistical_mechanics)) in the statistical physics community). We can find $Z$ using the summation law of discrete probolity $\sum_{s}p_{s} = 1$ which gives:
$$
\begin{align*}
\sum_{s\in\mathcal{S}}\frac{1}{Z}\times{f_{s}}\exp\left(-\beta\epsilon_{s}\right) & = \sum_{s\in\mathcal{S}}p_{s}\\
\frac{1}{Z}\sum_{s\in\mathcal{S}}f_{s}\exp\left(-\beta\epsilon_{s}\right) & = 1\\
\sum_{s\in\mathcal{S}}f_{s}\exp\left(-\beta\epsilon_{s}\right) & = Z\\
\end{align*}
$$
which leads to the probability of microstate $s$:
$$
\begin{align*}
p_{s} & = \frac{f_{s}\exp\left(-\beta\epsilon_{s}\right)}{\displaystyle \sum_{s^{\prime}\in\mathcal{S}}f_{i}\exp\left(-\beta\epsilon_{i}\right)}\qquad{s\in\mathcal{S}}
\end{align*}
$$
where $\beta$ is the [thermodynamic beta](https://en.wikipedia.org/wiki/Thermodynamic_beta). Finally, we relate the probability that promoter $P$ is in microstate $s$ back to the $\bar{u}\left(\dots\right)$ control function by computing the overall probability that the desired event happens, e.g., promoter $P$ undergoes transcription. We then define two subsets $\mathcal{A}\subseteq\mathcal{S}$ is the subset of states in which regulated transcription could occur, and $\mathcal{B}\subseteq\mathcal{S}$ is the set of states in which unregulated transcription could occur. 
Given $\mathcal{A}$ and $\mathcal{B}$, the control function $\bar{u}\left(\dots\right)$ becomes:
$$
\begin{align*}
\bar{u} & = \underbrace{\sum_{s\in{\mathcal{A}}}p_{s}}_{\text{regulated}\,u}+
\underbrace{\sum_{s^{\prime}\in\mathcal{B}}p_{s^{\prime}}}_{\text{unregulated}\,u^{\dagger}}\\
\end{align*}
$$
Thus, the control function $\bar{u}$ can written as the sum of a regulated $u\left(\dots\right)$ and unregulated $u^{\dagger}\left(\dots\right)$ component: $\bar{u} = u + u^{\dagger}$. We can then substitute this expression for $\bar{u}$ into the mRNA balance:
$$
\begin{align*}
\dot{m}_{j} &= r_{X,j}\bar{u}_{j}\left(\dots\right) - \left(\theta_{m,j}+\mu\right)\cdot{m_{j}}\\
&= r_{X,j}\left(u_{j} + u^{\dagger}_{j}\right) - \left(\theta_{m,j}+\mu\right)\cdot{m_{j}}\\
&= r_{X,j}u_{j} - \left(\theta_{m,j}+\mu\right)\cdot{m_{j}} + r_{X,j}u^{\dagger}_{j}\\
&= r_{X,j}u_{j} - \left(\theta_{m,j}+\mu\right)\cdot{m_{j}} + \lambda_{j}\quad\forall{j}\quad\blacksquare
\end{align*}
$$
where the unregulated expression rate $\lambda_{j}$ is given by:
$$
\begin{align*}
\lambda_{j} &\equiv r_{X,j}u^{\dagger}_{j}\\
\end{align*}
$$

#### What are the $f_{s}$ and $\epsilon_{s}$?
The $f_{s}$ and $\epsilon_{s}$ are the system state specific factors and pseudo energies, respectively, that describe the microstates of the promoter. But what are they?
* The $f_{s}$ values can be thought of as weights that represent the relative likelihood of each microstate occurring, while the $\epsilon_{s}$ values represent the energy associated with each microstate. The specific values of $f_{s}$ and $\epsilon_{s}$ depend on the details of the system being modeled, such as the concentrations of transcription factors and other regulatory molecules that influence gene expression. 
* The $f_{s}$ terms are often [assumed to be hill type functions](https://en.wikipedia.org/wiki/Hill_equation_(biochemistry)) that describe the binding of transcription factors to the promoter, while the $\epsilon_{s}$ must be estimated from data.

Let's look at a few examples of how we can use this model to describe the expression of a gene.

### Examples
* [G.K. Ackers, A.D. Johnson, & M.A. Shea, Quantitative model for gene regulation by lambda phage repressor., Proc. Natl. Acad. Sci. U.S.A. 79 (4) 1129-1133, https://doi.org/10.1073/pnas.79.4.1129 (1982).](https://pubmed.ncbi.nlm.nih.gov/6461856/)
* [Moon TS, Lou C, Tamsir A, Stanton BC, Voigt CA. Genetic programs constructed from layered logic gates in single cells. Nature. 2012;491(7423):249-253. doi:10.1038/nature11516](https://www.ncbi.nlm.nih.gov/labs/pmc/articles/PMC3904217/)
* [Tasseff R, Jensen HA, Congleton J, Dai D, Rogers KV, Sagar A, Bunaciu RP, Yen A, Varner JD. An Effective Model of the Retinoic Acid Induced HL-60 Differentiation Program. Sci Rep. 2017 Oct 30;7(1):14327. doi: 10.1038/s41598-017-14523-5. PMID: 29085021; PMCID: PMC5662654.](https://pubmed.ncbi.nlm.nih.gov/29085021/)
* [Adhikari A, Vilkhovoy M, Vadhin S, Lim HE, Varner JD. Effective Biophysical Modeling of Cell Free Transcription and Translation Processes. Front Bioeng Biotechnol. 2020 Nov 26;8:539081. doi: 10.3389/fbioe.2020.539081. PMID: 33324619; PMCID: PMC7726328.](https://pubmed.ncbi.nlm.nih.gov/33324619/)

# Today?
That's a wrap! Let's review - what are some things we discussed today?