# Consistent Reproduction Of growth/no growth Phenotype (CROP)
Predictions from metabolic models due not necessarily match experimental observations.

Calibration of these models is challenging due to their size and complexity, as well as sparse data.

CROP builds on flux balance analysis methods and provides a coarse-grained calibration of metabolic networks based on growth / no growth phenotype data: 
1. If growth is predicted by model, but not by the data --> remove reactions (gap creation)
2. If no growth is predicted by model, but is by the data --> add reactions (gap filling)

This notebook examines simple models to aid with the validation of the CROP algorithm.

## Testing toy model 1

Consider simple metabolic reaction system under a steady-state where a metabolic nutrient 'A' is imported into the cell, converted in an intermediate metabolite 'B', then converted into a waste produce 'C' that is exported out of the cell. 


<img src="crop_test_model_simple.PNG" alt="Toy Model 1" width="800"/>

### Preliminary flux balance analysis 

Flux balance analysis requires a stoichiometric matrix, **S**, and net reaction rate vector, $\vec{v}$, such that we can solve for the steady-state fluxes: $\textbf{S}\vec{v}=0$.

The stoichiometries for this reaction system are given be the following table: 
|          |     $A_{SRC}\rightarrow A_{int}$    | $A_{int}\rightarrow B_{int}$    |$B_{int}\rightarrow C_{int}$    |$C_{int}\rightarrow C_{SNK}$    |
|----------|:-------------:|:-------------:|:-------------:|:-------------:|
| $A_{int}$ |+1 |-1 | 0 | 0 |
| $B_{int}$ | 0 |+1 |-1 | 0 |
| $C_{int}$ | 0 | 0 |+1 |-1 |

Note that we model the internal (closed) reaction processes and metabolites, with external (open) reactions included as inputs and outputs.
    
This implies the following stoichiometric matrix form:
$$
\textbf{S} =
\begin{bmatrix}
+1 & -1 & 0 & 0\\
0 & +1 & -1 & 0\\
0 & 0 & +1 & -1\\
\end{bmatrix}
$$

Similary the net reactions rates/fluxes (forward minus reverse) are the following:
$$
\vec{v} =

\begin{bmatrix}
v^A_{\textrm{imp}} - v^A_{\textrm{exp}} \\
v^{AB}_{\textrm{exg}} - v^{BA}_{\textrm{exg}}\\
v^{BC}_{\textrm{exg}} - v^{CB}_{\textrm{exg}} \\
v^{C}_{\textrm{exp}} - v^{C}_{\textrm{imp}} \\
\end{bmatrix}
= 
\begin{bmatrix}
v_{A_{SRC} \rightarrow A_{int}} \\
v_{A_{int} \rightarrow B_{int}} \\
v_{B_{int} \rightarrow C_{int}} \\
v_{C_{int} \rightarrow C_{SNK}} \\
\end{bmatrix}
$$







At steady-state, $\textbf{S}\vec{v}=0$, which gives the following: 

$$
\textbf{S}\vec{v}
=
\begin{bmatrix}
+1 & -1 & 0 & 0\\
0 & +1 & -1 & 0\\
0 & 0 & +1 & -1\\
\end{bmatrix}
\cdot
\begin{bmatrix}
v_{A_{SRC} \rightarrow A_{int}} \\
v_{A_{int} \rightarrow B_{int}} \\
v_{B_{int} \rightarrow C_{int}} \\
v_{C_{int} \rightarrow C_{SNK}} \\
\end{bmatrix}
=
\begin{bmatrix}
v_{A_{SRC} \rightarrow A_{int}} - v_{A_{int} \rightarrow B_{int}} \\
v_{A_{int} \rightarrow B_{int}} - v_{B_{int} \rightarrow C_{int}}  \\
v_{B_{int} \rightarrow C_{int}} - v_{C_{int} \rightarrow C_{SNK}}\\
\end{bmatrix}
=
\begin{bmatrix}
0 \\
0 \\
0 \\
\end{bmatrix}
$$



This is an underdetermined system with 4 unknown net fluxes but only 3 equations - which is typical for metabolic models. Additional constraints are required to solve for the flux values. 

However, it's trivial to see from this toy example that the fluxes are all tighly coupled (more specifically, they are equal).

From the above equation: 

$v_{A_{SRC} \rightarrow A_{int}} = v_{A_{int} \rightarrow B_{int}} =  v_{B_{int} \rightarrow C_{int}} = v_{C_{int} \rightarrow C_{SNK}}$

### Expected test results for CROP

Using the above toy model as the 'provisional model' that needs to be updated based on data.

#### Test case 1: growth is predicted, but not observed in data
So, data shows that $v_{C_{int} \rightarrow C_{SNK}} < U_{\textrm{growth}}$, but not the provisional model.

$\rightarrow$ If the model predicts growth, removing any reaction will remove the flux. So this means that there are multiple solutions to this problem.

#### Test case 2: growth is not predicted, but is observed in data
So, data shows that $v_{C_{int} \rightarrow C_{SNK}} > U_{\textrm{growth}}$, but not the provisional model.

$\rightarrow$ If the model does not predict growth, then all of the fluxes are below $U_{\textrm{growth}}$. Adding another internal reaction will not increase the flux because the total flux through the system does not allow growth. So there is no solution to this problem, unless an additional source can be added.

### Formulation of CROP problem

fill in details here


## Testing toy model 2

This model extends the model by introducing another internal reaction: $A \rightarrow C$.

This creates 2 pathways that yield the waste product 'C'.

<img src="crop_test_model2_simple.PNG" alt="Toy Model 2" width="800"/>

### Preliminary flux balance analysis

Stoichiometric table: 

|  | $A_{SRC}\rightarrow A_{int}$ | $A_{int}\rightarrow B_{int}$ | $B_{int}\rightarrow C_{int}$ | $C_{int}\rightarrow C_{SNK}$ |  $A_{int}\rightarrow C_{int}$  |
|----------|:-------------:|:-------------:|:-------------:|:-------------:|:-------------:|
| $A_{int}$ |+1 |-1 | 0 | 0 |-1 |
| $B_{int}$ | 0 |+1 |-1 | 0 | 0 |
| $C_{int}$ | 0 | 0 |+1 |-1 |+1 |


Stoichiometric matrix, **S**:

$$
\textbf{S} =
\begin{bmatrix}
+1 & -1 & 0 & 0 & -1\\
0 & +1 & -1 & 0 & 0\\
0 & 0 & +1 & -1 & +1\\
\end{bmatrix}
$$


Flux vector, $\vec{v}$:
$$
\vec{v} =

\begin{bmatrix}
v^A_{\textrm{imp}} - v^A_{\textrm{exp}} \\
v^{AB}_{\textrm{exg}} - v^{BA}_{\textrm{exg}}\\
v^{BC}_{\textrm{exg}} - v^{CB}_{\textrm{exg}} \\
v^{C}_{\textrm{exp}} - v^{C}_{\textrm{imp}} \\
v^{A}_{\textrm{exg}} - v^{C}_{\textrm{exg}} \\
\end{bmatrix}
= 
\begin{bmatrix}
v_{A_{SRC} \rightarrow A_{int}} \\
v_{A_{int} \rightarrow B_{int}} \\
v_{B_{int} \rightarrow C_{int}} \\
v_{C_{int} \rightarrow C_{SNK}} \\
v_{A_{int} \rightarrow C_{int}} \\
\end{bmatrix}
$$





At steady-state, $\textbf{S}\vec{v}=0$, which gives the following: 

$$
\textbf{S}\vec{v}
=
\begin{bmatrix}
+1 & -1 & 0 & 0 & -1\\
0 & +1 & -1 & 0 & 0\\
0 & 0 & +1 & -1 & +1\\
\end{bmatrix}
\cdot
\begin{bmatrix}
v_{A_{SRC} \rightarrow A_{int}} \\
v_{A_{int} \rightarrow B_{int}} \\
v_{B_{int} \rightarrow C_{int}} \\
v_{C_{int} \rightarrow C_{SNK}} \\
v_{A_{int} \rightarrow C_{int}} \\
\end{bmatrix}
=
\begin{bmatrix}
v_{A_{SRC} \rightarrow A_{int}} - v_{A_{int} \rightarrow B_{int}} - v_{A_{int} \rightarrow C_{int}} \\
v_{A_{int} \rightarrow B_{int}} - v_{B_{int} \rightarrow C_{int}}  \\
v_{B_{int} \rightarrow C_{int}} - v_{C_{int} \rightarrow C_{SNK}} + v_{A_{int} \rightarrow C_{int}}\\
\end{bmatrix}
=
\begin{bmatrix}
0 \\
0 \\
0 \\
\end{bmatrix}
$$

Doing some algebra, we see that:

$v_{A_{SRC} \rightarrow A_{int}} = v_{A_{int} \rightarrow B_{int}} + v_{A_{int} \rightarrow C_{int}}$

$v_{A_{int} \rightarrow B_{int}} = v_{B_{int} \rightarrow C_{int}} $ 

$v_{C_{int} \rightarrow C_{SNK}} =  v_{B_{int} \rightarrow C_{int}} + v_{A_{int} \rightarrow C_{int}}$

which means that:

$v_{A_{SRC} \rightarrow A_{int}} = v_{B_{int} \rightarrow C_{int}} + v_{A_{int} \rightarrow C_{int}} = v_{C_{int} \rightarrow C_{SNK}}$


### Expected test results for CROP

Using the above toy model as the 'provisional model' that needs to be updated based on data.

#### Test case 1: growth is predicted, but not observed in data
So, data shows that $v_{C_{int} \rightarrow C_{SNK}} < U_{\textrm{growth}}$, but not the provisional model.

$\rightarrow$ If the model predicts growth, removing $v_{A_{SRC} \rightarrow A_{int}}$ or $v_{B_{int} \rightarrow C_{int}}$ and $v_{A_{int} \rightarrow C_{int}}$ reactions will remove the growth flux. So this means that there are multiple solutions to this problem.

#### Test case 2: growth is not predicted, but is observed in data
So, data shows that $v_{C_{int} \rightarrow C_{SNK}} > U_{\textrm{growth}}$, but not the provisional model.

$\rightarrow$ If the model does not predict growth, then $v_{A_{SRC} \rightarrow A_{int}}$ and $v_{B_{int} \rightarrow C_{int}} + v_{A_{int} \rightarrow C_{int}}$  are also, below $U_{\textrm{growth}}$. Adding another internal reaction will not increase the flux because the total flux through the system does not allow growth. So there is no solution to this problem, unless an additional source can be added.

### Formulation of CROP problem

fill in details here

## CROP Gap Creation (Reaction Removal) Test

Toy model 3:

<img src="crop_test_model3_simple.PNG" alt="Toy Model 3" width="800"/>



Stoichiometric table: 

|  | $A_{SRC}\rightarrow A_{int}$ | $A_{int}\rightarrow B_{int}$ | $B_{int}\rightarrow C_{int}$ | $C_{int}\rightarrow C_{SNK}$ |  $A_{int}\rightarrow C_{int}$  | $B_{SRC}\rightarrow B_{int}$  |
|----------|:-------------:|:-------------:|:-------------:|:-------------:|:-------------:|:-------------:|
| $A_{int}$ |+1 |-1 | 0 | 0 |-1 |0 |
| $B_{int}$ | 0 |+1 |-1 | 0 | 0 |+1|
| $C_{int}$ | 0 | 0 |+1 |-1 |+1 |0 |

Stoichiometric matric, **S**:

$$
\textbf{S} =
\begin{bmatrix}
+1 & -1 & 0 & 0 & -1 & 0\\
0 & +1 & -1 & 0 & 0 & 1\\
0 & 0 & +1 & -1 & +1 & 0\\
\end{bmatrix}
$$

Flux vector, $\vec{v}$:
$$
\vec{v} =

\begin{bmatrix}
v^A_{\textrm{imp}} - v^A_{\textrm{exp}} \\
v^{AB}_{\textrm{exg}} - v^{BA}_{\textrm{exg}}\\
v^{BC}_{\textrm{exg}} - v^{CB}_{\textrm{exg}} \\
v^{C}_{\textrm{exp}} - v^{C}_{\textrm{imp}} \\
v^{A}_{\textrm{exg}} - v^{C}_{\textrm{exg}} \\
v^{B}_{\textrm{imp}} - v^{B}_{\textrm{exp}} \\
\end{bmatrix}
= 
\begin{bmatrix}
v_{A_{SRC} \rightarrow A_{int}} \\
v_{A_{int} \rightarrow B_{int}} \\
v_{B_{int} \rightarrow C_{int}} \\
v_{C_{int} \rightarrow C_{SNK}} \\
v_{A_{int} \rightarrow C_{int}} \\
v_{B_{SRC} \rightarrow B_{int}} \\
\end{bmatrix}
$$

Initial CROP Z list (all reactions included): 
$$
\vec{Z_0} = 
\begin{bmatrix}
1 \\
1 \\
1 \\
1 \\
1 \\
1 \\
\end{bmatrix}
$$

CROP weight list (all reactions are equally probable): 
$$
\vec{weights} =
\begin{bmatrix}
0.5 \\
0.5 \\
0.5 \\
0.5 \\
0.5 \\
0.5 \\
\end{bmatrix}
$$

Experimental conditions and observations:

1. $v_{A_{SRC} \rightarrow A_{int}} = U_{A_1}$ and $v_{B_{SRC} \rightarrow B_{int}} = U_{B_1} $
    * growth observed: $v_{C_{int} \rightarrow C_{SNK}} > U_{\textrm{grow}}$
2. $v_{A_{SRC} \rightarrow A_{int}} = U_{A_2} $ and $v_{B_{SRC} \rightarrow B_{int}} = U_{B_2} $
    * no growth observed: $v_{C_{int} \rightarrow C_{SNK}} < U_{\textrm{grow}}$

where $U_{B_1} = U_{B_2} = U_{A_1} >> U_{A_2}$.

Additional constraints: 
1. steady-state flux: $\textbf{S}\vec{v}=0$
2. bounded positive net fluxes: $0 < v_i < U_i$ for flux i.

Comments:

1. Intuitively, we see that there are two pathways that generate 'C' flux. One using the 'A' nutrient and another using the 'B' nutrient. If the 'B' pathway is removed, then the model cannot grow without A and will therefore be consistent with the data. 
This means that CROP should remove a reaction (or reactions) corresponding to the 'B' pathway, such as $v_{B_{SRC} \rightarrow B_{int}}$ or $v_{B_{int} \rightarrow C_{int}}$. 
2. algebraically, we find that at steady-state the fluxes are:
$$
v_{A_{int} \rightarrow C_{int}} = v_{A_{SRC} \rightarrow A_{int}} - v_{A_{int} \rightarrow B_{int}}\\
v_{B_{int} \rightarrow C_{int}} = v_{A_{int} \rightarrow B_{int}} + v_{B_{SRC} \rightarrow B_{int}} \\
v_{C_{int} \rightarrow C_{SNK}} = v_{B_{int} \rightarrow C_{int}} + v_{A_{int} \rightarrow C_{int}} \\
$$


Expected output for CROP:
1. $\vec{Z} = \begin{bmatrix} 1  & 1  & 1  & 1  & 1  & 0 \end{bmatrix}$ or
2. $\vec{Z} = \begin{bmatrix} 1  & 1  & 0  & 1  & 1  & 1 \end{bmatrix}$ or
3. $\vec{Z} = \begin{bmatrix} 1  & 1  & 0  & 1  & 1  & 0 \end{bmatrix}$ or



### CROP problem formulation

$$\begin{equation}\begin{array}{l}
\min_z weights^T(1-z) \\
\begin{array}{lll}
& v_{nogrowth,C} =v_{A_{max}}*r_{A} \\
& Sv_{nogrowth}= 0 & \text{inner problem} \\
& 0\leq v_{nogrowth,i}\leq U_{nogrowth,i}\cdot z_i \\
& S^Tm +c = r \\
& r_i\leq\Omega_i\cdot(1-z_i) & \text{for $i\neq$ A} \\
& r_{A} \geq 0 \\
\end{array}\\
v_{nogrowth,C} \leq\text{minimal-growth} \\
Sv_{growth}= 0 \\
0\leq w_i\leq U_{wt,i}\cdot z_i \\
v_{growth,C} \geq \text{minimal-growth} \\
z\in \{0,1\} \\
\end{array}\end{equation}$$


where 

|  | $A_{SRC}\rightarrow A_{int}$ | $A_{int}\rightarrow B_{int}$ | $B_{int}\rightarrow C_{int}$ | $C_{int}\rightarrow C_{SNK}$ |  $A_{int}\rightarrow C_{int}$  | $B_{SRC}\rightarrow B_{int}$  |
|----------|:-------------:|:-------------:|:-------------:|:-------------:|:-------------:|:-------------:|
| $A_{int}$ |+1 |-1 | 0 | 0 |-1 |0 |
| $B_{int}$ | 0 |+1 |-1 | 0 | 0 |+1|
| $C_{int}$ | 0 | 0 |+1 |-1 |+1 |0 |


In [15]:
# code formulation
import cvxpy as cp
import numpy as np
import pandas as pd

S_index = ['A_int', 'B_int', 'C_int']
S_dict = {'A_SRC_to_A_int': [1,0,0],
          'A_int_to_B_int': [-1,1,0],
          'B_int_to_C_int': [0,-1,1],
          'C_int_to_C_SNK': [0,0,-1],
          'A_int_to_C_int': [-1,0,1],
          'B_SRC_to_B_int': [0,1,0],
          }
S_table = pd.DataFrame( S_dict ,index=S_index)
S = S_table.values

n_mets, n_rxns = S.shape

max_influx = 10
min_growth = 5
U_nogrowth       = np.eye(n_rxns)*1000.0
U_growth         = np.eye(n_rxns)*1000.0
weights = np.ones(n_rxns)*0.5
Omega   = 1000.0

A_idx = 0 # index for A_SRC to A_int
C_idx = 3 # index for C_int to C_SNK

v_growth = cp.Variable(n_rxns)
m = cp.Variable(n_mets)
r = cp.Variable(n_rxns)
v_nogrowth = cp.Variable(n_rxns)

z = cp.Variable(n_rxns, boolean=True)

c = np.zeros(n_rxns)
c[C_idx] = 1  # how to set c?

single_level_milp = cp.Problem(
    cp.Minimize( weights.T@(1-z)),
             [
                 v_nogrowth[C_idx]   == max_influx*r[A_idx],
                 S@v_nogrowth    == 0,
                 0              <=  v_nogrowth,
                 v_nogrowth     <=  U_nogrowth@z,
                 S.T@m - r       == -c,
                
                r[A_idx]      >= 0,
                r[A_idx+1:]   <= Omega*np.eye(n_rxns - A_idx - 1)@(1 - z[A_idx + 1:]),
            v_nogrowth <= min_growth,
            S@v_growth  == 0,
            0                <= v_growth,
            v_growth         <= U_growth@z,
            v_growth[C_idx]      >= min_growth,])

results = single_level_milp.solve(solver=cp.SCIPY, )

display(results)

#r[:A_idx]     <= Omega*np.eye(A_idx)@(1 - z[:A_idx]), # removed constraint due to error ValueError: Invalid dimensions (0, 0).

inf