# Consistent Reproduction Of growth/no growth Phenotype (CROP)
Predictions from metabolic models due not necessarily match experimental observations.

Calibration of these models is challenging due to their size and complexity, as well as sparse data.

CROP builds on flux balance analysis methods and provides a coarse-grained calibration of metabolic networks based on growth / no growth phenotype data: 
1. If growth is predicted by model, but not by the data --> remove reactions (gap creation)
2. If no growth is predicted by model, but is by the data --> add reactions (gap filling)

This notebook examines simple models to aid with the validation of the CROP algorithm.

## Testing toy model 1

Consider simple metabolic reaction system under a steady-state where a metabolic nutrient 'A' is imported into the cell, converted in an intermediate metabolite 'B', then converted into a waste produce 'C' that is exported out of the cell. 


<img src="crop_test_model_simple.PNG" alt="Toy Model 1" width="800"/>

### Preliminary flux balance analysis 

Flux balance analysis requires a stoichiometric matrix, **S**, and net reaction rate vector, $\vec{v}$, such that we can solve for the steady-state fluxes: $\textbf{S}\vec{v}=0$.

The stoichiometries for this reaction system are given be the following table: 

|          |     $A_{SRC}\rightarrow A_{int}$    | $A_{int}\rightarrow B_{int}$    |$B_{int}\rightarrow C_{int}$    |$C_{int}\rightarrow C_{SNK}$    |
|----------|:-------------:|:-------------:|:-------------:|:-------------:|
| $A_{int}$ |+1 |-1 | 0 | 0 |
| $B_{int}$ | 0 |+1 |-1 | 0 |
| $C_{int}$ | 0 | 0 |+1 |-1 |

Note that we model the internal (closed) reaction processes and metabolites, with external (open) reactions included as inputs and outputs.
    
This implies the following stoichiometric matrix form:
$$
\textbf{S} =
\begin{bmatrix}
+1 & -1 & 0 & 0\\
0 & +1 & -1 & 0\\
0 & 0 & +1 & -1\\
\end{bmatrix}
$$

Similary the net reactions rates/fluxes (forward minus reverse) are the following:
$$
\vec{v} =

\begin{bmatrix}
v^A_{\textrm{imp}} - v^A_{\textrm{exp}} \\
v^{AB}_{\textrm{exg}} - v^{BA}_{\textrm{exg}}\\
v^{BC}_{\textrm{exg}} - v^{CB}_{\textrm{exg}} \\
v^{C}_{\textrm{exp}} - v^{C}_{\textrm{imp}} \\
\end{bmatrix}
= 
\begin{bmatrix}
v_{A_{SRC} \rightarrow A_{int}} \\
v_{A_{int} \rightarrow B_{int}} \\
v_{B_{int} \rightarrow C_{int}} \\
v_{C_{int} \rightarrow C_{SNK}} \\
\end{bmatrix}
$$







At steady-state, $\textbf{S}\vec{v}=0$, which gives the following: 

$$
\textbf{S}\vec{v}
=
\begin{bmatrix}
+1 & -1 & 0 & 0\\
0 & +1 & -1 & 0\\
0 & 0 & +1 & -1\\
\end{bmatrix}
\cdot
\begin{bmatrix}
v_{A_{SRC} \rightarrow A_{int}} \\
v_{A_{int} \rightarrow B_{int}} \\
v_{B_{int} \rightarrow C_{int}} \\
v_{C_{int} \rightarrow C_{SNK}} \\
\end{bmatrix}
=
\begin{bmatrix}
v_{A_{SRC} \rightarrow A_{int}} - v_{A_{int} \rightarrow B_{int}} \\
v_{A_{int} \rightarrow B_{int}} - v_{B_{int} \rightarrow C_{int}}  \\
v_{B_{int} \rightarrow C_{int}} - v_{C_{int} \rightarrow C_{SNK}}\\
\end{bmatrix}
=
\begin{bmatrix}
0 \\
0 \\
0 \\
\end{bmatrix}
$$



This is an underdetermined system with 4 unknown net fluxes but only 3 equations - which is typical for metabolic models. Additional constraints are required to solve for the flux values. 

However, it's trivial to see from this toy example that the fluxes are all tighly coupled (more specifically, they are equal).

From the above equation: 

$v_{A_{SRC} \rightarrow A_{int}} = v_{A_{int} \rightarrow B_{int}} =  v_{B_{int} \rightarrow C_{int}} = v_{C_{int} \rightarrow C_{SNK}}$

### Expected test results for CROP

Using the above toy model as the 'provisional model' that needs to be updated based on data.

#### Test case 1: growth is predicted, but not observed in data
So, data shows that $v_{C_{int} \rightarrow C_{SNK}} < U_{\textrm{growth}}$, but not the provisional model.

$\rightarrow$ If the model predicts growth, removing any reaction will remove the flux. So this means that there are multiple solutions to this problem.

#### Test case 2: growth is not predicted, but is observed in data
So, data shows that $v_{C_{int} \rightarrow C_{SNK}} > U_{\textrm{growth}}$, but not the provisional model.

$\rightarrow$ If the model does not predict growth, then all of the fluxes are below $U_{\textrm{growth}}$. Adding another internal reaction will not increase the flux because the total flux through the system does not allow growth. So there is no solution to this problem, unless an additional source can be added.

### Formulation of CROP problem

fill in details here


## Testing toy model 2

This model extends the model by introducing another internal reaction: $A \rightarrow C$.

This creates 2 pathways that yield the waste product 'C'.

<img src="crop_test_model2_simple.PNG" alt="Toy Model 2" width="800"/>

### Preliminary flux balance analysis

Stoichiometric table: 

|  | $A_{SRC}\rightarrow A_{int}$ | $A_{int}\rightarrow B_{int}$ | $B_{int}\rightarrow C_{int}$ | $C_{int}\rightarrow C_{SNK}$ |  $A_{int}\rightarrow C_{int}$  |
|----------|:-------------:|:-------------:|:-------------:|:-------------:|:-------------:|
| $A_{int}$ |+1 |-1 | 0 | 0 |-1 |
| $B_{int}$ | 0 |+1 |-1 | 0 | 0 |
| $C_{int}$ | 0 | 0 |+1 |-1 |+1 |


Stoichiometric matrix, **S**:

$$
\textbf{S} =
\begin{bmatrix}
+1 & -1 & 0 & 0 & -1\\
0 & +1 & -1 & 0 & 0\\
0 & 0 & +1 & -1 & +1\\
\end{bmatrix}
$$


Flux vector, $\vec{v}$:
$$
\vec{v} =

\begin{bmatrix}
v^A_{\textrm{imp}} - v^A_{\textrm{exp}} \\
v^{AB}_{\textrm{exg}} - v^{BA}_{\textrm{exg}}\\
v^{BC}_{\textrm{exg}} - v^{CB}_{\textrm{exg}} \\
v^{C}_{\textrm{exp}} - v^{C}_{\textrm{imp}} \\
v^{A}_{\textrm{exg}} - v^{C}_{\textrm{exg}} \\
\end{bmatrix}
= 
\begin{bmatrix}
v_{A_{SRC} \rightarrow A_{int}} \\
v_{A_{int} \rightarrow B_{int}} \\
v_{B_{int} \rightarrow C_{int}} \\
v_{C_{int} \rightarrow C_{SNK}} \\
v_{A_{int} \rightarrow C_{int}} \\
\end{bmatrix}
$$





At steady-state, $\textbf{S}\vec{v}=0$, which gives the following: 

$$
\textbf{S}\vec{v}
=
\begin{bmatrix}
+1 & -1 & 0 & 0 & -1\\
0 & +1 & -1 & 0 & 0\\
0 & 0 & +1 & -1 & +1\\
\end{bmatrix}
\cdot
\begin{bmatrix}
v_{A_{SRC} \rightarrow A_{int}} \\
v_{A_{int} \rightarrow B_{int}} \\
v_{B_{int} \rightarrow C_{int}} \\
v_{C_{int} \rightarrow C_{SNK}} \\
v_{A_{int} \rightarrow C_{int}} \\
\end{bmatrix}
=
\begin{bmatrix}
v_{A_{SRC} \rightarrow A_{int}} - v_{A_{int} \rightarrow B_{int}} - v_{A_{int} \rightarrow C_{int}} \\
v_{A_{int} \rightarrow B_{int}} - v_{B_{int} \rightarrow C_{int}}  \\
v_{B_{int} \rightarrow C_{int}} - v_{C_{int} \rightarrow C_{SNK}} + v_{A_{int} \rightarrow C_{int}}\\
\end{bmatrix}
=
\begin{bmatrix}
0 \\
0 \\
0 \\
\end{bmatrix}
$$

Doing some algebra, we see that:

$v_{A_{SRC} \rightarrow A_{int}} = v_{A_{int} \rightarrow B_{int}} + v_{A_{int} \rightarrow C_{int}}$

$v_{A_{int} \rightarrow B_{int}} = v_{B_{int} \rightarrow C_{int}} $ 

$v_{C_{int} \rightarrow C_{SNK}} =  v_{B_{int} \rightarrow C_{int}} + v_{A_{int} \rightarrow C_{int}}$

which means that:

$v_{A_{SRC} \rightarrow A_{int}} = v_{B_{int} \rightarrow C_{int}} + v_{A_{int} \rightarrow C_{int}} = v_{C_{int} \rightarrow C_{SNK}}$


### Expected test results for CROP

Using the above toy model as the 'provisional model' that needs to be updated based on data.

#### Test case 1: growth is predicted, but not observed in data
So, data shows that $v_{C_{int} \rightarrow C_{SNK}} < U_{\textrm{growth}}$, but not the provisional model.

$\rightarrow$ If the model predicts growth, removing $v_{A_{SRC} \rightarrow A_{int}}$ or $v_{B_{int} \rightarrow C_{int}}$ and $v_{A_{int} \rightarrow C_{int}}$ reactions will remove the growth flux. So this means that there are multiple solutions to this problem.

#### Test case 2: growth is not predicted, but is observed in data
So, data shows that $v_{C_{int} \rightarrow C_{SNK}} > U_{\textrm{growth}}$, but not the provisional model.

$\rightarrow$ If the model does not predict growth, then $v_{A_{SRC} \rightarrow A_{int}}$ and $v_{B_{int} \rightarrow C_{int}} + v_{A_{int} \rightarrow C_{int}}$  are also, below $U_{\textrm{growth}}$. Adding another internal reaction will not increase the flux because the total flux through the system does not allow growth. So there is no solution to this problem, unless an additional source can be added.

### Formulation of CROP problem

fill in details here

## CROP Gap Creation (Reaction Removal) Test

Toy model 3:

<img src="crop_test_model3_simple.PNG" alt="Toy Model 3" width="800"/>



Stoichiometric table: 

|  | $A_{SRC}\rightarrow A_{int}$ | $A_{int}\rightarrow B_{int}$ | $B_{int}\rightarrow C_{int}$ | $C_{int}\rightarrow C_{SNK}$ |  $A_{int}\rightarrow C_{int}$  | $B_{SRC}\rightarrow B_{int}$  |
|----------|:-------------:|:-------------:|:-------------:|:-------------:|:-------------:|:-------------:|
| $A_{int}$ |+1 |-1 | 0 | 0 |-1 |0 |
| $B_{int}$ | 0 |+1 |-1 | 0 | 0 |+1|
| $C_{int}$ | 0 | 0 |+1 |-1 |+1 |0 |

Stoichiometric matric, **S**:

$$
\textbf{S} =
\begin{bmatrix}
+1 & -1 & 0 & 0 & -1 & 0\\
0 & +1 & -1 & 0 & 0 & 1\\
0 & 0 & +1 & -1 & +1 & 0\\
\end{bmatrix}
$$

Flux vector, $\vec{v}$:
$$
\vec{v} =

\begin{bmatrix}
v^A_{\textrm{imp}} - v^A_{\textrm{exp}} \\
v^{AB}_{\textrm{exg}} - v^{BA}_{\textrm{exg}}\\
v^{BC}_{\textrm{exg}} - v^{CB}_{\textrm{exg}} \\
v^{C}_{\textrm{exp}} - v^{C}_{\textrm{imp}} \\
v^{A}_{\textrm{exg}} - v^{C}_{\textrm{exg}} \\
v^{B}_{\textrm{imp}} - v^{B}_{\textrm{exp}} \\
\end{bmatrix}
= 
\begin{bmatrix}
v_{A_{SRC} \rightarrow A_{int}} \\
v_{A_{int} \rightarrow B_{int}} \\
v_{B_{int} \rightarrow C_{int}} \\
v_{C_{int} \rightarrow C_{SNK}} \\
v_{A_{int} \rightarrow C_{int}} \\
v_{B_{SRC} \rightarrow B_{int}} \\
\end{bmatrix}
$$

Initial CROP Z list (all reactions included): 
$$
\vec{Z_0} = 
\begin{bmatrix}
1 \\
1 \\
1 \\
1 \\
1 \\
1 \\
\end{bmatrix}
$$

CROP weight list (all reactions are equally probable): 
$$
\vec{weights} =
\begin{bmatrix}
0.5 \\
0.5 \\
0.5 \\
0.5 \\
0.5 \\
0.5 \\
\end{bmatrix}
$$



Experimental conditions and observations:

1. $v_{A_{SRC} \rightarrow A_{int}} = U_{A_1}$ and $v_{B_{SRC} \rightarrow B_{int}} = U_{B_1} $
    * growth observed: $v_{C_{int} \rightarrow C_{SNK}} \geq U_{\textrm{grow}}$
    * Growth condition: only A in the media, and no B in the media $\implies$ growth
2. $v_{A_{SRC} \rightarrow A_{int}} = U_{A_2} $ and $v_{B_{SRC} \rightarrow B_{int}} = U_{B_2} $
    * no growth observed: $v_{C_{int} \rightarrow C_{SNK}} \leq U_{\textrm{no-grow}}$
    * No Growth condition: no A in the media, and only B in the media $\implies$ no growth

where  $U_{A_1} >> U_{A_2}, and U_{B_1} =0 << U_{B_2}$.

Additional constraints: 
1. steady-state flux: $\textbf{S}\vec{v}=0$
2. bounded positive net fluxes: $0 < v_i < U_i$ for flux i.

Comments:

1. Intuitively, we see that there are two pathways that generate 'C' flux. One using the 'A' nutrient and another using the 'B' nutrient. If the 'B' pathway is removed, then the model cannot grow without A and will therefore be consistent with the data. 
This means that CROP should remove a reaction (or reactions) corresponding to the 'B' pathway, such as $v_{B_{SRC} \rightarrow B_{int}}$ or $v_{B_{int} \rightarrow C_{int}}$. 
2. algebraically, we find that at steady-state the fluxes are:
$$
v_{A_{int} \rightarrow C_{int}} = v_{A_{SRC} \rightarrow A_{int}} - v_{A_{int} \rightarrow B_{int}}\\
v_{B_{int} \rightarrow C_{int}} = v_{A_{int} \rightarrow B_{int}} + v_{B_{SRC} \rightarrow B_{int}} \\
v_{C_{int} \rightarrow C_{SNK}} = v_{B_{int} \rightarrow C_{int}} + v_{A_{int} \rightarrow C_{int}} \\
$$


Expected output for CROP:
1. $\vec{Z} = \begin{bmatrix} 1  & 1  & 1  & 1  & 1  & 0 \end{bmatrix}$ or
2. $\vec{Z} = \begin{bmatrix} 1  & 1  & 0  & 1  & 1  & 1 \end{bmatrix}$ or
3. $\vec{Z} = \begin{bmatrix} 1  & 1  & 0  & 1  & 1  & 0 \end{bmatrix}$ or



### CROP problem formulation (relaxed)

If we want a nogrowth condition where only $B$ is in the media, and not $A$, and we have a growth condition where $A$ is in the media, but not $B$, then we rewrite as below:

$$\begin{array}{ll}
\min_z &  weights^T(1 - z)  + M\cdot r_{nogrowth,\rightarrow B}\\
 &\begin{array}{lll}
& S^Tm + e_{C\rightarrow} = r_{nogrowth} \\
& r_{nogrowth,i}\leq\Omega_i\cdot(1-z_i) & \text{for $i\neq \rightarrow A$ or $\rightarrow B$} \\
& r_{nogrowth,\rightarrow A} \geq 0  \\
& r_{nogrowth,\rightarrow B} \geq 0 \\ 
\end{array} \\
& Sv_{growth} = 0 \\
& 0\leq v_{growth,i} \leq U_{growth,i}\cdot z_i  \text{  for $i\neq \rightarrow A$ or $\rightarrow B$}\\
& v_{growth,\rightarrow A} <= M \\
& v_{growth,\rightarrow B} <= 0 \\
& v_{growth,C\rightarrow} \geq \text{minimal growth} \\
& z_{\rightarrow A} = z_{\rightarrow B} = 1 \\
& z\in \{0,1\}
\end{array}$$

where the constant $M$ balances
1. the model's consistency with reaction weights versus
2. consistency with growth phenotypes

and $ e_{C\rightarrow}$ is a unit vector which is all zeros except for the export flux for C. This selects for the flux of C (bio growth flux).

and $r_{nogrowth}$ is the dual variable associated with the flux bounds in the primal problem (regular FBA). 0 means the bounds are not tight, negative means the lower bound is tight, positive means the upper bound is tight. 

where 

|  | $A_{SRC}\rightarrow A_{int}$ | $A_{int}\rightarrow B_{int}$ | $B_{int}\rightarrow C_{int}$ | $C_{int}\rightarrow C_{SNK}$ |  $A_{int}\rightarrow C_{int}$  | $B_{SRC}\rightarrow B_{int}$  |
|----------|:-------------:|:-------------:|:-------------:|:-------------:|:-------------:|:-------------:|
| $A_{int}$ |+1 |-1 | 0 | 0 |-1 |0 |
| $B_{int}$ | 0 |+1 |-1 | 0 | 0 |+1|
| $C_{int}$ | 0 | 0 |+1 |-1 |+1 |0 |


### CROP problem formulation (not relaxed)

If we want a nogrowth condition where only $B$ is in the media, and not $A$, and we have a growth condition where $A$ is in the media, but not $B$, then we rewrite as below:

$$\begin{array}{lll}
\min_z &  weights^T(1 - z) \text{\ \ \ \ \ \ if you have to remove a reaction, choose one with lowest likelihood (weight) }\\ \\
 &\begin{array}{lll}
& S^Tm + e_{C\rightarrow} = r_{nogrowth} & \text{Gibbs free energy-like constraint. Dual version of steady-state constraint. All equality constraints come into this part.}\\ 
& r_{nogrowth,i}\leq\Omega\cdot(1-z_i) & \text{for $i\neq \rightarrow A$ or $\rightarrow B$ and the reaction is part of the model $z_{i}=1$ then the dual variable, $r$, is {\it{not}} allowed to be positive. If the reaction is not part of model, then $r$ is allowed to be positive.} \\
& r_{nogrowth,\rightarrow A} \geq 0 & \text{Dual variable $r$ is allowed to be positive (tight) for A import reaction.}\\
& r_{nogrowth,\rightarrow B} \geq 0 & \text{Dual variable $r$ is allowed to be positive (tight) for B import reaction.}\\ 
& U_{B_2}\cdot r_{nogrowth,\rightarrow B}   \leq U_{nogrowth} & \text{The solution to the dual will exactly equal the primal, when this condition is met. Upper bound for $r_{nogrowth,\rightarrow B}$ } \\
\end{array} \\
& Sv_{growth} = 0 \text{\ \ \ \ \ \ steady state constraint in growth condition. }\\
& 0\leq v_{growth,i} \leq U_{growth,i}\cdot z_i  \text{\ \ \ \ \ \ for $i\neq \rightarrow A$ or $\rightarrow B$. No flux is allowed if you aren't part of the model.}\\
& v_{growth,\rightarrow A} <= U_{growth,\rightarrow A} \text{\ \ \ \ \ \ upper bound for A import flux reaction}\\
& v_{growth,\rightarrow B} <= 0 \text{\ \ \ \ \ \ upper bound for B import flux reaction} \\
& v_{growth,C\rightarrow} \geq U_{growth,C\rightarrow} \text{\ \ \ \ \ \ minimal growth condition} \\
& z_{\rightarrow A} = z_{\rightarrow B} = 1 \text{\ \ \ \ \ \ both A and B import reactions are in the model} \\  % maybe not necessary
& z\in \{0,1\} \text{\ \ \ \ \ \ $Z_i=0$ means the reaction $i$ is not in the model, and $Z_i=1$ means the reaction $i$ is in the model}
\end{array}$$

where the constant $M$ balances
1. the model's consistency with reaction weights versus
2. consistency with growth phenotypes

and $ e_{C\rightarrow}$ is a unit vector which is all zeros except for the export flux for C. This selects for the flux of C (bio growth flux).

and $r_{nogrowth}$ is the dual variable associated with the flux bounds in the primal problem (regular FBA). 0 means the bounds are not tight, negative means the lower bound is tight, positive means the upper bound is tight. 

and $m$ is the dual variable associated with the steady state equality constraint. $m<0$ if relaxing lower bound on steady state would increase growth. $m > 0$ if relaxing the upper bound on steady state would increase growth.

and $\Omega$ is a large postive number

where 

|  | $A_{SRC}\rightarrow A_{int}$ | $A_{int}\rightarrow B_{int}$ | $B_{int}\rightarrow C_{int}$ | $C_{int}\rightarrow C_{SNK}$ |  $A_{int}\rightarrow C_{int}$  | $B_{SRC}\rightarrow B_{int}$  |
|----------|:-------------:|:-------------:|:-------------:|:-------------:|:-------------:|:-------------:|
| $A_{int}$ |+1 |-1 | 0 | 0 |-1 |0 |
| $B_{int}$ | 0 |+1 |-1 | 0 | 0 |+1|
| $C_{int}$ | 0 | 0 |+1 |-1 |+1 |0 |



## Unit test for ABC toy model 3

Current model does have A-->B

True model doesn't have A-->B reaction

Conditions
1. observe growth with A in media but not B
2. observe growth with B in media but not A
3. observe no growth with A in media but not B - with a knockout for A-->C
4. observe growth with B in the media but not A - with a knockout for A-->C

In [18]:
# code formulation
from dataclasses import dataclass
import cvxpy as cp
import numpy as np
import pandas as pd
import optlang as op
import cobra as cb 
from cobra.core.reaction import Reaction
from cobra.core.metabolite import Metabolite
from optlang import Model, Variable, Constraint, Objective


def model_from_stoich_matrix(S:pd.DataFrame, name:str, obj:dict[str,int], lower_flux_bound:dict[str,float], upper_flux_bound:dict[str,float])->cb.core.model.Model:
    """Creates a cobra model from a stoichiometric matrix."""
    model = cb.core.model.Model(name)
    model.add_reactions([Reaction(rxn_id, lower_bound=lower_flux_bound[rxn_id], upper_bound=upper_flux_bound[rxn_id], obj=obj[rxn_id]) for rxn_id in S.columns])
    for rxn in model.reactions:
        rxn.add_metabolites({Metabolite(met_id):stoichiometry for met_id,stoichiometry in S[rxn.id].to_dict().items() if stoichiometry !=0})
    return model


@dataclass
class PhenotypeObservation:
    """Specifies media conditions, gene knockouts, reaction knockouts, and growth/no growth observation."""
    medium:dict[str,float]
    gene_knockouts:dict[str,bool]  # TODO
    reaction_knockouts:dict[str,bool]
    growth_phenotype:bool


class ConsistentReproductionOfPhenotype:
    """Adding and removing reactions based on phenotype observations."""

    def __init__(self, model:cb.core.model.Model, phenotypes:dict[str,PhenotypeObservation]):
        self.model = model
        self.phenotypes = phenotypes

    def create_problem(self):
        """Create an OptLang representation of the CROP problem."""


    
# stoichiometry 

# S_index = [r'$A_{int}$', r'$B_{int}$', r'$C_{int}$']
# S_dict = {r'$A_{SRC}\rightarrow%A_{int}$': [1,0,0],
#           r'$A_{int}\rightarrow%B_{int}$': [-1,1,0],
#           r'$B_{int}\rightarrow%C_{int}$': [0,-1,1],
#           r'$C_{int}\rightarrow%C_{SNK}$': [0,0,-1],
#           r'$A_{int}\rightarrow%C_{int}$': [-1,0,1],
#           r'$B_{SRC}\rightarrow%B_{int}$': [0,1,0],
#           }

S_index = ['A_int', 'B_int', 'C_int']
S_dict = {'A_SRC->A_int': [1,0,0],
          'A_int->B_int': [-1,1,0],
          'B_int->C_int': [0,-1,1],
          'C_int->C_SNK': [0,0,-1],
          'A_int->C_int': [-1,0,1],
          'B_SRC->B_int': [0,1,0],
          }

S_table = pd.DataFrame( S_dict ,index=S_index)
mets, rxns = S_table.index, S_table.columns
n_mets = len(mets)
n_rxns = len(rxns)

display(S_table)


# constant Omega
upper_flux_bound = 1e3

min_growth_flux = 10
max_nogrowth_flux = 5


A_not_B_medium = {
 'A_SRC->A_int': min_growth_flux,
 'B_SRC->B_int': 0.0
 }

B_not_A_medium = {
 'A_SRC->A_int': 0.0,
 'B_SRC->B_int': min_growth_flux
 }

reaction_knockouts = {
 'A_int->C_int': True,
 }


phenotype_observations = {
    'A_not_B': PhenotypeObservation(medium=A_not_B_medium, reaction_knockouts={}, growth_phenotype=True),
    'B_not_A': PhenotypeObservation(medium=B_not_A_medium, reaction_knockouts={}, growth_phenotype=True),
    'A_not_B_ko_AC':PhenotypeObservation(medium=A_not_B_medium, reaction_knockouts=reaction_knockouts, growth_phenotype=False),
    'B_not_A_ko_AC':PhenotypeObservation(medium=B_not_A_medium, reaction_knockouts=reaction_knockouts, growth_phenotype=True)
}


cobra_model = model_from_stoich_matrix(S_table, "ABC_toy_model_3")

# list of weights (reaction size)
likelihoods = [0.5 for _ in rxns]

# list of z (reaction size)
reaction_indicator = {rxn_id:op.Variable(f"z_{rxn_id}", type="binary") for rxn_id in rxns}
display(reaction_indicator)

# flux dual, metabolite dual, and flux variables
flux_dual = {}
flux = {}
metabolite_dual = {}
for observation_id, observation in phenotype_observations.items():
    if observation.growth_phenotype: # primal problem when growth is observed
        flux[observation_id] = {rxn_id:op.Variable(f"v_{observation_id}_{rxn_id}") for rxn_id in rxns}
    else: # dual problem when no growth is observed
        flux_dual[observation_id] = {rxn_id:op.Variable(f"r_{observation_id}_{rxn_id}") for rxn_id in rxns}
        metabolite_dual[observation_id] = {met_id:op.Variable(f"m_{observation_id}_{met_id}") for met_id in mets}

display(flux_dual)
display(flux)
display(metabolite_dual)

# constraints
# Gibbs free energy constraint S^T m + e_c = r_nogrowth

objs = {rxn_id:1 if rxn_id=='C_int->C_SNK' else 0 for rxn_id in rxns}

constraint = {}
for observation_id, observation in phenotype_observations.items():
    if not observation.growth_phenotype:
        for reaction in cobra_model.reactions:
            constraint[f"Gibbs_{reaction.id}"] = Constraint(sum(stoichiometry*metabolite_dual[observation_id][met_id] 
                                                                for met_id,stoichiometry in reaction.metabolites)-flux_dual[observation_id][reaction.id] + objs[reaction.id], 
                                                                lb=0, ub=0)



# objective


# list of U_nogrowth and U_growth (reaction size)
U_growth = {rxn_id:upper_flux_bound for rxn_id in rxns}
U_nogrowth = {rxn_id:upper_flux_bound for rxn_id in rxns}


lower_flux_bounds = {rxn_id:0 for rxn_id in rxns}

U_growth['A_SRC->A_int'] = min_growth_flux
U_growth['B_SRC->B_int'] = 0




display(cobra_model.summary())


# list of weights (reaction size)
likelihoods = [0.5 for _ in rxns]

# list of z (reaction size)
reaction_indicator = {rxn_id:op.Variable(f"z_{rxn_id}", type="binary") for rxn_id in rxns}
display(reaction_indicator)

# list of r (no growth)
flux_dual = {rxn_id:op.Variable(f"r_nogrowth_{rxn_id}") for rxn_id in rxns}
display(flux_dual)

# list of v (growth)
flux = {rxn_id:op.Variable(f"v_growth_{rxn_id}") for rxn_id in rxns}
display(flux)

# list of m (metabolite size)
metabolite_dual = {met_id:op.Variable(f"m_nogrowth_{met_id}") for met_id in mets}
display(metabolite_dual)

# # list of e_c (reaction size)
# e_c = 1
# display(e_c)



# matrix for stoichiometric matrix
S_table


Unnamed: 0,A_SRC->A_int,A_int->B_int,B_int->C_int,C_int->C_SNK,A_int->C_int,B_SRC->B_int
A_int,1,-1,0,0,-1,0
B_int,0,1,-1,0,0,1
C_int,0,0,1,-1,1,0


TypeError: PhenotypeObservation.__init__() missing 1 required positional argument: 'gene_knockouts'

In [25]:
dir(cobra_model.problem)

['Configuration',
 'Constraint',
 'GLP_BV',
 'GLP_CV',
 'GLP_DB',
 'GLP_ETMLIM',
 'GLP_FEAS',
 'GLP_FR',
 'GLP_FX',
 'GLP_INFEAS',
 'GLP_IV',
 'GLP_LO',
 'GLP_MAX',
 'GLP_MIN',
 'GLP_MSG_ALL',
 'GLP_MSG_ERR',
 'GLP_MSG_OFF',
 'GLP_MSG_ON',
 'GLP_NOFEAS',
 'GLP_OFF',
 'GLP_ON',
 'GLP_OPT',
 'GLP_SF_AUTO',
 'GLP_UNBND',
 'GLP_UNDEF',
 'GLP_UP',
 'Model',
 'Objective',
 'TemporaryFilename',
 'Variable',
 '_GLPK_STATUS_TO_STATUS',
 '_GLPK_VTYPE_TO_VTYPE',
 '_VTYPE_TO_GLPK_VTYPE',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 '_glpk_validate_id',
 'doubleArray',
 'get_col_duals',
 'get_col_primals',
 'get_row_duals',
 'get_row_primals',
 'glp_add_cols',
 'glp_add_rows',
 'glp_adv_basis',
 'glp_create_index',
 'glp_create_prob',
 'glp_del_cols',
 'glp_del_rows',
 'glp_find_col',
 'glp_find_row',
 'glp_get_col_dual',
 'glp_get_col_kind',
 'glp_get_col_lb',
 'glp_get_col_name',
 'glp_get_col_prim',
 'glp_get_col_ub',
 'glp_ge

In [47]:
#cobra_model.problem.Model.constraints.keys()
# dir(cobra_model)
# type(cobra_model.constraints)
cobra_model.constraints['A_int'].expression

1.0*A_SRC->A_int - 1.0*A_SRC->A_int_reverse_81d32 - 1.0*A_int->B_int + 1.0*A_int->B_int_reverse_b5f99 - 1.0*A_int->C_int + 1.0*A_int->C_int_reverse_56b96

In [4]:
# optlang tutorial
from optlang import Model, Variable, Constraint, Objective

# All the (symbolic) variables are declared, with a name and optionally a lower and/or upper bound.
x1 = Variable('x1', lb=0)
x2 = Variable('x2', lb=0)
x3 = Variable('x3', lb=0)

# A constraint is constructed from an expression of variables and a lower and/or upper bound (lb and ub).
c1 = Constraint(x1 + x2 + x3, ub=100)
c2 = Constraint(10 * x1 + 4 * x2 + 5 * x3, ub=600)
c3 = Constraint(2 * x1 + 2 * x2 + 6 * x3, ub=300)

# An objective can be formulated
obj = Objective(10 * x1 + 6 * x2 + 4 * x3, direction='max')

# Variables, constraints and objective are combined in a Model object, which can subsequently be optimized.
model = Model(name='Simple model')
model.objective = obj
model.add([c1, c2, c3])
status = model.optimize()
print("status:", model.status)
print("objective value:", model.objective.value)
print("----------")
for var_name, var in model.variables.items():
    print(var_name, "=", var.primal)

status: optimal
objective value: 733.3333333333333
----------
x1 = 33.33333333333333
x2 = 66.66666666666667
x3 = 0.0


In [4]:
S_table*[1, 1, 1,1,0,1]

Unnamed: 0,$A_{SRC}\rightarrow A_{int}$,$A_{int}\rightarrow B_{int}$,$B_{int}\rightarrow C_{int}$,$C_{int}\rightarrow C_{SNK}$,$A_{int}\rightarrow C_{int}$,$B_{SRC}\rightarrow B_{int}$
$A_{int}$,1,-1,0,0,0,0
$B_{int}$,0,1,-1,0,0,1
$C_{int}$,0,0,1,-1,0,0


In [26]:
S = S_table.values

n_mets, n_rxns = S.shape

max_influx = 10
min_growth = 5
max_nogrowth = 2
U_nogrowth       = 1000.0
U_growth         = 1000.0
weights = np.ones(n_rxns)*0.5
Omega   = 1000.0

A_rxn = S_table.columns.get_loc(r'$A_{SRC}\rightarrow A_{int}$')# index for A_SRC to A_int
C_rxn = S_table.columns.get_loc(r'$C_{int}\rightarrow C_{SNK}$') # index for C_int to C_SNK
B_rxn = S_table.columns.get_loc(r'$B_{SRC}\rightarrow B_{int}$')
B_to_C_rxn = S_table.columns.get_loc(r'$B_{int}\rightarrow C_{int}$')
A_to_C_rxn = S_table.columns.get_loc(r'$A_{int}\rightarrow C_{int}$')

v_growth = cp.Variable(n_rxns)
m_nogrowth = cp.Variable(n_mets)
r_nogrowth = cp.Variable(n_rxns)
v_nogrowth = cp.Variable(n_rxns)

z = cp.Variable(n_rxns, boolean=True)

c = np.zeros(n_rxns)
c[C_rxn] = 1  # how to set c?

fba = cp.Problem(
    cp.Maximize( v_growth[C_rxn] ),
    [
        S@v_growth == 0,
        v_growth >= 0,
        v_growth[B_to_C_rxn] == 0,
        v_growth[B_rxn] <= max_influx,
        v_growth[A_rxn] <= max_influx,
    ])
results_fba = fba.solve(solver=cp.SCIPY, verbose=False )
display(pd.DataFrame({'v': v_growth.value}, index=S_table.columns))

Unnamed: 0,v
$A_{SRC}\rightarrow A_{int}$,10.0
$A_{int}\rightarrow B_{int}$,-0.0
$B_{int}\rightarrow C_{int}$,-0.0
$C_{int}\rightarrow C_{SNK}$,10.0
$A_{int}\rightarrow C_{int}$,10.0
$B_{SRC}\rightarrow B_{int}$,-0.0


# Phenotype observations

* When we observe $A$ only in the media, we grow
* When we observe $B$ only in the media, we don't grow


$$\begin{array}{ll}
\min_z &  weights^T(1 - z)  + M\cdot r_{nogrowth,\rightarrow B}\\
 &\begin{array}{lll}
 & v_{nogrowth,C} <= Mr_{A} \\
& Sv_{nogrowth}= 0 & \text{inner problem} \\
& 0\leq v_{nogrowth,i}\leq U_{nogrowth,i}\cdot z_i \\
& S^Tm + e_{C\rightarrow} = r_{nogrowth} \\
& r_{nogrowth,i}\leq\Omega_i\cdot(1-z_i) & \text{for $i\neq \rightarrow A$ or $\rightarrow B$} \\
& r_{nogrowth,\rightarrow A} \geq 0  \\
& r_{nogrowth,\rightarrow B} \geq 0 \\ 
\end{array} \\
& Sv_{growth} = 0 \\
& 0\leq v_{growth,i} \leq U_{growth,i}\cdot z_i  \text{  for $i\neq \rightarrow A$ or $\rightarrow B$}\\
& v_{growth,\rightarrow A} <= M \\
& v_{growth,\rightarrow B} <= 0 \\
& v_{growth,C\rightarrow} \geq \text{minimal growth} \\
& z_{\rightarrow A} = z_{\rightarrow B} = 1 \\
& z\in \{0,1\}
\end{array}$$


In [50]:
single_level_milp = cp.Problem(
    cp.Minimize( weights.T@(1-z) + max_nogrowth*r_nogrowth[B_rxn]),
             [  
               v_nogrowth[C_rxn] <= max_nogrowth*r_nogrowth[B_rxn],
               S@v_nogrowth == 0,
               0 <= v_nogrowth,
               v_nogrowth <= U_nogrowth*z,    
                 
               S.T@m_nogrowth + c       == r_nogrowth,                         # Delta G-like constraint
               r_nogrowth[A_rxn]      >= 0,                              # A is constraining        #
               r_nogrowth[A_rxn+1:B_rxn]   <= Omega*(1 - z[A_rxn + 1: B_rxn]), # If anything but A is constraining, remove it from the model.
               r_nogrowth[B_rxn] >= 0,
            S@v_growth  == 0,
            0                <= v_growth,
            v_growth[A_rxn] <= max_influx,
            v_growth         <= U_growth*z,
            v_growth[B_rxn]  <= 0,
            v_growth[C_rxn]  >= min_growth,
            z[B_rxn] == 1,
             ])

results = single_level_milp.solve(solver=cp.SCIPY, verbose=True )
fluxes = pd.DataFrame( z.value, index=S_table.columns, columns=[r'$z$'])
fluxes[r'r_{nogrowth}'] = pd.Series(np.squeeze(np.asarray(r_nogrowth.value)),index=rxns)
fluxes[r'v_{growth}'] = pd.Series(np.squeeze(np.asarray(v_growth.value)),index=rxns)
fluxes[r'v_{nogrowth}'] = pd.Series(np.squeeze(np.asarray(v_nogrowth.value)),index=rxns)
metabolites = pd.DataFrame( m_nogrowth.value, index=S_table.index, columns=['$m$'])
display(fluxes)
display(metabolites)
display(S_table.T)
#r[:A_idx]     <= Omega*np.eye(A_idx)@(1 - z[:A_idx]), # removed constraint due to error ValueError: Invalid dimensions (0, 0).

                                     CVXPY                                     
                                     v1.3.2                                    
(CVXPY) Dec 20 07:00:23 PM: Your problem has 27 variables, 15 constraints, and 0 parameters.
(CVXPY) Dec 20 07:00:23 PM: It is compliant with the following grammars: DCP, DQCP
(CVXPY) Dec 20 07:00:23 PM: (If you need to solve this problem multiple times, but with different data, consider using parameters.)
(CVXPY) Dec 20 07:00:23 PM: CVXPY will first compile your problem; then, it will invoke a numerical solver to obtain a solution.
-------------------------------------------------------------------------------
                                  Compilation                                  
-------------------------------------------------------------------------------
(CVXPY) Dec 20 07:00:23 PM: Compiling problem (target solver=SCIPY).
(CVXPY) Dec 20 07:00:23 PM: Reduction chain: Dcp2Cone -> CvxAttr2Constr -> ConeMatrixStuffing 

Unnamed: 0,$z$,r_{nogrowth},v_{growth},v_{nogrowth}
$A_{SRC}\rightarrow A_{int}$,1.0,1.0,5.0,-0.0
$A_{int}\rightarrow B_{int}$,1.0,-1.0,-0.0,-0.0
$B_{int}\rightarrow C_{int}$,0.0,1.0,-0.0,-0.0
$C_{int}\rightarrow C_{SNK}$,1.0,-0.0,5.0,-0.0
$A_{int}\rightarrow C_{int}$,1.0,-0.0,5.0,-0.0
$B_{SRC}\rightarrow B_{int}$,1.0,-0.0,-0.0,-0.0


Unnamed: 0,$m$
$A_{int}$,1.0
$B_{int}$,-0.0
$C_{int}$,1.0


Unnamed: 0,$A_{int}$,$B_{int}$,$C_{int}$
$A_{SRC}\rightarrow A_{int}$,1,0,0
$A_{int}\rightarrow B_{int}$,-1,1,0
$B_{int}\rightarrow C_{int}$,0,-1,1
$C_{int}\rightarrow C_{SNK}$,0,0,-1
$A_{int}\rightarrow C_{int}$,-1,0,1
$B_{SRC}\rightarrow B_{int}$,0,1,0


In [42]:
S.T@m_nogrowth.value - r_nogrowth.value

array([ 0.,  0.,  0., -1.,  0.,  0.])

SyntaxError: invalid syntax (1505788370.py, line 1)

$$m_{B_{int}} - m_{B_{src}} + c = r_{B_{int}}$$
$$m_{B_{int}} - m_{A_{int}} + c = r_{B_{int}}$$



<img src="crop_test_model3_simple.PNG" alt="Toy Model 3" width="800"/>



In [21]:
single_level_milp = cp.Problem(
    cp.Minimize( weights.T@(1-z)),
             [  
                 v_nogrowth[C_idx]   == max_nogrowth*r[A_idx],  # Primal is equal to dual.
                 S@v_nogrowth    == 0,                          # steady state
                 0              <=  v_nogrowth,                 # irreversible
                 v_nogrowth     <=  U_nogrowth@z,               
                 v_nogrowth[A_idx] <= 10,
                 v_nogrowth[B_idx] <= max_influx,
                 S.T@m - r       == -c,
               # v_nogrowth[A_idx] <= max_influx,
                # v_nogrowth[B_idx] <= max_influx,
                r[A_idx]      >= 0,
                r[A_idx+1:]   <= Omega*np.eye(n_rxns - A_idx - 1)@(1 - z[A_idx + 1:]), # Everything after A is either not constraining or not in the model.
            v_nogrowth[C_idx] <= 2,   # Not enough C flux  for growth.
            S@v_growth  == 0,
            0                <= v_growth,
            v_growth[A_idx] <= max_influx,
            v_growth         <= U_growth@z,
            #v_growth[B_idx]  <= max_influx,
            v_growth[C_idx]      >= min_growth,])

results = single_level_milp.solve(solver=cp.SCIPY, verbose=True )
fluxes = pd.DataFrame( z.value, index=S_table.columns, columns=[r'$z$'])
fluxes[r'$r$'] = pd.Series(np.squeeze(np.asarray(r.value)),index=rxns)
fluxes[r'$v_{growth}$'] = pd.Series(np.squeeze(np.asarray(v_growth.value)),index=rxns)
fluxes[r'$v_{nogrowth}$'] = pd.Series(np.squeeze(np.asarray(v_nogrowth.value)),index=rxns)

display(fluxes)

#r[:A_idx]     <= Omega*np.eye(A_idx)@(1 - z[:A_idx]), # removed constraint due to error ValueError: Invalid dimensions (0, 0).

                                     CVXPY                                     
                                     v1.3.2                                    
(CVXPY) Dec 06 06:34:42 PM: Your problem has 27 variables, 16 constraints, and 0 parameters.
(CVXPY) Dec 06 06:34:42 PM: It is compliant with the following grammars: DCP, DQCP
(CVXPY) Dec 06 06:34:42 PM: (If you need to solve this problem multiple times, but with different data, consider using parameters.)
(CVXPY) Dec 06 06:34:42 PM: CVXPY will first compile your problem; then, it will invoke a numerical solver to obtain a solution.
-------------------------------------------------------------------------------
                                  Compilation                                  
-------------------------------------------------------------------------------
(CVXPY) Dec 06 06:34:42 PM: Compiling problem (target solver=SCIPY).
(CVXPY) Dec 06 06:34:42 PM: Reduction chain: Dcp2Cone -> CvxAttr2Constr -> ConeMatrixStuffing 

Unnamed: 0,$z$,$r$,$v_{growth}$,$v_{nogrowth}$
$A_{SRC}\rightarrow A_{int}$,1.0,1.0,5.0,2.0
$A_{int}\rightarrow B_{int}$,1.0,-1000.0,-0.0,0.0
$B_{int}\rightarrow C_{int}$,0.0,1000.0,0.0,0.0
$C_{int}\rightarrow C_{SNK}$,1.0,0.0,5.0,2.0
$A_{int}\rightarrow C_{int}$,1.0,0.0,5.0,2.0
$B_{SRC}\rightarrow B_{int}$,1.0,-999.0,-0.0,-0.0


# Let B in but don't allow A

In [11]:
single_level_milp = cp.Problem(
    cp.Minimize( weights.T@(1-z)),
             [
                 v_nogrowth[C_idx]   == max_nogrowth*r[B_idx],  # the uptake rate is equal to the biomass flux
                 S@v_nogrowth    == 0,   # v_nogrowth[A_idx] <= 0
                 v_nogrowth[A_idx] <= 0,
                 v_nogrowth[B_idx] <= max_influx,
                 0              <=  v_nogrowth,
                 v_nogrowth     <=  U_nogrowth@z,
                 S.T@m - r       == -c,
               # v_nogrowth[A_idx] <= max_influx,
                # v_nogrowth[B_idx] <= max_influx,
                r[:B_idx]   <= Omega*np.eye(B_idx)@(1 - z[:B_idx]),  # B cannot be a constraining uptake rate.
                r[B_idx]      >= 0,    # there is no A in the media and there is B in the media.
               # r[B_idx+1:]   <= Omega*np.eye(n_rxns - B_idx - 1)@(1 - z[B_idx + 1:]),  # B cannot be a constraining uptake rate.
            v_nogrowth[C_idx] <= max_nogrowth,  
            S@v_growth  == 0,                 # steady-state
            0                <= v_growth,     # All reactions are irreversible
            v_growth[A_idx] <= max_influx,   # uptake rate is bounded
            v_growth         <= U_growth@z,  # Removed reactions have no flux
            v_growth[B_idx] >= min_growth,   
                 v_growth[B_idx]  <= max_influx,  # uptake rate of B is bounded
            v_growth[C_idx]      >= min_growth,])   # growth rate exceeds min growth threshold

results = single_level_milp.solve(solver=cp.SCIPY, verbose=True )
fluxes = pd.DataFrame( z.value, index=S_table.columns, columns=[r'$z$'])
fluxes[r'$r$'] = pd.Series(np.squeeze(np.asarray(r.value)),index=rxns)
fluxes[r'$v_{growth}$'] = pd.Series(np.squeeze(np.asarray(v_growth.value)),index=rxns)
fluxes[r'$v_{nogrowth}$'] = pd.Series(np.squeeze(np.asarray(v_nogrowth.value)),index=rxns)

display(fluxes)

#r[:A_idx]     <= Omega*np.eye(A_idx)@(1 - z[:A_idx]), # removed constraint due to error ValueError: Invalid dimensions (0, 0).

                                     CVXPY                                     
                                     v1.3.2                                    
(CVXPY) Dec 06 06:24:51 PM: Your problem has 27 variables, 17 constraints, and 0 parameters.
(CVXPY) Dec 06 06:24:51 PM: It is compliant with the following grammars: DCP, DQCP
(CVXPY) Dec 06 06:24:51 PM: (If you need to solve this problem multiple times, but with different data, consider using parameters.)
(CVXPY) Dec 06 06:24:51 PM: CVXPY will first compile your problem; then, it will invoke a numerical solver to obtain a solution.
-------------------------------------------------------------------------------
                                  Compilation                                  
-------------------------------------------------------------------------------
(CVXPY) Dec 06 06:24:51 PM: Compiling problem (target solver=SCIPY).
(CVXPY) Dec 06 06:24:51 PM: Reduction chain: Dcp2Cone -> CvxAttr2Constr -> ConeMatrixStuffing 

Unnamed: 0,$z$,$r$,$v_{growth}$,$v_{nogrowth}$
$A_{SRC}\rightarrow A_{int}$,0.0,1000.0,0.0,-0.0
$A_{int}\rightarrow B_{int}$,1.0,-999.0,-0.0,-0.0
$B_{int}\rightarrow C_{int}$,1.0,0.0,5.0,2.0
$C_{int}\rightarrow C_{SNK}$,1.0,0.0,5.0,2.0
$A_{int}\rightarrow C_{int}$,1.0,-999.0,-0.0,0.0
$B_{SRC}\rightarrow B_{int}$,1.0,1.0,5.0,2.0


## CROP Test - Toy model 4 

Toy model 4:

<img src="crop_test_model4_simple.PNG" alt="Toy Model 4" width="800"/>




In [None]:
# crop formulation goes here

## CROP Test 

Toy model 5:

<img src="crop_test_model5_simple.PNG" alt="Toy Model 5" width="800"/>

In [None]:
# crop formulation goes here