Import dependencies

In [None]:
%reload_ext autoreload
%autoreload 1
%matplotlib inline

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import cobra
import escher

# Load model

In [None]:
model = cobra.io.load_json_model("./models/iMM904.json")

# Utilities

In [None]:
def print_formulas(reaction):
    """Print formulas of reactants and products of a reaction."""
    print('reactants')
    for reactant in reaction.reactants:
        print(f'{reactant.id} ({reactant.name}): F {reactant.formula}')
    print('products')
    for product in reaction.products:
        print(f'{product.id} ({product.name}): F {product.formula}')

def print_formula_weights(reaction):
    """Print formula weights of reactants and products of a reaction."""
    print('reactants')
    for reactant in reaction.reactants:
        print(f'{reactant.id} ({reactant.name}): MW {reactant.formula_weight}')
    print('products')
    for product in reaction.products:
        print(f'{product.id} ({product.name}): MW {product.formula_weight}')
        
def print_stoichiometry(reaction):
    """Pretty-print stoichiometry of reaction."""
    for metabolite, coeff in reaction.metabolites.items():
        print(f'{coeff}, {metabolite.name}')

# Objective function

Get objective function (biomass/growth)

In [None]:
biomass = model.reactions.BIOMASS_SC5_notrace
biomass

In [None]:
print_stoichiometry(biomass)

> These stoichiometric constants were determined by the relative compositions (i.e. mass fractions) of these compounds in exponentially growing yeast, as determined from experiments (Mo et al., 2009).
>
> If the protocol described in Thiele & Palsson (2010) is followed, particularly figure 11, then the stoichiometric constants are expressed in terms of mmol substrate per g<sub>DW</sub>.  For example, there is 0.4588 mmol alanine in 1 gram (dry weight) of cell.

Medium

In [None]:
model.medium

Unrestrict bounds

In [None]:
model.reactions.get_by_id('EX_glc__D_e').bounds = (-10, 0)
model.reactions.get_by_id('EX_o2_e').bounds = (-999999.0, 0)

> Problem: If I set the bounds of glucose exchange to be (-99999, 0), the objective function reaches a huge value, so I left it at the default value.

Linear reaction coefficients

In [None]:
cobra.util.solver.linear_reaction_coefficients(model)

> The objective function is the biomass reaction and the biomass reaction only, not a linear combination of anything.

Optimise using (vanilla) FBA

In [None]:
solution = model.optimize()

In [None]:
model.summary()

In [None]:
solution

# Non-growth ATP maintenance

In [None]:
model.reactions.get_by_id('ATPM')

# Fluxes

Import list of metabolites represented in biomass reaction.  `biomass_type` is manually labelled.

In [None]:
biomass_metabolites_df = pd.read_csv('iMM904-biomass-categories.csv', delimiter=',')

In [None]:
biomass_metabolites_df

Get fluxes of each metabolite in the (optimised) biomass reaction, append to the dataframe.

In [None]:
model.metabolites.get_by_id('13BDglcn_c').summary()

> Each metabolite has at least one producing reaction and at least one consuming reaction.  Among the consuming reactions, one of them is the biomass reaction `BIOMASS_SC5_notrace` because I'm specifically looking at metabolites that are represented in this biomass reaction.
>
> Because FBA assumes steady-state, the sum of producing reaction fluxes must be equal to the sum of consuming reaction fluxes.  In other words, if the metabolite is $s_i$, $\frac{ds_i}{dt} = \sum (\mathrm{flux}_{\mathrm{producing}}) = -\sum (\mathrm{flux}_{\mathrm{consuming}}) = 0$.  Therefore, using $\frac{ds_i}{dt}$ for time approximations does not make sense as it will be zero for all metabolites.
>
> In the time approximation, I estimate how much time it takes to produce each metabolite to the amount that is sufficient to produce a new cell.  Some metabolites have more than one consuming reaction, so it means that some of the metabolite is used to produce something that is not biomass.  Because of these metabolites, using the magnitude of the sum of all producting reactions' fluxes (which is equal to the magnitude of the sum of all consuming reactions' fluxes) for my calculations does not make sense.  Therefore, I use the flux of the biomass reaction for each metabolite.
>
> The flux of the biomass reaction for each metabolite is computed by taking the overall flux of the biomass reaction and multiply it by the stoichiometric coefficient of said metabolite in the biomass reaction, i.e. $\mathrm{flux}_{\mathrm{biomass},s_i} = \mathrm{flux}_{\mathrm{biomass}} \cdot \mathrm{coeff}_{s_i}$

> **PROBLEM: How do I account for the fact that for some metabolites, not all of it is used for biomass?**

In [None]:
biomass_metabolites_df['flux'] = [np.nan] * len(biomass_metabolites_df)

In [None]:
# flux from biomass reaction -- USE THIS ONE
for df_idx, metabolite_id in enumerate(biomass_metabolites_df.id):
    metabolite = model.metabolites.get_by_id(metabolite_id)
    # Inspect reactants, ignore products, based on stoichiometric constants
    if biomass.metabolites[metabolite] < 0:
        flux_in_biomass = metabolite.summary().consuming_flux.loc['BIOMASS_SC5_notrace'].flux
        biomass_metabolites_df['flux'].iloc[df_idx] = flux_in_biomass

In [None]:
# sum of producing reaction fluxes
for df_idx, metabolite_id in enumerate(biomass_metabolites_df.id):
    metabolite = model.metabolites.get_by_id(metabolite_id)
    # Inspect reactants, ignore products, based on stoichiometric constants
    if biomass.metabolites[metabolite] < 0:
        flux_producing = sum(metabolite.summary().producing_flux.flux)
        #print(f'{metabolite.name}: {-1/flux_in_biomass}')
        biomass_metabolites_df['flux'].iloc[df_idx] = flux_producing

In [None]:
biomass_metabolites_df

# Timescale

## From objective function

The objective function is expressed in units of h<sup>-1</sup>.

The biomass reaction of FBA models are commonly scaled so that the flux through it is equal to the exponential growth rate (Orth et al., 2010).  This scaling is performed through adjusting the stoichiometric coefficients in the biomass reaction (to be discussed again later).

In [None]:
solution.objective_value

Converting this into estimated time:

In [None]:
CELL_DRY_MASS = 15e-12

#time = CELL_DRY_MASS/solution.objective_value
time = 1/solution.objective_value

In [None]:
time

## From fluxes

Compute sum of reactant coefficients.

This is needed to calculate the mole fraction of each metabolite.  This calculation is based on the assumption that the relative values of the reactant coefficients reflect the mole fraction of each metabolite, as documented by Mo et al. (2009).

In [None]:
sum_reactant_coeffs = -sum([coeff for (_, coeff) in biomass.metabolites.items() if coeff < 0])
sum_reactant_coeffs

Remove the coeffs from NGAM as these ones were considered separately when constructing the biomass reaction of the model.

NGAM simulates energy demands that are not associated with growth, and the stoichiometric constants for ATP, ADP, H<sub>2</sub>O, and H<sup>+</sup> do not reflect the biomass components of the cell.  In addition, H<sub>2</sub>O must be excluded because we consider dry weight.  In an FBA model, the biomass reaction includes components that form a new cell (e.g. building blocks like lipids and amino acids) and components that are needed for the building machinery (e.g. ATP).

In [None]:
sum_reactant_coeffs -= -biomass.metabolites[model.metabolites.get_by_id('atp_c')]
sum_reactant_coeffs -= -biomass.metabolites[model.metabolites.get_by_id('h2o_c')]
sum_reactant_coeffs

> The sum of reactant coefficients, apart from NGAM (ATP + H<sub>2</sub>O --> ADP + H<sup>+</sup> + P<sub>i</sub>) should be 6.40.
>
> The biomass reaction is scaled so that the flux through it is equal to the exponential growth rate ($\mu$) of the model organism.  The sum of these coefficients is not 1.  Recalling the stoichiometric coefficients at the beginning of the notebook: if the growth rate, i.e. the flux of the biomass reaction, is X h<sup>-1</sup>, then the consuming flux of alanine is (0.4588 mmol Alanine g<sub>DW</sub><sup>-1</sup>)(X h<sup>-1</sup>) = 0.4588 $\cdot$ X mmol g<sub>DW</sub><sup>-1</sup> h<sup>-1</sup>, with units consistent with the rest of the FBA model.
>
> Furthermore, the stoichiometric coefficients for the NGAM components were derived from polymerisation cost per gram dry weight and the biomass yield (Förster et al., 2003).

Construct new columns in `biomass_metabolites_df`.

In [None]:
biomass_metabolites_df['stoich'] = [np.nan] * len(biomass_metabolites_df)
biomass_metabolites_df['mole_fraction'] = [np.nan] * len(biomass_metabolites_df)
biomass_metabolites_df['formula_weight'] = [np.nan] * len(biomass_metabolites_df)

for df_idx, metabolite_id in enumerate(biomass_metabolites_df.id):
    metabolite = model.metabolites.get_by_id(metabolite_id)
    biomass_metabolites_df['stoich'].iloc[df_idx] = biomass.metabolites[metabolite]
    # Inspect reactants, ignore products, based on stoichiometric constants
    if biomass.metabolites[metabolite] < 0:
        if metabolite_id in ['atp_c', 'h2o_c']:
            biomass_metabolites_df['mole_fraction'].iloc[df_idx] = np.nan
        else:
            biomass_metabolites_df['mole_fraction'].iloc[df_idx] = -biomass.metabolites[metabolite]/sum_reactant_coeffs
        biomass_metabolites_df['formula_weight'].iloc[df_idx] = metabolite.formula_weight

Compute mass fractions

In [None]:
m = biomass_metabolites_df['mole_fraction'] * biomass_metabolites_df['formula_weight']
m /= m.sum()
biomass_metabolites_df['mass_fraction'] = m

In [None]:
# Sanity check
p = pd.pivot_table(
    biomass_metabolites_df,
    values='mass_fraction',
    index='biomass_type',
    aggfunc=sum,
)
p.sort_values(by='mass_fraction', ascending=False)

Estimate time.

Logic:
- Take (1-3)β-D-glucan as an example.
- The mass fraction is 0.19, i.e. 19% of cell mass is (1-3)β-D-glucan.  This is (0.19)(15e-12 g).
- Given that the molecular weight of (1-3)β-D-glucan is 162.14 g/mol, this translates to (0.19)(15e-12)/(162.14) mol.
- If the flux is 6.0 mmol gDW-1 h-1, 1 gDW of cell makes 6.0 mmol (1-3)β-D-glucan in an hour.
- i.e. 1 cell makes (6.0)(15e-12) mmol = (6.0)(15e-12)(1e-3) mol (1-3)β-D-glucan in an hour.
- i.e. 1 cell makes the required (1-3)β-D-glucan in $\frac{(0.19)(15 \times 10^{-12})}{162.14} \times \frac{1}{(6.0)(15 \times 10^{-12})} \mathrm{hours} = \frac{1000 \times 0.19}{(162.14)(6.0)(1 \times 10^{-3})} \mathrm{hours}$.
- In general terms: $\frac{1000 \cdot \mathrm{mol fraction}}{\mathrm{MW} \times \mathrm{flux}}$

In [None]:
biomass_metabolites_df['time'] = -1000 * biomass_metabolites_df['mole_fraction'] / (biomass_metabolites_df['formula_weight'] * biomass_metabolites_df['flux'])

In [None]:
biomass_metabolites_df

In [None]:
biomass_metabolites_df['time'].sum()

# Demand reactions

Orth et al. (2010) describe a method to calculate the maximum yield of each biosynthetic precursor.

For each precursor, add a demand reaction: a reaction that consumes the precursor and doesn't produce anything.  Then, optimise the model, using the demand reaction as the objective function.

## Example: trehalose

As an example, I'm going to start with trehalose (`tre_c`).

Create demand reaction and add to model.

In [None]:
model.add_boundary(model.metabolites.get_by_id('tre_c'), type='demand')

Set objective of model to this demand reaction, and optimise.

In [None]:
model.objective = 'DM_tre_c'
solution = model.optimize()
model.summary()

Estimate time

In [None]:
1/solution.objective_value

## Repeat for all metabolites

Reload model, create demand reactions, set exchange reaction bounds (aerobic with glucose limitation).

In [None]:
model = cobra.io.load_json_model("./models/iMM904.json")

for metabolite_id in biomass_metabolites_df.id:
    model.add_boundary(model.metabolites.get_by_id(metabolite_id), type='demand')
    
model.reactions.get_by_id('EX_glc__D_e').bounds = (-10, 0)
model.reactions.get_by_id('EX_o2_e').bounds = (-999999.0, 0)

Iterate through list of reactions, optimise, and then store solutions in dataframe

In [None]:
biomass_metabolites_df['demand_flux'] = [np.nan] * len(biomass_metabolites_df)

for df_idx, metabolite_id in enumerate(biomass_metabolites_df.id):
    demand_reaction_id = 'DM_' + metabolite_id
    model.objective = demand_reaction_id
    solution = model.optimize()
    biomass_metabolites_df['demand_flux'].iloc[df_idx] = solution.objective_value

In [None]:
biomass_metabolites_df['demand_time'] = 1000 * biomass_metabolites_df['mole_fraction'] / (biomass_metabolites_df['formula_weight'] * biomass_metabolites_df['demand_flux'])

In [None]:
biomass_metabolites_df

In [None]:
biomass_metabolites_df['demand_time'].sum()

In [None]:
# alternatives
# - resource balance analysis