# Inspecting pyTFA package
The integrations of metabolomics data and thermodynamics information could be summarized with a Thermodynamics-Based Metabolic Flux Balance Analysis (TFA). This was nicely implemented in the pyTFA package.
```
Thermodynamics-based Flux Analysis, in Python. Paper : Pierre Salvy, Georgios Fengos, Meric Ataman, Thomas Pathier, Keng C Soh, Vassily Hatzimanikatis. "pyTFA and matTFA: a Python package and a Matlab toolbox for Thermodynamics-based Flux Analysis" Bioinformatics (2018), bty499, DOI: https://doi.org/10.1093/bioinformatics/bty499
```
The first step is to evaluate the perfomance to make sure the integration is possible. Five different operations has to be consider:

1. Translate the model from cobrapy to pyTFA.
2. Adding of user-provided thermodynamics data.
3. Adding of user-provided metabolomics data.
4. Compute group contributions. This is done in the package only for SEED IDs annotations.
5. Solve the resulting LP problem.

This operations will be tested for the [tutorial to reproduce the figure of the publication](https://github.com/EPFL-LCSB/pytfa/blob/master/tutorials/figure_paper.py).

In [1]:
! [ ! -f "thermo_data.thermodb" ] && curl -O -L "https://raw.githubusercontent.com/EPFL-LCSB/pytfa/master/data/thermo_data.thermodb"

In [2]:
import os
import errno
import pytfa

from pytfa.io import import_matlab_model, load_thermoDB

from pytfa.optim.variables import DeltaG,DeltaGstd,ThermoDisplacement
from pytfa.analysis import  variability_analysis,           \
                            apply_reaction_variability,     \
                            apply_generic_variability,       \
                            apply_directionality

from cobra.flux_analysis.variability import flux_variability_analysis

from math import log

Academic license - for non-commercial use only


In [3]:
def apply_concentration_bound(measure):
    met, lb, ub = measure
    the_conc_var = tmodel.log_concentration.get_by_id(met)
    # Do not forget the variables in the model are logs !
    the_conc_var.ub = log(ub)
    the_conc_var.lb = log(lb)

CPLEX = 'optlang-cplex'
GUROBI = 'optlang-gurobi'
GLPK = 'optlang-glpk'

metabolomics_data = [
    ('atp_c', 1e-3, 1e-2),
    ('adp_c', 4e-4, 7e-4),
    ('atp_c', 2e-4, 3e-4)
]

In [4]:
# Load the cobra_model
cobra_model = import_matlab_model('small_ecoli.mat')
cobra_model.solver = "glpk"

In [5]:
%%time
# Load reaction DB
thermo_data = load_thermoDB('thermo_data.thermodb')

CPU times: user 303 ms, sys: 55.5 ms, total: 358 ms
Wall time: 356 ms


In [6]:
%%time
# Initialize the thermodynamic model
tmodel = pytfa.ThermoModel(thermo_data, cobra_model)
tmodel.solver = GLPK
tmodel.prepare()
tmodel.convert(add_displacement = True)
tmodel.print_info()

2019-11-15 13:59:02,428 - thermomodel_new - INFO - # Model initialized with units kcal/mol and temperature 298.15 K
2019-11-15 13:59:02,429 - thermomodel_new - INFO - # Model preparation starting...




2019-11-15 13:59:03,152 - thermomodel_new - INFO - # Model preparation done.
2019-11-15 13:59:03,153 - thermomodel_new - INFO - # Model conversion starting...




2019-11-15 13:59:05,048 - thermomodel_new - INFO - # Model conversion done.
2019-11-15 13:59:05,048 - thermomodel_new - INFO - # Updating cobra_model variables...
2019-11-15 13:59:05,065 - thermomodel_new - INFO - # cobra_model variables are up-to-date


                value
key                  
name              new
description       new
num constraints  3765
num variables    3898
num metabolites   304
num reactions     599
                           value
key                             
num metabolites(thermo)      300
num reactions(thermo)        418
pct metabolites(thermo)  98.6842
pct reactions(thermo)     69.783
CPU times: user 2.83 s, sys: 58.4 ms, total: 2.89 s
Wall time: 2.87 s


In [7]:
%%time
map(apply_concentration_bound, metabolomics_data)

CPU times: user 5 µs, sys: 0 ns, total: 5 µs
Wall time: 8.34 µs


<map at 0x7efd8eb39160>

Negligible.

In [8]:
%%time
tmodel.optimize()

CPU times: user 10.6 s, sys: 17.5 ms, total: 10.6 s
Wall time: 10.5 s


Unnamed: 0,fluxes,reduced_costs
DM_4CRSOL,0.000181,
DM_5DRIB,0.000187,
DM_AMOB,0.000002,
DM_MTHTHF,0.001087,
Ec_biomass_iJO1366_WT_53p95M,0.810997,
...,...,...
LMPD_250_trp-L_c,0.044795,
LMPD_251_tyr-L_c,0.108668,
LMPD_252_udcpdp_c,0.000045,
LMPD_253_utp_c,0.113622,


In [9]:
%%time
# FVA cobra
fva_fluxes = flux_variability_analysis(cobra_model)

CPU times: user 97.3 ms, sys: 148 ms, total: 246 ms
Wall time: 555 ms


In [10]:
# Perform variability analysis again
# gets stuck in my computer
#tva_fluxes_lc = variability_analysis(tmodel, kind='reactions')

### Results
Looks like the LP solving is the only demanding operation but it should be below 10 s running in the platform. The variability_analysis seems to be discarded for now: it perfroms a LP problem for every reaction in the model as an objective.
In my case, it takes longer than it should be because my computer freaks out with some infeasible solutions... Gurobi doesn't seem to work somehow.

## Plotting
For consistency with the tutorial.

In [11]:
from pytfa.io.plotting import plot_fva_tva_comparison
from bokeh.plotting import show, output_file
from bokeh.layouts import column

In [12]:
output_file('outputs/va_comparison.html')
p1 = plot_fva_tva_comparison(fva_fluxes, tva_fluxes)
p2 = plot_fva_tva_comparison(fva_fluxes, tva_fluxes_lc)
c = column(p1,p2)
show(c)

NameError: name 'tva_fluxes' is not defined