# Inspect the profile of new test function
We are interested in speeding up the test of metabolite production. However, even with the new logic, it is still too slow.

In [1]:
import pandas as pd
import cobra
import numpy as np

import cProfile
import pstats

Functions to test.

In [2]:
def open_exchanges(model):
    for rxn in model.exchanges:
        rxn.bounds = (-1000, 1000)

def solve_boundary(metabolite, rxn, val=-1):
    """
    Solves the model when some reaction `rxn` has been added to the `metabolite`'s contraints.
    """
    constraint = metabolite.constraint
    constraint.set_linear_coefficients({rxn: val})
    solution = metabolite.model.slim_optimize()
    # TODO: it seems like with context doesn't catch these changes, need to check
    # restore constraint
    constraint.set_linear_coefficients({rxn: 0})
    return solution

def run_fba(model, rxn_id, direction="max", single_value=True):
    model.objective = model.reactions.get_by_id(rxn_id)
    model.objective_direction = direction
    if single_value:
        return model.slim_optimize()
    else:
        try:
            solution = model.optimize()
        except Infeasible:
            return solution
            return np.nan

def test_new(model):
    """
    New test
    """
    mets_not_produced = list()
    open_exchanges(model)
    irr = model.problem.Variable("irr", lb=0, ub=1000)
    with model:
        model.add_cons_vars(irr)
        # helper.run_fba() only accepts reactions in the model
        model.objective = irr
        for met in model.metabolites:
            solution = solve_boundary(met, irr)
            if np.isnan(solution) or solution < model.tolerance:
                mets_not_produced.append(met)
    return mets_not_produced

def test_old(model):
    """
    old_test in consistency.py
    """
    mets_not_produced = list()
    open_exchanges(model)
    for met in model.metabolites:
        with model:
            exch = model.add_boundary(
                met, type="irrex", reaction_id="IRREX", lb=0, ub=1000
            )
            solution = run_fba(model, exch.id)
            if np.isnan(solution) or solution < model.tolerance:
                mets_not_produced.append(met)
    return mets_not_produced

## Profiling of new test
First, write them to file and then use the pstats library to parse the output (which has a REALLY annoying format).

In [3]:
model = cobra.io.read_sbml_model("iAB_RBC_283.xml")
model.solver = "glpk"


cProfile.run('test_new(model)', 'testats')

Academic license - for non-commercial use only


Now, read from file with pstats from the standard library.

In [4]:
p_new = pstats.Stats('testats')
p_new.strip_dirs().sort_stats(2).print_stats()

Thu Nov  7 10:46:05 2019    testats

         201481 function calls (199744 primitive calls) in 2.069 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    2.069    2.069 {built-in method builtins.exec}
        1    0.000    0.000    2.069    2.069 <string>:1(<module>)
        1    0.004    0.004    2.069    2.069 <ipython-input-2-2558f62ab766>:29(test_new)
      342    0.004    0.000    2.045    0.006 <ipython-input-2-2558f62ab766>:5(solve_boundary)
      342    0.002    0.000    1.911    0.006 model.py:1020(slim_optimize)
      342    0.001    0.000    1.905    0.006 interface.py:1454(optimize)
      342    0.001    0.000    1.903    0.006 glpk_interface.py:700(_optimize)
      478    0.002    0.000    1.845    0.004 glpk_interface.py:674(_run_glp_simplex)
      478    1.842    0.004    1.842    0.004 {built-in method swiglpk._swiglpk.glp_simplex}
      684    0.031    0.000    0.125    0.000 gl

        6    0.000    0.000    0.000    0.000 blocks.py:131(_check_ndim)
        1    0.000    0.000    0.000    0.000 generic.py:5145(__finalize__)
        3    0.000    0.000    0.000    0.000 nanops.py:58(check)
       20    0.000    0.000    0.000    0.000 {built-in method numpy.geterrobj}
        9    0.000    0.000    0.000    0.000 {method 'rpartition' of 'str' objects}
       39    0.000    0.000    0.000    0.000 {method 'isspace' of 'str' objects}
        2    0.000    0.000    0.000    0.000 {built-in method builtins.all}
        1    0.000    0.000    0.000    0.000 model.py:1228(__enter__)
        1    0.000    0.000    0.000    0.000 glpk_interface.py:119(type)
        6    0.000    0.000    0.000    0.000 {built-in method swiglpk._swiglpk.glp_set_obj_coef}
        3    0.000    0.000    0.000    0.000 interface.py:1175(interface)
        3    0.000    0.000    0.000    0.000 interface.py:1191(<lambda>)
        2    0.000    0.000    0.000    0.000 sparse.py:223(construct

<pstats.Stats at 0x7f211ecc8828>

## Profiling of old test
As, I can't see anything relevant, I will compare it with the performance of the old test

In [5]:
cProfile.run('test_old(model)', 'oldstats')
p_old = pstats.Stats('oldstats')
p_old.strip_dirs().sort_stats(2).print_stats()

Thu Nov  7 10:46:08 2019    oldstats

         1478970 function calls (1457416 primitive calls) in 3.134 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    3.134    3.134 {built-in method builtins.exec}
        1    0.000    0.000    3.134    3.134 <string>:1(<module>)
        1    0.007    0.007    3.134    3.134 <ipython-input-2-2558f62ab766>:46(test_old)
      342    0.002    0.000    2.428    0.007 <ipython-input-2-2558f62ab766>:17(run_fba)
      342    0.002    0.000    2.323    0.007 model.py:1020(slim_optimize)
      342    0.001    0.000    2.317    0.007 interface.py:1454(optimize)
      342    0.002    0.000    2.315    0.007 glpk_interface.py:700(_optimize)
      554    0.004    0.000    2.225    0.004 glpk_interface.py:674(_run_glp_simplex)
      554    2.219    0.004    2.219    0.004 {built-in method swiglpk._swiglpk.glp_simplex}
      342    0.001    0.000    0.381    0.001 model

        2    0.000    0.000    0.000    0.000 {pandas._libs.lib.infer_datetimelike_array}
        1    0.000    0.000    0.000    0.000 re.py:231(compile)
        3    0.000    0.000    0.000    0.000 {method 'match' of '_sre.SRE_Pattern' objects}
        1    0.000    0.000    0.000    0.000 _methods.py:47(_all)
        6    0.000    0.000    0.000    0.000 series.py:443(_set_subtyp)
        2    0.000    0.000    0.000    0.000 managers.py:325(__len__)
        6    0.000    0.000    0.000    0.000 blocks.py:131(_check_ndim)
        3    0.000    0.000    0.000    0.000 _internal.py:865(npy_ctypes_check)
        1    0.000    0.000    0.000    0.000 re.py:286(_compile)
        1    0.000    0.000    0.000    0.000 {built-in method numpy.arange}
        9    0.000    0.000    0.000    0.000 {method 'rpartition' of 'str' objects}
        7    0.000    0.000    0.000    0.000 series.py:460(name)
        2    0.000    0.000    0.000    0.000 sparse.py:223(construct_from_string)
        3 

<pstats.Stats at 0x7f21355896d8>

I can see that both the `context` manager (0.380) and the `add_boundary` method (0.291) put a bigger overhead to the performance than the set_linear_coefficient  calls in the new test (0.125, with 2x more calls). 

I don't see anything unexpected but I don't know if this value of 0.125 should be lower or it scales poorly with bigger models (which it shouldn't).