Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loopless FVA produces odd results #698

Closed
tpfau opened this issue Apr 27, 2018 · 14 comments
Closed

Loopless FVA produces odd results #698

tpfau opened this issue Apr 27, 2018 · 14 comments

Comments

@tpfau
Copy link

tpfau commented Apr 27, 2018

Loopless FVA produces odd results

Please explain:
Using the very simple model attached I tried to run loopless FVA. What I got was all 0 fluxes except for the exchange reactions. I assume this is due to the use of CycleFreeFVA to achieve looplessness. I think the reason is that an objective coefficient of 0 is incompatible with CycleFreeFVA, as that was what I originally used. This is due to a zero coefficient leading to the CycleFreeFVA solution being always zero for reactions which can have any flux in a cycle.
However, when using a higher objective coefficient, I get non zero flux through R1 and R2 due to the forced flux through the objective reaction. In fact, the "cycling" reactions (R2 and R3) are then forced to carry flux depending on the coefficient, while the exchange reactions are pushed towards 0.
Not sure, but this looks pretty odd to me.

Code Sample

model = cobra.io.read_sbml_model( "Test.xml")
A:
>>> fva = flux_variability_analysis(model, loopless=True, fraction_of_optimum=0)
B:
>>> fva = flux_variability_analysis(model, loopless=True, fraction_of_optimum=0.001)
C:
>>> fva = flux_variability_analysis(model, loopless=True, fraction_of_optimum=1)

Actual Output

A:
>>> fva
    maximum  minimum
R1      0.0      0.0
R2      0.0      0.0
R3      0.0      0.0
EA   1000.0      0.0
EB   1000.0      0.0
B:
>>> fva
    maximum  minimum
R1   1000.0      1.0
R2      1.0      1.0
R3      1.0      1.0
EA    999.0      0.0
EB    999.0      0.0

C:
>>> fva
    maximum  minimum
R1   1000.0   1000.0
R2   1000.0   1000.0
R3   1000.0   1000.0
EA      0.0      0.0
EB      0.0      0.0

Expected Output

A:
>>> fva
    maximum  minimum
R1    1000.0      0.0
R2      0.0      0.0
R3      0.0      0.0
EA   1000.0      0.0
EB   1000.0      0.0
B:
>>> fva
    maximum  minimum
R1   1000.0      1.0
R2      0.0      0.0
R3      0.0      0.0
EA    1000.0      1.0
EB    1000.0      1.0

C:
>>> fva
    maximum  minimum
R1   1000.0   1000.0
R2      0.0      0.0
R3      0.0      0.0
EA      1000.0      1000.0
EB      1000.0      1000.0

Output of cobra.show_versions()

System Information

OS Linux
OS-release 4.4.0-121-generic
Python 2.7.12

Package Versions

pip 8.1.1
setuptools 39.0.1
cobra 0.11.3
future 0.16.0
swiglpk 1.4.4
optlang 1.4.1
ruamel.yaml 0.14.12
pandas 0.22.0
numpy 1.14.2
tabulate 0.8.2
python-libsbml 5.13.0
lxml 3.5.0
scipy 1.0.0
matplotlib 2.2.2

@Midnighter
Copy link
Member

Midnighter commented Apr 27, 2018

For clarification: in the model that you have attached the objective is set to R2: B -> C. The model boundary is defined by EA: -> A and EB: B ->.

I think the offending lines are probably the following ones in loopless_fva_iter:

# find the reactions with loops using the current reaction and remove
# the loops
for rxn in model.reactions:
    rid = rxn.id
    if ((abs(ll_sol[rid]) < zero_cutoff) and (abs(almost_ll_sol[rid]) > zero_cutoff)):
        rxn.bounds = max(0, rxn.lower_bound), min(0, rxn.upper_bound)

however, I'd have to inspect ll_sol and almost_ll_sol to make a call on that and I'm too tired for that right now 😄 I'll get back to you on that, though.

@tpfau
Copy link
Author

tpfau commented Apr 27, 2018

Argh wait... maybe I mixed something up there.
Let me double check this.

@tpfau
Copy link
Author

tpfau commented Apr 27, 2018

Thinking about this, the objective should not make any difference in this call. Without cycles, this Model should only be able to carry flux on R1, EA and EB regardless of the objective.
The only diffference between objectives would be the lower bounds of EA, R1 and EB, which increase with an increasing objective fraction.
With R1 as objective it should still give the mentioned solution (but it gets more problematic with the CycleFreeFVA concept).

@cdiener
Copy link
Member

cdiener commented Apr 27, 2018

Without really having seen the model it is important to mention that CycleFreeFVA only removes loops if it does not create infeasibility. If your objectives is a reaction that requires a loop (like R2) you will get loops in the solution.

@tpfau
Copy link
Author

tpfau commented Apr 27, 2018

The model is quite trivial:

 EA         R1         EB
 ---->  A  -----> B ------> 
         ^       / R2
        R3 \   v
             C

If R2 is used, and the original solution is not cycle free (which would be contradicting the CycleFreeFBA formulation), yes, I would expect to see cycling flux. Even though one could argue, that this should not produce any flux neither if the solutions are requested to be cycle free.
But even if R1 or EB is used I get the same results as mentioned above.

@cdiener
Copy link
Member

cdiener commented Apr 27, 2018

So I can't completely reproduce the results you get:

In [46]: test.objective = "EB"

In [47]: fva = flux_variability_analysis(test, loopless=True, fraction_of_optimum=.001)

In [48]: fva
Out[48]: 
    maximum  minimum
R1      1.0      1.0
R2      0.0      0.0
R3      0.0      0.0
EA   1000.0      1.0
EB   1000.0      1.0

In [49]: fva = flux_variability_analysis(test, loopless=True, fraction_of_optimum=1)

In [50]: fva
Out[50]: 
    maximum  minimum
R1   1000.0   1000.0
R2      0.0      0.0
R3      0.0      0.0
EA   1000.0   1000.0
EB   1000.0   1000.0

In [51]: fva = flux_variability_analysis(test, loopless=True, fraction_of_optimum=.5)

In [52]: fva
Out[52]: 
    maximum  minimum
R1    500.0    500.0
R2    500.0      0.0
R3    500.0      0.0
EA   1000.0    500.0
EB   1000.0    500.0

So if I set the objetive to EB, for your cases I actually get the expected output but for the fraction=0.5 case you get nonsense. ~~~For most real models that would probably be not a problem since most exchanges are reversible which alleviates this problem.~~~ (edit: this was nonsense, still a problem)

@cdiener
Copy link
Member

cdiener commented Apr 27, 2018

Ok I think I see what's wrong. Will try a fix...

@cdiener
Copy link
Member

cdiener commented Apr 27, 2018

Sent a PR to fix a part of the problem. When using a reaction outside the loop as the objective you now get a result without loops:

In [1]: from cobra.io import read_sbml_model

In [2]: from cobra.flux_analysis import flux_variability_analysis

In [3]: test = read_sbml_model("/home/cdiener/Downloads/Test.xml")

In [4]: test.objective = "EB"

In [5]: flux_variability_analysis(test, loopless=True, fraction_of_optimum=0.5)
Out[5]: 
    maximum  minimum
R1   1000.0    500.0
R2      0.0      0.0
R3      0.0      0.0
EA   1000.0    500.0
EB   1000.0    500.0

This will still not work when setting the objective to R1. But the paper also assumes that the objective is not part of a cycle which is expressed in the last equation in equation block 2 of the paper (fixed objective so if the objective requires a cycle the method won't work since the objective forces inclusion of a Type III EFM).

@Midnighter
Copy link
Member

@tpfau can you please install the latest devel version and see if that matches your expectations in light of the above discussion?

pip install https://github.com/opencobra/cobrapy/archive/devel.zip

@tpfau
Copy link
Author

tpfau commented May 15, 2018

I can replicate the results by @cdiener. However some more comments 😄

wrt:

This will still not work when setting the objective to R1. But the paper also assumes that the objective is not part of a cycle which is expressed in the last equation in equation block 2 of the paper (fixed objective so if the objective requires a cycle the method won't work since the objective forces inclusion of a Type III EFM).

Would it be possible to identify such reactions during the FVA run, as this otherwise leads to really odd results if a reaction within a cycle is part of the objective.

But there are also some things I don't understand about the methodology:
If I get CycleFreeFlux (CFF) correct it is essentially:

  1. determine a flux distribution v_0 by:
Optimize c*v
s.t. 
    S * v = 0
  1. get a cycle free distribution v_free as:
minimize |v|
s.t 
    S*v = 0
    keep directionalities of fluxes from v_0
    fix exchange reactions from v_0
    fix objective from v_0

For FVA this becomes (if I understand this correctly):

  1. determine a maximal/minimal flux for v(i) and an associated distribution v_0
    Then try to eliminate cycles to obtain v_1:
minimize |v|
s.t. 
    S*v = 0
    keep directionalities of fluxes from v_0
    fix exchange reactions from v_0

The objective is not fixed in this instance, as it would otherwise not be possible to determine a cycle free solution, the solution would simply be the same again including the cycle.

  1. If v_0(i) == v_1(i) stop, and assume, that reaction i is not in 2 different cycles and that the value for v_i is a cycle free value.
    Else:
    fix v(i) = v_0(i) and rerun
minimize |v|
s.t. 
    S*v = 0
    keep directionalities of fluxes from v_0
    fix exchange reactions from v_0

to obtain v_2.
There should now be some reactions v(j) which carry flux in v_2 but which do not carry flux in v_1. These reactions are set to 0 (except for the objective) and the process is repeated starting with a new determination of v_0.

This leads to odd sitations as the following:
If you change the toy model to contain an Exchanger for C instead of A, set the objective to EB, and run FVA again you suddenly get:

>>> flux_variability_analysis(model, loopless=True, fraction_of_optimum=0.0)
    maximum  minimum
R1      0.0      0.0
R2      0.0      0.0
R3      0.0      0.0
EC   1000.0      0.0
EB   1000.0      0.0
>>> flux_variability_analysis(model, loopless=True, fraction_of_optimum=0.5)
    maximum  minimum
R1   1000.0    500.0
R2      0.0      0.0
R3   1000.0    500.0
EC   1000.0    500.0
EB   1000.0    500.0

In the first example, the initial solution found by the solver for R1 seems to have been R1 -> R2 -> R3. Thus it eliminated R3 from the solution, and got a 0 flux.
In the second run, since R1 has an enforced flux, which is also enforced for R3, its value is non zero for v_1 and therefore doesn't eliminate R3, obtaining the correct flux.
Essentially: If there are any required chains of reactions in the model, which are also part of a cycle, and which do not carry forced flux, it seems rather random whether the result is correct.

But even without the changes, R1 should be reported with a max flux of 1000 instead of the enforced value from the objective_fraction. R2/R3 can be wrong (because of the enforced flux through the cycle) but at least R1 should be correct (imo).

@cdiener
Copy link
Member

cdiener commented May 24, 2018

Would it be possible to identify such reactions during the FVA run, as this otherwise leads to really odd results if a reaction within a cycle is part of the objective.

In theory yes, but you would need a "real" loopless method such as the original one that uses a MIP formulation. However that would completely obliterate the speed benefit com CycleFreeFlux. I would probably doubt how much sense it makes to have a cyclic biomass objective for your model. I think one of the major properties of a biomass reaction is to actually consume things which usually makes it non-cyclic. We could maybe make that clear in the docs so that shortcoming is transparent to the user.

Regarding the exchange for C I think you are absolutely correct and I agree with the concerns there. Basically if you have a cycle where each intermediate has its own exchange CycleFreeFlux has a problem due to the initial solution it uses which is one arbitrary solution from the infinite possible ones. So if you get a solution with a non-zero exchange in the first optimization (basically classic FBA) you get an enforced cycle since CycleFreeFVA will now always enforce that exchange. But this is more a shortcoming of the method itself since it will only find the least "loopy" solution given the particular reference solution. And if that one happens to have loop-enforcing exchange that loop will be maintained. You can fix this by adding an intermediate step to the exchanges for instance <-> precursor_C <-> C. In most larger models this intermediate step is always included since there is usually ad external compartment and a cytosolic one and exchanges are formulated as <-> M_e <-> M_c. So that particular problem probably does not affect too many models.

@tpfau
Copy link
Author

tpfau commented May 24, 2018

Regarding the exchange for C I think you are absolutely correct and I agree with the concerns there. Basically if you have a cycle where each intermediate has its own exchange CycleFreeFlux has a problem due to the initial solution it uses which is one arbitrary solution from the infinite possible ones. So if you get a solution with a non-zero exchange in the first optimization (basically classic FBA) you get an enforced cycle since CycleFreeFVA will now always enforce that exchange. But this is more a shortcoming of the method itself since it will only find the least "loopy" solution given the particular reference solution. And if that one happens to have loop-enforcing exchange that loop will be maintained. You can fix this by adding an intermediate step to the exchanges for instance <-> precursor_C <-> C. In most larger models this intermediate step is always included since there is usually ad external compartment and a cytosolic one and exchanges are formulated as <-> M_e <-> M_c. So that particular problem probably does not affect too many models.

Actually, the problem is not the Exchange for C as such, but rather, that now there are 2 reactions of the cycle which are in any instance required for each of them to be able to carry flux.
The problem is, when those reactions are not required for the original objective (e.g. biomass) to carry flux.
If they are, they will not carry zero flux during the cycle removal step, and thus not be deleted. But, if they are independent of the original objective, then they can easily not carry any flux during the general minimisation step, and would thus both be deleted, indicating zero flux. And I can easily imagine these kinds of reactions/constellations in normal models.

@cdiener
Copy link
Member

cdiener commented May 24, 2018

Hmm, yeah I get it. It still kind of depends on the fixed exchanges though. In the end I think the crux of the problem lies in the fact that CycleFree removes cycles from a given reference solution and in FVA you only get one arbitrary reference for each optimization. But as you have argumented perfectly yourself, that is a problem with the method itself which was not really addressed in the original publication. The magnitude of the problem probably depends on the size of the flux cone after constraining the objective.

Would be good to check how much that affects most models. This can be done with a consistency check with a varied fraction of optimum for instance. Ergo run the FVA once with a fraction of 1.0, then with a fraction of 0.5 and check whether the first min-maxes lie within the second ones. Since the textbook model is already part of the tests and is ran a few times there it does not seem to get affected by that too much. The same for the salmonella model from the docs...

For now I don't really see a good solution other than stressing that problem in the docs. At least I am not aware of any other fast loopless FVA method...

Edits: spelling

@Midnighter
Copy link
Member

Eivind Almaas mentioned an extra-fast cycle checking method at the COBRA conference. We're in touch with him and eagerly awaiting a pre-print or publication 😉 I'll close this for now but feel free to re-open the discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants