Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loopless solutions for optlang [fix #385] #391

Merged
merged 13 commits into from Feb 27, 2017

Conversation

cdiener
Copy link
Member

@cdiener cdiener commented Feb 14, 2017

Up to now only implements the original method from Schellenberger. First time the optlang interface really shines. I could basically implement the original method as is. This requires less constraints than the prior version in cobrapy and brings a speed up (twice as fast).

Missing:

  • a posteriori version (should be much faster)
  • faster loopless FVA using the method from CycleFreeFlux (semi a posteriori)

As expected a posteriori methods are faster. For FVA the loopless version is still slower than the non-loopless one for large problems (kind of expected). However it is much faster than using add_loopless or construct_loopless_model followed by calling "normal" FVA. Faster meaning 2 seconds vs 2 minutes for textbook and a much larger gap (~1000x) for ecoli.

API

This one I am still not sure about. So fire away with your comments

A priori:

with model:
    add_loopless(model)
    model.optimize()
    single_gene_deletion(model)

A posteriori:

with model:
    model.optimize()
    fluxes = loopless_solution(model)

Maybe that one should take a flux distribution as argument alongside with the model? That way you could use it to convert flux samples or saved flux distributions to loopless ones as well...

Benchmarks

test_output.txt

@cdiener cdiener added the WIP work in progress label Feb 14, 2017
@hredestig hredestig added ready and removed WIP work in progress labels Feb 14, 2017
Erratum in: Biophys J. 2011 Mar 2;100(5):1381.
"""
internal = [i for i, r in enumerate(model.reactions) if not r.boundary]
Sint = model.S[:, np.array(internal)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we think this is an operation we'll do many times? Just thinking that we could speed things up if I had the array-generating function accept an optional argument to only give you certain columns...

In this case though, probably not a speed-limiting factor.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now it's only used here...

coefs = {model.solver.variables[
"delta_g_" + model.reactions[ridx].id]: row[i]
for i, ridx in enumerate(internal) if abs(row[i]) > zero_cutoff}
model.solver.constraints[name].set_linear_coefficients(coefs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this something we'll want to context-manage eventually?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already is due to the add_to_solver above. Had to do it like that because you can't set linear coefficients for a Constraint that is not part of an optlang.Model yet.

def _():
with test_model:
add_loopless(test_model)
test_model.optimize(solver="optlang-glpk")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so I'm confused: glpk can handle the MILP formulation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! Absolutely. GLPK includes a MIP solver. It does not support indicator variables like cplex or gurobi so you have to declare them by hand.

@codecov-io
Copy link

codecov-io commented Feb 15, 2017

Codecov Report

Merging #391 into devel-2 will increase coverage by 0.16%.
The diff coverage is 79.72%.

@@             Coverage Diff             @@
##           devel-2     #391      +/-   ##
===========================================
+ Coverage    65.35%   65.52%   +0.16%     
===========================================
  Files           64       64              
  Lines         7716     7822     +106     
  Branches      1338     1359      +21     
===========================================
+ Hits          5043     5125      +82     
- Misses        2442     2459      +17     
- Partials       231      238       +7
Impacted Files Coverage Δ
cobra/core/model.py 85.97% <ø> (+0.66%)
cobra/flux_analysis/init.py 50% <ø> (ø)
cobra/flux_analysis/sampling.py 96.1% <ø> (-0.17%)
cobra/util/array.py 100% <100%> (ø)
cobra/flux_analysis/variability.py 92.85% <100%> (+0.31%)
cobra/util/solver.py 80.9% <100%> (-0.67%)
cobra/flux_analysis/loopless.py 81.25% <74.39%> (-18.75%)
cobra/test/test_flux_analysis.py 86.81% <80%> (-0.64%)
cobra/solvers/mosek.py 0% <ø> (ø)
cobra/core/formula.py 18.42% <ø> (ø)
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ac928ef...9ed85d7. Read the comment docs.

----------
.. [1] Elimination of thermodynamically infeasible loops in steady-state
metabolic models.
Schellenberger J, Lewis NE, Palsson BO.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this not be fixed by using u""" rather than reverting to ASCII? Internationalization is a good thing in general, we're not in the 80s anymore :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in this case it's not that critical since Palsson is cited like that pretty often. If it would in actual code I would agree. But really have no preference. I can add the u""" if you would like.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, he is cited with O fairly often. Let's leave it then.

@cdiener cdiener changed the title Loopless solutions for optlang [WIP] Loopless solutions for optlang Feb 15, 2017
@cdiener cdiener mentioned this pull request Feb 15, 2017
Copy link
Contributor

@hredestig hredestig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • made some cosmetic changes hope it was ok
  • Although haven't tried it live so don't not very informed opinion but I like the API you suggest, shall we proceed with it as is? Guess just need to remove the additional benchmarking code or you want to keep it?

@@ -143,7 +158,7 @@ def _fva_optlang(model, reaction_list, fraction):
fva_old_objective = prob.Variable(
Copy link
Contributor

@hredestig hredestig Feb 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you need m.optimize() above instead to save the solution? also, I reverted m.optimize() to raise an exception when doing the pfba, was not intended as a final solution there but work-in-progress, with that change I guess you don't need to raise ValueError here , think that's where the error comes from...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. The thing is that everything involving constructing a Solution (even LazySolution) adds overhead. This is why I use the pure optlang interface and m.solver.objective.value below (which is set by calling m.solver.optimize already). In that instance it does not make so much difference except for small models. However, in the loop below that make quite a difference.

"""Add variables and constraints to make a model thermodynamically
feasible. This removes flux loops. The used formulation is described
in [1]_.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps could give an example of when to use add_loopless versus loopless_solution to help understand the reason for both of them?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that is a good idea. Also note that zero_cutoff becomes unnecessary if there would a way in optlang to get the tolerance for the solver.

@pstjohn
Copy link
Contributor

pstjohn commented Feb 17, 2017

So here's a random question:

what if I did:

add_loopless(model)
model.add_reaction(my_new_reaction)
model.optimize()

or even

add_moma(model)
model.add_reaction(my_new_reaction)
model.optimize()

My point is that if we want these routines to return/modify model objects, maybe we should be superclassing model, and then either providing add_reaction wrappers that handle the new constraints, or return NotImplementedErrors?

but then we might lose speed and the ability to context manage.

Either way, probably not a huge concern... Maybe a simple docstring warning would be sufficient

@cdiener
Copy link
Member Author

cdiener commented Feb 17, 2017

@pstjohn That is a good point. For now mentioning it in the docstrings and the future docs is the easiest thing. We could add a check in add_reactions but I wouldn't know for what to check right now...

@cdiener cdiener changed the title Loopless solutions for optlang Loopless solutions for optlang [WIP] Feb 17, 2017
@cdiener
Copy link
Member Author

cdiener commented Feb 17, 2017

Returning to WIP status since the changes in the Solution class broke the loopless functions.

@hredestig I had to rebase due to some problems. I tried to integrate all your suggestions and some more. Could you check whether it's fine?

@@ -143,7 +158,7 @@ def _fva_optlang(model, reaction_list, fraction):
fva_old_objective = prob.Variable(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. The thing is that everything involving constructing a Solution (even LazySolution) adds overhead. This is why I use the pure optlang interface and m.solver.objective.value below (which is set by calling m.solver.optimize already). In that instance it does not make so much difference except for small models. However, in the loop below that make quite a difference.

"""Add variables and constraints to make a model thermodynamically
feasible. This removes flux loops. The used formulation is described
in [1]_.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that is a good idea. Also note that zero_cutoff becomes unnecessary if there would a way in optlang to get the tolerance for the solver.

"""
internal = [i for i, r in enumerate(model.reactions) if not r.boundary]
Sint = model.S[:, numpy.array(internal)]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually forgot to throw an error here if there is no numpy :O

@cdiener
Copy link
Member Author

cdiener commented Feb 18, 2017

So here is the gist of the last commits:

Model.objective still had the @resettable decorator. That together with the new set_objective caused some bugs filling up the contexts. Taking @resettable off Model.objective uncovered some bugs with set_objective where it would first zero the objective and than read the reset value (which is surprisingly zero in this case :) ).

hredestig
hredestig previously approved these changes Feb 20, 2017
@hredestig
Copy link
Contributor

Looks great now, is it still wip?

@cdiener
Copy link
Member Author

cdiener commented Feb 20, 2017

@hredestig Still figuring out whether loopless solution should have an argument fluxes that takes a dict of fluxes.

Might be useful for flux sampling (convert sample to a loopless solution) or saved flux solutions. The problem there is how to get the objective value from a dict alone. Easy if Model.objective_coefficients gives the correct objective, but what happens if I set an objective in the optlang interface directy (non-linear, uses other variables, etc.)?

cdiener added a commit to cdiener/cobrapy that referenced this pull request Feb 20, 2017
hredestig pushed a commit that referenced this pull request Feb 21, 2017
* Remove unimplemented/incomplete optlang solvers from get_solver_name
* Add test_choose_solver fix from #391
* fix typo
@hredestig hredestig dismissed their stale review February 22, 2017 13:51

still work in progress

@cdiener cdiener changed the title Loopless solutions for optlang [WIP] Loopless solutions for optlang Feb 22, 2017
@cdiener
Copy link
Member Author

cdiener commented Feb 22, 2017

Ok, should be fine now. Added the fluxes arg as well.

Note that this also fixes two bugs I found on the way (#416 and another one in util.solver_valid_atoms). Let me know if you want those in a separate PR.

@hredestig hredestig changed the title Loopless solutions for optlang Loopless solutions for optlang [fix #385] Feb 24, 2017
model.objective.set_linear_coefficients(
{rxn.forward_variable: -1, rxn.reverse_variable: 1})
model.solver.objective.direction = "min"
model.optimize(objective_sense=None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this raises a SolveError if status is not optimal - be in try/catch?

@cdiener
Copy link
Member Author

cdiener commented Feb 24, 2017

Okay rebased onto devel-2 😅 and added fixes and tests. Also implemented the review from @hredestig. For now this PR also fixes #416 and #418.

Copy link
Contributor

@hredestig hredestig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice!

@hredestig hredestig merged commit 6cf03f8 into opencobra:devel-2 Feb 27, 2017
@hredestig hredestig removed the ready label Feb 27, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants