Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dynamic FBA implementation #452

Closed
phantomas1234 opened this issue Mar 10, 2017 · 14 comments
Closed

dynamic FBA implementation #452

phantomas1234 opened this issue Mar 10, 2017 · 14 comments

Comments

@phantomas1234
Copy link
Contributor

A much requested feature here at DTU Biosustain. Any particular preference if this should be part of cobrapy or a separate package? I am leaning more towards the first.

@pstjohn
Copy link
Contributor

pstjohn commented Mar 10, 2017

i've coded up a few dynamic fba implementations, a really basic one is here. There are better algorithms out there, this for instance. If I remember correctly all their stuff was in either messy fortran or matlab 😄 .

In terms of speed, you'll really want to make sure the ODE is running in a compiled language. That will be the most difficult part of getting something implemented, since we'll want people to be able to specify uptake rate rules in python. You could probably use things like theano, tensorflow, or casadi to construct the symbolic graph in python and then do a jit compilation.

There's also the option of doing a collocation-based approach, like this that works really well in casadi, but you start to run into limits on how large of a model you can simulate. Also, the IPOPT extension he mentions seems to no longer be supported.

@cdiener
Copy link
Member

cdiener commented Mar 10, 2017

Yes would be practical. @pstjohn what you're saying sounds interesting but like you said would be a of of C code since you have to implement the LP solving there as well. So is the function call overhead for Python larger than solving the LP in a lower-level language?

@pstjohn
Copy link
Contributor

pstjohn commented Mar 10, 2017

If you're following the approach from paul barton's group, you only end up resolving the LP when you change solution bases. If you're using the collocation-based approach, you actually don't end up solving the LP at all, you embed the optimality by using the KKT conditions in the nonlinear program.

If you're doing it naively, the way I did, then you need to resolve the LP at each interval call to f(x,p). The bounds are only going to change very slightly, so my expectation is the solve would be very fast - in terms of overhead, just think about all the conversations we've had about whether or not a solution object should pull the flux values from the solver 😄 . That says nothing about whether its mathematically advisable to be solving the LP at each call to f(x,p), I'm sure the writers of the ODE integration code would scoff at all the assumptions you're breaking on continuity, etc.

Solving an ODE will involve hundreds to thousands of calls to dx/dt = f(x, p), so the overhead is nontrivial. Also worth considering (if we're trying to use the method from tidor's group) is that we'll need rootfinding in the ODE integration. I don't think scipy's solvers have that, nor do the SUNDIALS wrappers from casadi (casadi/casadi#974). SUNDIALs proper does though, but that's in C.

@cdiener
Copy link
Member

cdiener commented Mar 12, 2017

Thanks for the info! Really need to read the papers 😄 So that would probably need a C/C++ implementation with Python bindings so sounds like a separate package would be better. I remember implementing a gauss collocation in CUDA and non-linear root funding is not so hard if you have a decent linear solver since most software just builds a damped Newton on top of that (at least most FEM methods just do that). Definitely sounds like an interesting project!

@pstjohn
Copy link
Contributor

pstjohn commented Mar 12, 2017

I don't want to be too discouraging, its most likely possible to hack together something in python using scipy's integrators: it would be about the same as integrating these things in MATLAB. I just worry someone will start using these things for parameter estimation of genome-scale models and the computational time will go through the roof :)


I would probably expect that the best way to do it would be some sort of cython/c++ implementation so you could call IDA/GLPK directly like I did in my implementation, but also using some of the theoretical developments from here or here].

I don't think that group has released any code though, I did email them a while back asking about it but I don't remember thinking that adapting it to general models would be fairly easy.

The problem with any of these options though is allowing a user to specify their functions (i.e., uptake rake expressions) in python, and then compile them on-the-fly for the C++ package. Theano/tensorflow/casadi might be able to help there.

@cdiener
Copy link
Member

cdiener commented Mar 13, 2017

Took a little peak at the Barton paper you linked. The stability issues are quite surprising. I didn't know convential algorithms could fail so easily...
I agree that if you want something future-proof you would probably would want to start out with a C++ implementation which you would wrap with Cython. Sorry I misunderstood the root finding thing, now I get that you meant event detection when the basis has to be changed.

Regarding the import rates: are there that many different forms that are commonly used? If not one could pre-implement some and parametrize them from Python. Another alternative would be using the C-wrapper from symengine. @phantomas1234 might know more about that. Parameter estimation would be for the uptake rates only correct?

If it would be a C++ it would probably fit better in its own package maybe also implementening some parameter optimization as well. If we want it in cobrapy we would probably have to go the scipy way to get something functional fast...

@phantomas1234
Copy link
Contributor Author

phantomas1234 commented Mar 28, 2017

The Barton papers seem to be state-of-the-art. I remember reading this paper here http://onlinelibrary.wiley.com/doi/10.1002/bit.24748/full. Annoying that the code is not freely available. @Midnighter is a pretty good C++ coder so maybe we could get this started? As far as I know the symengine folks use also cython for their Python bindings. Probably not too hard to let the user define rate equations symbolically using symengine.py and then use them directly in the C++ dfba tool via symengine. Would it be important to support different LP solvers?

@pstjohn
Copy link
Contributor

pstjohn commented Mar 28, 2017

You can email the barton group to get access to their source code. The descriptions are here. I'm not sure about the license restrictions, but I know I got a copy of it a couple years back. Its essentially fortran hooks into their own DAEPACK dae simulator.

I'm assuming the most maintainable / straightforward implementation would use IDAS (from sundials) and something to pass the rate equation from python.

@cdiener
Copy link
Member

cdiener commented Mar 28, 2017

As far as I can tell the papers you linked are newer than the implementation of DFBSIM so I would also tend to start from scratch and use the newer methods. I agree with Sundials IDA. What would be used for the LP solver? The paper used cplex but I think we should probably start out with glpk if there is a way to actually access its solution basis (it can be saved but I have never seen any getters for it)...

@pstjohn
Copy link
Contributor

pstjohn commented Mar 28, 2017

I met a couple of his group members at AIChE, I know this is something they're still working on. They might be able to share some insight on the best way to code it up. Although as far as I could tell most of their newer implementation was in MATLAB.

https://aiche.confex.com/aiche/2016/webprogram/Paper453134.html

@phantomas1234
Copy link
Contributor Author

Look like there is another cobrapy based dfba package here https://github.com/matthiaskoenig/dfba @matthiaskoenig is this similar to @pstjohn implementation?

@matthiaskoenig
Copy link
Contributor

Just commenting on this. If you have any questions let me know.

The implementation is based on SOA (stationary optimization approach) doing iterative updates between the ode and LP part. The main objective here is not to implement a DFBA solver but to find a clean/reproducible way to encode a general DFBA problem (not only the bioreactor parts) in SBML.
So going with a simple approach.

I am building on cobrapy for the FBA part and roadrunner on the ODE part for now, but if a nice implementation in python of the Barton Algorithm, i.e. a direct approach, becomes available I would support it.
I need very general ODEs with additional features like Events, RateRules and the possibility to couple models (also multiple FBA).

This is still in the beta stage:
https://sbmlutils.readthedocs.io/en/latest/notebooks/dfba.html
https://github.com/matthiaskoenig/sbmlutils

but hopefully getting ready in the next few weeks

@Midnighter
Copy link
Member

@synchon worked on this in his GSoC'18 project. The code is in his repository and we'll pick up the work from there. I'm closing this issue here since a separate package makes more sense.

@davidtourigny
Copy link

davidtourigny commented Oct 2, 2019

A Python/C++-based dfba extension for cobrapy is now available here https://pypi.org/project/dfba/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants