New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
META: FORTRAN Code inventory #18566
Comments
I recently took a close look at ODEPACK to review gh-14552. Converting that will be a bear since it consists of large pieces of machinery with sophisticated control logic. I'd be up for helping to replace it with a C, C++, or Cython implementation if we can find someone else with sufficient, time, knowledge, and interest to work on it together with me. I think it would have to wait until the all of the deliverables are met for the SDG @tirthasheshpatel and I have to work on |
I am interested in looking at |
You don't need to know Fortran. If you know about the functionality then you can approximate as closely as possible without following the code faithfully. You don't need to jump through the GOTOs but kind of get why and where it jumps. I am basically following the logic and not the code. Because you go quickly insane with Arithmetic IFs and Computed GOTOs. |
I'd also be interested in picking up some of these. I'll take a close look during the week and pick somewhere to get started. |
I'd be happy to review PRs for replacing old Fortran code. I think I can take a look at gh-18570 this week.
Picking up the language itself is actually not too hard. You could look through https://web.stanford.edu/class/me200c/tutorial_77/ to get started, and look things up here https://docs.oracle.com/cd/E19957-01/805-4939/index.html as needed as you work through code. Like @ilayn said, it's the unstructured control flow that can make understanding the code so difficult.
What I tend to do is follow along with a notebook and try to work out flowcharts in an almost mechanical fashion, and then use these to reason about parts of the program. I've found trying to follow along and keep track of the state in ones head like one can do in structured programs is basically impossible. |
@j-bowhay The LSODA does provide unique functionality (automatic stiffness detection and switching between stiff/nonstiff methods), but I don't know that the algorithm itself is especially complex. The stiff and nonstiff methods individually should be straightforward (pure-Python BDF---LSODA's stiff solver---is already available via But rather than simply directly reimplementing LSODA, I think it would be a better use of time to implement abstracted utilities for method switching/composition (which could then be used to replicate LSODA). DifferentialEquations.jl does this and finds that combinations of other stiff and nonstiff methods can outperform LSODA. Obviously this goes beyond the mission of this issue, but if effort is to be spent on the ODE solvers I don't think it should be to simply directly translate LSODA/odepack. With a proper, modular abstraction, I would expect the individual components (stiff and nonstiff solvers with their own dense output, stiffness detection, and logic for dynamically switching) to be simpler to tackle than the LSODA source would suggest.1 (And I don't immediately see a reason these should be implemented in compiled code, at least at first.) On a semi-related note: a year ago I rewrote Footnotes |
I don't know enough solver knowledge to contribute to your analysis. But it sounds like we can indeed have a look at the Cython implementation to see whether we can improve or I don't know what you would like to get out of that exercise. Typically a verbose Cython code can get to wrapped Fortran speed with not so much effort. All depends on the sophistication. The main issue we are trying to tackle is to get rid of this dormant unmaintained code which is blocking us to achieve more user friendly and feature rich API. In a way, I come to believe that the code you wrap forces you into its own API, good or bad. And it is impossible to troubleshoot these things. #14807 is one of my favorite bug reports I've encountered here. In other words, scientific python still smells like fortran-like API which is a shame if you compare it with other coding domains and frameworks. Hence if domain experts think there are better alternatives, then we can switch to them. If they think it's a waste of time, then we deprecate it. Or it is a daunting task to do anything about but we still need the functionality, then a few of us who are crazy enough can sit down and give it a go. After spending last month with this fortran code, I'm simultaneously amazed that the authors provided such robust code which still works and being used by so many people, and lost almost all my respect to scientific computing community at the same time for letting this code untouched for the last 40 years making the same excuses. Please let me know if I can provide any help in these issues. I'm more inclined to touch things on sparse.linalg and optimize because I know something about those things but I can try taking a stab at translations or cython code. |
I'm happy to share it but I should've been clearer that I think it's independent from the Fortran issue (I was just rewriting the pure-Python routines in Cython). Just wanted to mention for visibility to people thinking about compiled code for IVP solvers. (And indeed I was able to get the overhead to
I should've also mentioned that, since a large fraction of the odepack-dependence comes only from the "old"
Thanks for sharing - would be glad to see the ecosystem move toward being not only more maintainable and robust to such correctness issues, but also more extensible and improvable. |
Thanks @zachjweiner for the summary. I did wonder what the overlap was with the existing Python code but was yet to put in any effort to investigate. If I understand you correctly, the only barrier to removing the |
There is also some Fortran code in stats, all probably rather low-hanging fruit.
EDIT: edited these in into the OP, with a subjective difficulty assessment. |
This is great @ilayn !
I can help with this sort of work, depending on what are you planning @ilayn . |
Indeed, I am very much all in for redesigning certain items if the codebase left us quite behind. The DifferentialEquations.jl is an example of how a specialized task force can take things way forward than what this fortran code can offer. So in that sense, I'm all in. One thing I am starting to notice is that, once you understand the codeflow you start to get rather grandiose ideas about how to generalize so better careful with that :) Unfortunately, I'm seriously illiterate when it comes to anything that is not linalg or optimize. Hence, I am not sure if I can be of any help in designing the strategy. What I want is to get rid of the usual suspects, So TL;DR, let's
If |
One thing we have to be careful about stability statement is that though they are stable, they are fantastically old and use outdated ways of doing things in the absence of modern tooling (say fixed lapack bugs, threaded/reentrant C libs, and so on). Core of Long story short, though they are stable, these code have huge inertia for newcomers to implement things. Moreover they also carry a mystique that these code are battle-tested/should not be touched etc. That is demonstrably not true if you actually look at it 😃. I'm sure the algorithms are solid but implementations are far from perfect. |
I had a look into the optimizers. The biggest problem is in my opinion
I agree with @ilayn 's observation that the cores of these algorithms often look like homegrown linear algebra algorithms or copies of the classic F77 implementations which are nowadays available in LAPACK. Even the modern Fortran code includes again copies of BLAS routines and constrained least squares codes. To avoid linking complexity, this probably makes sense for Fortran or C but for scipy .. |
Just to clarify - does the modern and actively maintained Fortran project present any new problems in tooling? I think that's the key question. |
I don't know if we have any actively maintained new fortran code (maybe PRIMA will be if we manage to link it) so probably we would need to experience it to see if there are new issues. Folks mention iso_c_binding and ctypes routes but I didn't see any demonstration yet. If we don't need to wrestle with ABI issues and/or linking strangeness, I'm fine with it. Like I mentioned before, this is not a crusade against Fortran, but to get rid of dormant code which happens to be fortran77. |
cc #19079 (comment) for a C++ port of AMOS, which seems to be license-compatible. |
A full Python interface for MINPACK is available since one year now: https://github.com/fortran-lang/minpack/tree/main/python A few tests serving as examples can be found here: https://github.com/fortran-lang/minpack/blob/main/python/minpack/test_library.py But I guess this doesn't really solve any of the problems you have with F77 which are:
It appears easier for people to create something new in their own (closed) community than to work across programming languages and tools. Some time ago I rewrote NNLS with modern Fortran flow-control constructs in an attempt to make it more maintainable and easy to interface. Unfortunately, I cannot live off of refactoring Fortran codes, hence I had no motivation to contribute it to SciPy, despite being a user. |
Maybe try BLIS which was written in C.
AMD also builds their BLAS library (AOCL-BLAS) based on BLIS.
|
Yes BLIS is in my radar. I need to spare some time and see how we can find a fitting LAPACK for it. In the meantime, Rust folks are also covering quite some distance. I am looking into those. The absolutely last resort is going to be have a performant BLAS-like, and write a small subset of LAPACK with it. But I am really not looking forward to it. Julia folks did some really nice progress with |
For ARPACK, I wonder how feasible it would be use https://github.com/JuliaLinearAlgebra/ArnoldiMethod.jl as a reference and implement it in cython/etc. According to https://discourse.julialang.org/t/ann-arnoldimethod-jl-v0-4/110604, it is more stable than ARPACK. |
Thanks for sharing @cournape. That looks great - way easier to understand than ARPACK and MIT-licensed. |
Note that ArnoldiMethod.jl only implements the non-symmetric standard eigenproblem, it's lacking Lanczos and generalized problem. I don't know what the state of the art is, but as far as I remember Lanczos needs extra orthogonalization because it's unstable, and with that in place it looks rather similar to the non-symmetric case, so you might as well only do the non-symmetric case. |
This issue is meant for a central place for our Fortran (almost exclusively F77) codebase to track per module and if possible remove depending on contributor affinity to the subject matter and fortran-fu.
List has projected effort required just based on code length hence involves prejudice.
scipy.integrate
:dop
Convert: Easymach
Will be deletedodepack
: Convert : Hardquadpack
: Convert : Moderate-Hardscipy.interpolate
fitpack
: Convert : Moderate-Hard (modernized f90 implementation https://github.com/perazz/fitpack)scipy.io
scipy.linalg
lu.f
: Replaced in ENH:MAINT:linalg:lu Cythonized and ndarray support added #18358det.f
: Replaced in ENH:MAINT:linalg det in Cython and with nDarray support #18225id_dist
: Convert: Moderate (replacement in progress) ENH: linalg: Pythonizeid_dist
FORTRAN code #20558blas
: Convert: Very Hard (Depends on BLAS/LAPACK lib) A lot of historical wrapper inconsistencieslapack
: Convert : Very Hard (Depends on BLAS/LAPACK lib) A lot of historical wrapper inconsistenciesscipy.odr
odrpack
: Convert : Moderatescipy.optimize
nnls
: Replaced ENH:optimize: Rewrite nnls in Python #18570cobyla
: Convert : Very Hard (The new implementation PRIMA is going to replace cobyla)L-BFGS-B
: Convert : Hardminpack
: Convert : Hard (LFortran team is active on this one too)minpack2
: Replaced in MAINT: port minpack2.dcsrch from Fortran to Python, remove Fortran code #19060slsqp
: Replaced in WIP:ENH:optimize:Rewrite SLSQP solver #19121scipy.sparse.linalg
iterative
: Replaced in MAINT:ENH:sparse.linalg: Rewrite iterative solvers in Python, remove FORTRAN code #18488scipy.special
AMOS
: Replaced in ENH:MAINT:special:Rewrite amos F77 code #19587cdflib
: Replaced in ENH:MAINT:special:Cythonize cdflib #19560mach
: Deletedspecfun
: Replaced in ENH:Rewrite specfun F77 code in C #19824scipy.stats
statlib
: Replaced in MAINT:stats:Cythonize and remove Fortran statlib code #18679mvndst
: Already implemented in ENH: stats.multivariate_t: add cdf method #17410 but not public yetThe text was updated successfully, but these errors were encountered: