-
-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: odr: upgrade to ODRPACK95 (port to Python or beyond...) #7107
Comments
If anyone is looking for the source code: it's a zip file linked at the top of http://www.netlib.org/odrpack/. It doesn't include a license; not written by government employees so not public domain. ACM is usually a pain, did you find license info anywhere? Looks like the most straightforward thing to do if the license is compatible is to include that Fortran code in the same way as done for ODRPACK now. The trust-region algorithm that's the basis for ODRPACK95 is available in |
Maybe wait for @rkern to comment before starting, he's the expert for |
ODRPACK was indeed written by NIST folk, but ODRPACK95 was not. The main work was done by people at Virginia Tech. It used to be distributed as part of JigCell, but the download and SVN links no longer work, so I couldn't say if there was a license attached to that project independent of the TOMS publication. You can try contacting Layne T. Watson or Jason Zwolak and ask. I have always thought that it would be nice to reimplement the algorithm in pure Python. |
One more possible GSoC topic? How does the difficulty compare to a GSoC
project scope?
…On Tue, Feb 28, 2017 at 10:30 PM, Robert Kern ***@***.***> wrote:
ODRPACK was indeed written by NIST folk, but ODRPACK95 was not. The main
work was done by people at Virginia Tech. It used to be distributed as part
of JigCell <http://jigcell.cs.vt.edu/index.php>, but the download and SVN
links no longer work, so I couldn't say if there was a license attached to
that project independent of the TOMS publication. You can try contacting
Layne T. Watson or Jason Zwolak and ask.
I have always thought that it would be nice to reimplement the algorithm
in pure Python.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#7107 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACCPSNnzYIv7UnJKX1mG6qwCwTV60dYzks5rhHW5gaJpZM4MN4Rw>
.
|
Fits pretty well, I think, for a student that's comfortable implementing numerical algorithms from their papers. |
Any chance you'd be interested in writing a project description / mentoring
/ co-mentoring it if there's a student?
Fits pretty well, I think, for a student that's comfortable implementing
… numerical algorithms from their papers.
|
Not (co-)mentoring, no, but I might be able to write up a description. |
Probably should establish the license first to avoid tears. |
There isn't much interesting in the ODRPACK95 code itself that isn't in the ODRPACK code. The algorithmic innovation was adding bound constraints to its trust-region solver, but we'd probably have to do it differently to add that functionality to our own trust-region solvers in any case. |
Sent an email inquiry about the license to Layne Watson at ltw@cs.vt.edu. |
Sent also a message to the same email on 3Feb2017, but got no replay. I am also interested in this upgrade and could dedicate some time. |
I would like to help here, but I think I am a bit oversubscribed to lead. I would certainly be able to help test. It would be really nice to clear up the copyright issues, including if we decide to drop the Fortran code altogether and write from scratch in Python/Cython. |
If the copyright issues are not addressed, writing from scratch cannot
be avoided. At this point it appears the authors of the package are not
going to reply, so we can assume the copyright issues will not be resolved.
|
It is not clear to me that even writing from scratch would alleviate all the potential copyright/license issues. I tried reviewing the ACM TOMS policy, but I'm no lawyer: Would we need to be able to prove that we did NOT somehow "copy" the F95 bound constraint implementation over to Scipy? |
IANAL, but if you have not ever looked at the F95 source code in
question, it is unlikely there is any copyright issue
https://en.wikipedia.org/wiki/Clean_room_design. Also simply from a
moral point of view, if you are planning to work on it, do not read
their source code.
|
Just got a reply from Jason:
This sounds like a green semaphore. |
Hi everyone. Any updates on this problem? It would be really useful to have the bound constraints feature |
scipy is trying to remove most Fortran codes to reduce our build complexity and maintenance burden (see #18566 ), so we are unlikely to accept ORDPACK95. Are there any ports to other languages? |
I am willing to revisit converting ODR to Python with the addition of parameter bounds. This could presumably build off of other scipy algorithms (e.g., aforementioned trust region optimization). However, I have found the existing Fortran-based code to be blazingly fast, and I make heavy use of the implicit model formulation option. I think speed and feature parity should be well screened as criteria for the translation. |
Based on the existing translations, It's typically on par, sometimes faster (due to avoiding the glue), or at most 2x-ish slower but removal of the maintenance burden, unnecessary compiling olympics and removed code rot (since we can't fix bugs) is much more important for us. We can always make the code run faster by going lower level or if there is any other better maintained code base. |
Rewriting in Python, and if it's slow then accelerating the bottleneck with C/C++/Cython, would be great. And much easier to add any needed features then. |
Also it goes much faster if there is someone who knows what ODR code should do. So please let me know if you would like to take stab @markcampanelli. I'd be happy to do the boring parts of the translation. It's somehow getting easier for me the more I do it (and making me more furious about what I am seeing 😅 ) |
FWIW, "trust region" is more of an adjective than a Zwolak2006 has a clear explanation of the ODRPACK95 algorithm. |
@rkern Thanks for shining a bit of light forward. I will read the paper and decide if I think I can do this. Making time is perhaps the biggest challenge, of course. Out of curiosity, is the scipy team settled on preferentially using, say, Cython for compiled implementations? Being able to write some high performance Rust would be a deal sweetener for me, but C/Cython wouldn’t necessarily be a deal breaker. |
Cython/C/C++ are all acceptable. We're more and more coming to the conclusion that Cython generates binaries that are way too large, so C/C++ is nicer if you're equally comfortable there. If the code supports only Rust: I suspect that eventually we'll get there, but it's a big lift and we don't have the build infrastructure for it nor done the experiments to figure out what the total impact would be. Maybe in a couple of years. |
I can try to rewrite the code in python. @markcampanelli if you don't mind I can help you... |
@markcampanelli @GuiiFerrari, I am also interested in volunteering to help port ODRPACK95 to scipy. |
Had not expected to see so much enthusiasm for porting ODRPACK to another language. This is highly appreciated folks! Feel free to coordinate a little here :) |
The ODRPACK95 code is actually very well structured in a module; nothing like older F77 code. Modern Fortran allows easy interfacing between C and Fortran (either direction). This being so, what is the advantage of converting well-written and validated code to C when the same result can be achieved just by writing the C bindings? |
We are quite aware of PRIMA but every Fortran user who thinks it is a trivial task to do the bindings when attempt it quickly change their mind. it is not that straightforward and @zaikunzhang can comment on it better as the author of PRIMA. Also see #18118 |
Doing the binding cannot be more complex than rewriting and validating the whole code! ;) SUBROUTINE ODR
& (FCN,
& N,M,NP,NQ,
& BETA,
& Y,X,
& DELTA,
& WE,WD,
& IFIXB,IFIXX,
& JOB,NDIGIT,TAUFAC,
& SSTOL,PARTOL,MAXIT,
& IPRINT,LUNERR,LUNRPT,
& STPB,STPD,
& SCLB,SCLD,
& WORK,IWORK,
& INFO,
& LOWER,UPPER)
C...Routine names used as subprogram arguments
C FCN: The user-supplied subroutine for evaluating the model.
C...Variable definitions (alphabetically)
C BETA: The function parameters.
C DELTA: The initial error in the X data
C IFIXB: The values designating whether the elements of BETA are
C fixed at their input values or not.
C IFIXX: The values designating whether the elements of X are
C fixed at their input values or not.
C INFO: The variable designating why the computations were stopped.
C IPRINT: The print control variable.
C IWORK: The integer work space.
C JOB: The variable controlling problem initialization and
C computational method.
C LOWER: The lower bound on BETA.
C LUNERR: The logical unit number for error messages.
C LUNRPT: The logical unit number for computation reports.
C M: The number of columns of data in the explanatory variable.
C MAXIT: The maximum number of iterations allowed.
C N: The number of observations.
C NDIGIT: The number of accurate digits in the function results, as
C supplied by the user.
C NP: The number of function parameters.
C NQ: The number of responses per observation.
C PARTOL: The parameter convergence stopping tolerance.
C SCLB: The scaling values for BETA.
C SCLD: The scaling values for DELTA.
C STPB: The relative step for computing finite difference
C derivatives with respect to BETA.
C STPD: The relative step for computing finite difference
C derivatives with respect to DELTA.
C SSTOL: The sum-of-squares convergence stopping tolerance.
C TAUFAC: The factor used to compute the initial trust region
C diameter.
C UPPER: The upper bound on BETA.
C WD: The DELTA weights.
C WD1: A dummy array used when WD(1,1,1)=0.0E0_R8.
C WE: The EPSILON weights.
C WORK: The REAL (KIND=R8) work space.
C X: The explanatory variable.
C Y: The dependent variable. Unused when the model is implicit. |
You would be surprised. Especially when half of the functions are home-grown (but outdated) LAPACK functions. But I am too invested in removing F77 code from SciPy so I should not comment. |
As a matter of fact, there exists at least one scipy dev who would be interested in seeing the result (me). First of all, note that this interest does not guarantee the result is going to be accepted. The value of the exercise is to help us judge the pros and cons of this way of binding F90 code, to be considered together with other pros and cons of maintaining it. Even if it's not going to make it into scipy proper, a non-trivial real world worked example would be very useful in many circumstances. So, if you're still interested :-), it'd be great to understand:
|
The Python binding of PRIMA has been finished, thanks to the precious contribution of @nbelakovski. The binding is in fact not that complicated. It works on all platforms, and its build system does not depend on anything that is "dirty" or outdated as far as I can see. Note that PRIMA is a comprehensive package for (derivative-free) optimization. It needs to bind not only one but five solvers. This increases the complexity considerably. I hope PRIMA provides an example of binding modern Fortran code to Python. |
Ok, I will give it a try. ;) |
Sounds good! I think it's okay to replace old F77 code with new F95 code, as long as it's better quality and not a ton more code. That's a net win, and when we do get to rewriting it in another language (because we would really like a Fortran-free future!), that job will be easier to do when starting from better F95 code. |
Hi @HugoMVale, |
I am interested in volunteering to help port ODRPACK95 (http://dl.acm.org/citation.cfm?id=1268782) to scipy in order to get the bound constraints feature. I need some direction on how to proceed, however.
The text was updated successfully, but these errors were encountered: