Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GSoC] WIP: Bounded LSQ algorithms #5019

Closed
wants to merge 10 commits into from
Closed

Conversation

nmayorov
Copy link
Contributor

@nmayorov nmayorov commented Jul 6, 2015

Hi!

I finally prepared code for PR. It is based on not yet merged #4884 So the first commit related to least squares is e908f76

The main (and only public) function is scipy.optimize.least_squares which wraps scipy.optimize.leastsq (as method='lm') and two new algorithms I was working on 'dogbox' and 'trf'. The docstring is in a reasonable shape so I encourage you to try least_squares on your problems. Here is the link to benchmarks I done with algorithms.

The issues at the moment:

  1. Rendering of html documentation. We wanted to do the same as in scipy.optimize.minimize, where general documentation is accompanied with specific methods documentation, like http://docs.scipy.org/doc/scipy-dev/reference/optimize.minimize-neldermead.html But currently I don't know how to do that and not sure if it is the best decision in general.
  2. How to wrap leastsq. I think least_squares might feel more solid and homogeneous if I write a new wrapper to MINPACK functions, instead of calling leastsq.
  3. The need to wrap leastsq. Now it seems OK to wrap it, but I'm working on extension to sparse Jacobians which will add quite a bit of additional options, which apply to both of the new algorithms, but not at all to leastsq. So the tempting solution is to drop leastsq (it will be still available after all) and make a very clean interface (without options, additional doc pages, etc.)
  4. As I said I'm working on supporting sparse Jacobians with an appropriate iterative solver (LSMR), so the interface may change significantly. This PR is likely to be not final, but I'm not going to add sparse-support features here anyway.

@nmayorov
Copy link
Contributor Author

nmayorov commented Jul 6, 2015

Why test_ltisys.py suddenly is not passing?

@pv
Copy link
Member

pv commented Jul 6, 2015

The failing ltisys test is testing that some warnings are emitted. You have warnings.simplefilter('ignore') in one of the files --- simplefilter can be used, but it needs to be wrapped inside with catch_warnings(): similarly as in the ltisys case so that the global warnings state is preserved.

@andyfaff
Copy link
Contributor

andyfaff commented Jul 6, 2015

Any possibility of using optimize.differential_evolution or other optimize.minimize variants (eg LBFGS) methods here? There is a large use of those for curvefitting.

@pv
Copy link
Member

pv commented Jul 6, 2015

@andyfaff: least-squares has more information available than scalar minimization, so it seems those are out of scope. In principle what's done here has similarities to LBFGS, but the problem solved is different and LBFGS fortran code is not very reusable.

@ev-br
Copy link
Member

ev-br commented Jul 6, 2015

@andyfaff Nikolay benchmarked L-BFGS-B on a series of LSQ problems, see the link in the OP. AFAICT the conclusion is that specialized algorithms he implemented often outperform the general-purpose ones.

@ev-br ev-br added enhancement A new feature or improvement scipy.optimize labels Jul 6, 2015
@andyfaff
Copy link
Contributor

andyfaff commented Jul 6, 2015

I'm really looking at a way of plugging in the global minimisers here, like
differential_evolution or basinhopping. Gradient based techniques have a
difficult time on a lot of least squares problems which possess local
minima.
On 6 Jul 2015 11:46 am, "Evgeni Burovski" notifications@github.com wrote:

@andyfaff https://github.com/andyfaff Nikolay benchmarked L-BFGS-B on a
series of LSQ problems, see the link in the OP. AFAICT the conclusion is
that specialized algorithms he implemented often outperform the
general-purpose ones.


Reply to this email directly or view it on GitHub
#5019 (comment).

@pv
Copy link
Member

pv commented Jul 6, 2015

At least for basinhopping, these probably are OK as the local optimization step. Some extra plumbing seems to be required, as LSQ needs to get the residual vectors.

@andyfaff
Copy link
Contributor

andyfaff commented Jul 6, 2015 via email

@ev-br
Copy link
Member

ev-br commented Jul 6, 2015

Gradient based techniques have a difficult time on a lot of least squares problems which possess local minima.

Would you list some specific problems? More problems to test current algorithms on are definitely very welcome --- the selection in Nikolay's sandbox repo could definitely be expanded. I wouldn't be surprised if he accepts PRs with more benchmarks.

Regarding the new minimizers, I am tempted to consider these as an enhancement request. It would be most helpful if you could detail your suggestions in, e.g., this wiki page or even implement the needed changes to the least_squares wrapper as suggested in this PR.

The main aim of this PR, as far as I understand it, is to provide two new algorithms, dogbox and trf, for least squares with bounds --- as such, they complement and compete with implementations of bounds by transformation of variables, as is done in leastsqbound and lmfit.

@ev-br ev-br added this to the 0.17.0 milestone Jul 6, 2015
@pv
Copy link
Member

pv commented Jul 6, 2015

Yes, the least_squares interface will need to consider how to support global optimizers, including basinhopping, without making the interface too convoluted. The focus of the present PR seems more on the new gradient-based bounded LSQ algorithms, so it could be more clear to discuss extensions required of the interface design in a separate ticket.

@pv pv mentioned this pull request Jul 6, 2015
@pv
Copy link
Member

pv commented Jul 6, 2015

I'd suggest discussion on details of the interface is diverted to gh-5020, and continue discussing the trf/dogbox algorithms here.

@andyfaff
Copy link
Contributor

andyfaff commented Jul 6, 2015

On 6 Jul 2015 1:01 pm, "Evgeni Burovski" notifications@github.com wrote:

Gradient based techniques have a difficult time on a lot of least
squares problems which possess local minima.

Would you list some specific problems?

Sure. Anything which depends on the starting position would be useful here.
There are a few problems from my PR for global optimisation problems that
could be used.

@nmayorov
Copy link
Contributor Author

nmayorov commented Jul 6, 2015

squares problems which possess local minima.

It's a question of finding global minimum among several local, right? it's not directly related to what I'm doing.

@nmayorov
Copy link
Contributor Author

nmayorov commented Jul 7, 2015

As Evgeni asked, I removed all sparse features from _numdiff.py (they weren't used in any way).

@andyfaff
Copy link
Contributor

andyfaff commented Jul 7, 2015

An example leastsq example which can fail to converge to the correct global minimum is taken from my WIP #4191, which contains many global minimisation test functions. I found it at http://www.itl.nist.gov/div898/strd/nls/data/mgh09.shtml

import numpy as np
from numpy.testing import assert_almost_equal
from scipy.optimize import leastsq, differential_evolution

bounds = zip([-5.0] * 4, [5.0] * 4)
global_optimum = [0.192833, 0.190836, 0.123117, 0.135766]

a = np.asarray([4.0, 2.0, 1.0, 1 / 2.0, 1 / 4.0, 1 / 6.0, 1 / 8.0,
                      1 / 10.0, 1 / 12.0, 1 / 14.0, 1 / 16.0])
b = np.asarray([0.1957, 0.1947, 0.1735, 0.1600, 0.0844, 0.0627,
                      0.0456, 0.0342, 0.0323, 0.0235, 0.0246])

def f(x):
    vec = b - (x[0] * (a ** 2 + a * x[1])
                   / (a ** 2 + a * x[2] + x[3]))
    return vec

def g(x):
    return np.sum(f(x) ** 2)

res = leastsq(f, [-1, -1, 3, 3])
energy = g(res[0])
assert_almost_equal(energy, minimum_energy)

res2 = differential_evolution(g, bounds, seed=1)
energy2 = g(res2.x)
assert_almost_equal(energy2, minimum_energy)

You'll find with that starting point and leastsq that you fall into a local minimum.

@ev-br
Copy link
Member

ev-br commented Jul 8, 2015

In your example, leastsq simply does not converge:

In [39]: r = leastsq(f, [-1, -1, 3, 3], full_output=True)

In [40]: r[-2]
Out[40]: 'Number of calls to function has reached maxfev = 1000.'

Both methods included in this PR are local, and they full well can converge to a local optimum. That being said,

In [32]: res = least_squares(f, [-1, -1, 3, 3], bounds=([-5.]*4, [5]*4), method='dogbox')

In [33]: res.success
Out[33]: True

In [34]: res.x
Out[34]: array([ 0.19283212,  0.19086615,  0.12312293,  0.13577984])

and least_squares(..., method='trf') converges to a different (local) minimum on a boundary.

@ev-br
Copy link
Member

ev-br commented Jul 22, 2015

The original idea of separating dense and sparse computations did not work out. Cosing this in favor of gh-5044

@ev-br ev-br closed this Jul 22, 2015
@ev-br ev-br removed this from the 0.17.0 milestone Jul 22, 2015
@nmayorov nmayorov deleted the bounded_lsq branch October 17, 2015 14:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A new feature or improvement scipy.optimize
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants