New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix optimize defaults for older scipy versions for L-BFGS-B #476

Merged
merged 4 commits into from Nov 23, 2014

Conversation

Projects
None yet
4 participants
@Garyfallidis
Member

Garyfallidis commented Nov 22, 2014

Enabling Optimizer class to work with < 0.12 versions of Scipy for L-BFGS-B can be tricky. The parameters of L-BFGS-B function changed from 0.11 to 0.12. Here is a workaround. I also added some more prints reporting what is actually happening when < 0.12. GGT! @arokem release the Kraken!

@@ -104,7 +99,7 @@ def __init__(self, fun, x0, args=(), method='L-BFGS-B', jac=None,
callback : callable, optional
Called after each iteration, as ``callback(xk)``, where ``xk`` is
the current parameter vector.
the current parameter vector. Only available using Scipy 0.12+.

This comment has been minimized.

@arokem

arokem Nov 22, 2014

Member

Small nit: I'd use "scipy > 0.12", rather than "0.12+"

@@ -117,7 +112,8 @@ def __init__(self, fun, x0, args=(), method='L-BFGS-B', jac=None,
`show_options('minimize', method)`.
evolution : bool, optional
save history of x for each iteration
save history of x for each iteration. Only available using Scipy
0.12+.

This comment has been minimized.

@arokem

arokem Nov 22, 2014

Member

Here too

if SCIPY_LESS_0_12:
if evolution is True:
print('Saving the history is only available with Scipy 0.12+.')

This comment has been minimized.

@arokem

arokem Nov 22, 2014

Member

And here

@@ -4,8 +4,6 @@
Scipy < 0.12. All optimizers are available for scipy >= 0.12.

This comment has been minimized.

@arokem

arokem Nov 22, 2014

Member

This will soon be inaccurate, as we add stuff from my SFM and LiFE PRs. I might just change this in my PR. I have a few more changes proposed on top of this. You can leave this as-is, but I wanted to give you a heads up on that.

This comment has been minimized.

@Garyfallidis

Garyfallidis Nov 23, 2014

Member

Okay, cool.

@arokem

This comment has been minimized.

Member

arokem commented Nov 22, 2014

General thought: with the new Travis config, should we setup up CI testing for both scipy<0.12 and for scipy>=0.12? We already have 6 machines tested there...

@Garyfallidis

This comment has been minimized.

Member

Garyfallidis commented Nov 23, 2014

@arokem I changed the 0.12+ to >=0.12

As about setting up CI testing I think it would be great to have a range of different numpy and scipy versions. So be my guest. Go forward with merging this. And will push in a bit the other one with the enhancement of SLR to support previous transformations.

@arokem

This comment has been minimized.

Member

arokem commented Nov 23, 2014

Sure. Will wait for Travis to finish and then merge this in. More changes
coming in this moduel in the PRs I have been working, after I rebase on
this.

On Sun, Nov 23, 2014 at 2:29 PM, Eleftherios Garyfallidis <
notifications@github.com> wrote:

@arokem https://github.com/arokem I changed the 0.12+ to >=0.12

As about setting up CI testing I think it would be great to have a range
of different numpy and scipy versions. So be my guest. Go forward with
merging this. And will push in a bit the other one with the enhancement of
SLR to support previous transformations.


Reply to this email directly or view it on GitHub
#476 (comment).

@Garyfallidis

This comment has been minimized.

Member

Garyfallidis commented Nov 23, 2014

Travis has finished. GGT!

arokem added a commit that referenced this pull request Nov 23, 2014

Merge pull request #476 from Garyfallidis/fix_optimize_defaults
Fix optimize defaults for older scipy versions for L-BFGS-B

@arokem arokem merged commit 23e9c2b into nipy:master Nov 23, 2014

1 check passed

continuous-integration/travis-ci The Travis CI build passed
Details
@arokem

This comment has been minimized.

Member

arokem commented Nov 23, 2014

Yeah - I am not very happy with this - test coverage on this is very poor
on that file (40%), but I will merge this anyway, so you don't have to wait
for this for the SLR stuff. I wouldn't be amazed if this comes back to bite
us on the buildbots, etc.

On Sun, Nov 23, 2014 at 2:49 PM, Eleftherios Garyfallidis <
notifications@github.com> wrote:

Travis has finished. GGT!


Reply to this email directly or view it on GitHub
#476 (comment).

self._evol_kx = None
_eps = np.finfo(float).eps

This comment has been minimized.

@samuelstjean

samuelstjean Nov 23, 2014

Contributor

why is eps changed to the lowest possible value? you might get convergence issues at line 155 for example, and anywhere else it's used.

@matthew-brett

This comment has been minimized.

Member

matthew-brett commented Nov 23, 2014

Really - have we merged a file with 40% coverage?

@arokem

This comment has been minimized.

Member

arokem commented Nov 23, 2014

It will never be much more than that. If scipy<0.12 it runs one block of
code, if scipy>=0.12 it runs another block. Any ideas? Should I revert that
merge?

On Sun, Nov 23, 2014 at 3:02 PM, Matthew Brett notifications@github.com
wrote:

Really - have we merged a file with 40% coverage?


Reply to this email directly or view it on GitHub
#476 (comment).

@Garyfallidis

This comment has been minimized.

Member

Garyfallidis commented Nov 23, 2014

Hey wait a minute guys. The coverage is low because it depends on which version of scipy you are running.

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Nov 23, 2014

I'm really wondering about the tolerance set at the bare floating point minimum though.

@Garyfallidis

This comment has been minimized.

Member

Garyfallidis commented Nov 23, 2014

Otherwise we need to separate the tests into two functions and skip the tests with the different scipy versions. This is an alternative.

@arokem

This comment has been minimized.

Member

arokem commented Nov 23, 2014

yep - we're having two conversations here ...

On Sun, Nov 23, 2014 at 3:05 PM, Samuel St-Jean notifications@github.com
wrote:

I'm really wondering about the tolerance set at the bare floating point
minimum though.


Reply to this email directly or view it on GitHub
#476 (comment).

@arokem

This comment has been minimized.

Member

arokem commented Nov 23, 2014

now it's three - including the meta-conversation I just started :-)

On Sun, Nov 23, 2014 at 3:05 PM, Ariel Rokem arokem@gmail.com wrote:

yep - we're having two conversations here ...

On Sun, Nov 23, 2014 at 3:05 PM, Samuel St-Jean notifications@github.com
wrote:

I'm really wondering about the tolerance set at the bare floating point
minimum though.


Reply to this email directly or view it on GitHub
#476 (comment).

@arokem

This comment has been minimized.

Member

arokem commented Nov 23, 2014

I don't think that would solve the problem - there would still be low
coverage on the module code itself. Or am I misunderstanding your proposal?

On Sun, Nov 23, 2014 at 3:05 PM, Eleftherios Garyfallidis <
notifications@github.com> wrote:

Otherwise we need to separate the tests into two functions and skip the
tests with the different scipy versions. This is an alternative.


Reply to this email directly or view it on GitHub
#476 (comment).

@Garyfallidis

This comment has been minimized.

Member

Garyfallidis commented Nov 23, 2014

Okay look I will make another PR using the skip mechanism. Maybe that is better and will not reduce the coverage. Let me give it a try.

@arokem

This comment has been minimized.

Member

arokem commented Nov 23, 2014

As for _eps - I think that it's used to set the factr keyword so that
the number of iterations to be the same across different values of the
floating point minimum.

On Sun, Nov 23, 2014 at 3:05 PM, Samuel St-Jean notifications@github.com
wrote:

I'm really wondering about the tolerance set at the bare floating point
minimum though.


Reply to this email directly or view it on GitHub
#476 (comment).

@arokem

This comment has been minimized.

Member

arokem commented Nov 23, 2014

Maybe use Travis to test on both versions of scipy, and compare the
coverage reports, to see that they have complementary 'holes'?

On Sun, Nov 23, 2014 at 3:10 PM, Eleftherios Garyfallidis <
notifications@github.com> wrote:

Okay look I will make another PR using the skip mechanism. Maybe that is
better and will not reduce the coverage. Let me give it a try.


Reply to this email directly or view it on GitHub
#476 (comment).

@Garyfallidis

This comment has been minimized.

Member

Garyfallidis commented Nov 23, 2014

Okay no. The coverage will be small no matter what I do. I don't see how I can increase the coverage here. This is truly conditional here on the scipy version that you have on your machine. If you don't have scipy >= a large part of the code will not be tested. And the opposite.

@Garyfallidis

This comment has been minimized.

Member

Garyfallidis commented Nov 23, 2014

@arokem with what scipy version was the coverage 40%? Do you remember? I cannot see Travis results anymore.

@Garyfallidis

This comment has been minimized.

Member

Garyfallidis commented Nov 23, 2014

If you don't remember. No worries.

@arokem

This comment has been minimized.

Member

arokem commented Nov 23, 2014

It didn't run lines 130-210, so must be >=0.12

On Sun, Nov 23, 2014 at 3:33 PM, Eleftherios Garyfallidis <
notifications@github.com> wrote:

If you don't remember. No worries.


Reply to this email directly or view it on GitHub
#476 (comment).

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Nov 23, 2014

I don't see how factr is related to the number of iterations, according to the doc at http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.optimize.fmin_l_bfgs_b.html it's on a way overkill level that will never converge/take forever.

By default,
eps=1e-8
ftol = 1e-7
factr=options['ftol']/_eps = 450359962.73704958 (which is between 1e8 and 1e9)

Converge if <= factr * eps, with the doc saying : Typical values for factr are: 1e12 for low accuracy; 1e7 for moderate accuracy; 10.0 for extremely high accuracy. Currently it stands at ftol = 1e-7.

Default scipy wil be factr=10000000.0 * 1e-8 = 0.1, and if they use eps=1e-16 on 64 bits system (do they actually? doc says eps is the machine precision, so that's not really clear) factr=1e-09.

Is there any reason to change those defaults? And the number of maxiteration is already controlled by another variable, this is for the relative convergence of the cost function rather.

@Garyfallidis

This comment has been minimized.

Member

Garyfallidis commented Nov 23, 2014

@samuelstjean you are looking at the wrong scipy version dude! Please be more careful with your comments.

@arokem

This comment has been minimized.

Member

arokem commented Nov 23, 2014

The default value of the factr kwarg is 10000000.0

On Sun, Nov 23, 2014 at 3:36 PM, Samuel St-Jean notifications@github.com
wrote:

I don't see how factr is related to the number of iterations, according to
the doc at
http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.optimize.fmin_l_bfgs_b.html
it's on a way overkill level that will never converge/take forever.

By default,
eps=1e-8
ftol = 1e-7
factr=options['ftol']/_eps = 450359962.73704958 (which is between 1e8 and
1e9)

Converge if <= factr * eps, with the doc saying : Typical values for factr
are: 1e12 for low accuracy; 1e7 for moderate accuracy; 10.0 for extremely
high accuracy. Currently it stands at ftol = 1e-7.

Default scipy wil be factr=10000000.0 * 1e-8 = 0.1, and if they use
eps=1e-16 on 64 bits system (do they actually? doc says eps is the machine
precision, so that's not really clear) factr=1e-09.

Is there any reason to change those defaults? And the number of
maxiteration is already controlled by another variable, this is for the
relative convergence of the cost function rather.


Reply to this email directly or view it on GitHub
#476 (comment).

@arokem

This comment has been minimized.

Member

arokem commented Nov 23, 2014

OK - turns out we did have a machine running the minimal versions of the
dependencies, so scipy 0.9. Maybe all we need from it is also a coverage
report? #479

On Sun, Nov 23, 2014 at 3:31 PM, Eleftherios Garyfallidis <
notifications@github.com> wrote:

Okay no. The coverage will be small no matter what I do. I don't see how I
can increase the coverage here. This is truly conditional here on the scipy
version that you have on your machine. If you don't have scipy >= a large
part of the code will not be tested. And the opposite.


Reply to this email directly or view it on GitHub
#476 (comment).

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Nov 23, 2014

What do you mean, this branch of code I mentionned is entered if scipy < 0.12 is it not? Still exists on 0.14, and they seem to have added a maxiter argument as the only difference. Anyway, I still don't see how factr and the number of iterations are related nor should be changed.

@Garyfallidis

This comment has been minimized.

Member

Garyfallidis commented Nov 24, 2014

I didn't invent this by myself I saw what they do in minimize to call fmin_lbfgs_b with a common interface of parameters. I did this sometime ago. I will check again just in case I missed something.

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Nov 24, 2014

Well, it's just that according to the doc, setting factr as it is will give 1e-7, which means the function will only stop is the relative difference between two evaluation fo the cost function is lower than 1e-7.

They also list 10 as being of extreme precision, so 1e-7 is way overkill and should probably never converge for this criterion, only to hit the maximum number of iterations everytime instead.

@Garyfallidis

This comment has been minimized.

Member

Garyfallidis commented Nov 24, 2014

Okay, can you make some suggestions for default options which you think they will be good?

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Nov 24, 2014

Using the default they are using. If they picked that, there is probably a good reason, so why change it after all. As it stands it's
factr=10000000.0
which will fall back to 0.1 if the machine precision is 1e-8 and 1e-9 if it's 1e-16 (and I still find that low, but oh well).

@Garyfallidis

This comment has been minimized.

Member

Garyfallidis commented Nov 24, 2014

When we tried the default values with the streamline-based registration they didn't work. And the cost function was very simple. Anyway will have a look again. But let me make something clear. At this stage I am not trying to make a general Optimizer class we don't have enough user cases yet to design something like that. What I am trying to do is make a simple class that deals with different version of Scipy. Hopefully, when we switch to Scipy 0.11 or 0.12 we can forget about all these and just use the wrapper function minimize which simplifies playing with different parameters across different optimizers.

@samuelstjean

This comment has been minimized.

Contributor

samuelstjean commented Nov 24, 2014

Well, it was designed as a general optimization wrapper for starters, so it largely depends on the cost function used. If you use it on anything least-square based with large numbers (like registration of images based on intensities) it is way too high I'd say. As for specilalized application like the bundle registration, you might right away reach the convergence criterion with those same values.

It's hard to set defaults as the use case might be largely different of course, so there is no single answer here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment