Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix optimize defaults for older scipy versions for L-BFGS-B #476

Merged
merged 4 commits into from Nov 23, 2014

Conversation

Garyfallidis
Copy link
Contributor

Enabling Optimizer class to work with < 0.12 versions of Scipy for L-BFGS-B can be tricky. The parameters of L-BFGS-B function changed from 0.11 to 0.12. Here is a workaround. I also added some more prints reporting what is actually happening when < 0.12. GGT! @arokem release the Kraken!

@@ -104,7 +99,7 @@ def __init__(self, fun, x0, args=(), method='L-BFGS-B', jac=None,

callback : callable, optional
Called after each iteration, as ``callback(xk)``, where ``xk`` is
the current parameter vector.
the current parameter vector. Only available using Scipy 0.12+.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nit: I'd use "scipy > 0.12", rather than "0.12+"

@arokem
Copy link
Contributor

arokem commented Nov 22, 2014

General thought: with the new Travis config, should we setup up CI testing for both scipy<0.12 and for scipy>=0.12? We already have 6 machines tested there...

@Garyfallidis
Copy link
Contributor Author

@arokem I changed the 0.12+ to >=0.12

As about setting up CI testing I think it would be great to have a range of different numpy and scipy versions. So be my guest. Go forward with merging this. And will push in a bit the other one with the enhancement of SLR to support previous transformations.

@arokem
Copy link
Contributor

arokem commented Nov 23, 2014

Sure. Will wait for Travis to finish and then merge this in. More changes
coming in this moduel in the PRs I have been working, after I rebase on
this.

On Sun, Nov 23, 2014 at 2:29 PM, Eleftherios Garyfallidis <
notifications@github.com> wrote:

@arokem https://github.com/arokem I changed the 0.12+ to >=0.12

As about setting up CI testing I think it would be great to have a range
of different numpy and scipy versions. So be my guest. Go forward with
merging this. And will push in a bit the other one with the enhancement of
SLR to support previous transformations.


Reply to this email directly or view it on GitHub
#476 (comment).

@Garyfallidis
Copy link
Contributor Author

Travis has finished. GGT!

arokem added a commit that referenced this pull request Nov 23, 2014
Fix optimize defaults for older scipy versions for L-BFGS-B
@arokem arokem merged commit 23e9c2b into dipy:master Nov 23, 2014
@arokem
Copy link
Contributor

arokem commented Nov 23, 2014

Yeah - I am not very happy with this - test coverage on this is very poor
on that file (40%), but I will merge this anyway, so you don't have to wait
for this for the SLR stuff. I wouldn't be amazed if this comes back to bite
us on the buildbots, etc.

On Sun, Nov 23, 2014 at 2:49 PM, Eleftherios Garyfallidis <
notifications@github.com> wrote:

Travis has finished. GGT!


Reply to this email directly or view it on GitHub
#476 (comment).

self._evol_kx = None

_eps = np.finfo(float).eps
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is eps changed to the lowest possible value? you might get convergence issues at line 155 for example, and anywhere else it's used.

@matthew-brett
Copy link
Contributor

Really - have we merged a file with 40% coverage?

@arokem
Copy link
Contributor

arokem commented Nov 23, 2014

It will never be much more than that. If scipy<0.12 it runs one block of
code, if scipy>=0.12 it runs another block. Any ideas? Should I revert that
merge?

On Sun, Nov 23, 2014 at 3:02 PM, Matthew Brett notifications@github.com
wrote:

Really - have we merged a file with 40% coverage?


Reply to this email directly or view it on GitHub
#476 (comment).

@Garyfallidis
Copy link
Contributor Author

Hey wait a minute guys. The coverage is low because it depends on which version of scipy you are running.

@samuelstjean
Copy link
Contributor

I'm really wondering about the tolerance set at the bare floating point minimum though.

@Garyfallidis
Copy link
Contributor Author

Otherwise we need to separate the tests into two functions and skip the tests with the different scipy versions. This is an alternative.

@arokem
Copy link
Contributor

arokem commented Nov 23, 2014

yep - we're having two conversations here ...

On Sun, Nov 23, 2014 at 3:05 PM, Samuel St-Jean notifications@github.com
wrote:

I'm really wondering about the tolerance set at the bare floating point
minimum though.


Reply to this email directly or view it on GitHub
#476 (comment).

@arokem
Copy link
Contributor

arokem commented Nov 23, 2014

now it's three - including the meta-conversation I just started :-)

On Sun, Nov 23, 2014 at 3:05 PM, Ariel Rokem arokem@gmail.com wrote:

yep - we're having two conversations here ...

On Sun, Nov 23, 2014 at 3:05 PM, Samuel St-Jean notifications@github.com
wrote:

I'm really wondering about the tolerance set at the bare floating point
minimum though.


Reply to this email directly or view it on GitHub
#476 (comment).

@arokem
Copy link
Contributor

arokem commented Nov 23, 2014

I don't think that would solve the problem - there would still be low
coverage on the module code itself. Or am I misunderstanding your proposal?

On Sun, Nov 23, 2014 at 3:05 PM, Eleftherios Garyfallidis <
notifications@github.com> wrote:

Otherwise we need to separate the tests into two functions and skip the
tests with the different scipy versions. This is an alternative.


Reply to this email directly or view it on GitHub
#476 (comment).

@Garyfallidis
Copy link
Contributor Author

Okay look I will make another PR using the skip mechanism. Maybe that is better and will not reduce the coverage. Let me give it a try.

@arokem
Copy link
Contributor

arokem commented Nov 23, 2014

As for _eps - I think that it's used to set the factr keyword so that
the number of iterations to be the same across different values of the
floating point minimum.

On Sun, Nov 23, 2014 at 3:05 PM, Samuel St-Jean notifications@github.com
wrote:

I'm really wondering about the tolerance set at the bare floating point
minimum though.


Reply to this email directly or view it on GitHub
#476 (comment).

@arokem
Copy link
Contributor

arokem commented Nov 23, 2014

Maybe use Travis to test on both versions of scipy, and compare the
coverage reports, to see that they have complementary 'holes'?

On Sun, Nov 23, 2014 at 3:10 PM, Eleftherios Garyfallidis <
notifications@github.com> wrote:

Okay look I will make another PR using the skip mechanism. Maybe that is
better and will not reduce the coverage. Let me give it a try.


Reply to this email directly or view it on GitHub
#476 (comment).

@Garyfallidis
Copy link
Contributor Author

Okay no. The coverage will be small no matter what I do. I don't see how I can increase the coverage here. This is truly conditional here on the scipy version that you have on your machine. If you don't have scipy >= a large part of the code will not be tested. And the opposite.

@Garyfallidis
Copy link
Contributor Author

@arokem with what scipy version was the coverage 40%? Do you remember? I cannot see Travis results anymore.

@Garyfallidis
Copy link
Contributor Author

If you don't remember. No worries.

@arokem
Copy link
Contributor

arokem commented Nov 23, 2014

It didn't run lines 130-210, so must be >=0.12

On Sun, Nov 23, 2014 at 3:33 PM, Eleftherios Garyfallidis <
notifications@github.com> wrote:

If you don't remember. No worries.


Reply to this email directly or view it on GitHub
#476 (comment).

@samuelstjean
Copy link
Contributor

I don't see how factr is related to the number of iterations, according to the doc at http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.optimize.fmin_l_bfgs_b.html it's on a way overkill level that will never converge/take forever.

By default,
eps=1e-8
ftol = 1e-7
factr=options['ftol']/_eps = 450359962.73704958 (which is between 1e8 and 1e9)

Converge if <= factr * eps, with the doc saying : Typical values for factr are: 1e12 for low accuracy; 1e7 for moderate accuracy; 10.0 for extremely high accuracy. Currently it stands at ftol = 1e-7.

Default scipy wil be factr=10000000.0 * 1e-8 = 0.1, and if they use eps=1e-16 on 64 bits system (do they actually? doc says eps is the machine precision, so that's not really clear) factr=1e-09.

Is there any reason to change those defaults? And the number of maxiteration is already controlled by another variable, this is for the relative convergence of the cost function rather.

@Garyfallidis
Copy link
Contributor Author

@samuelstjean you are looking at the wrong scipy version dude! Please be more careful with your comments.

@arokem
Copy link
Contributor

arokem commented Nov 23, 2014

The default value of the factr kwarg is 10000000.0

On Sun, Nov 23, 2014 at 3:36 PM, Samuel St-Jean notifications@github.com
wrote:

I don't see how factr is related to the number of iterations, according to
the doc at
http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.optimize.fmin_l_bfgs_b.html
it's on a way overkill level that will never converge/take forever.

By default,
eps=1e-8
ftol = 1e-7
factr=options['ftol']/_eps = 450359962.73704958 (which is between 1e8 and
1e9)

Converge if <= factr * eps, with the doc saying : Typical values for factr
are: 1e12 for low accuracy; 1e7 for moderate accuracy; 10.0 for extremely
high accuracy. Currently it stands at ftol = 1e-7.

Default scipy wil be factr=10000000.0 * 1e-8 = 0.1, and if they use
eps=1e-16 on 64 bits system (do they actually? doc says eps is the machine
precision, so that's not really clear) factr=1e-09.

Is there any reason to change those defaults? And the number of
maxiteration is already controlled by another variable, this is for the
relative convergence of the cost function rather.


Reply to this email directly or view it on GitHub
#476 (comment).

@arokem
Copy link
Contributor

arokem commented Nov 23, 2014

OK - turns out we did have a machine running the minimal versions of the
dependencies, so scipy 0.9. Maybe all we need from it is also a coverage
report? #479

On Sun, Nov 23, 2014 at 3:31 PM, Eleftherios Garyfallidis <
notifications@github.com> wrote:

Okay no. The coverage will be small no matter what I do. I don't see how I
can increase the coverage here. This is truly conditional here on the scipy
version that you have on your machine. If you don't have scipy >= a large
part of the code will not be tested. And the opposite.


Reply to this email directly or view it on GitHub
#476 (comment).

@samuelstjean
Copy link
Contributor

What do you mean, this branch of code I mentionned is entered if scipy < 0.12 is it not? Still exists on 0.14, and they seem to have added a maxiter argument as the only difference. Anyway, I still don't see how factr and the number of iterations are related nor should be changed.

@Garyfallidis
Copy link
Contributor Author

I didn't invent this by myself I saw what they do in minimize to call fmin_lbfgs_b with a common interface of parameters. I did this sometime ago. I will check again just in case I missed something.

@samuelstjean
Copy link
Contributor

Well, it's just that according to the doc, setting factr as it is will give 1e-7, which means the function will only stop is the relative difference between two evaluation fo the cost function is lower than 1e-7.

They also list 10 as being of extreme precision, so 1e-7 is way overkill and should probably never converge for this criterion, only to hit the maximum number of iterations everytime instead.

@Garyfallidis
Copy link
Contributor Author

Okay, can you make some suggestions for default options which you think they will be good?

@samuelstjean
Copy link
Contributor

Using the default they are using. If they picked that, there is probably a good reason, so why change it after all. As it stands it's
factr=10000000.0
which will fall back to 0.1 if the machine precision is 1e-8 and 1e-9 if it's 1e-16 (and I still find that low, but oh well).

@Garyfallidis
Copy link
Contributor Author

When we tried the default values with the streamline-based registration they didn't work. And the cost function was very simple. Anyway will have a look again. But let me make something clear. At this stage I am not trying to make a general Optimizer class we don't have enough user cases yet to design something like that. What I am trying to do is make a simple class that deals with different version of Scipy. Hopefully, when we switch to Scipy 0.11 or 0.12 we can forget about all these and just use the wrapper function minimize which simplifies playing with different parameters across different optimizers.

@samuelstjean
Copy link
Contributor

Well, it was designed as a general optimization wrapper for starters, so it largely depends on the cost function used. If you use it on anything least-square based with large numbers (like registration of images based on intensities) it is way too high I'd say. As for specilalized application like the bundle registration, you might right away reach the convergence criterion with those same values.

It's hard to set defaults as the use case might be largely different of course, so there is no single answer here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants