Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: add scipy's minimize to optimizer #3193

Merged
merged 1 commit into from Feb 27, 2017
Merged

ENH: add scipy's minimize to optimizer #3193

merged 1 commit into from Feb 27, 2017

Conversation

inoryy
Copy link
Contributor

@inoryy inoryy commented Sep 10, 2016

Initial implementation for #2629.

Right now I've used it to add a proof-of-concept access to dogleg optimization method.

As a next step I see a lot of room for refactoring of existing internal _fit_* methods to make use of the added _fit_minimize, rather than accessing scipy's methods directly. @josef-pkt would that fit in with what you've envisioned for #2629?

@@ -156,11 +155,14 @@ def _fit(self, objective, gradient, start_params, fargs, kwargs,
extra_fit_funcs = kwargs.setdefault('extra_fit_funcs', dict())

methods = ['newton', 'nm', 'bfgs', 'lbfgs', 'powell', 'cg', 'ncg',
'basinhopping']
'basinhopping', 'dogleg']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought of adding here a 'minimize' keyword directly, so we can use whatever scipy minimize provides without having to hardcode any methods.
There user will have to provide the name of the actual method, either in the **keywords, or we could try parsing a double method here e.g. `'minimize-dogleg'.split('-')

adding some specific methods like you did with dogleg is also possible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely like that idea more, but what about existing methods? Should I keep them as is or refactor to make use of _fit_minimize?

On a somewhat related note, I've noticed that newton method has it's own implementation entirely. Is there a specific edge case that scipy.optimize.newton doesn't cover?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I answered the first part in main comments.

about newton: the statsmodels newton predates scipy's dogleg and newton by many years. #449
the newer scipy newton style optimizer are one of the prime candidates for adding as default use, are are potentially big improvements compared to our newton.
Again, our newton works reasonably well (and we have more control over some details, like regularization), so we need separate keywords. But I would guess that if the scipy newton methods work as advertised, then we will switch the discrete_models to use those as default.

@josef-pkt
Copy link
Member

related to travis failures: the problems are with oldest scipy version which will be dropped as supported versions, most likely before this is merged. So, you can ignore those failures for now.
Or add a conditional skipif to the test, to stay with an easier to check green travis.

@josef-pkt
Copy link
Member

To continue the inline comment:

The plan was to add a generic method as enhancement so we have access to whatever scipy minimize provides without having to hardcode the options to minimize (including selection of optimizer method).

It's safer to stick with the enhancement in this PR, and not replace/refactor the existing use of the old connected optimizers. The old wrappers have been working for a long time, but we don't have any experience with how well minimize works (and whether it's overhead doesn't cause problems or slow-downs). There is also a high risk (I guess it's sure) that there are incompatibility in the current keyword use for extra keywords to the optimizers, where either the name or default has changed.

And idea for a followup is to create a testing PR that internally switches the old methods to go through minimize, e.g. delegate bfgs to minimize-bfgs. This PR would not be merged for some time, but it would give us a good idea about how well the current test suite works with minimize, and whether there are only minor problems or some widespread incompatibilities.

@inoryy
Copy link
Contributor Author

inoryy commented Sep 10, 2016

@josef-pkt alright, that makes sense. Thanks for the detailed responses, I should be able to wrap up this PR now.

@inoryy
Copy link
Contributor Author

inoryy commented Sep 11, 2016

@josef-pkt should be working now. If all is ok, I'll start working on documentation updates.

xopt = res.x
retvals = None
if full_output:
retvals = {'fopt': res.fun, 'iterations': res.nit,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since it's just one line, we can add scipy 0.14 compatibility, AFAICS
use np.nan if res.nit is not available
nit = getattr(res, 'nit', np.nan)

@josef-pkt
Copy link
Member

Looks good to me, you can add the documentation for the way it is now.

I still need to check some details
do we need to worry about any of the scipy version incompatibilities
Should and can we avoid the runtime error for extra gradient and hessian? since it's essentially statsmodels that is causing them.

options['maxiter'] = maxiter

# we're using BFGS by default, which doesn't need hessian
if kwargs['min_method'] == None: hess = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to avoid the runtimewarning (based on your comment in the tests, I didn't check myself)
we could expand this to the other major "known" min_methods
e.g.

no_hess = ['Nelder-Mead', 'bfgs', ...]   # could also be defined outside of function, but likely not worth doing that
if kwargs['min_method'] in no_hess: 
    hess = None

@inoryy
Copy link
Contributor Author

inoryy commented Sep 11, 2016

do we need to worry about any of the scipy version incompatibilities

I think that's reasonable, even if the plan is to move away from older versions eventually. Shouldn't complicate the code too much.

Should and can we avoid the runtime error for extra gradient and hessian? since it's essentially statsmodels that is causing them.

Note that these errors appear only in the test suite due to the way it's set up. In the real world a user would follow the documentation and/or knowledge of how methods work, so the only way he'd see the extra hessian/jac errors is if he's done something wrong - in which case it most definitely makes sense to notify him about it.

@josef-pkt
Copy link
Member

About the last point RuntimeWarning for jacobian and hessian:
AFAIR, we add those internally, and there is no way for users to disable them.
We can check what the existing wrapper for "nm' for example is doing.

@inoryy
Copy link
Contributor Author

inoryy commented Sep 11, 2016

@josef-pkt ah, okay then filtering is the way to go. 👍

@josef-pkt
Copy link
Member

for scipy version compatibility:
Your added code to wrap minimize is generic and doesn't depend on the availability of specific min_methods in different versions, except for changes to scipy's minimize options and return keyword like nit AFAICS. Users will get the error from scipy.minimize if their optimization option are not compatible with their scipy version. (that's not our problem)
However, the unit tests for specific min_methods will depend on when they have been added to scipy. E.g. min_method='dogleg' requires a scipy version where dogleg has been added.
That means SkipError will need to be conditional on the specific method and not on the availability of scipy minimize itself.

@josef-pkt
Copy link
Member

BTW: this is going to be a nice focused enhancement PR that is easy to review and merge, but adds a lot of improvement options.

@josef-pkt josef-pkt added this to the 0.9 milestone Sep 11, 2016
@inoryy
Copy link
Contributor Author

inoryy commented Sep 12, 2016

Compatibility issues should be fixed.

I've removed the has_minimize check in tests, since it's available even in lowest supported scipy version (0.11). The check for has_dogleg looks funky since scipy doesn't provide a cleaner way to tell.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.005%) to 89.256% when pulling 07a0807 on Inoryy:scipy-minimize into bb1db2b on statsmodels:master.

@inoryy
Copy link
Contributor Author

inoryy commented Sep 12, 2016

Added documentation as well. Not sure how specific should I be in there, any relevant info is available at scipy docs?

Should I do something about the coveralls error? I think the decrease only happens due to version it's checked at, where dogleg isn't available.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.005%) to 89.256% when pulling c86353e on Inoryy:scipy-minimize into bb1db2b on statsmodels:master.

@josef-pkt
Copy link
Member

@inoryy Thank you for your first contribution

Everything looks fine from what I can see reading through it. We will soon update the python/dependencies version that coveralls uses, so there is nothing to worry about. And your unit tests look good.
I think this can be squashed into one commit, given that the individual commits don't contain extra information. Can you squash (with interactive rebase)?

(I might not be back to merging/maintenance for a few more days.)

@inoryy inoryy changed the title WIP ENH: add scipy's minimize to optimizer ENH: add scipy's minimize to optimizer Sep 14, 2016
@inoryy
Copy link
Contributor Author

inoryy commented Sep 14, 2016

@josef-pkt, thank you for working me through this! I've squashed the commits, should be good to go.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.005%) to 89.256% when pulling 14b202a on Inoryy:scipy-minimize into bb1db2b on statsmodels:master.

@josef-pkt
Copy link
Member

@inoryy Can you rebase this on master?

We have changed the versions of our dependencies on TravisCI to more recent versions.

Based on the previous comments, this should be ready to merge.
I will go over it again and maybe check a difficult Poisson example to see whether the new newton methods work better than the currently available minimizers.

@inoryy
Copy link
Contributor Author

inoryy commented Feb 10, 2017

@josef-pkt done (rebase worked without conflicts)

@coveralls
Copy link

coveralls commented Feb 10, 2017

Coverage Status

Coverage increased (+0.002%) to 90.186% when pulling ce70a61 on Inoryy:scipy-minimize into c2acb1a on statsmodels:master.

@josef-pkt josef-pkt added this to finalize in try_josef_queue Feb 27, 2017
@josef-pkt
Copy link
Member

a bit behind master but it's still ok
merging.

@inoryy Thank you, that should help a lot for some difficult estimation problems

@josef-pkt josef-pkt merged commit 1328f0f into statsmodels:master Feb 27, 2017
@josef-pkt josef-pkt moved this from finalize to done in try_josef_queue Feb 27, 2017
@josef-pkt josef-pkt mentioned this pull request Apr 15, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

None yet

3 participants