FEAT: Add the Nelder-Mead algorithm #441

QBatista · 2018-10-16T16:35:17Z

Adds a jitted implementation of the Nelder-Mead algorithm for multivariate optimization.

The two main references are Lagarias et. al (1998) for the algorithm and Singer and Singer (2004) for efficiency.

It might be a good idea to have a common interface for scalar_maximization.py and this multivariate version. Please let me know if you have any suggestions for setting up this interface and choosing
names.

Performance

import quantecon as qe
from numba import njit
import numpy as np

@njit
def rosenbrock(x):
    return -(100 * (x[1] - x[0] ** 2) ** 2 + (1 - x[0])**2)

x0 = np.array([-2, 1])

qe.optimize.maximize(rosenbrock, x0, tol_x=1e-20)
%timeit qe.optimize.maximize(rosenbrock, x0, tol_x=1e-20)

106 µs ± 6.83 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

from scipy.optimize import minimize

%timeit minimize(rosenbrock, x0, method='Nelder-Mead')

6.23 ms ± 308 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Close #419

natashawatkins · 2018-10-17T00:28:51Z

I agree with having a common interface which can be used for either scalar or multivariate

Maybe we could do something similar to scipy and have both a common interface to the function and individual functions, ie. nelder_mead and brentq, and a function maximize with a method argument

natashawatkins · 2018-10-17T00:31:20Z

But if we don't decide for go for a common interface for now, I think we should call this function something more like nelder_mead to differentiate from brent_max

- Jitting is commented out for generating coverage report

coveralls · 2018-10-17T02:26:35Z

Coverage increased (+0.1%) to 94.571% when pulling 09016f9 on QBatista:multivariate_optim into 3bb4d7d on QuantEcon:master.

natashawatkins · 2018-10-19T15:51:29Z

I've tried adding this to the lecture but don't seem to get convergence.. here is a look at the output from this function (stored in solver) vs. the 'brute-force' method we're currently using

Here is the piece of code just in case you spot something wrong with it:

Also why is the bounds parameter set to have a shape of two by default? Is this needed?

Thanks

QBatista · 2018-10-19T17:31:31Z

The bounds parameter is set this way to get Numba to work without using generated_jit. Would it be possible for you to upload the notebook so that I can have a look at it?

natashawatkins · 2018-10-19T18:25:10Z

Is that an issue for maximising with more than 2 arguments? Should we have something that automatically detects the size of params and creates the bounds?

natashawatkins · 2018-10-19T18:26:48Z

Here's the notebook: https://gist.github.com/natashawatkins/807eb4e19fd86ae1ebc8f6dcadd530fb

QBatista · 2018-10-19T20:06:33Z

It looks like the problem is coming from the algorithm. Both my version and Scipy often seem to cycle. I can sometimes get convergence by tweaking the parameters depending on the JVWorker instance, but this occurs quite rarely. Most other methods in Scipy seem to fail, but TNC sometimes work. However, this also doesn't seem to be very robust -- changing the initial guess causes failure.

natashawatkins · 2018-10-19T20:44:10Z

So in the live lecture at the moment, it's using the COBYLA method - do you think we should implement this one in the QuantEcon library?

QBatista · 2018-10-20T01:56:50Z

Actually, after checking the live lecture, the problem seems to be that the constraints are missing. By passing bounds = np.array([[0., 1.], [0., 1.]]) as in the live lecture, it seems to be converging consistently. The graph is slightly different though. I tried a naive way of encoding s + ϕ <= 1 by changing the objective function to return -np.inf if the constraint is not satisfied, but the algorithm seems to eventually get stuck.

natashawatkins · 2018-10-22T01:02:58Z

Is there a way we can add a constraints argument to the function? Also are you finding it quite slow? It seems to be a lot slower than the 'brute force' method

chrishyland · 2018-10-22T22:43:24Z

Hey @natashawatkins ,
Just thought to jump in but in a previous iteration I had of going about implementing the bounded constraints just involved shifting a point to its closest boundary if a point is outside the feasible region. This is touched on in here if you scroll towards the end.

QBatista · 2018-10-23T03:00:14Z

@chrishyland Thanks for commenting on this. From your notebook, I see how you've implemented box constraints such as a<=x_n<=b by setting x_n equal to the closest boundary when the constraint is violated. However, I was reluctant to use this technique because it modifies the algorithm, and instead, I set the value returned by the function to be maximized to be -np.inf when the constraint is violated. However, this didn't seem to work for linear inequality constraints of the form x_1 + ... + x_n <= b. Do you have a better idea for implementing such constraints?

chrishyland · 2018-10-23T03:08:25Z

Hi @QBatista ,
I believe that in the other link that the box constraints implementation implementation would be sufficient for most problems and I recall seeing an implementation somewhere (will have to look again) in a paper that uses something like this. I’ve struggled as well to find a linear inequality constraint solution that you mentioned as it appears that the Nelder-Mead algorithm is an unconstrained optimisation problem, so I’m not sure if adding such constraints is possible unless you alter the algorithm? However, I recall that @jstac saying that just having box constraints is sufficient for now for Quantecon’s needs. Sorry I couldn’t be of more help but I’d be happy to discuss things with you if you had any other ideas on what to do.

QBatista · 2018-11-04T06:12:57Z

@jstac @natashawatkins Having a common interface for the Nelder-Mead algorithm and brent_max would require choosing a common set of required arguments. However, the Nelder-Mead algorithm requires an initial guess of the solution x0 while brent_max doesn't.

One option would be to follow Scipy, which uses two different interfaces, one dedicated to function minimization with only one variable (called minimize_scalar) and a more general one which handles functions with both one or more variables (called minimize). The algorithms behind both interfaces are different: brent is used in minimize_scalar, while Nelder-Mead is used in minimize.

Another option would be to make the initial guess argument optional and raise an error message if the method used requires an initial guess. Scipy uses a similar idea for its L-BFGS-B method in the minimize interface, where the Jacobian of the function is a keyword argument, but an error message is raised if the Jacobian is not passed because this particular method requires it.

In terms of maintenance, the first option would have a more straightforward implementation, especially since we want the interface to be jitted, and therefore should be easier to maintain.

@oyamad Please let me know if you have any suggestions for designing this interface.

jstac · 2018-11-04T06:20:14Z

Thanks @QBatista

In option one it's not clear from your discussion how the initial guess issue is resolved. The distinction in SciPy is between multivariate and univariate routines, is it not?

Can you be a bit more specific about the proposed interface?

QBatista · 2018-11-04T06:43:34Z

@jstac In Scipy, the methods that do not require an initial guess are methods which can be exclusively applied to functions with only one variable, and therefore, they all fall in scalar_minimization. Meanwhile, the more general methods that can be used for functions with one or more variables all require an initial guess, and are accessed through minimize. Therefore, the minimize interface requires an initial guess to be passed, while minimize_scalar has no option for passing an initial guess. This means that a method which requires an initial guess but can be applied exclusively to functions with only one variable would not fit into any of these two interfaces.

jstac · 2018-11-04T07:13:28Z

Thanks for the explanation. I guess something analogous would be OK for us.

natashawatkins · 2018-11-05T07:14:22Z

Should we just stick to having separate functions for different algorithms? What was the issue regarding generated jit again?

QBatista · 2018-11-06T04:24:09Z

That's another option -- as long as we don't have too many methods, we could simply have a set of maximize_"method_name" functions.

@natashawatkins generated_jit makes it possible to write a function that has different implementations depending on the input type. This is relevant here because the set of methods that we can use depends on the type of input function. For example, we cannot use brent_max for functions with more than one variables, so if we pass this type of function, the interface should not allow to call this method. However, using this functionality makes the interface more complex, and therefore, it is more costly to maintain. The issue which I mention above is different though -- the problem is that there is no natural set of common required arguments because some methods require an initial guess while others don't.

natashawatkins · 2018-11-06T04:26:50Z

I don't like the idea of having maximize and maximize_scalar because I don't think it's very explicit that there are different algorithms available...

I think we could just call it qe.maximize.brentq etc.

Doesn't this get rid of the generated_jit issue as well?

QBatista · 2018-11-06T04:40:03Z

Yes, that's fine too. Just one thing to note: we currently also have root finding methods in qe.optimize and these would not fit in qe.maximize. For example, brentq is a root finding method while brent is a maximization method. SciPy puts both root finding methods and optimization methods in optimize, so we could either put all of them in qe.optimize along with maximization methods, or have both qe.maximize and qe.root_finding (or something similar).

natashawatkins · 2018-11-06T04:42:49Z

sorry I meant qe.optimize.brentq

jstac · 2018-11-06T04:43:42Z

I'm in favor of @natashawatkins's suggestion.

QBatista · 2018-11-06T04:47:30Z

Sounds good -- I'll make the changes soon.

QBatista · 2018-11-07T05:49:32Z

This is now ready for review.

jstac · 2018-11-07T08:39:28Z

Cleverly written, well documented, excellent tests. Great job @QBatista.

mmcky · 2018-11-11T22:37:33Z

thanks @QBatista this is looking good. I will attach a ready tag and merge tomorrow.

mmcky · 2018-11-11T22:40:09Z

Once this is merged -- I will issue a new release via PyPI

natashawatkins · 2018-11-21T23:56:10Z

@mmcky are you releasing a new version of the library? I might use this in a lecture

QBatista force-pushed the multivariate_optim branch from 3e5feb0 to 3dbc5cf Compare October 17, 2018 00:28

FEAT: Add Nelder-Mead algorithm

cee14b8

- Jitting is commented out for generating coverage report

QBatista force-pushed the multivariate_optim branch from 3dbc5cf to cee14b8 Compare October 17, 2018 00:35

Uncomment @njit

df4a45b

Minor fixes

09016f9

QBatista force-pushed the multivariate_optim branch from 2c421b5 to 09016f9 Compare November 7, 2018 05:40

QBatista changed the title ~~[WIP] FEAT: Add the Nelder-Mead algorithm~~ [MRG] FEAT: Add the Nelder-Mead algorithm Nov 7, 2018

mmcky added the ready label Nov 11, 2018

mmcky changed the title ~~[MRG] FEAT: Add the Nelder-Mead algorithm~~ FEAT: Add the Nelder-Mead algorithm Nov 14, 2018

mmcky merged commit 01ee68d into QuantEcon:master Nov 14, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEAT: Add the Nelder-Mead algorithm #441

FEAT: Add the Nelder-Mead algorithm #441

QBatista commented Oct 16, 2018 •

edited

Loading

natashawatkins commented Oct 17, 2018

natashawatkins commented Oct 17, 2018

coveralls commented Oct 17, 2018 •

edited

Loading

natashawatkins commented Oct 19, 2018

QBatista commented Oct 19, 2018

natashawatkins commented Oct 19, 2018

natashawatkins commented Oct 19, 2018

QBatista commented Oct 19, 2018

natashawatkins commented Oct 19, 2018

QBatista commented Oct 20, 2018

natashawatkins commented Oct 22, 2018

chrishyland commented Oct 22, 2018

QBatista commented Oct 23, 2018

chrishyland commented Oct 23, 2018

QBatista commented Nov 4, 2018

jstac commented Nov 4, 2018

QBatista commented Nov 4, 2018

jstac commented Nov 4, 2018

natashawatkins commented Nov 5, 2018

QBatista commented Nov 6, 2018

natashawatkins commented Nov 6, 2018

QBatista commented Nov 6, 2018

natashawatkins commented Nov 6, 2018

jstac commented Nov 6, 2018

QBatista commented Nov 6, 2018

QBatista commented Nov 7, 2018

jstac commented Nov 7, 2018

mmcky commented Nov 11, 2018

mmcky commented Nov 11, 2018

natashawatkins commented Nov 21, 2018

FEAT: Add the Nelder-Mead algorithm #441

FEAT: Add the Nelder-Mead algorithm #441

Conversation

QBatista commented Oct 16, 2018 • edited Loading

Performance

natashawatkins commented Oct 17, 2018

natashawatkins commented Oct 17, 2018

coveralls commented Oct 17, 2018 • edited Loading

natashawatkins commented Oct 19, 2018

QBatista commented Oct 19, 2018

natashawatkins commented Oct 19, 2018

natashawatkins commented Oct 19, 2018

QBatista commented Oct 19, 2018

natashawatkins commented Oct 19, 2018

QBatista commented Oct 20, 2018

natashawatkins commented Oct 22, 2018

chrishyland commented Oct 22, 2018

QBatista commented Oct 23, 2018

chrishyland commented Oct 23, 2018

QBatista commented Nov 4, 2018

jstac commented Nov 4, 2018

QBatista commented Nov 4, 2018

jstac commented Nov 4, 2018

natashawatkins commented Nov 5, 2018

QBatista commented Nov 6, 2018

natashawatkins commented Nov 6, 2018

QBatista commented Nov 6, 2018

natashawatkins commented Nov 6, 2018

jstac commented Nov 6, 2018

QBatista commented Nov 6, 2018

QBatista commented Nov 7, 2018

jstac commented Nov 7, 2018

mmcky commented Nov 11, 2018

mmcky commented Nov 11, 2018

natashawatkins commented Nov 21, 2018

QBatista commented Oct 16, 2018 •

edited

Loading

coveralls commented Oct 17, 2018 •

edited

Loading