Use Exitflag in Optimization Result #28

lindahua · 2013-05-05T14:44:08Z

An optimization procedure may terminates for a variety of reasons:

converged (changes are smaller than the tolerance)
reached maximum number of iterations without convergence
failed to find a good search direction
numerical problems (e.g. the solution went unbounded, Inf or Nan occurred, etc)

Currently, the optimization functions returns OptimizationResults use a field converged to indicate the condition of exit. But this may not be able to provide accurate information if the procedure was terminated due to special reasons (e.g. numerical problems).

Using a more informative exitflag (instead of only a boolean variable) also addresses the problems such as the one you encountered at (line 209 of l_bfgs.jl). In such cases, you can simply terminate the procedure, and use a proper exitflag to tell the caller what happened.

Here is a possible list of exit flags: http://www.mathworks.com/help/optim/ug/fmincon.html

However, I think using symbols instead of integers might make it more user-friendly.

The text was updated successfully, but these errors were encountered:

lindahua · 2013-05-05T14:49:44Z

Note: NLOpt.jl also uses a set of integers to indicate different termination reasons. (See line 73-86 in https://github.com/stevengj/NLopt.jl/blob/master/src/NLopt.jl)

johnmyleswhite · 2013-05-05T14:49:58Z

Very good point. I've been thinking about expanding on the simple Boolean exit status for a while. Symbols do seem like a good approach. I'll read through the flags from fmincon and try to add them.

For me, the biggest open question is the boundary between true error conditions in which we should raise an error and conditions in which we return results with warnings that convergence was not reached.

johnmyleswhite · 2013-05-05T14:51:51Z

Looking through NLOpt, it seems like we should implement multiple convergence diagnostics -- e.g. convergence in gradient norm vs. convergence in function values vs. convergence in state. I've been debating this for some time, but hesitated. Since NLOpt is doing it, it seems like we'd be wise to follow suit.

timholy · 2013-05-05T17:56:20Z

Convergence criteria are an interesting topic. It seems that most optimization routines threshold the gradient (e.g., all components have absolute value < 1e-6). However, the physicist in me just cringes: I always imagine my different variables having different units, so with this criterion you're comparing convergence in "per-parsec" vs "per-microsecond," which makes no sense. Another way to say it is that this criterion is not scale-invariant. For that reason, I could not bring myself to adopt this criterion in cgdescent.

Perhaps I shouldn't worry so much about this; it does bother me that cgdescent is different, I think, from the majority of other algorithms. And it's also the case that the intermediate steps in optimization are also not scale-invariant (at least not until the Hessian gets "learned"), but to me it seems more important to have that property for a stopping criterion than for the intermediates in a calculation.

For what it's worth, one that does work out from a "units" perspective is this one:

max(abs(g.*dx)) < tol*(abs(f_new) + abs(f_old))

g is the gradient, dx is the step taken (their product has the same units as the objective function), tol is a dimensionless number (e.g., sqrt(eps(T))), and f_new and f_old are the function value on the current and previous iteration. I use both of them in case the function value happens to "pass through" zero. Whether one wants max or mean or something else is debatable (it looks like I used mean in cgdescent), but thinking about it I suspect max is the better choice.

johnmyleswhite · 2013-05-05T18:20:44Z

I'm very sympathetic to the concerns about units, but I'm not sure that scale-invariance is ultimately essential: the conventional measures of the difficulty of basic quadratic optimization problems like condition number are arguably themselves not scale-invariant.

For CG, Nocedal and Wright mention using this measure of convergence,

max(abs(g)) < tol * (1 + abs(f_new))

which is quite close to the one you're describing -- but different enough to not be scale-invariant.

For now, I think we should stick with conventional metrics -- while allowing users some flexibility to select among competing standards.

johnmyleswhite · 2013-05-06T01:30:30Z

As a first pass at this, I've enlarged the OptimizationResults so that it separately specifies whether the function values converged or the gradient converged. It also includes a full set of function values encountered along the way for times when the trajectory is important without maintaining a full trace.

johnmyleswhite · 2013-05-06T04:26:53Z

I've finished my draft work on this with 1dd3392: the algorithms now assess convergence in terms of change in x, change in f(x) and the norm of the gradient, gr. All of this information is stored in OptimizationResults using Bool's along with information about exceeding the maximum number of iterations.

For now, I think we're done, but would like to see if others think we need more information than this.

johnmyleswhite · 2013-07-10T02:32:26Z

What do people want to do here? Right now, you get information about convergence in x, f(x) and norm(g(x), Inf). Which of the other errors from NLopt seem worth implementing?

timholy · 2014-03-08T13:48:22Z

As you can tell, I'm back at looking at this package again. To me the information seems quite adequate.

One small thing I noticed: should converged be exported?

pkofod · 2015-10-13T17:16:35Z

The current convergence information seems sufficient to me.

johnmyleswhite · 2015-10-14T03:01:36Z

Agreed.

xiaohk · 2017-01-06T19:44:53Z

Maybe one related issue. #331

johnmyleswhite closed this as completed Oct 14, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use Exitflag in Optimization Result #28

Use Exitflag in Optimization Result #28

lindahua commented May 5, 2013

lindahua commented May 5, 2013

johnmyleswhite commented May 5, 2013

johnmyleswhite commented May 5, 2013

timholy commented May 5, 2013

johnmyleswhite commented May 5, 2013

johnmyleswhite commented May 6, 2013

johnmyleswhite commented May 6, 2013

johnmyleswhite commented Jul 10, 2013

timholy commented Mar 8, 2014

pkofod commented Oct 13, 2015

johnmyleswhite commented Oct 14, 2015

xiaohk commented Jan 6, 2017

Use Exitflag in Optimization Result #28

Use Exitflag in Optimization Result #28

Comments

lindahua commented May 5, 2013

lindahua commented May 5, 2013

johnmyleswhite commented May 5, 2013

johnmyleswhite commented May 5, 2013

timholy commented May 5, 2013

johnmyleswhite commented May 5, 2013

johnmyleswhite commented May 6, 2013

johnmyleswhite commented May 6, 2013

johnmyleswhite commented Jul 10, 2013

timholy commented Mar 8, 2014

pkofod commented Oct 13, 2015

johnmyleswhite commented Oct 14, 2015

xiaohk commented Jan 6, 2017