Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Going below sqrt(eps) #631

Closed
antoine-levitt opened this issue Jun 20, 2018 · 10 comments
Closed

Going below sqrt(eps) #631

antoine-levitt opened this issue Jun 20, 2018 · 10 comments

Comments

@antoine-levitt
Copy link
Contributor

A number of linesearches get into various trouble when going below the dreaded sqrt(eps) precision. Example:

using Optim

n = 10
M = randn(n,n)
M = I+M'M
A = (M+M')/2
b = randn(n)
f(x) = vecdot(x,A*x) + b'x
g(x) = 2*A*x + b
g!(stor,x) = copy!(stor,g(x))
x0 = randn(n)

Optim.optimize(f, g!, x0, Optim.ConjugateGradient(linesearch=LineSearches.BackTracking()), Optim.Options(show_trace=true,allow_f_increases=true,f_tol=-1.0, g_tol=-1.0, x_tol=-1.0))

Change the linesearch and method to explore a variety of different bugs (JuliaNLSolvers/LineSearches.jl#123 is just the first one, there are others hidden behind).

It is not completely clear what to do in this regime, in which differences of objective functions don't make much sense. Of course this is an extreme precision regime, but it's sometimes useful to go there. HagerZhang seems to be consistently able to converge to machine precision, I'm not sure how. A simple solution would be to have a minimum linesize of C*eps(T)/dphi_0, which should be OK in most cases?

cc @cortner, who has experience in this matter

@anriseth
Copy link
Contributor

Sort-of relevant: JuliaNLSolvers/LineSearches.jl#120

@antoine-levitt
Copy link
Contributor Author

Very relevant!

So, now with JuliaNLSolvers/LineSearches.jl#123 in, with the code above and the default initial alpha, we have:

  • BackTracking: ERROR: LoadError: Non-finite f(x) while optimizing (NaN)
  • HagerZhang: ERROR: LoadError: ArgumentError: Value and slope at step length = 0 must be finite. (but honorable mention because it goes up to gradients of 1e-16)
  • MoreThuente: ERROR: LoadError: ArgumentError: Invalid parameters to MoreThuente.
  • Static: ERROR: LoadError: AssertionError: α > real(Tα(0))
  • StrongWolfe: infinite loop

So that looks like a good test case ;-)

@mohamed82008
Copy link
Contributor

After JuliaNLSolvers/LineSearches.jl#136, HagerZhang now doesn't error ;)

using Optim, LineSearches, LinearAlgebra

n = 10
M = randn(n,n)
M = I+M'M
A = (M+M')/2
b = randn(n)
f(x) = dot(x,A*x) + b'x
g(x) = 2*A*x + b
g!(stor,x) = copyto!(stor,g(x))
x0 = randn(n)

Optim.optimize(f, g!, x0, Optim.ConjugateGradient(linesearch=LineSearches.HagerZhang()), Optim.Options(show_trace=true,allow_f_increases=true,f_tol=-1.0, g_tol=-1.0, x_tol=-1.0))

@cortner
Copy link
Contributor

cortner commented Nov 6, 2018

Some general thoughts: Although for simple problems HZ is very impressive, I've found that in "practical" cases, Hager Zhang doesn't do much better than the others. When the optimisers crash, then I normally hack something together, but the solution is always to avoid objective evaluations altogether. Some of the following work:

  1. specialised code to evaluate objective-differences in a numerically stable way, then restart the optimiser with the difference as the new objective (but maybe Optim could allow the use of objective differences? I haven't thought about it)
  2. switch to a line search that minimises the gradient of the gradient in the search direction. The break-down is normally in the asymptotic regime when one expects to have a contraction property so this normally works well
  3. Infer a "good" step size and switch to static steps
  4. switch to a Newton scheme

@pkofod
Copy link
Member

pkofod commented Nov 6, 2018

After JuliaNLSolvers/LineSearches.jl#136, HagerZhang now doesn't error ;)

Makes you wonder: was it the actual bug fix, or the handling around zero ? :) Probably the latter, but...

@mohamed82008
Copy link
Contributor

Well the latter was a bug too :)

@antoine-levitt
Copy link
Contributor Author

I still get various errors like

    97    -5.103860e-01     7.771561e-16
┌ Warning: Linesearch failed, using alpha = 0.004337177887818116 and exiting optimization.
│ The linesearch exited with message:
│ Linesearch failed to converge, reached maximum iterations 50.
└ @ Optim ~/.julia/packages/Optim/ULNLZ/src/utilities/perform_linesearch.jl:47

(with the code pasted above). That also happens with other optimizers. BackTracking does not error, but stagnates around sqrt(eps).

@cortner That depends somewhat on the optimizer, but for LBFGS I agree with you. Line searches should only be used to add some robustness, not help convergence. It seems a simple backtracking strategy works best.

@mohamed82008
Copy link
Contributor

I get the line search failure warning but it exits gracefully, no errors, with |g(x)| = 1.78e-15 and |x - x'| = 6.94e-18. May I ask what's the desired output of this exercise?

@antoine-levitt
Copy link
Contributor Author

The desired output is that the method converges efficiently to machine precision (ie a gradient of the order of machine precision). Currently HZ achieves this, but the other linesearches don't. After that, it's just a stress test that might help catch bugs.

@pkofod
Copy link
Member

pkofod commented Aug 18, 2020

I've noted the point about the lower bound on the step size and will keep this in mind going forward, but otherwise this isn't really actionable. Thanks for the discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants