New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

problems with trust region method #89

Open
tpapp opened this Issue Apr 2, 2017 · 11 comments

Comments

Projects
None yet
6 participants
@tpapp

tpapp commented Apr 2, 2017

I have been using this package extensively for solving collocation problems in economics. In my previous experience (with other code), trust region methods have almost always dominated quasi-Newton methods, and definitely Newton methods. This is in accordance with what the Nocedal-Wright book says.

However, with NLsolve, trust region methods frequently don't converge, while plain vanilla Newton converges very well. An MWE is

using NLsolve

"Freudenstein and Roth function."
function f!(x, fval)
    fval[1] = -13 + x[1] + ((5-x[2])*x[2]-2)*x[2]
    fval[2] = -29 + x[1] + ((x[2] + 1)*x[2]-14)*x[2]
end

o1 = nlsolve(f!, [0.5, -2]; autodiff = true)
NLsolve.converged(o1)           # false
o1.zero                         # "bad root"  11.4128, -0.896805

o2 = nlsolve(f!, [0.5, -2]; autodiff = true, method = :newton)
NLsolve.converged(o2)           # true
o2.zero                         # root 5, 4

I don't think there is anything with trust region as a method in theory, so I am wondering if there is a bug in the implementation.

I would like to help out. I could:

  1. factor out parts of the source and write tests for them (exact references of which method is implemented would help though),
  2. code up a library of well-known test cases, both for optimization and rootfinding (this would help Optim.jl too). I am not aware of an existing library/collection like this for Julia.
@sglyon

This comment has been minimized.

Show comment
Hide comment
@sglyon

sglyon Apr 5, 2017

Collaborator

Hey @tpapp thanks for opening this issue. It sounds like we have similar use cases for this library :)

One thing that we (@pkofod and I) have talked about is breaking out the tests in this package in to another package we can require when running tests. This would follow in the style of what Optim.jl has done with the OptimTests.jl package. I think we have a decent start here as we have the MINPACK test suite plus a number of other problems.

Personally, I have never implemented the trust region method so I wouldn't be able to speak to whether or not there are potential coding errors in this implementation. Do you know of a reference that would be best to look at for understanding how trust region methods work?

Collaborator

sglyon commented Apr 5, 2017

Hey @tpapp thanks for opening this issue. It sounds like we have similar use cases for this library :)

One thing that we (@pkofod and I) have talked about is breaking out the tests in this package in to another package we can require when running tests. This would follow in the style of what Optim.jl has done with the OptimTests.jl package. I think we have a decent start here as we have the MINPACK test suite plus a number of other problems.

Personally, I have never implemented the trust region method so I wouldn't be able to speak to whether or not there are potential coding errors in this implementation. Do you know of a reference that would be best to look at for understanding how trust region methods work?

@pkofod

This comment has been minimized.

Show comment
Hide comment
@pkofod

pkofod Apr 5, 2017

Contributor

Nocedal and wright is a good starting point. Let us hunt 🐛s

Contributor

pkofod commented Apr 5, 2017

Nocedal and wright is a good starting point. Let us hunt 🐛s

@pkofod

This comment has been minimized.

Show comment
Hide comment
@pkofod

pkofod Apr 5, 2017

Contributor

Let me add something. I invited nlsolve to JuliaNLSolvers because I really want to see some development going into the package. We've had great success increasing productivity and collaboration with just a few new contributors in Optim, so I'd be thrilled to have more people looking at NLsolve than is currently the case.

Also, I agree that the trust region method usually works very well for the problems I work with (I think my research interests align quite nicely with yours) so let's find out what problem is.

Contributor

pkofod commented Apr 5, 2017

Let me add something. I invited nlsolve to JuliaNLSolvers because I really want to see some development going into the package. We've had great success increasing productivity and collaboration with just a few new contributors in Optim, so I'd be thrilled to have more people looking at NLsolve than is currently the case.

Also, I agree that the trust region method usually works very well for the problems I work with (I think my research interests align quite nicely with yours) so let's find out what problem is.

@tpapp

This comment has been minimized.

Show comment
Hide comment
@tpapp

tpapp Apr 5, 2017

IMO the best starting point for debugging would be factoring the code into smaller functions that one can test separately (dogleg, etc).

In the long run, a package with nothing but test functions would be nice, with the understanding if f(x)=0 is used for testing nonlinear solvers, min ||f(x)|| can be used to test he optimizer. A nice paper with lots of functions is

@article{more1981testing,
  title={Testing unconstrained optimization software},
  author={Mor{\'e}, Jorge J and Garbow, Burton S and Hillstrom, Kenneth E},
  journal={ACM Transactions on Mathematical Software (TOMS)},
  volume={7},
  number={1},
  pages={17--41},
  year={1981},
  publisher={ACM}
}

tpapp commented Apr 5, 2017

IMO the best starting point for debugging would be factoring the code into smaller functions that one can test separately (dogleg, etc).

In the long run, a package with nothing but test functions would be nice, with the understanding if f(x)=0 is used for testing nonlinear solvers, min ||f(x)|| can be used to test he optimizer. A nice paper with lots of functions is

@article{more1981testing,
  title={Testing unconstrained optimization software},
  author={Mor{\'e}, Jorge J and Garbow, Burton S and Hillstrom, Kenneth E},
  journal={ACM Transactions on Mathematical Software (TOMS)},
  volume={7},
  number={1},
  pages={17--41},
  year={1981},
  publisher={ACM}
}
@dehann

This comment has been minimized.

Show comment
Hide comment
@dehann

dehann Apr 5, 2017

Hi All, this might be loosely related to the issue. Thought it might be worth mentioning an outside use case from robotics, which makes extensive use of the trust region method also. Around 3-15 dimensional on fairly well behaved, but not convex, functions, and with possibly weird initial conditions. I have some fair quality assurance testing around aggregate output from batch calls for a variety of cases.

In the last year or two, I've seen a few non convergence issues due to non-start or trivial cases. Usually a small bump to the estimate x+=1e-10*randn(n) solves my non-convergence cases. I had wondered if there might be an untested corner case somewhere. For example when x0=zeros(n) I've seen it get stuck.

For completeness, I've also been using gradient free methods from Optim.jl in some setups minimization rather than root finding cases. Thereby carrying two dependencies. Thanks!

dehann commented Apr 5, 2017

Hi All, this might be loosely related to the issue. Thought it might be worth mentioning an outside use case from robotics, which makes extensive use of the trust region method also. Around 3-15 dimensional on fairly well behaved, but not convex, functions, and with possibly weird initial conditions. I have some fair quality assurance testing around aggregate output from batch calls for a variety of cases.

In the last year or two, I've seen a few non convergence issues due to non-start or trivial cases. Usually a small bump to the estimate x+=1e-10*randn(n) solves my non-convergence cases. I had wondered if there might be an untested corner case somewhere. For example when x0=zeros(n) I've seen it get stuck.

For completeness, I've also been using gradient free methods from Optim.jl in some setups minimization rather than root finding cases. Thereby carrying two dependencies. Thanks!

@ChrisRackauckas

This comment has been minimized.

Show comment
Hide comment
@ChrisRackauckas

ChrisRackauckas Apr 17, 2017

Contributor

I have a problem where the trust region method is going crazy. It's a non-stochastic problem, with the same initial conditions, but I can get wildly different values out each time. For example, the residuals:

[8.86293e-5,1.19202e-5,0.000330256,-919.125,32.0626,-323.151]
[-1.18138e-5,-4.73636e-6,-1.32164e-5,-5464.3,190.616,-1921.17]
[-0.0008622,-0.000345666,-0.000964572,-919.142,32.0623,-323.158]
[0.00180817,0.000712957,0.00200938,0.0078686,0.00149706,0.0045823]
[0.00345202,0.00138402,0.00386197,-919.143,32.0666,-323.154]
[-0.000480206,-0.000192523,-0.000537226,0.009301,-0.000797501,0.00278132]
[0.0,0.0,0.0,-0.0126197,0.000440224,-0.00443692]
[0.000837957,0.000286074,0.000861251,-0.00115442,0.000712696,0.00031013]
[1.37184e-6,5.50121e-7,1.53482e-6,-4.98481e-5,3.09059e-6,-1.61291e-5]
[0.0040354,0.00159477,0.00451992,-5406.55,188.605,-1900.86]
[1.61864e-6,6.48841e-7,1.81049e-6,1.27032e-5,1.15111e-6,6.11227e-6]
[-0.00124305,0.00119902,0.00218492,0.000610128,0.00150093,0.0043943]
[0.0,0.0,0.0,0.0331242,-0.0011555,0.011646]
[-0.00289402,0.000288416,-0.00125191,24808.6,-865.419,8722.35]
[0.0,0.0,0.0,0.033643,-0.0011736,0.0118284]
[0.000579008,0.000202527,0.000630362,-755.516,26.3558,-265.628]
[3.71598e-7,7.17875e-7,7.72998e-8,-8.85129e-6,2.00432e-6,-1.73855e-6]

For now I put it in a loop and just let it run repeatedly until the residual is low enough

save_sol = 0
while true
  bvp = BVProblem(f, domin, cur_bc!, init)
  resid_f = Array(Float64, 6)
  sol = solve(bvp, Shooting(Tsit5(),nlsolve=(f,u0) -> NLsolve.nlsolve(f,u0,autodiff=true)),force_dtmin=true,abstol=1e-10)
  cur_bc!(resid_f,sol)
  println(resid_f)
  if maximum(abs.(resid_f)) < 1e-5
    save_sol = sol
    break
  end
end

but that's just a hack to get around the fact that it's acting very non-deterministic and diverges randomly. (Using finite differencing is even worse on this problem!)

Contributor

ChrisRackauckas commented Apr 17, 2017

I have a problem where the trust region method is going crazy. It's a non-stochastic problem, with the same initial conditions, but I can get wildly different values out each time. For example, the residuals:

[8.86293e-5,1.19202e-5,0.000330256,-919.125,32.0626,-323.151]
[-1.18138e-5,-4.73636e-6,-1.32164e-5,-5464.3,190.616,-1921.17]
[-0.0008622,-0.000345666,-0.000964572,-919.142,32.0623,-323.158]
[0.00180817,0.000712957,0.00200938,0.0078686,0.00149706,0.0045823]
[0.00345202,0.00138402,0.00386197,-919.143,32.0666,-323.154]
[-0.000480206,-0.000192523,-0.000537226,0.009301,-0.000797501,0.00278132]
[0.0,0.0,0.0,-0.0126197,0.000440224,-0.00443692]
[0.000837957,0.000286074,0.000861251,-0.00115442,0.000712696,0.00031013]
[1.37184e-6,5.50121e-7,1.53482e-6,-4.98481e-5,3.09059e-6,-1.61291e-5]
[0.0040354,0.00159477,0.00451992,-5406.55,188.605,-1900.86]
[1.61864e-6,6.48841e-7,1.81049e-6,1.27032e-5,1.15111e-6,6.11227e-6]
[-0.00124305,0.00119902,0.00218492,0.000610128,0.00150093,0.0043943]
[0.0,0.0,0.0,0.0331242,-0.0011555,0.011646]
[-0.00289402,0.000288416,-0.00125191,24808.6,-865.419,8722.35]
[0.0,0.0,0.0,0.033643,-0.0011736,0.0118284]
[0.000579008,0.000202527,0.000630362,-755.516,26.3558,-265.628]
[3.71598e-7,7.17875e-7,7.72998e-8,-8.85129e-6,2.00432e-6,-1.73855e-6]

For now I put it in a loop and just let it run repeatedly until the residual is low enough

save_sol = 0
while true
  bvp = BVProblem(f, domin, cur_bc!, init)
  resid_f = Array(Float64, 6)
  sol = solve(bvp, Shooting(Tsit5(),nlsolve=(f,u0) -> NLsolve.nlsolve(f,u0,autodiff=true)),force_dtmin=true,abstol=1e-10)
  cur_bc!(resid_f,sol)
  println(resid_f)
  if maximum(abs.(resid_f)) < 1e-5
    save_sol = sol
    break
  end
end

but that's just a hack to get around the fact that it's acting very non-deterministic and diverges randomly. (Using finite differencing is even worse on this problem!)

@pkofod

This comment has been minimized.

Show comment
Hide comment
@pkofod

pkofod Apr 17, 2017

Contributor

Since it's the trust region solver, we can at least rule out, that this is a problem with the changes in LineSearches. I hope to have time to have a look soon.

Contributor

pkofod commented Apr 17, 2017

Since it's the trust region solver, we can at least rule out, that this is a problem with the changes in LineSearches. I hope to have time to have a look soon.

@KristofferC

This comment has been minimized.

Show comment
Hide comment
@KristofferC

KristofferC Apr 17, 2017

Collaborator

Are things getting initialized correctly?

Collaborator

KristofferC commented Apr 17, 2017

Are things getting initialized correctly?

@KristofferC

This comment has been minimized.

Show comment
Hide comment
@KristofferC

KristofferC Apr 17, 2017

Collaborator

In the solver I mean

Collaborator

KristofferC commented Apr 17, 2017

In the solver I mean

@pkofod

This comment has been minimized.

Show comment
Hide comment
@pkofod

pkofod Apr 17, 2017

Contributor

Are things getting initialized correctly?

I havn't checked, but I don't think so. I saw this exact same problem with some of the methods in Optim a while back, and there some arrays were assumed (in the code) to be initialized as zeros(n), but was actually initialized using Array{T,1}(n) constructors. This lead to unpredictable behaviour. Very often they would have very small elements, but sometimes it would contain large enough values to throw off the results.

Contributor

pkofod commented Apr 17, 2017

Are things getting initialized correctly?

I havn't checked, but I don't think so. I saw this exact same problem with some of the methods in Optim a while back, and there some arrays were assumed (in the code) to be initialized as zeros(n), but was actually initialized using Array{T,1}(n) constructors. This lead to unpredictable behaviour. Very often they would have very small elements, but sometimes it would contain large enough values to throw off the results.

@ChrisRackauckas

This comment has been minimized.

Show comment
Hide comment
@ChrisRackauckas

ChrisRackauckas Oct 26, 2017

Contributor

I cannot reproduce my original problematic example anymore.

Contributor

ChrisRackauckas commented Oct 26, 2017

I cannot reproduce my original problematic example anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment