Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes to MoreThuente #75

Merged
merged 6 commits into from Nov 27, 2017
Merged

Fixes to MoreThuente #75

merged 6 commits into from Nov 27, 2017

Conversation

anriseth
Copy link
Collaborator

After looking through the MoreThuente code, I found that the original translation from the MATLAB code has a bug in it. There are several lines in cstep where a value s = max(theta, dgx,dg) is calculated. In the original FORTRAN and MATLAB codes, however, this is set to s = max(abs(theta), abs(dgx), abs(dg)).

I caught this when an optimization problem I was running broke down. It is a high-dimensional problem, and I can't replicate it in a test as the input values to the line search are shown in pretty-mode, and small changes in the eight decimal place mean that the problem does not appear...

There also seems to be a weird choice in the original codes where they calculate
gamma = s* sqrt( (theta/s)^2 - (dg/s) * (dgx / s)). As s >= 0, this should be equivalent (modulo floating points) to gamma = sqrt( theta^2 - dg * dgx). I have made the change, as it omits the potentially problematic division by s.

In addition, I have moved the finite-value check from #73 outside the of loop, so that it does not interfere with the algorithm in itself, but only ensures that the initial step does not mess things up.

@codecov
Copy link

codecov bot commented Nov 22, 2017

Codecov Report

Merging #75 into master will increase coverage by 0.55%.
The diff coverage is 60%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #75      +/-   ##
==========================================
+ Coverage   62.27%   62.82%   +0.55%     
==========================================
  Files           8        8              
  Lines         607      616       +9     
==========================================
+ Hits          378      387       +9     
  Misses        229      229
Impacted Files Coverage Δ
src/morethuente.jl 56.54% <60%> (+2.14%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5cd79f4...ae164e7. Read the comment docs.

@codecov
Copy link

codecov bot commented Nov 22, 2017

Codecov Report

Merging #75 into master will increase coverage by 0.3%.
The diff coverage is 57.69%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master      #75     +/-   ##
=========================================
+ Coverage   62.27%   62.58%   +0.3%     
=========================================
  Files           8        8             
  Lines         607      620     +13     
=========================================
+ Hits          378      388     +10     
- Misses        229      232      +3
Impacted Files Coverage Δ
src/morethuente.jl 55.89% <57.69%> (+1.5%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5cd79f4...017e5c2. Read the comment docs.

@pkofod
Copy link
Member

pkofod commented Nov 23, 2017

Good catch... and did it work after you fixed this?

@anriseth
Copy link
Collaborator Author

anriseth commented Nov 23, 2017

Good catch... and did it work after you fixed this?

Yes, it now converges, whilst previously it generated NaN step guesses.

I have also compared behaviour with a without the s term in the gamma. This does change the number of finding evaluations in two of the optimisations I ran (these a corner cases with step directions almost orthogonal to the descent direction). The calculated difference for gamma are order 1e-15 or smaller (sometimes zero), so these algorithms are not always stable to floating point operation errors...

@pkofod
Copy link
Member

pkofod commented Nov 23, 2017

Alright, looks good to me then. They probably did it for a reason, if it's a floating point reason, maybe @simonbyrne has stumbled across something like it before?

@anriseth
Copy link
Collaborator Author

I'll get in contact with the people who were involved in the implementations of the original codes, and see if they can remember (originally from 1983 and 1991, so we'll see...)

@anriseth
Copy link
Collaborator Author

Dianne O'Leary pointed out to me that s is a scaling to prevent under- and overflow in the calculation of theta^2 and dgx * dg. It's probably best to keep it that way.

I'll merge and tag this on Tuesday, if nobody has any objections.

@pkofod
Copy link
Member

pkofod commented Nov 27, 2017

Unless you have a specific reason not to, just squash+merge and tag now (or whenever you feel like it).

@anriseth anriseth merged commit deb45fb into master Nov 27, 2017
@anriseth anriseth deleted the fixmt branch November 27, 2017 13:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants