Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use NaNMath #71

Merged
merged 2 commits into from
Nov 6, 2017
Merged

Use NaNMath #71

merged 2 commits into from
Nov 6, 2017

Conversation

anriseth
Copy link
Collaborator

@anriseth anriseth commented Nov 5, 2017

In case something goes wrong during the interpolation step.

Asbjørn Nilsen Riseth added 2 commits November 5, 2017 14:01
@codecov
Copy link

codecov bot commented Nov 5, 2017

Codecov Report

Merging #71 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@          Coverage Diff           @@
##           master     #71   +/-   ##
======================================
  Coverage    61.9%   61.9%           
======================================
  Files           7       7           
  Lines         546     546           
======================================
  Hits          338     338           
  Misses        208     208
Impacted Files Coverage Δ
src/backtracking.jl 94.28% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 675f5b5...ca92160. Read the comment docs.

@anriseth anriseth merged commit 9140e08 into master Nov 6, 2017
@anriseth anriseth deleted the nanmath branch November 6, 2017 10:34
@andreasnoack
Copy link

This causes the following problem

julia> NaNMath.min(ForwardDiff.Dual{Float64}(1), ForwardDiff.Dual{Float64}(1))
ERROR: StackOverflowError:
Stacktrace:
 [1] min(::ForwardDiff.Dual{Float64,Int64,0}, ::ForwardDiff.Dual{Float64,Int64,0}) at /Users/andreasnoack/.julia/packages/NaNMath/pEdac/src/NaNMath.jl:291 (repeats 80000 times)

julia> min(ForwardDiff.Dual{Float64}(1), ForwardDiff.Dual{Float64}(1))
Dual{Float64}(1)

which can happen with autodiff=:forward.

@pkofod
Copy link
Member

pkofod commented Feb 8, 2019

@anriseth do you remember what part "could go wrong"?

@anriseth
Copy link
Collaborator Author

anriseth commented Feb 9, 2019

Hmm, no unfortunately not. I remember running into an issue where NaNs propagated through and the culprit may have been alphatmp in some weird edgecase.

We can either revert the Namath operations which are unlikely to affect as many people as the Forwarddiff problems, or add an explicit check on NaN instead
(I believe the problem was alphatmp, and not alpha, rhohi, or rholo)

@andreasnoack
Copy link

I tried to change NaNMath.min and max in Backtracking locally and I immediately got gradients full of NaNs so I probably hit the issue that made you do this PR.

@anriseth
Copy link
Collaborator Author

anriseth commented Feb 9, 2019

Ah, that's annoying. Do any of the other line search methods work for you?
I assume Static would work. We could make a vanilla backtracking as well, to get something, potentially, a little better than Static but which support Duals.

@pkofod
Copy link
Member

pkofod commented Feb 9, 2019

The other searches don’t really work for Andreas, because they don’t have the initial finiteness backtracking step (it should just be included in the others as well)

@anriseth
Copy link
Collaborator Author

anriseth commented Feb 9, 2019

@pkofod
Copy link
Member

pkofod commented Feb 9, 2019

Oh that was added... did you try morethuente Andreas?

@andreasnoack
Copy link

I've locally defined DiffRules for NaNMath.min/max but then I hit the NaN issue right after. The culprit seems to be

α_0 = min(α_0, min(alphamax, ls.maxstep / norm(s, Inf)))
.
Since maxstep is initialized to Inf we get things like

julia> Inf/ForwardDiff.Dual{Float64}(1.0, 0.0)
Dual{Float64}(Inf,NaN)

Initially, I thought I could work around this by flipping the NANSAFE_MODE_ENABLED flag in ForwardDiff but it didn't work, see JuliaDiff/ForwardDiff.jl#179 (comment).

I guess it might be difficult to get correct behavior for Duals when we hit Infs. Maybe another solution could be to pick set ls.maxstep to a large but finite value.

@andreasnoack
Copy link

I can of course just set maxstep but it might be difficult to figure out.

@anriseth
Copy link
Collaborator Author

If I remember correctly maxstep is something @cortner found useful (correct me if I'm wrong). By setting the default to Inf we thought it wouldn't make any difference to people anyway, but you've proved that wrong :p

Two possible ways around this:

  1. If you've defined DiffRules for NanMath.min, then would it work to make the line
    α_0 = NaNMath.min(α_0, NaNMath.min(alphamax, ls.maxstep / norm(s, Inf))) ?
  2. We can explicitly check for NaN and skip the min operation when there are NaNs (or does that cause issues elsewhere?)

If you remove the line checking for maxstep locally, does the optimization run to convergence?

@cortner
Copy link
Contributor

cortner commented Feb 12, 2019

Correct - I introduced this As a stabilising mechanism.

Could one just check whether maxstep is Inf and in that case skip this operation?

@andreasnoack
Copy link

Regarding the initial comment then I think it should be fixed by JuliaDiff/DiffRules.jl#31.

Regarding the other issue then JuliaDiff/ForwardDiff.jl#386 might help here but it's still an option that is off by default.

I'm not sure what the better solution is but at least I've been able to get things working now for my use case by specifying a finite maxstep.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants