Skip to content

Die on NaN loss#2363

Closed
jeffdonahue wants to merge 1 commit into
BVLC:masterfrom
jeffdonahue:die-on-nan
Closed

Die on NaN loss#2363
jeffdonahue wants to merge 1 commit into
BVLC:masterfrom
jeffdonahue:die-on-nan

Conversation

@jeffdonahue
Copy link
Copy Markdown
Contributor

This will make training die on a NaN loss, unless turned off by setting SolverParameter.die_on_nan == false. It's been saving the GPUs I've been using many pointless cycles...

@jeffdonahue
Copy link
Copy Markdown
Contributor Author

(Of course, this adds to the existing list of annoyances associated with Caffe's checks -- e.g. in PyCaffe -- but that's kind of a separate issue... Nonetheless I can see how it probably shouldn't be merged until those things are resolved.)

@longjon
Copy link
Copy Markdown
Contributor

longjon commented Apr 25, 2015

Makes sense. Yes, it might be a bad idea to crash pycaffe in this normal situation. See also #1349.

@jeffdonahue
Copy link
Copy Markdown
Contributor Author

Whoops, thanks -- should have searched... Also missed #1479 which did the same thing, sorry @sguada -- I'll leave this open, at least for now, in case anyone wants a current version of this to use privately, but if any version of this is merged it should be a rebase of #1479 rather than this one.

@jeffdonahue
Copy link
Copy Markdown
Contributor Author

Closing this as it's a duplicate and no longer mergable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants