New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
weights become very large and then loss = nan #373
Comments
I had similar issues when the learning rate was too high for my network. Lowering the learning rate solved the problem for me (I don't think it was a numerical problem in caffe). An indicator for high learning rate is a diverging error value and this typically leads to NaN in my experiments. |
I don't think it is the learning rate's problem. I dump out the weight of my network and find that only one of the bias value in the last inner product layer become 3e+29 after a specific iteration, but all other weights seems good (e.g. between -10.0 and 10.0). Strange... |
@chyh1990 I think I might have same problem with you. |
Try to set initial bias to 0.1 in all layers, or add regularization to the On Tuesday, May 13, 2014, smiley19 notifications@github.com wrote:
Sergio |
@chyh1990 Thanks for your help, you do gave me a big favor! |
Re: #373 (comment) about |
(I assume this has been addressed, but feel free to reopen the issue should there be further questions.) |
@chyh1990 Could you tell me how to get all parameters in my training model? Thanks very much! |
I understand that regularization gets rid of unreasonable values for biases and weights, but I wonder how could a loss become NaN? Is it because those values resulted the loss to be -inf? |
I use caffe to train my CNN but loss become nan after a few thousand iterations.
I dump the weights before that iterations and I found that some weights in the inner product layers become very large (e.g. +3e29). I check my shuffled input data and make sure they are in reasonable range.
I represent this in both GPU mode and CPU mode. Does this indicate some numeric problems in caffe or other causes?
The text was updated successfully, but these errors were encountered: