Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can you help me to solve it ?WARNING: non-finite loss, ending training tensor([ nan, 0.22070, 0.00000, nan], device='cuda:0') #20

Open
shenghsiaowong opened this issue Dec 10, 2019 · 5 comments

Comments

@shenghsiaowong
Copy link

WARNING: non-finite loss, ending training tensor([ nan, 0.22070, 0.00000, nan], device='cuda:0')

@tanluren
Copy link
Owner

Hi, it looks like gradient explosion, did you load pretrained model? did you prepare your data according to https://github.com/ultralytics/yolov3/wiki/Train-Custom-Data ?could you show the command you used to train?

@shenghsiaowong
Copy link
Author

Hi, it looks like gradient explosion, did you load pretrained model? did you prepare your data according to https://github.com/ultralytics/yolov3/wiki/Train-Custom-Data ?could you show the command you used to train?

this is my train command ,python train.py --cfg cfg/yolov3-1cls.cfg --data data/voc.data --weights darknet53.conv.74

@tanluren
Copy link
Owner

you may try to lower the lr and use yolov3.weights as pretrained model

@chouxianyu
Copy link

@shenghsiaowong
Are you using focal loss?
If you are using it,you may try removing the focal loss.

A few days ago,i used focal loss and found an error(obj_loss is nan,other loss,other losses are occasionally nan) .
Accoarding to issue842, i have tried many ways:lower the lr,remove the tiny object ,lower the weight of
losses in the hyp and so on,but the error still exists.
Finally I removed focal loss on the obj_loss and the nan error didn't appear anymore.
So you may try removing the focal loss.For example,if your obj_loss is nan,just remove the focal loss on obj_loss).

@xiaozhangya
Copy link

you may try to lower the lr and use yolov3.weights as pretrained model
Hello, what if you change the network slightly and cannot use the pre-trained network? How can I start training a model from scratch? thank you for your reply!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants