New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nan tem_loss in the middle of the traing #22
Comments
This problem also occurred during my training. when you closed this,has it been resolved? @yangsusanyang |
@leemengxing, I remembered that when I changed the batch size, the problem is gone. The problem will still appear for other settings, I guess. |
i found when batsize = 1 will happen. @yangsusanyang @JJBOY |
Same here. Increasing the batch will solve the issue. @leemengxing |
There is a statistical ratio in the loss function. It may happen that the number of iou> 0.7 or iou> 0.9 is 0 for small batch sizes, such as 1, which leads to nan. @frostinassiky |
Thank you for the great work!
I tried your training program, and printed the loss after each batch. The loss seems correct in the beginning of the training, then "nan" values appear as shown in the following. Wonder if you guessed the possible reasons.
...
BMN training loss(epoch 0): tem_loss: 1.184, pem class_loss: 0.458, pem reg_loss: 0.024, total_loss: 1.883
BMN training loss(epoch 0): tem_loss: 1.184, pem class_loss: 0.458, pem reg_loss: 0.024, total_loss: 1.884
BMN training loss(epoch 0): tem_loss: 1.184, pem class_loss: nan, pem reg_loss: 0.024, total_loss: nan
BMN training loss(epoch 0): tem_loss: nan, pem class_loss: nan, pem reg_loss: 0.024, total_loss: nan
Thanks!
The text was updated successfully, but these errors were encountered: