Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function 'SqrtBackward' returned nan values in its 0th output. Bug in min_enclosing_box.py? #20

Closed
zye1996 opened this issue Apr 25, 2021 · 3 comments
Labels
bug Something isn't working

Comments

@zye1996
Copy link

zye1996 commented Apr 25, 2021

When backprop with GIoU loss, there is a sqrt out of range regarding the line here when sqrt encounter 0 values.

Should we add a small offset to the value inside the sqrt? I tried 1e-8 and the training became unstable while 1e-16 is fine.

num = torch.sqrt( (y2-y1).square() + (x2-x1).square() +1e-16) + 1e-8

@lilanxiao
Copy link
Owner

lilanxiao commented Apr 25, 2021

Thank you very much for the issue!

Yes, it's a bug. According to this link: pytorch/pytorch#6394, the backprop of torch.sqrt() would generate nan if the input is zero.

Actually, I'm surprised that Pytorch really puts an inf there. I thought the gradient was hard-coded to some very large but limited value. lol. My computer is now occupied by some other tasks and I cannot run tests. But I will fix this ASAP and let you know.

@lilanxiao lilanxiao added the bug Something isn't working label Apr 25, 2021
@lilanxiao
Copy link
Owner

hi, I've changed that line to
num = torch.sqrt( (y2-y1).square() + (x2-x1).square() +1e-14)
the 1e-8 is no more necessary as the sqrt is guaranteed positive. Please let me know if there are further issues.

@zye1996
Copy link
Author

zye1996 commented Apr 28, 2021

hi, I've changed that line to
num = torch.sqrt( (y2-y1).square() + (x2-x1).square() +1e-14)
the 1e-8 is no more necessary as the sqrt is guaranteed positive. Please let me know if there are further issues.

Hi I have verified the fix work for the backprop. I can confirm that DIoU loss enhance the performance of the detector considerably.

@zye1996 zye1996 closed this as completed Apr 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants