Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fp16 training loss=nan #48

Open
ilaij0810 opened this issue Nov 16, 2022 · 2 comments
Open

fp16 training loss=nan #48

ilaij0810 opened this issue Nov 16, 2022 · 2 comments

Comments

@ilaij0810
Copy link

hi, thank you for your work!
I have encountered a problem when I set fp16 training loss is always nan. then i found in resa module, after down, up, right and left feature fusion, the feature value become very large, and many values are larger than 65504, so the actually value becomes inf. How can I achieve mixed precision(fp16) training without losing too much performance?

@ilaij0810
Copy link
Author

I have trid some method, add BN to conv in resa module,
image
but no lanes detected.
if I decrease the value of alpha, i.e, alpha=0.1, or change the act (original is relu) to sigmoid or tanh, .., it will losing too much performance?

expecting your reply.

@2696120622
Copy link

2696120622 commented Jul 10, 2023

@ilaij0810
I have the same problem of loss=nan. If I set alpha to 1.0, I have not got the loss of nan. But, the training can not coverge.
Do you have any solutions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants