Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[maybe a bug] loss nan #69

Open
xmy0916 opened this issue Jan 17, 2022 · 4 comments
Open

[maybe a bug] loss nan #69

xmy0916 opened this issue Jan 17, 2022 · 4 comments

Comments

@xmy0916
Copy link
Contributor

xmy0916 commented Jan 17, 2022

https://github.com/yitu-opensource/T2T-ViT/blob/main/models/token_performer.py#L18
My code has turned on fp16, so the 1e-8 on this line to prevent division by 0 is not enough for my code... the loss of the network calculation appears nan due to this code :
https://github.com/yitu-opensource/T2T-ViT/blob/main/models/token_performer.py#L50

@yuanli2333
Copy link
Collaborator

Yes, you may be right, we can try to change 1e-8 to a large one, did you try it?

@xmy0916
Copy link
Contributor Author

xmy0916 commented Jan 18, 2022

@yuanli2333 I have tested 1e-4 but also can't fix the problem.

@jiawangbai
Copy link

@xmy0916 In my implementation, 1e-6 can fix this problem, with bs=2048 and lr=1e-3.

@xmy0916
Copy link
Contributor Author

xmy0916 commented Jan 19, 2022

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants