Issues: sail-sg/Adan
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
在我的cnn模型中,lr=0.01时,在20-30epoch,map可以提升的很快但是后续会成为NAN。但是如果使用0.001不会直接为NAN,但是效果不好,请问这个现象代表着什么问题?谢谢!
#42
opened Dec 10, 2023 by
liiicon
Concrete weight decay configuration for GPT-2 pretraining
#40
opened Aug 31, 2023 by
DesperateExplorer
ProTip!
What’s not been updated in a month: updated:<2024-04-08.