New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Loss increases during pretraining #35
Comments
Hi @mmaaz60 |
Hum that’s quite surprising then. Nothing fishy happened, like the job getting preempted then restarted? |
Thank You
Nothing such happened during training
I am using transformers version 4.5.1
I actually stopped and then resumed the training from the 19th epoch and now it reaches to 25th epoch and seems to be converging. Not sure what went wrong previously as I didn't change anything when resuming. |
Hi @alcinos, @ashkamath, @nguyeho7,
I hope you are doing good.
I was trying to pretrain MDETR using the provided instructions. What I noticed is that loss started increasing during the 20th epoch. It kept decreasing to around 39 till the 19th epoch and jumped to around 77 after the 20th epoch. What could be the reason for this? Note that I am using the EfficientNetB5 backbone. The log.txt is attached.
Thanks
log.txt
The text was updated successfully, but these errors were encountered: