New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Weird loss progression #10
Comments
One example log will be in following. Note that the current code will not give you the exact same loss, but the trend of how the loss is developed will be similar Learning Rate Train Loss Theta Loss Theta Skip Loss |
The only things I modified in your code are the
The train loss is slightly increasing instead of getting lower like yours and the two other losses are not really changing... Do you have an idea of what is going on? |
very small batch size will work badly for batch norm, you will also need to adjust the learning rate according to the batch size, if you divide the batch size by 8, you should also divide the lr by 8 |
Thank you for your fast answer, I will try that. |
Since I am training the model on VLOG with a very small batch size, the training is going to take forever (8 days). And because I don't want to wait that long, I'll stop the training before 30 epochs. But the losses shown in the logs seem odd to me. Can someone provide me the log of a complete training so I can compare the losses and see if my early results are normal or not? Thanks
The text was updated successfully, but these errors were encountered: