New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The problms of mismatched evaluation metrics #27
Comments
Hi, thank you for your interest in the project. Also, I'm using |
Thanks for your reply. I have also been looking for possible reasons for this problem. I think both the optimizer and the lr scheduler may be the cause. So I will try to use different configs for the optimizer and the lr scheduler to find the optimal result. And, are you currently using the method which manually stops and retrains with the previous model as a starting point? |
I'm interrupting the training after some time (I keep looking at the progress in wandb and if the loss stagnates I'm interrupting) and resume from the last saved checkpoint (It saves on Keyboardinterrupt after the first epoch). python train.py --config path-to-checkpoints/model-name/config.yaml --resume something like that. |
With the StepLR scheduler this does not happen anymore |
Hi, thank you for your excellent work. I reproduce your work with the config file named default.yaml, but cannot get the same result(BLEU=0.74). And I found the train loss increased after a few epoches. Can you give some adivice?
The text was updated successfully, but these errors were encountered: