Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The problms of mismatched evaluation metrics #27

Closed
Root970103 opened this issue Jul 20, 2021 · 4 comments
Closed

The problms of mismatched evaluation metrics #27

Root970103 opened this issue Jul 20, 2021 · 4 comments

Comments

@Root970103
Copy link

Hi, thank you for your excellent work. I reproduce your work with the config file named default.yaml, but cannot get the same result(BLEU=0.74). And I found the train loss increased after a few epoches. Can you give some adivice?

image

@lukas-blecher
Copy link
Owner

Hi, thank you for your interest in the project.
I've had the same problem and I think it is because of the learning rate scheduler. I've created a discussion post here: #11
Restarting the training is not an optimal solution but it seems to work. As stated, maybe another scheduler is a better fit. Any ideas?

Also, I'm using config.yaml right now.

@Root970103
Copy link
Author

Thanks for your reply. I have also been looking for possible reasons for this problem. I think both the optimizer and the lr scheduler may be the cause. So I will try to use different configs for the optimizer and the lr scheduler to find the optimal result.

And, are you currently using the method which manually stops and retrains with the previous model as a starting point?

@lukas-blecher
Copy link
Owner

I'm interrupting the training after some time (I keep looking at the progress in wandb and if the loss stagnates I'm interrupting) and resume from the last saved checkpoint (It saves on Keyboardinterrupt after the first epoch).

python train.py --config path-to-checkpoints/model-name/config.yaml --resume

something like that.

@lukas-blecher
Copy link
Owner

With the StepLR scheduler this does not happen anymore

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants