Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hight perplexity when Further Pre-Training #19

Open
MrRace opened this issue Oct 2, 2021 · 1 comment
Open

hight perplexity when Further Pre-Training #19

MrRace opened this issue Oct 2, 2021 · 1 comment

Comments

@MrRace
Copy link

MrRace commented Oct 2, 2021

When do further pre-training on my own datas the ppl is too much high for example 709. I have 3582619 examples, and use batch size=8, epoch=3, learing rate=5e-5. Is there any advice ? Thanks a lot!

@xuyige
Copy link
Owner

xuyige commented Oct 9, 2021

the further pre-trained task is masked language model, not language model, therefore using ppl i think may not be a good metric.
can you set your batch size larger or using gradient accumulate? and you can check a accruacy of masked language model as well as the loss curve to check the further pre-training.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants