Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about expected results #98

Closed
richarddwang opened this issue Sep 30, 2020 · 1 comment
Closed

Question about expected results #98

richarddwang opened this issue Sep 30, 2020 · 1 comment

Comments

@richarddwang
Copy link

richarddwang commented Sep 30, 2020

Hi @clarkkev ,

  1. How long did you train ELECTRA-Small OWT
    In the expected result section of READEME.md, you have mentioned "OWT is the OpenWebText-trained model from above (it performs a bit worse than ELECTRA-Small due to being trained for less time and on a smaller dataset)". How may steps have you trained ? And AFAIK openwebtext should be larger than wikibook, is that mean you use only part of the data ?

  2. How come the scores in expected results
    You have also mentioned "The below scores show median performance over a large number of random seeds.", is that mean the scores listed in that section is the scores of models pretrained from scractch with random seeds and each model was finetuned for 10 runs with random seeds, or is one pretrained model and finetuned for 10 runs with many random seeds ?

  3. Did you use double_unordered in training models for expected results ?

@richarddwang
Copy link
Author

Below is the original Kevin's reply to my email.

  1. It is was trained for 1 million steps. I'm actually not sure how many epochs over the dataset it does, but the (public) OWT dataset is only about 50% bigger than Wikibooks I believe.

  2. They are from the same pre-trained checkpoint with different random seeds for fine-tuning. The number of runs was at least 10, but much more (I think 100) for some tasks; I left the eval jobs running for a while and took the median of all the results.

  3. Yes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant