Skip to content
This repository has been archived by the owner on Jan 7, 2023. It is now read-only.

The spending of time to train a language model based on transformer with fairseq #8

Open
Randy-1009 opened this issue Dec 24, 2021 · 1 comment

Comments

@Randy-1009
Copy link

I am using 3 Tesla V100 GPUs to train a model based on transformer with fairseq. The parameters are set as the same as the given train sentence. However, each epoch takes a lot of time (more than 2 hours) and is this normal ? I'd like to know how long it takes to train the model when you did this reserch. Thank you~

@Randy-1009
Copy link
Author

I can stop training until the Perplexity(PPL) is 29.xx, right? Now after 20 epochs, the Perplexity is 30.16.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant