The spending of time to train a language model based on transformer with fairseq #8

Randy-1009 · 2021-12-24T05:04:37Z

I am using 3 Tesla V100 GPUs to train a model based on transformer with fairseq. The parameters are set as the same as the given train sentence. However, each epoch takes a lot of time (more than 2 hours) and is this normal ? I'd like to know how long it takes to train the model when you did this reserch. Thank you~

Randy-1009 · 2021-12-24T05:50:40Z

I can stop training until the Perplexity(PPL) is 29.xx, right? Now after 20 epochs, the Perplexity is 30.16.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The spending of time to train a language model based on transformer with fairseq #8

The spending of time to train a language model based on transformer with fairseq #8

Randy-1009 commented Dec 24, 2021

Randy-1009 commented Dec 24, 2021

The spending of time to train a language model based on transformer with fairseq #8

The spending of time to train a language model based on transformer with fairseq #8

Comments

Randy-1009 commented Dec 24, 2021

Randy-1009 commented Dec 24, 2021