How long does it take to train a sequene_parallel bert? #1365
Unanswered
tianboh
asked this question in
Community | Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi, I am following the document to train in the docker environment.
I use the default config as
I am using 4 V100 GPUs to train this model. However, I only have done ~2400 iterations after an hour of training. There are 1000000 iterations in total, so it seems that it takes ~17 days to finish training. Is this normal? Or do I need to reduce training iterations?
Beta Was this translation helpful? Give feedback.
All reactions