New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About GPU and processing time in pretraining of LayoutLMv3 #917
Comments
Below are the details of our training setting using v100 (32G) GPUs:
I recommend using fewer training steps if you want to reduce the training time with the available hardware. For example, 150,000 steps should be enough to achieve slightly worse results on most tasks. |
Thank you for your very helpful answer! I'll try that number of steps. |
I am glad the answer was helpful to you! There are two ways to use fewer training steps. I use the same warm-up ratios as in the paper for both methods.
|
Thanks for the additional advice! I see that I was wrong about the warm-up. It's a ratio, not a number. |
Describe
Model I am using (UniLM, MiniLM, LayoutLM ...): LaoyutLMv3
Question
I would like to estimate how long does pretraining take time.
So I would like to know the GPU you used and the time it took for pretraining.
In the environment at hand, it seems that it will take about 2-3 months for 4 x A100(80GB) with base model ...
I would also like to consider whether this is the correct time.
Condition:
The text was updated successfully, but these errors were encountered: