Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

pretrain a model with the MLM objective #7

Closed
liujiqiang999 opened this issue Feb 20, 2019 · 3 comments
Closed

pretrain a model with the MLM objective #7

liujiqiang999 opened this issue Feb 20, 2019 · 3 comments

Comments

@liujiqiang999
Copy link

Hi, How many GPU are used when training a model with the MLM objective?

@glample
Copy link
Contributor

glample commented Feb 20, 2019

Hi,

In practice, we observed that bigger batches seem to help, and it significantly accelerates the overall training time. I would suggest to use at least 8 GPUs, especially if you use a big model.

@glample glample closed this as completed Feb 21, 2019
@jiahuigeng
Copy link

Have you successfully implemented accumulated gradients or multi-gpu settings?

@glample
Copy link
Contributor

glample commented Apr 9, 2019

Not yet. It will be easy in the next version of PyTorch distributed, so we will wait for that.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants