-
Notifications
You must be signed in to change notification settings - Fork 591
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how much gpus did you use? #48
Comments
The model was trained on 2 Quadro RTX 6000s for about 6 days: 1 day for pre-training on RedWeb + about 5 days for training on MIX5. Multi-GPU with low batch size was not an issue, because we froze all batch norm layers. |
@ranftlr thanks for your quick responce. I am training MidasNet on 2 tesla v100 gpus, but it's much slower. So the mean and val of bn layers is not updated during the training process? |
Correct, mean and variance are fixed and are not updated. I can't comment on the speed on a v100 as I don't have any available. |
@Tord-Zhang |
Hi, since the MidasNet is a very large model, how much gpus did you use and how long did it take to train the model? Since the batchsize is not larget ( 8 for each dataset), would multi-gpu training hurt the performance? Since there are a lot of batch normalization layers in the encoder.
Thanks.
The text was updated successfully, but these errors were encountered: