You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Pytorch guys recently released an official SyncBatchnorm implementation. It requires a specific setup where we use torch.parallel.DistributedDataParallel(...) instead of nn.DataParallel(...) and launch a separate process for each GPU.
In my experiments SyncBatchnorm worked well. Also, using torch.parallel.DistributedDataParallel(...) with one process per GPU provides a huge speed up in training. The gain of adding more GPUs is almost linear, it performs a lot faster than nn.DataParallel(...). I believe you could reduce training time drastically by switching to torch.parallel.DistributedDataParallel(...).
BTW, thanks for this implementation!
The text was updated successfully, but these errors were encountered:
Pytorch guys recently released an official SyncBatchnorm implementation. It requires a specific setup where we use
torch.parallel.DistributedDataParallel(...)
instead ofnn.DataParallel(...)
and launch a separate process for each GPU.I wrote a small step-by-step here: https://github.com/dougsouza/pytorch-sync-batchnorm-example.
In my experiments SyncBatchnorm worked well. Also, using
torch.parallel.DistributedDataParallel(...)
with one process per GPU provides a huge speed up in training. The gain of adding more GPUs is almost linear, it performs a lot faster thannn.DataParallel(...)
. I believe you could reduce training time drastically by switching totorch.parallel.DistributedDataParallel(...)
.BTW, thanks for this implementation!
The text was updated successfully, but these errors were encountered: