-
Notifications
You must be signed in to change notification settings - Fork 276
About process_group in SyncBN #22
Comments
Plus, I met some issues on reproducing SimCLR based on your code. As mentioned in your paper, you have reproduced SimCLR's performance, could you please provide the main.py file for SimCLR or provide some training tips? For example, besides the different loss and multi-crop augmentation, is there any other difference from SwAV when training? Your implementation is really clear and easy to extend! Looking forward to your reply. Thanks. |
Hi @yxgeee, thanks for your interest in this repo. I use communication groups of 8 GPUs when training with 64 GPUs (8 machines) in order to speed up training. Training with global synchronized batch-norm (i.e stats are shared across all processes) takes about x2 more time ! Sharing batch statistics across processes located on the same machine only allows to get rid of inter-machine communications which we found to be a bottleneck. Surprisingly enough, we did not observe any decay of performance when synchronizing batch-norm per machine compared to global syncbn. |
Regarding SimCLR, I've not been planning to share my code since there are already many implementations out there. I might include it if there is an interest for the community. |
Thanks a lot for your reply! |
sounds great !! thanks a lot! |
Hi,
I noticed that you adopted 8 GPUs as a group in SyncBN (https://github.com/facebookresearch/swav/blob/master/main_swav.py#L158) when training with a large batch size of 4096, i.e. 512 training samples in a group for sync batchnorm. I am wondering that 1) why don't you use global syncBN for training and 2) how much does it affect?
Thanks!
The text was updated successfully, but these errors were encountered: