About process_group in SyncBN #22

yxgeee · 2020-08-28T07:35:45Z

Hi,

I noticed that you adopted 8 GPUs as a group in SyncBN (https://github.com/facebookresearch/swav/blob/master/main_swav.py#L158) when training with a large batch size of 4096, i.e. 512 training samples in a group for sync batchnorm. I am wondering that 1) why don't you use global syncBN for training and 2) how much does it affect?

Thanks!

yxgeee · 2020-08-28T09:30:07Z

Plus, I met some issues on reproducing SimCLR based on your code. As mentioned in your paper, you have reproduced SimCLR's performance, could you please provide the main.py file for SimCLR or provide some training tips? For example, besides the different loss and multi-crop augmentation, is there any other difference from SwAV when training?

Your implementation is really clear and easy to extend! Looking forward to your reply. Thanks.

mathildecaron31 · 2020-09-07T11:22:28Z

Hi @yxgeee, thanks for your interest in this repo.

I use communication groups of 8 GPUs when training with 64 GPUs (8 machines) in order to speed up training. Training with global synchronized batch-norm (i.e stats are shared across all processes) takes about x2 more time ! Sharing batch statistics across processes located on the same machine only allows to get rid of inter-machine communications which we found to be a bottleneck.

Surprisingly enough, we did not observe any decay of performance when synchronizing batch-norm per machine compared to global syncbn.

mathildecaron31 · 2020-09-07T11:25:48Z

Regarding SimCLR, I've not been planning to share my code since there are already many implementations out there. I might include it if there is an interest for the community.

yxgeee · 2020-09-07T16:46:57Z

Thanks a lot for your reply!

BIGBALLON · 2021-10-19T06:15:29Z

Hi @yxgeee, thanks for your interest in this repo.

I use communication groups of 8 GPUs when training with 64 GPUs (8 machines) in order to speed up training. Training with global synchronized batch-norm (i.e stats are shared across all processes) takes about x2 more time ! Sharing batch statistics across processes located on the same machine only allows to get rid of inter-machine communications which we found to be a bottleneck.

Surprisingly enough, we did not observe any decay of performance when synchronizing batch-norm per machine compared to global syncbn.

sounds great !! thanks a lot!

yxgeee closed this as completed Sep 7, 2020

mathildecaron31 mentioned this issue Apr 16, 2021

Cannot replicate the main result from your paper #60

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About process_group in SyncBN #22

About process_group in SyncBN #22

yxgeee commented Aug 28, 2020

yxgeee commented Aug 28, 2020

mathildecaron31 commented Sep 7, 2020

mathildecaron31 commented Sep 7, 2020

yxgeee commented Sep 7, 2020

BIGBALLON commented Oct 19, 2021

About process_group in SyncBN #22

About process_group in SyncBN #22

Comments

yxgeee commented Aug 28, 2020

yxgeee commented Aug 28, 2020

mathildecaron31 commented Sep 7, 2020

mathildecaron31 commented Sep 7, 2020

yxgeee commented Sep 7, 2020

BIGBALLON commented Oct 19, 2021