MemoryBankModule - no distributed all gather prior to queueing a batch #1018

ibro45 · 2022-12-22T22:49:54Z

If I'm not mistaken, the batch is not all_gathered before the queue is updated. I was just comparing the code against the one it was based on (MoCo's memory bank) and noticed this: https://github.com/facebookresearch/moco/blob/78b69cafae80bc74cd1a89ac3fb365dc20d157d3/moco/builder.py#L55

Was this done on purpose?

Is it because different processes would use the same queued batches, unlike the DistributedDataLoader batches which are indexed in DDP way?

The text was updated successfully, but these errors were encountered:

guarin · 2023-01-03T09:10:07Z

Hi, you are right that our implementation differs and that our code does not share memory bank entries between processes. Thanks for pointing it out! Will make an issue to change this, but might take a while until we have time to work on it.

Not sharing has the following benefits:

No sync between processes
Larger diversity in the memory bank (because memory bank is unique for each process)
But also a couple of drawbacks:
Training performance might slightly change if number of processes is changed
Memory bank is updated less frequently

ibro45 · 2023-01-03T18:09:57Z

We could have both approaches by adding all_gather_batches=False to the MemoryBankModule. But I would list the pros and cons in docstrings, as you might want to avoid gathering batches despite using DDP. What do you think?

guarin mentioned this issue Jun 19, 2023

Guarin lig 3062 add mocov2 imagenet benchmark #1291

Merged

guarin closed this as completed in #1291 Dec 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MemoryBankModule - no distributed all gather prior to queueing a batch #1018

MemoryBankModule - no distributed all gather prior to queueing a batch #1018

ibro45 commented Dec 22, 2022 •

edited

guarin commented Jan 3, 2023

ibro45 commented Jan 3, 2023

MemoryBankModule - no distributed all gather prior to queueing a batch #1018

MemoryBankModule - no distributed all gather prior to queueing a batch #1018

Comments

ibro45 commented Dec 22, 2022 • edited

guarin commented Jan 3, 2023

ibro45 commented Jan 3, 2023

ibro45 commented Dec 22, 2022 •

edited