Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All samples from each GPU combined before applying contrastive loss? #5

Open
fedshyvana opened this issue Jul 16, 2022 · 0 comments
Open

Comments

@fedshyvana
Copy link

Hi, thank you for the great work Jianwei! I was wondering, for distributed training, do you:

  1. combine the mini-batches across GPUs before applying contrastive loss (therefore the actual batchsize = n_GPUs x batchsize per GPU)
    OR
  2. simply compute the contrastive loss seperately for each GPU? (batchsize is just the batchsize on each GPU)

I've seen implementations of contrastive pretraining methods such as this one (SimCLR) do the 1st option:
https://github.com/Spijkervet/SimCLR/blob/cd85c4366d2e6ac1b0a16798b76ac0a2c8a94e58/simclr/modules/gather.py#L5

I ask because in your code, you have a comment that says "# gather features from all gpus" but if I'm not mistaken I don't actually see where the features are gathered across all GPUs:

UniCL/main.py

Line 177 in 4f680ff

# gather features from all gpus

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant