Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Actual batch size with multiple GPUs #120

Open
IceBubble217 opened this issue Jan 13, 2024 · 0 comments
Open

Actual batch size with multiple GPUs #120

IceBubble217 opened this issue Jan 13, 2024 · 0 comments

Comments

@IceBubble217
Copy link

I am training on 4 GPUs, each fits 2 examples. I am setting the training with
TRAIN.BATCH_SIZE_TOTAL 8 \ TRAIN.BATCH_SIZE_PER_GPU 2 \

It seem in the code TRAIN.BATCH_SIZE_PER_GPU actually doesn't matter. The batch size per GPU is determined by TRAIN.BATCH_SIZE_TOTAL divided by # GPUs. Can you confirm that?

I also want to confirm that when computing gradient steps, the effective batch size is 8, not 2 for my setting.

Finally, I found there is a parameter 'GRADIENT_ACCUMULATE_STEP', which is set to be 1 by default. Should I specify this as well if I want to have larger batch size? E.g. 16.

Thanks in advance for the so many questions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant