Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large batch size and multiple GPUs #137

Closed
jjcao opened this issue Oct 31, 2017 · 4 comments
Closed

Large batch size and multiple GPUs #137

jjcao opened this issue Oct 31, 2017 · 4 comments

Comments

@jjcao
Copy link

jjcao commented Oct 31, 2017

When I train pix2pix model with batchSize > 1, norm = batch, and multiple GPU, the results seem wrong/strange.

When I train pix2pix model with batchSize > 1, norm = batch, and single GPU, the results are correct.

Could this be solved?

Thank you.

@junyanz
Copy link
Owner

junyanz commented Oct 31, 2017

We observe that batchSize=1 with single gpu gives us the best results so far.
According to this post, It seems that pytorch calculates mean/var statistics for each gpu.
So how many images do you have per GPU? 1 per GPU might cause some problems for batchnorm.
Have you tried instancenorm with multi-gpu setting and batchSize>1?

@jjcao
Copy link
Author

jjcao commented Oct 31, 2017

Yes. with --norm instance, it worked.

Specify the number of images per GPU? Is there an option or is it simple for changing your code?

@junyanz
Copy link
Owner

junyanz commented Oct 31, 2017

I guess it will be batchSize/#gpus.

@jjcao
Copy link
Author

jjcao commented Oct 31, 2017

If it is batchSize/#gpus, then norm still need to be "instance" for successful training. I have tested this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants