Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MultiGPU #29

Closed
FengLoveBella opened this issue Jan 30, 2018 · 8 comments
Closed

MultiGPU #29

FengLoveBella opened this issue Jan 30, 2018 · 8 comments

Comments

@FengLoveBella
Copy link

How to run your code on multi-GPU? Thank you very much.

@zhengyang-wang
Copy link
Owner

I haven't explored Tensorflow on multi-GPU currently.

@myhooo
Copy link

myhooo commented Jan 31, 2018

I add one GPU at the line "os.environ['CUDA_VISIBLE_DEVICES'] = '1,3'" in the main.py and the code can run on these two GPUs. @zhoufengbuaa

@John1231983
Copy link

I do not think so. For multiple GPU, you have to compute average gradient and batch normalization. It is very difficult. For easy, just compute average gradient and it will work. See the example of mnist dataset

@FengLoveBella
Copy link
Author

@myhooo os.environ['CUDA_VISIBLE_DEVICES'] = '1,3' it is absolutely not ok, the gpu1 and gpu3 are allocated, but only the gpu1 is used for network.
@John1231983 I try a lot to use multi-gpu, I really compute average grads and average loss, but there is still some problem. reuse_variables and some else drive me crazy.

@FengLoveBella
Copy link
Author

@zhengyang-wang It is very important to use large batch when semantic segmentation. Multi-gpu is absolutely a good chiose.

@myhooo
Copy link

myhooo commented Feb 2, 2018

@zhoufengbuaa Thank you for telling me that I am wrong~ ^_^

@zhengyang-wang
Copy link
Owner

@zhoufengbuaa I'm aware of that. However, there is an easy way as suggested by @John1231983, which is to use accumulated gradients. A similar way is used in the implementation of msc training. You can read my code to figure out how to do it. This approach allows you to use a large batch of larger patches, but it takes longer time to train.

@John1231983
Copy link

I thinl gradient is one one problem of multiple gpu. The another is syn. batch norm statistic that is not support in tensorflow now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants