Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chainer MultiGPU #87

Open
ilkarman opened this issue Jun 1, 2018 · 0 comments
Open

Chainer MultiGPU #87

ilkarman opened this issue Jun 1, 2018 · 0 comments

Comments

@ilkarman
Copy link
Owner

ilkarman commented Jun 1, 2018

@mitmul Thank you for highlighting my typo in your PR request; I wanted to highlight two further issues I am facing here

  1. Toggling between single and muli-gpu (4x) improves time-taken from 47min15s to 14min43s; however for some reason the AUC also drops from 0.8028 (which matches all other examples) to 0.56. This does not happen for example with PyTorch. There is a also a diff in validation/main/loss which ends at 0.23 for multi-gpu but 0.15 for single-gpu

  2. I wondered if there was an update to the pre-trained densenet model so that I no longer have to override CaffeFunction with class to reduce the memory fooptrint? The custom call_ lets me use a batch of 56 over 32, however I am still not able to get the low-memory footprint as with other frameworks that lets me run a batch of 64

Chainer:  4.1.0
CuPy:  4.1.0
Numpy:  1.14.1
GPU:  ['Tesla V100-PCIE-16GB', 'Tesla V100-PCIE-16GB', 'Tesla V100-PCIE-16GB', 'Tesla V100-PCIE-16GB']
CUDA Version 9.0.176
CuDNN Version  7.0.5
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant