Chainer MultiGPU #87

ilkarman · 2018-06-01T12:35:01Z

@mitmul Thank you for highlighting my typo in your PR request; I wanted to highlight two further issues I am facing here

Toggling between single and muli-gpu (4x) improves time-taken from 47min15s to 14min43s; however for some reason the AUC also drops from 0.8028 (which matches all other examples) to 0.56. This does not happen for example with PyTorch. There is a also a diff in validation/main/loss which ends at 0.23 for multi-gpu but 0.15 for single-gpu
I wondered if there was an update to the pre-trained densenet model so that I no longer have to override CaffeFunction with class to reduce the memory fooptrint? The custom call_ lets me use a batch of 56 over 32, however I am still not able to get the low-memory footprint as with other frameworks that lets me run a batch of 64

Chainer:  4.1.0
CuPy:  4.1.0
Numpy:  1.14.1
GPU:  ['Tesla V100-PCIE-16GB', 'Tesla V100-PCIE-16GB', 'Tesla V100-PCIE-16GB', 'Tesla V100-PCIE-16GB']
CUDA Version 9.0.176
CuDNN Version  7.0.5

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chainer MultiGPU #87

Chainer MultiGPU #87

ilkarman commented Jun 1, 2018 •

edited

Loading

Chainer MultiGPU #87

Chainer MultiGPU #87

Comments

ilkarman commented Jun 1, 2018 • edited Loading

ilkarman commented Jun 1, 2018 •

edited

Loading