Questions about multi-gpu training #14

RyanHTR · 2018-04-16T13:31:07Z

It's a great work. Does this code support multi-gpu training? I've tried to alter NUM_GPUS and GPU_ID, but it seems like that the code just selects one gpu for training. Is there any clue about it? Thanks.

JiahuiYu · 2018-04-16T17:37:45Z

To enable multi-GPU training, you will need to change this line to MultiGPUTrainer.
Expect some adventures when using multi-GPU for this project. I am not sure about the behavior.

zhiweige · 2018-04-25T07:51:40Z

@RyanHTR Hello, RyanHTR, can you train the network successfully on multi-GPU?

lipanpeng · 2018-05-07T03:22:42Z

@RyanHTR I changed this line to MultiGPUTrainer. But I got an error "TypeError: 'NoneType' object is not callable" which I can't figure it out. Do you have this problem?

zengyh1900 · 2018-06-28T13:02:27Z

@JiahuiYu There is a bug for 'NoneType object is not callable' None()

JiahuiYu · 2018-06-28T15:00:18Z

@1900zyh This is not bug. Loss should be None for multi-GPU training.

zengyh1900 · 2018-06-28T15:06:47Z

@JiahuiYu I think it should be
assert loss is None, 'For multigpu training, graph_def should be provided, instead of loss.'
Or it will report TypeError

JiahuiYu · 2018-06-28T18:12:02Z

@1900zyh Ohhhh I see. Thank you!

bis-carbon · 2019-04-04T18:39:08Z

I have 4 GTX 1080Ti GPUs and each gpu can handle batch size of 16 that means if I use all the gpus I can change batch size to 64. But when I do that my GPUs ran out of memory.
Am assuming here that ng.train.MultiGPUTrainer uses data parallelism to split input data (64 batch size) in to 4 gpus where each gpu gets 16 batch of images.

Because of that Issue I can only train on batch size of 16, whether I use 4 gpus or 1 gpu.
What are your thoughts about this?

JiahuiYu · 2019-04-04T19:55:13Z

@bis-carbon The batch size here is the per-gpu batch size.

bis-carbon · 2019-04-04T20:02:10Z

Thank you for your quick response and great work.

Adhiyaman-Manickam · 2019-11-11T04:04:36Z

@1900zyh @bis-carbon @lipanpeng Hi. Have you figured out the issues that how to use multi gpu for training. If so, kinldy let me know, I am struggling. Thanks in advance

JiahuiYu closed this as completed Apr 17, 2018

JiahuiYu mentioned this issue Apr 24, 2018

Question about the training speed #17

Closed

xhh232018 mentioned this issue Jun 25, 2018

Questions about multi-GPU training #63

Closed

JiahuiYu mentioned this issue Sep 26, 2018

can this code supports multi gpus? #143

Closed

laolongboy mentioned this issue Nov 23, 2018

Question about GPU #181

Closed

JiahuiYu mentioned this issue Apr 4, 2019

Multi-GPU run out of memory #242

Closed

JiahuiYu reopened this Aug 9, 2019

JiahuiYu added the good first issue Good for newcomers label Aug 15, 2019

Adhiyaman-Manickam mentioned this issue Nov 11, 2019

GPU speed - Reg #349

Closed

JiahuiYu mentioned this issue Nov 25, 2020

Doesn't use GPUs to train when 7 GPUs available #475

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions about multi-gpu training #14

Questions about multi-gpu training #14

RyanHTR commented Apr 16, 2018

JiahuiYu commented Apr 16, 2018

zhiweige commented Apr 25, 2018

lipanpeng commented May 7, 2018 •

edited

zengyh1900 commented Jun 28, 2018 •

edited

JiahuiYu commented Jun 28, 2018

zengyh1900 commented Jun 28, 2018

JiahuiYu commented Jun 28, 2018

bis-carbon commented Apr 4, 2019 •

edited

JiahuiYu commented Apr 4, 2019

bis-carbon commented Apr 4, 2019

Adhiyaman-Manickam commented Nov 11, 2019

Questions about multi-gpu training #14

Questions about multi-gpu training #14

Comments

RyanHTR commented Apr 16, 2018

JiahuiYu commented Apr 16, 2018

zhiweige commented Apr 25, 2018

lipanpeng commented May 7, 2018 • edited

zengyh1900 commented Jun 28, 2018 • edited

JiahuiYu commented Jun 28, 2018

zengyh1900 commented Jun 28, 2018

JiahuiYu commented Jun 28, 2018

bis-carbon commented Apr 4, 2019 • edited

JiahuiYu commented Apr 4, 2019

bis-carbon commented Apr 4, 2019

Adhiyaman-Manickam commented Nov 11, 2019

lipanpeng commented May 7, 2018 •

edited

zengyh1900 commented Jun 28, 2018 •

edited

bis-carbon commented Apr 4, 2019 •

edited