Any tips to reduce training time #4

cricket1 · 2016-12-16T09:00:44Z

I am using p2.x.large machine on aws .It has 12gb gpu memory
For batch size > 1 I see high gpu utility and gpu memory usage .Thus ending up with out of memory issues
With batch size = 1 ,an epoch takes 4 hrs ,thus 3000 epochs = 500 days :)
Is there a pretrained model which I can use for training
Is such High usage normal.I see that each image is 10kb and mask is 2.5kb.I am using chainer for the first time .Am i doing something wrong .

bobye · 2016-12-30T17:00:30Z

Make sure you have cudnn enabled for chainer installation. I am able to run with batchsize = 3 after I fixed the cudnn issue (for 12gb GPU).

shiba24 · 2016-12-31T15:13:14Z

Hi, thank you for the comments, @bobye and @cricket1 .
Large memory usage is sometimes inevitable for this kind of neural network.
Just one tip: chainer recently implemented Forget function (http://docs.chainer.org/en/stable/reference/functions.html#forget), and using this definitely we can reduce memory usage, make batchsize larger! (Sorry i do not have enough time to implement this now, though...)

cricket1 · 2017-01-03T17:43:05Z

@bobye @shiba24 thnks for the input will try both

xscjun · 2017-01-05T08:33:52Z

could somebody share a pretrained model ?

bobye · 2017-01-07T00:35:00Z

@xscjun I guess this project is still in progress, actually not complete. You are welcome to contribute.

shiba24 · 2017-01-18T06:05:40Z

In a few weeks I will try to reduce memory usage by implementing forget function as long as we can increase GPU utils.

GhadeerMohamad · 2018-02-20T15:36:01Z

Hello, Tnx for the great work, and is there anyway to save checkpoints while training the model since i haven't found the trained model after i finished training.
thank you

shiba24 closed this as completed Jun 21, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Any tips to reduce training time #4

Any tips to reduce training time #4

cricket1 commented Dec 16, 2016

bobye commented Dec 30, 2016

shiba24 commented Dec 31, 2016 •

edited

cricket1 commented Jan 3, 2017

xscjun commented Jan 5, 2017

bobye commented Jan 7, 2017

shiba24 commented Jan 18, 2017

GhadeerMohamad commented Feb 20, 2018

Any tips to reduce training time #4

Any tips to reduce training time #4

Comments

cricket1 commented Dec 16, 2016

bobye commented Dec 30, 2016

shiba24 commented Dec 31, 2016 • edited

cricket1 commented Jan 3, 2017

xscjun commented Jan 5, 2017

bobye commented Jan 7, 2017

shiba24 commented Jan 18, 2017

GhadeerMohamad commented Feb 20, 2018

shiba24 commented Dec 31, 2016 •

edited