Discussion on convergence and memory requirements using ResNet #84

anguoyang · 2016-03-09T15:09:41Z

Hi, I found a description on the dede website that resnet_50 need 6G+ memory to run, is it a "MUST" requirement or just a "PREFER" one? thanks.

beniz · 2016-03-09T16:00:36Z

Thanks for bringing this up. I believe that this is true for training, but not for prediction, so I'll get the info on the website corrected accordingly.

At the moment, doing a quick image classification test on a single image prediction task, the resnet_50 takes 302MB out of 12GB on a K40, as reported by nvidia-smi.

If you wish to train a model, there's some more info regarding resnets in #60.

revilokeb · 2016-03-09T20:26:33Z

Hi @beniz and @anguoyang, if the network architecture is fixed (resnet_50, resnet_101,...) then batch_size is the important variable that determines if the algo can be run on a GPU with given RAM.

That means for quite a few prediction setups the batch size could be even set to as small as 1, thus reducing necessary GPU RAM to relatively small amounts as pointed out by Emmanuel.

Training a network is another matter. However according to my experience this also very much depends on the task. For example I have been able to finetune resnet_50, resnet_101 and even resnet_152 on certain classification tasks with batch sizes as low as 4 or 8 on a single GPU, which requires GPU RAM for resnet_50 of less than 4GB (batch_size=8, according to nvidia-smi), for resnet_101 of less than 6GB (batch_size=8) and for resnet_152 a little more than 5GB (batch_size=4). Classification error was low throughout those experiments. But of course my task was much, much simpler than training ImageNet from scratch, which I do not think is possible that way. All I would like to point out is that depending on the complexity of the task (for finetuning sometimes I set the learning rate of all lower level layers to small numbers or even zero) sometimes relatively small batch sizes can be afforded allowing to use transfer learning / finetuning on the really ultra-deep nets with moderately large (single) GPUs.

As a consequence even with limited GPU ressources (such as single 4Gb or 6GB GPU) it really is sometimes possible to use the high-quality ultra-deep nets to learn interesting task and later publish them for prediction on moderately expensive Amazon 4GB GPUs.

beniz · 2016-03-09T21:18:05Z

Thanks for all the detailed info. FTR, some people have reported difficulty with convergence, see KaimingHe/deep-residual-networks#6

ghost · 2016-07-28T11:31:29Z

Hi @beniz

I just find your quick analysis on the memory usage. Is the GPU memory depends on the input image size? For example, when I use resnet_50 and resize the input image to around 1200 x 4000, out of memory occurs. But when downsize the image to around 900 x 3000, it works.

I hope you can provide another quick analysis about the relationship btw image size and memory (fix the batchsize to a small constant).

beniz · 2016-07-28T12:14:51Z

The ResNets are fully convolutional, i.e. any size above the initial 224x224 training size works. Of course the memory requirement increases with size. I'd expect a square increase or even above due to the increased number of feature maps.

ghost · 2016-07-28T13:15:30Z

Thank you.

freeyawork · 2016-12-08T04:02:25Z

@beniz if input image size larger than 224*224, the units of output of flatten layer will increase. I think that's the reason of memory requirement increases with size.

youye115 · 2017-03-13T06:26:21Z

I trained faster-rcnn-resnet50 well ， but when I use the trained model to predict on same machine , check failed " out of memory", Anyone knows why?
Uploading test1.txt…

beniz added type:question kind:neural net labels Mar 9, 2016

beniz changed the title ~~Memory requirements on ResNet~~ Discussion on convergence and memory requirements using ResNet Mar 12, 2016

beniz mentioned this issue Mar 12, 2016

Training with deep residual network KaimingHe/deep-residual-networks#6

Closed

This was referenced Jun 30, 2016

Error while training resnet_32/resnet_50 - "exception while forward/backward pass through the network" #154

Closed

ResNet-50 out of memory on a 4GB GPU? #156

Closed

alkamid mentioned this issue Sep 13, 2016

Prediction fails for ResNet-50 - outputs the default class #178

Closed

naibaf7 mentioned this issue Apr 26, 2018

Question about performance naibaf7/libdnn#27

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discussion on convergence and memory requirements using ResNet #84

Discussion on convergence and memory requirements using ResNet #84

anguoyang commented Mar 9, 2016

beniz commented Mar 9, 2016

revilokeb commented Mar 9, 2016

beniz commented Mar 9, 2016

ghost commented Jul 28, 2016

beniz commented Jul 28, 2016

ghost commented Jul 28, 2016

freeyawork commented Dec 8, 2016

youye115 commented Mar 13, 2017 •

edited

Loading

Discussion on convergence and memory requirements using ResNet #84

Discussion on convergence and memory requirements using ResNet #84

Comments

anguoyang commented Mar 9, 2016

beniz commented Mar 9, 2016

revilokeb commented Mar 9, 2016

beniz commented Mar 9, 2016

ghost commented Jul 28, 2016

beniz commented Jul 28, 2016

ghost commented Jul 28, 2016

freeyawork commented Dec 8, 2016

youye115 commented Mar 13, 2017 • edited Loading

youye115 commented Mar 13, 2017 •

edited

Loading