encode_images.py runs out of memory on image sequence #3

sam598 · 2019-02-26T05:10:56Z

When encoding a large number of images the time to set reference images to the perceptual model takes longer, and eventually the script crashes with the following error:

019-02-25 21:02:48.244097: W tensorflow/core/common_runtime/bfc_allocator.cc:271] Allocator (GPU_0_bfc) ran out of memory trying to allocate 320.00MiB. Current allocation summary follows.
2019-02-25 21:02:48.252031: W tensorflow/core/common_runtime/bfc_allocator.cc:275] ****************************______
2019-02-25 21:02:48.257082: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at conv_grad_input_ops.cc:937 : Resource exhausted: OOM when allocating tensor with shape[5,16,1024,1024] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[5,16,1024,1024] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node gradients_10/G_synthesis_1/_Run/G_synthesis/ToRGB_lod0/Conv2D_grad/Conv2DBackpropInput}} = Conv2DBackpropInput[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](gradients_10/G_synthesis_1/_Run/G_synthesis/ToRGB_lod0/Conv2D_grad/ShapeN, G_synthesis_1/_Run/G_synthesis/ToRGB_lod0/mul, gradients_10/G_synthesis_1/_Run/G_synthesis/ToRGB_lod0/add_grad/tuple/control_dependency)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

When training on a batch of 1 image at a time it takes about 250 images before it crashes. When training on a batch of 5 images at a time it crashes on the 10th batch (50 images).

Is the perceptual model holding onto previous images? Could there be a memory leak somewhere? As far as I can tell the crash happens on the self.sess.run command of the optimize method. I also tried removing tqdm from the script but it still crashes during training.

sam598 · 2019-02-26T05:39:11Z

I found this thread: tensorflow/tensorflow#4151 which I think solves the issue.

SystemErrorWang · 2019-08-30T06:42:10Z

I found this thread: tensorflow/tensorflow#4151 which I think solves the issue.

I got similar problem, tried your modification but failed, as other parts of the codes are heavily modified and created conflicts.

sam598 closed this as completed Feb 26, 2019

sam598 mentioned this issue Feb 26, 2019

Added placeholders to fix memory leak in perceptual model #4

Open

sam598 mentioned this issue Jun 15, 2019

Runs out of memory and crashes when encoding an image sequence. pbaylies/stylegan-encoder#6

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

encode_images.py runs out of memory on image sequence #3

encode_images.py runs out of memory on image sequence #3

sam598 commented Feb 26, 2019 •

edited

sam598 commented Feb 26, 2019

SystemErrorWang commented Aug 30, 2019

encode_images.py runs out of memory on image sequence #3

encode_images.py runs out of memory on image sequence #3

Comments

sam598 commented Feb 26, 2019 • edited

sam598 commented Feb 26, 2019

SystemErrorWang commented Aug 30, 2019

sam598 commented Feb 26, 2019 •

edited