Memory Problem #49

adkulas · 2018-11-28T21:21:31Z

I was just playing around with some of the example photos given but I noticed as I convert an image, the memory used by CUDA gets used and is never deallocated.

Basically if I convert a couple images my memory gets used up and I get the CUDA out of memory error.

I noticed if I restart the kernal or kill the python process I get the memory back and can continue trying to convert a different image.

Shouldn't the memory be deallocated after the image is converted and saved?

(I was running the project on windows 10 with an NVIDIA GTX970 graphics card. I used the weights linked to in your article:
https://blog.floydhub.com/colorizing-and-restoring-old-images-with-deep-learning/ )

jantic · 2018-11-28T21:24:41Z

What's the render factor you're trying to use? I haven't run into a memory leak yet myself (I don't think at least). I've tried looking for them too.

adkulas · 2018-11-29T00:01:45Z

I tried different render factors, 42 will give me the error on the first picture but if I use 15 I can do about two or three sample pictures before I get the error.

Do you know of a way to monitor the memory usage of the graphics card? I can try to get some more concrete metrics.

jantic · 2018-11-29T01:17:20Z

There's nvidia-smi to monitor memory usage. render_factor 42 won't work for sure on the 970.

Given that graphics card, I think your best bet is to use the Colab notebook for now-
https://colab.research.google.com/github/jantic/DeOldify/blob/master/DeOldify_colab.ipynb

Unfortunately the next two weeks or so are going to be super busy for me and I'll probably not be able to do much until then.

stevemurch · 2018-12-03T17:21:22Z

Just jumping in here to confirm adkulas's report above also on Win10 with an NVIDIA GTX970 graphics card. Works for 2-3 images at render factor 15 before reporting CUDA error: out of memory. Trying with render factor 42 immediately throws out of memory error.

Perhaps this is helpful? I haven't yet tried it out: https://discuss.pytorch.org/t/how-to-clear-some-gpu-memory/1945

(By the way, amazing work Jason!)

stevemurch · 2018-12-03T17:43:59Z

An update:

My card is actually a 980 Ti WDDM
Win10 config.

Note that in the "Color Visualization" notebook at present writing, in some cases, the render_factor is hard-coded and in some cases it uses the default. I was just mindlessly pressing the run button on the cells -- e.g. some lines are written thus:

vis.plot_transformed_image("test_images/DayAtSeaBelgium.jpg", render_factor=41)

Be sure to use a lower render_factor if you get out of memory.

But I can still confirm with the nvidia-smi.exe tool (installed by default in c:\program files\nvidia corporation\nvsmi -- it's a command-line tool)... that memory does not appear to be released on Windows, and that once the video memory is used up, you'll get out-of-memory errors.

I haven't yet tried out the pytorch link above to try to clear memory.

stevemurch · 2018-12-03T18:29:23Z

I was able to get it to successively render by calling torch.cuda.empty_cache() and gc.collect() after each colorize call.

For instance:

torch.cuda.empty_cache()
n = 2**14 a_2GB = np.ones((n, n)) # RAM: +2GB
del a_2GB # RAM: -2GB

// be sure to import gc in declarations and then
gc.collect()

Still trying to isolate which of these are necessary and which are simply decorative. Will update when I know more. I'm still coming up to speed on python and pytorch, so bear with the hacky code above... will incorporate into a new function in a more elegant solution.

On my machine, I also had to keep the render_factor below about 30 -- and note that it's hard-coded in the Jupyter notebook sample in several calls to a number larger than 30.

stevemurch · 2018-12-03T18:48:42Z

Yes, I can confirm that calling a clean_mem function prior to each visualization call makes the sample notebook reliably work on Win10 GTX980. For others, on my specific machine, I had to keep the render_factor at around 30 or below. But there is now no longer any need to restart the kernel to get through the entire notebook.

For those, like me, rather new to python, the steps you want to do are:

Add this to top of file (because you'll need the general python garbage collector):

import gc

Add a new function definition toward the top of the notebook:

def clean_mem():
    torch.cuda.empty_cache()
    n = 2**14
    a_2GB = np.ones((n, n))  # RAM: +2GB
    del a_2GB  # RAM: -2GB
    gc.collect()

call clean_mem() before (or after) rendering each image.

I understand from Jason and others that this memory management isn't explicitly required on non-Windows platforms.

jantic · 2018-12-03T22:55:59Z

Wow great detective work Steve! Thank you. I'll keep this open to remind myself to put in a fix for this- didn't realize Windows machines would have this sort of issue.

stevemurch · 2018-12-15T00:28:34Z

I've switched to Linux (Ubuntu 18.04 for now) and can confirm I got the out of memory error too -- in my case, when I try render factors above about 30, it barfs. Moreover, it doesn't seem to be "self-cleaning" -- once it hits the CUDA out of memory error, it stays there. HOWEVER the code above (the sample clean_mem() function I wrote above) does work.

With minimal modification to the Jupyter color visualization notebook, I run something like this:

for i in range(10):
    clean_mem()
    vis.plot_transformed_image("/home/steve/build/DeOldify/input_images/"+str(i+1)+".jpg", render_factor=30)

(obviously I have a folder called input_images, and have plunked a bunch of B&W photos in there starting with 1.jpg, 2.jpg... 10.jpg)

jantic · 2018-12-15T01:51:58Z

Good to know. I'm doing a big overhaul of the code right now. This will be part of that.

jantic · 2019-03-19T23:02:02Z

This will be revisted when we explicitly add support for Windows (a distant future thing for now). Windows support currently isn't a priority, especially since Colab notebooks exists. Closing.

stevemurch mentioned this issue Dec 3, 2018

Worked! Observations from a newbie. (Win 10 install) #55

Closed

jantic closed this as completed Dec 15, 2018

jantic reopened this Dec 15, 2018

jantic mentioned this issue Jan 5, 2019

RuntimeError: cuda runtime error (2) : out of memory at ..\aten\src\THC\THCGeneral.cpp:663 #64

Closed

jantic closed this as completed Mar 19, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory Problem #49

Memory Problem #49

adkulas commented Nov 28, 2018

jantic commented Nov 28, 2018

adkulas commented Nov 29, 2018

jantic commented Nov 29, 2018

stevemurch commented Dec 3, 2018

stevemurch commented Dec 3, 2018

stevemurch commented Dec 3, 2018 •

edited

stevemurch commented Dec 3, 2018 •

edited

jantic commented Dec 3, 2018

stevemurch commented Dec 15, 2018 •

edited

jantic commented Dec 15, 2018

jantic commented Mar 19, 2019 •

edited

Memory Problem #49

Memory Problem #49

Comments

adkulas commented Nov 28, 2018

jantic commented Nov 28, 2018

adkulas commented Nov 29, 2018

jantic commented Nov 29, 2018

stevemurch commented Dec 3, 2018

stevemurch commented Dec 3, 2018

stevemurch commented Dec 3, 2018 • edited

stevemurch commented Dec 3, 2018 • edited

jantic commented Dec 3, 2018

stevemurch commented Dec 15, 2018 • edited

jantic commented Dec 15, 2018

jantic commented Mar 19, 2019 • edited

stevemurch commented Dec 3, 2018 •

edited

stevemurch commented Dec 3, 2018 •

edited

stevemurch commented Dec 15, 2018 •

edited

jantic commented Mar 19, 2019 •

edited