Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory Problem #49

Closed
adkulas opened this issue Nov 28, 2018 · 11 comments
Closed

Memory Problem #49

adkulas opened this issue Nov 28, 2018 · 11 comments

Comments

@adkulas
Copy link

adkulas commented Nov 28, 2018

I was just playing around with some of the example photos given but I noticed as I convert an image, the memory used by CUDA gets used and is never deallocated.

Basically if I convert a couple images my memory gets used up and I get the CUDA out of memory error.

I noticed if I restart the kernal or kill the python process I get the memory back and can continue trying to convert a different image.

Shouldn't the memory be deallocated after the image is converted and saved?

(I was running the project on windows 10 with an NVIDIA GTX970 graphics card. I used the weights linked to in your article:
https://blog.floydhub.com/colorizing-and-restoring-old-images-with-deep-learning/ )

@jantic
Copy link
Owner

jantic commented Nov 28, 2018

What's the render factor you're trying to use? I haven't run into a memory leak yet myself (I don't think at least). I've tried looking for them too.

@adkulas
Copy link
Author

adkulas commented Nov 29, 2018

I tried different render factors, 42 will give me the error on the first picture but if I use 15 I can do about two or three sample pictures before I get the error.

Do you know of a way to monitor the memory usage of the graphics card? I can try to get some more concrete metrics.

@jantic
Copy link
Owner

jantic commented Nov 29, 2018

There's nvidia-smi to monitor memory usage. render_factor 42 won't work for sure on the 970.

Given that graphics card, I think your best bet is to use the Colab notebook for now-
https://colab.research.google.com/github/jantic/DeOldify/blob/master/DeOldify_colab.ipynb

Unfortunately the next two weeks or so are going to be super busy for me and I'll probably not be able to do much until then.

@stevemurch
Copy link

Just jumping in here to confirm adkulas's report above also on Win10 with an NVIDIA GTX970 graphics card. Works for 2-3 images at render factor 15 before reporting CUDA error: out of memory. Trying with render factor 42 immediately throws out of memory error.

Perhaps this is helpful? I haven't yet tried it out: https://discuss.pytorch.org/t/how-to-clear-some-gpu-memory/1945

(By the way, amazing work Jason!)

@stevemurch
Copy link

An update:

  • My card is actually a 980 Ti WDDM
  • Win10 config.

Note that in the "Color Visualization" notebook at present writing, in some cases, the render_factor is hard-coded and in some cases it uses the default. I was just mindlessly pressing the run button on the cells -- e.g. some lines are written thus:

vis.plot_transformed_image("test_images/DayAtSeaBelgium.jpg", render_factor=41)

Be sure to use a lower render_factor if you get out of memory.

But I can still confirm with the nvidia-smi.exe tool (installed by default in c:\program files\nvidia corporation\nvsmi -- it's a command-line tool)... that memory does not appear to be released on Windows, and that once the video memory is used up, you'll get out-of-memory errors.

I haven't yet tried out the pytorch link above to try to clear memory.

@stevemurch
Copy link

stevemurch commented Dec 3, 2018

I was able to get it to successively render by calling torch.cuda.empty_cache() and gc.collect() after each colorize call.

For instance:

torch.cuda.empty_cache()
n = 2**14 a_2GB = np.ones((n, n)) # RAM: +2GB
del a_2GB # RAM: -2GB

// be sure to import gc in declarations and then
gc.collect()

Still trying to isolate which of these are necessary and which are simply decorative. Will update when I know more. I'm still coming up to speed on python and pytorch, so bear with the hacky code above... will incorporate into a new function in a more elegant solution.

On my machine, I also had to keep the render_factor below about 30 -- and note that it's hard-coded in the Jupyter notebook sample in several calls to a number larger than 30.

@stevemurch
Copy link

stevemurch commented Dec 3, 2018

Yes, I can confirm that calling a clean_mem function prior to each visualization call makes the sample notebook reliably work on Win10 GTX980. For others, on my specific machine, I had to keep the render_factor at around 30 or below. But there is now no longer any need to restart the kernel to get through the entire notebook.

For those, like me, rather new to python, the steps you want to do are:

  1. Add this to top of file (because you'll need the general python garbage collector):

import gc

  1. Add a new function definition toward the top of the notebook:
def clean_mem():
    torch.cuda.empty_cache()
    n = 2**14
    a_2GB = np.ones((n, n))  # RAM: +2GB
    del a_2GB  # RAM: -2GB
    gc.collect()
  1. call clean_mem() before (or after) rendering each image.

I understand from Jason and others that this memory management isn't explicitly required on non-Windows platforms.

@jantic
Copy link
Owner

jantic commented Dec 3, 2018

Wow great detective work Steve! Thank you. I'll keep this open to remind myself to put in a fix for this- didn't realize Windows machines would have this sort of issue.

@stevemurch
Copy link

stevemurch commented Dec 15, 2018

I've switched to Linux (Ubuntu 18.04 for now) and can confirm I got the out of memory error too -- in my case, when I try render factors above about 30, it barfs. Moreover, it doesn't seem to be "self-cleaning" -- once it hits the CUDA out of memory error, it stays there. HOWEVER the code above (the sample clean_mem() function I wrote above) does work.

With minimal modification to the Jupyter color visualization notebook, I run something like this:

for i in range(10):
    clean_mem()
    vis.plot_transformed_image("/home/steve/build/DeOldify/input_images/"+str(i+1)+".jpg", render_factor=30)

(obviously I have a folder called input_images, and have plunked a bunch of B&W photos in there starting with 1.jpg, 2.jpg... 10.jpg)

@jantic
Copy link
Owner

jantic commented Dec 15, 2018

Good to know. I'm doing a big overhaul of the code right now. This will be part of that.

@jantic
Copy link
Owner

jantic commented Mar 19, 2019

This will be revisted when we explicitly add support for Windows (a distant future thing for now). Windows support currently isn't a priority, especially since Colab notebooks exists. Closing.

@jantic jantic closed this as completed Mar 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants