Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to release GPU memory after sess.close()? #19731

Closed
githubgsq opened this issue Jun 4, 2018 · 45 comments
Closed

How to release GPU memory after sess.close()? #19731

githubgsq opened this issue Jun 4, 2018 · 45 comments
Assignees
Labels
stat:awaiting response Status - Awaiting response from author

Comments

@githubgsq
Copy link

githubgsq commented Jun 4, 2018

hi, all:
I'm training models iteratively. After each model trained, I run sess.close() and recreate a new session to run a new training process. But it seems that the GPU memory was not relseased and it's increasing constantly.
I tried tf.reset_default_graph() before run session also typed gc.collect() after sess.close(), but takes no effect.
How could I release GPU memory timely to avoid OOM error please?
Thanks!

@JaeDukSeo
Copy link

I am not 100% sure of specifying GPU releasing, but these are some methods that I used.

  1. You would already know this method
    config = tf.ConfigProto()
    config.gpu_options.allow_growth=True
    sess = tf.Session(config=config)

  2. Maybe this will help?
    https://stackoverflow.com/questions/34199233/how-to-prevent-tensorflow-from-allocating-the-totality-of-a-gpu-memory

  3. tf.Session. reset() -> this method just resets the session but I don't believe it doesn't necessarily let go of the memory. ( ## I strongly recommend this one, and here is the link to a blog that I used to train multiple models : http://bit.ly/2J8lhqz)

I think issue #1578 had a similar problem, here is the link: #1578

@githubgsq
Copy link
Author

githubgsq commented Jun 4, 2018

hi, @JaeDukSeo
thanks for your kind advices and I tried them

  1. I initialized session with config as you said.
    config = tf.ConfigProto()
    config.gpu_options.allow_growth=True
    sess = tf.Session(config=config)
  2. I didn't set 'per_process_gpu_memory_fraction', because I have only one process and total gpu memory can be used
  3. I tried sess.reset(), but errors occured, which indicates that reset() takes at least 1 argument (0 given). https://www.tensorflow.org/api_docs/python/tf/Session#reset shows that reset() is implemented for distributed session. I didn't use distributed session also did not set target arg here.

Why the gpu memory usage is still lingering after sess.close() and del graph?
How could I use tf.Session.reset() with a single machine and a single session?
Is there any other advice for releasing resources?
any advice would be appreciated :)

@githubgsq
Copy link
Author

hi,all
I tried method referred here:#17048
and the memory could be released now!
Thanks a lot~

@JaeDukSeo
Copy link

haha I am glad! and my bad the commend was actually
tf.reset_default_graph()
screen shot 2018-06-04 at 8 34 06 am

@tensorflowbutler
Copy link
Member

Nagging Assignee @cy89: It has been 14 days with no activity and this issue has an assignee. Please update the label and/or status accordingly.

@saxenarohan97
Copy link

I have a similar problem. I have a couple of questions in this regard:

@githubgsq when you mention about the method from #17048, do you mean moving your TensorFlow session code into a subprocess? So when the subprocess exits, the GPU memory is released?

@JaeDukSeo You mention setting allow_growth=True, but if my model is very large and a large amount of memory is allocated to TensorFlow even after allow_growth=True, then it will not be deallocated, right? (Since the docs say that "Note that we do not release memory, since that can lead to even worse memory fragmentation").

Thanks in advance.

@githubgsq
Copy link
Author

@saxenarohan97 yes, a subprocess run the session code. I also called tf.reset_default_graph() before the subprocess executed.

@tensorflowbutler
Copy link
Member

Nagging Assignee @cy89: It has been 14 days with no activity and this issue has an assignee. Please update the label and/or status accordingly.

@cy89
Copy link

cy89 commented Jul 21, 2018

@JaeDukSeo do you happen to have an answer for @saxenarohan97 ?

@cy89 cy89 added the stat:awaiting response Status - Awaiting response from author label Jul 21, 2018
@JaeDukSeo
Copy link

@cy89 I agree with @saxenarohan97 if allow_growth is set to true I don't think it automatically deallocates GPU

@cy89
Copy link

cy89 commented Jul 28, 2018

@JaeDukSeo thanks for your reply!
I'll close, as it looks like this thread has answers to all open questions.

@cy89 cy89 closed this as completed Jul 28, 2018
@TanLingxiao
Copy link

I use numba to release the gpu. With tensorflow I can not find a effect method.

@nikolayvoronchikhin
Copy link

@TanLingxiao were you able to find any other method? numba is a great way with the drawback being that once you run cuda.close(), you can no longer run your process again in the same process/session. Was hoping that tensorflow has config option to free GPU Memory after the processing ends.

@dackster
Copy link

Hi I'm new to GitHub and have been trying to get Tensorflow running in Python.

In principle I have the thing up and running, but the GPU memory is not released, causing an OOM error at some point. These few lines already clutter the memory.

import tensorflow as tf
sess = tf.Session()
sess.close()

I've been googling the problem but so far I have only managed to solve it by either

  • Clossing the console (and thus losing all varibales which is what I am actually trying to avoid)
  • Using numba to kill the cuda device (however, this only works once) - see above. In that case my code looked something like this:

from numba import cuda
cuda.select_device(0)
#do tf stuff
cuda.close()
#the memory was released here!
cuda.select_device(0)
#to tf stuff -> caused an OOM
(compare to https://numba.pydata.org/numba-doc/dev/cuda/device-management.html)

As mentioned above I would like to avoid killing the session and thus losing my varibales in memory (used to train a NN).

  • I am aware that I can alocate only a fraction of the memory (cfg.gpu_options.per_process_gpu_memory_fraction = 0.1) or let the memory grow (cfg.gpu_options.allow_growth=True) and this both works fine, but afterwards I simply am unable to release the memory.

  • I have tried to put it into threads and pools following ideas and workaounds people suggested.

  • Also the K.session_close() and/or gc.collect() combination does not work.

  • I also downgraded tf to 1.9 and Keras to 2.1.9 (second not really relevant for the above code).

  • I have also upgraded my graphics card driver to the newest release (see below and note the memory which is not released after the call from above).

grafik

I'd be very thankful for any suggestions what to do with the code snippet from above to ensure that the GPU memory is free in the end

@p890040
Copy link

p890040 commented Oct 4, 2018

Hi I'm new to GitHub and have been trying to get Tensorflow running in Python.

In principle I have the thing up and running, but the GPU memory is not released, causing an OOM error at some point. These few lines already clutter the memory.

import tensorflow as tf
sess = tf.Session()
sess.close()

I've been googling the problem but so far I have only managed to solve it by either

  • Clossing the console (and thus losing all varibales which is what I am actually trying to avoid)
  • Using numba to kill the cuda device (however, this only works once) - see above. In that case my code looked something like this:

from numba import cuda
cuda.select_device(0)
#do tf stuff
cuda.close()
#the memory was released here!
cuda.select_device(0)
#to tf stuff -> caused an OOM
(compare to https://numba.pydata.org/numba-doc/dev/cuda/device-management.html)

As mentioned above I would like to avoid killing the session and thus losing my varibales in memory (used to train a NN).

  • I am aware that I can alocate only a fraction of the memory (cfg.gpu_options.per_process_gpu_memory_fraction = 0.1) or let the memory grow (cfg.gpu_options.allow_growth=True) and this both works fine, but afterwards I simply am unable to release the memory.
  • I have tried to put it into threads and pools following ideas and workaounds people suggested.
  • Also the K.session_close() and/or gc.collect() combination does not work.
  • I also downgraded tf to 1.9 and Keras to 2.1.9 (second not really relevant for the above code).
  • I have also upgraded my graphics card driver to the newest release (see below and note the memory which is not released after the call from above).

grafik

I'd be very thankful for any suggestions what to do with the code snippet from above to ensure that the GPU memory is free in the end

Exactly same problem for me.
Very hope for some suggestion about this.

@anki-xyz
Copy link

Have the same issue hear; I can only fit a model once using Keras with TensorFlow backend, and the second time (with the very same model), it just crashes (OOM error).
Also appreciate suggestions here.

@pwais
Copy link
Contributor

pwais commented Jan 18, 2019

+1

@yurmchg
Copy link

yurmchg commented Jan 22, 2019

I have solved this issue with some kind of duct tape. I've used the bash script, which launched my module multiple times, after every execution the GPU memory has been released. It is also possible to use subprocess.Popen to call the module, that uses TF with GPU

@marinone94
Copy link

I have solved it by running the session in a separate Thread. When the session is completed, the memory used by such process is released, when the process is killed. Remember to save your session results on disk in the same method.

@pwais
Copy link
Contributor

pwais commented Mar 18, 2019

@marinone94 do you have code / proof? I tried a thread too but the only thing that worked for me is to use a subprocess. Others have seen similar: #20387 #15880

@p890040
Copy link

p890040 commented Mar 18, 2019

@yurmchg @marinone94
You guys are right. I have tried this way, and it does work.
However, I think it' not very efficient method for program. Because it may cause many problem and inconvenient when we want to communicate with other process(ex: send or receive data).
Very Hope tensorflow official can provide the release gpu memory function, or just make it release after end this thread (not process).

@pwais
Copy link
Contributor

pwais commented Mar 18, 2019

@p890040 can you please post a code example? I definitely tried python threads to no avail.

@oliviaolivia700
Copy link

@p890040 can you please post a code example? I definitely tried python threads to no avail.

This may help
https://github.com/tensorflow/tensorflow/issues/17048#issuecomment-368082470

@marinone94
Copy link

marinone94 commented Apr 9, 2019

import tensorflow_gpu as tf
import tensorflow_hub as hub
import threading
import pickle
import os
import time

econdModuleUrl = "https://tfhub.dev/google/universal-sentence-encoder/2"
'''
QUICKLY REWRITTEN, THERE MIGHT BE TYPOS AND/OR OTHER ERRORS, NO TIME TO CHECK IT
EDIT: YOU MIGHT HAVE TO ADD CORRECT INDENTATION
I might have missed something for running on GPU, if it doesn't work add:

tfconfig = tf.ConfigProto():
tfconfig.gpu_options.allow_growth = True

and use it to instantiate the session, but I'm quite confident you don't need it
'''

def ThreadCall(task, args):
"""
Starts a new thread where task with arguments args is running.
If args is a non-iterable pbject, eg audio = AudioFile(), use args = (audio,) instead of args = audio
If no args, use args = []
"""
t = threading.Thread(target = task, args = args)
t.start()

def SentenceEmbedding(sentences):
embed = hub.Module(econdModuleUrl)
if os.path.exist(r'.\example'):
os.remove(r'.\example')
with tf.Session() as session:
session.run([tf.global_variables_initializer(), tf.tables_initializer()])
message_embedding = session.run(embed(sentences))
session.close() #useless
with open(r'.\example', 'wb') as f:
pickle.dump(message_embedding, f)

sentences = ['This is the first sentence', 'and this is the second', 'just to show how to release ']
ThreadCall(task=SentenceEmbedding, args=sentences)
time.sleep(1)
#you have to set your own way to check when the thread is done, this is just a stupid example
while True:
if os.path.exist(r'.\example'):
break
with open(r'.\example', 'rb') as f:
message_embedding = pickle.load(message_embedding, f)

@jessequinn
Copy link

I use numba to release the gpu. With tensorflow I can not find a effect method.

thanks for this!

@mrgloom
Copy link

mrgloom commented Nov 11, 2019

numba cause Segmentation fault.

Example:
https://stackoverflow.com/questions/58792739/tensorflow-model-wrapper-that-can-release-gpu-resources

So as I understand the only method so far is to run model load and predict in separate process or maybe pass same session to any used model clearing old graph before loading new model.

@saravanabalagi
Copy link

saravanabalagi commented Nov 20, 2019

I tried the following things, but, none guaranteed to free the memory up

# didn't work for me
tf.reset_default_graph()
K.clear_session()
cuda.select_device(0); cuda.close()
model = get_new_model() # overwrite
model = None
del model
gc.collect()

Creating separate processes always worked and guaranteed that the memory is freed up. Further that helped me manage and allocate resources the way I wanted.

# works for me
process_train = multiprocessing.Process(train_model, args=...())
process_train.start()
process_train.join()

@pwais
Copy link
Contributor

pwais commented Nov 20, 2019

+1 @saravanabalagi subprocess works for me, but that is in many cases impractical.

@mirekphd
Copy link

mirekphd commented Nov 20, 2019

Seeing this issue (and its many variations) closed, we see it is a design flaw and should move on. For new projects that need no JavaScript runtime I will always recommend PyTorch, which has a dedicated function for releasing GPU memory, torch.cuda.empty_cache()

Saves a lot of hacking around memory management and as a bonus the python API proved more stable over time...

@leimao
Copy link

leimao commented Dec 13, 2019

tf.reset_default_graph()

This is being deprecated by TF 2.0 and there is no equivalence in TF 2.0.

@leimao
Copy link

leimao commented Dec 13, 2019

Please reopen this issue.

@EKami
Copy link

EKami commented Feb 5, 2020

Same issue here, please reopen the issue... TF 2.0 didn't solve anything, just made my code run slower...

@asis-shukla
Copy link

tf.reset_default_graph() not working in 2.1.
How to release GPU memory after model training in Eager execution mode.

@pwais
Copy link
Contributor

pwais commented Apr 8, 2020

@cy89 Please re-open. Closing this issue was not helpful to the community.

@duduscript
Copy link

Same issue, please reopen the issue.

@cdeboeser
Copy link

Same issue, please reopen.

@dsuthar-nvidia
Copy link

Same issue. Seems crazy that a framework like Tensorflow does not even have a one simple way to release the memory.

@asis-shukla
Copy link

asis-shukla commented Jun 6, 2020

@dd1923 Can you try this one. On TensorFlow 2.1 it seems work
import tensorflow.keras.backend as K
K.clear_session()

@dsuthar-nvidia
Copy link

@asis-shukla Tried it. Does not work. Still leaves all gpus full unfortunately.

@ucalyptus2
Copy link

@asis-shukla doesn't work.

@GF-Huang
Copy link

Not work tf-2.2

@yinghuang
Copy link

Someone please give a correct answer!

@maxvfischer
Copy link

Someone please give a correct answer!

@yinghuang As a lot of other people, I'm using subprocesses to release the memory, which works fine.

E.g.

import multiprocessing

def run_inference_or_training(param1, param2, ...):
    ...

if __name__ == '__main__':
    p = multiprocessing.Process(
        target=run_inference_or_training,
        args=(param1, param2, )
    )
    p.start()
    p.join()  # Add this if you want to wait for the process to finish.

When the function run_inference_or_training is done, the process p will be closed down and the memory allocated inside run_inference_or_training will be released.

@newbiesitl
Copy link

There are ways to reset default graph in tf2
1:

with tf.Graph().as_default():
main()
2:
from tensorflow.python.framework import ops
ops.reset_default_graph()
sess = tf.InteractiveSession()

@OtakuOW
Copy link

OtakuOW commented Nov 22, 2023

Hi, I have the same problem. I have a 12Gb RTX 4070 OC but I can't create because it tells me I have no memory. Not being an expert in programming, could you explain to me very easily how to remedy this problem? Photos are also welcome

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stat:awaiting response Status - Awaiting response from author
Projects
None yet
Development

No branches or pull requests