New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to release GPU memory after sess.close()? #19731
Comments
I am not 100% sure of specifying GPU releasing, but these are some methods that I used.
I think issue #1578 had a similar problem, here is the link: #1578 |
hi, @JaeDukSeo
Why the gpu memory usage is still lingering after sess.close() and del graph? |
hi,all |
Nagging Assignee @cy89: It has been 14 days with no activity and this issue has an assignee. Please update the label and/or status accordingly. |
I have a similar problem. I have a couple of questions in this regard: @githubgsq when you mention about the method from #17048, do you mean moving your TensorFlow session code into a subprocess? So when the subprocess exits, the GPU memory is released? @JaeDukSeo You mention setting Thanks in advance. |
@saxenarohan97 yes, a subprocess run the session code. I also called tf.reset_default_graph() before the subprocess executed. |
Nagging Assignee @cy89: It has been 14 days with no activity and this issue has an assignee. Please update the label and/or status accordingly. |
@JaeDukSeo do you happen to have an answer for @saxenarohan97 ? |
@cy89 I agree with @saxenarohan97 if allow_growth is set to true I don't think it automatically deallocates GPU |
@JaeDukSeo thanks for your reply! |
I use numba to release the gpu. With tensorflow I can not find a effect method. |
@TanLingxiao were you able to find any other method? numba is a great way with the drawback being that once you run cuda.close(), you can no longer run your process again in the same process/session. Was hoping that tensorflow has config option to free GPU Memory after the processing ends. |
Hi I'm new to GitHub and have been trying to get Tensorflow running in Python. In principle I have the thing up and running, but the GPU memory is not released, causing an OOM error at some point. These few lines already clutter the memory. import tensorflow as tf I've been googling the problem but so far I have only managed to solve it by either
from numba import cuda As mentioned above I would like to avoid killing the session and thus losing my varibales in memory (used to train a NN).
I'd be very thankful for any suggestions what to do with the code snippet from above to ensure that the GPU memory is free in the end |
Exactly same problem for me. |
Have the same issue hear; I can only fit a model once using Keras with TensorFlow backend, and the second time (with the very same model), it just crashes (OOM error). |
+1 |
I have solved this issue with some kind of duct tape. I've used the bash script, which launched my module multiple times, after every execution the GPU memory has been released. It is also possible to use |
I have solved it by running the session in a separate Thread. When the session is completed, the memory used by such process is released, when the process is killed. Remember to save your session results on disk in the same method. |
@marinone94 do you have code / proof? I tried a thread too but the only thing that worked for me is to use a subprocess. Others have seen similar: #20387 #15880 |
@yurmchg @marinone94 |
@p890040 can you please post a code example? I definitely tried python threads to no avail. |
This may help |
import tensorflow_gpu as tf econdModuleUrl = "https://tfhub.dev/google/universal-sentence-encoder/2" tfconfig = tf.ConfigProto(): and use it to instantiate the session, but I'm quite confident you don't need it def ThreadCall(task, args): def SentenceEmbedding(sentences): sentences = ['This is the first sentence', 'and this is the second', 'just to show how to release '] |
thanks for this! |
Example: So as I understand the only method so far is to run model load and predict in separate process or maybe pass same session to any used model clearing old graph before loading new model. |
I tried the following things, but, none guaranteed to free the memory up
Creating separate processes always worked and guaranteed that the memory is freed up. Further that helped me manage and allocate resources the way I wanted.
|
+1 @saravanabalagi subprocess works for me, but that is in many cases impractical. |
Seeing this issue (and its many variations) closed, we see it is a design flaw and should move on. For new projects that need no JavaScript runtime I will always recommend PyTorch, which has a dedicated function for releasing GPU memory, Saves a lot of hacking around memory management and as a bonus the python API proved more stable over time... |
This is being deprecated by TF 2.0 and there is no equivalence in TF 2.0. |
Please reopen this issue. |
Same issue here, please reopen the issue... TF 2.0 didn't solve anything, just made my code run slower... |
tf.reset_default_graph() not working in 2.1. |
@cy89 Please re-open. Closing this issue was not helpful to the community. |
Same issue, please reopen the issue. |
Same issue, please reopen. |
Same issue. Seems crazy that a framework like Tensorflow does not even have a one simple way to release the memory. |
@dd1923 Can you try this one. On TensorFlow 2.1 it seems work |
@asis-shukla Tried it. Does not work. Still leaves all gpus full unfortunately. |
@asis-shukla doesn't work. |
Not work tf-2.2 |
Someone please give a correct answer! |
@yinghuang As a lot of other people, I'm using subprocesses to release the memory, which works fine. E.g. import multiprocessing
def run_inference_or_training(param1, param2, ...):
...
if __name__ == '__main__':
p = multiprocessing.Process(
target=run_inference_or_training,
args=(param1, param2, )
)
p.start()
p.join() # Add this if you want to wait for the process to finish. When the function |
There are ways to reset default graph in tf2 with tf.Graph().as_default(): |
Hi, I have the same problem. I have a 12Gb RTX 4070 OC but I can't create because it tells me I have no memory. Not being an expert in programming, could you explain to me very easily how to remedy this problem? Photos are also welcome |
hi, all:
I'm training models iteratively. After each model trained, I run sess.close() and recreate a new session to run a new training process. But it seems that the GPU memory was not relseased and it's increasing constantly.
I tried tf.reset_default_graph() before run session also typed gc.collect() after sess.close(), but takes no effect.
How could I release GPU memory timely to avoid OOM error please?
Thanks!
The text was updated successfully, but these errors were encountered: