Clearing GPU memory in Keras #12625

SphrGhfri · 2019-04-05T16:58:27Z

80% my GPU memory get's full after loading pre-trained Xception model. but after deleting my model , memory doesn't get empty or flush.
I've also used codes like : K.clear_session() , gc.collect() , tf.reset_default_graph() , del model but none of them worked. Gpu properties say's 85% of memory is full.

Nothing flush gpu memory except numba.cuda.close() but won't allow me to use my gpu again. The only way to clear it is restarting kernel and rerun my code.

I'm looking for any script code to add my code allow me to use my code in for loop and clear gpu in every loop.

Part of my code :

image_input = Input(shape=(224, 224, 3))
base_model = Xception(input_tensor=image_input, include_top=False,weights='imagenet')
base_model.compile(loss='categorical_crossentropy',optimizer='adadelta',metrics=['accuracy'])
hist = base_model.fit(X,Y,epochs=2)

System information

Have I written custom code :
Windows 10 64-bit
TensorFlow installed from conda install tensorflow-gpu
TensorFlow version: 1.3
Python version: 3.6
CUDA/cuDNN version: 9.2
GPU model and memory: Asus GTX 1060 6gb

The text was updated successfully, but these errors were encountered:

nateraw · 2019-04-09T23:44:20Z

This function looks promising, stolen from fastai forums

from keras.backend.tensorflow_backend import set_session
from keras.backend.tensorflow_backend import clear_session
from keras.backend.tensorflow_backend import get_session
import tensorflow

# Reset Keras Session
def reset_keras():
    sess = get_session()
    clear_session()
    sess.close()
    sess = get_session()

    try:
        del classifier # this is from global space - change this as you need
    except:
        pass

    print(gc.collect()) # if it's done something you should see a number being outputted

    # use the same config as you used to create the session
    config = tensorflow.ConfigProto()
    config.gpu_options.per_process_gpu_memory_fraction = 1
    config.gpu_options.visible_device_list = "0"
    set_session(tensorflow.Session(config=config))

SphrGhfri · 2019-04-11T10:57:37Z

@nateraw
Thank you very much !!!
The only script that works to rerun training session without restarting.
you helped me not to run every single time of rerun and now i can use for loop to train multiple times.

Moondra · 2019-06-19T00:27:23Z

This function looks promising, stolen from fastai forums

from keras.backend.tensorflow_backend import set_session
from keras.backend.tensorflow_backend import clear_session
from keras.backend.tensorflow_backend import get_session
import tensorflow

# Reset Keras Session
def reset_keras():
    sess = get_session()
    clear_session()
    sess.close()
    sess = get_session()

    try:
        del classifier # this is from global space - change this as you need
    except:
        pass

    print(gc.collect()) # if it's done something you should see a number being outputted

    # use the same config as you used to create the session
    config = tensorflow.ConfigProto()
    config.gpu_options.per_process_gpu_memory_fraction = 1
    config.gpu_options.visible_device_list = "0"
    set_session(tensorflow.Session(config=config))

I keep getting "CUDA_ERROR_OUT_OF_MEMORY" when running the above function.
The only thing that clears up my memory is restarting my computer.

ambigus9 · 2019-09-06T18:01:28Z

This function looks promising, stolen from fastai forums

from keras.backend.tensorflow_backend import set_session
from keras.backend.tensorflow_backend import clear_session
from keras.backend.tensorflow_backend import get_session
import tensorflow

# Reset Keras Session
def reset_keras():
    sess = get_session()
    clear_session()
    sess.close()
    sess = get_session()

    try:
        del classifier # this is from global space - change this as you need
    except:
        pass

    print(gc.collect()) # if it's done something you should see a number being outputted

    # use the same config as you used to create the session
    config = tensorflow.ConfigProto()
    config.gpu_options.per_process_gpu_memory_fraction = 1
    config.gpu_options.visible_device_list = "0"
    set_session(tensorflow.Session(config=config))

Thanks! Works perfect!!

cpoptic · 2019-10-02T14:07:31Z

running the above reset_keras() function still throws an OOM error on Ubuntu with 16GB of GPU memory.

InternalError: CUDA runtime implicit initialization on GPU:0 failed. Status: out of memory

TensorFlow installed from conda install tensorflow-gpu
TensorFlow version: 1.14
Python version: 3.6
CUDA/cuDNN version: 10.0.168
GPU model and memory: Tesla V100-PCIE-16GB 16gb

Same when I try running:

#from keras import backend as K
import tensorflow as tf
from tensorflow.keras import backend as K
curr_session = tf.get_default_session()
# close current session
if curr_session is not None:
    curr_session.close()
# reset graph
K.clear_session()
# create new session
s = tf.InteractiveSession()
K.set_session(s)

I find it fascinating that the TensorFlow team has not made a very straightforward way to clear GPU memory from a session. So much is broken with TF. Little annoyances like this; a user reasonably expects TF to handle clearing CUDA memory or have memory leaks, yet there appears no explicit way to handle this. Even K.clear_session() doesn't work. This is not unreasonable. Maybe the blame should be directed towards nvidia, as even the following code doesn't clear the memory:

from numba import cuda
cuda.select_device(0)
cuda.close()

After several hours of scouring StackOverflow and the Github issues, and trying the above approaches (none of which worked for some reason), I'm left with the decidedly inelegant approach of restarting the entire kernel. Frustrating.

ilyarudyak · 2019-10-27T11:37:37Z

Why do they close issues that are not solved like at all?

mattalhonte · 2020-01-17T03:53:00Z

Having this issue too. Restarting doesn't always fix it either. Even shutting down Jupyter doesn't necessarily fix it. Using Windows, no idea how to fix this when it pops up. Any time I restart the kernel with TF on it there's a random chance it'll happen after restarting.

akshaybabloo · 2020-01-23T02:59:31Z

Is this still happening if you use fit_generator(), looking at OP's issue with full memory I guess its because of the dataset being too big?

elmahyai · 2020-02-19T11:50:34Z

Any solution, yet ?

azelk · 2020-03-29T14:17:02Z

After several hours of scouring StackOverflow and the Github issues, and trying the above approaches (none of which worked for some reason), I'm left with the decidedly inelegant approach of restarting the entire kernel. Frustrating.

I have to do the same ☹️

omertortumlu · 2020-05-30T12:28:15Z

Tensorflow version: 2.1.0
Keras version: 2.3.1
When use reset_keras() function; I get this error
get_session is not available when using TensorFlow 2.0.
Do you have a working function for this version :))

devdimit93 · 2020-08-04T20:41:10Z

Tensorflow version: 2.1.0
Keras version: 2.3.1
When use reset_keras() function; I get this error
get_session is not available when using TensorFlow 2.0.
Do you have a working function for this version :))

It works for me. Google colab with TF 2.3.0

#reset Keras Session
def reset_keras():
    sess = tf.compat.v1.keras.backend.get_session()
    tf.compat.v1.keras.backend.clear_session()
    sess.close()
    sess = tf.compat.v1.keras.backend.get_session()

    try:
        del classifier # this is from global space - change this as you need
    except:
        pass

    # use the same config as you used to create the session
    config = tf.compat.v1.ConfigProto()
    config.gpu_options.per_process_gpu_memory_fraction = 1
    config.gpu_options.visible_device_list = "0"
    tf.compat.v1.keras.backend.set_session(tf.compat.v1.Session(config=config))

SUFEHeisenberg · 2021-05-09T15:19:19Z

Tensorflow version: 2.1.0
Keras version: 2.3.1
When use reset_keras() function; I get this error
get_session is not available when using TensorFlow 2.0.
Do you have a working function for this version :))

It works for me. Google colab with TF 2.3.0

#reset Keras Session
def reset_keras():
    sess = tf.compat.v1.keras.backend.get_session()
    tf.compat.v1.keras.backend.clear_session()
    sess.close()
    sess = tf.compat.v1.keras.backend.get_session()

    try:
        del classifier # this is from global space - change this as you need
    except:
        pass

    # use the same config as you used to create the session
    config = tf.compat.v1.ConfigProto()
    config.gpu_options.per_process_gpu_memory_fraction = 1
    config.gpu_options.visible_device_list = "0"
    tf.compat.v1.keras.backend.set_session(tf.compat.v1.Session(config=config))

Thanks a lot for sharing your code in TF 2.0, but I still cannot fix it up with TF==2.2.0& keras ==2.3.1. And after I run your code, the console shows following messages BUT GPU still almost full which confuses me a lot :

2021-05-09 23:12:03.111033: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:21:00.0 name: Quadro P2000 computeCapability: 6.1
coreClock: 1.4805GHz coreCount: 8 deviceMemorySize: 5.00GiB deviceMemoryBandwidth: 130.53GiB/s
2021-05-09 23:12:03.111607: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2021-05-09 23:12:03.111866: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2021-05-09 23:12:03.112118: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2021-05-09 23:12:03.112363: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2021-05-09 23:12:03.112637: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2021-05-09 23:12:03.112897: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2021-05-09 23:12:03.113170: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2021-05-09 23:12:03.113497: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2021-05-09 23:12:03.113796: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-05-09 23:12:03.114042: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] 0
2021-05-09 23:12:03.114196: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0: N
2021-05-09 23:12:03.114435: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3841 MB memory) -> physical GPU (device: 0, name: Quadro P2000, pci bus id: 0000:21:00.0, compute capability: 6.1)

May I ask you have you ever met this issue?
BTW, could you please me what the classifier (from the global space) is in your own code? should I just del model?

jvishnuvardhan added backend:tensorflow type:support User is asking for help / asking an implementation question. Stackoverflow would be better suited. labels Apr 5, 2019

SphrGhfri closed this as completed Apr 11, 2019

SphrGhfri mentioned this issue Apr 20, 2019

Clearing Tensorflow-Keras GPU memory tensorflow/tensorflow#27433

Closed

yiakwy mentioned this issue Jun 8, 2019

[New Feature] 清除paddle控制设备占用存储 PaddlePaddle/models#2374

Closed

tobigithub mentioned this issue Jun 24, 2019

prosit interactive mode works while server mode fails kusterlab/prosit#8

Closed

pjpratik mentioned this issue Dec 12, 2022

How to release memory when I want to change model with tf.saved_model.load already? tensorflow/tensorflow#58852

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clearing GPU memory in Keras #12625

Clearing GPU memory in Keras #12625

SphrGhfri commented Apr 5, 2019

nateraw commented Apr 9, 2019

SphrGhfri commented Apr 11, 2019

Moondra commented Jun 19, 2019

ambigus9 commented Sep 6, 2019

cpoptic commented Oct 2, 2019 •

edited

Loading

ilyarudyak commented Oct 27, 2019

mattalhonte commented Jan 17, 2020

akshaybabloo commented Jan 23, 2020

elmahyai commented Feb 19, 2020

azelk commented Mar 29, 2020

omertortumlu commented May 30, 2020

devdimit93 commented Aug 4, 2020

SUFEHeisenberg commented May 9, 2021

Clearing GPU memory in Keras #12625

Clearing GPU memory in Keras #12625

Comments

SphrGhfri commented Apr 5, 2019

nateraw commented Apr 9, 2019

SphrGhfri commented Apr 11, 2019

Moondra commented Jun 19, 2019

ambigus9 commented Sep 6, 2019

cpoptic commented Oct 2, 2019 • edited Loading

ilyarudyak commented Oct 27, 2019

mattalhonte commented Jan 17, 2020

akshaybabloo commented Jan 23, 2020

elmahyai commented Feb 19, 2020

azelk commented Mar 29, 2020

omertortumlu commented May 30, 2020

devdimit93 commented Aug 4, 2020

SUFEHeisenberg commented May 9, 2021

cpoptic commented Oct 2, 2019 •

edited

Loading