Different behaviour of Keras 2.0.9 and 2.0.8 #8353

mdoulaty · 2017-11-02T12:04:33Z

With PIP installation of Keras, when using 2.0.9 (latest as of now), when importing Keras with tf backend, it allocates all available GPU resources immediately after import - however, in 2.0.8, the GPU allocation was not happening immediately after importing Keras. Is this an expected behaviour in 2.0.9?

yanpanlau · 2017-11-02T14:01:58Z

I am seeing the same behavior. The following code works for 2.0.8 but not 2.0.9

import tensorflow as tf
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.Session(config=config)
from keras import backend as K
K.set_session(sess)

AndrasEros · 2017-11-02T14:57:20Z

Maybe this was implemented:
#8311

mark86092 · 2017-11-02T15:27:26Z

I encounter the same problem

Code here https://github.com/fchollet/keras/blob/master/keras/backend/tensorflow_backend.py#L50

_LOCAL_DEVICES = device_lib.list_local_devices()

attempts to allocates the GPU resources. Not sure it is an expected behavior

datumbox · 2017-11-03T09:27:50Z

This is a known issue. The specific method call re-registers all the GPUs/resources instead of just counting the number of available devices. I intend to send a patch during the weekend.

datumbox · 2017-11-03T13:32:45Z

The problem is that the device_lib.list_local_devices() initialises a TF session and registers all available GPUs on the system. Judging from the name of the function I believe this is a bug on Tensorflow. I don't see why listing the devices requires registering them in a session.

Reproducing the problem is tricky as you need also more than 1 GPU. Here is a pure TF example which shows the problem:

$ python
Python 2.7.6 (default, Oct 26 2016, 20:30:19) 
[GCC 4.8.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> 
>>> tf.__version__
'1.4.0'
>>> 
>>> config = tf.ConfigProto()
>>> config.gpu_options.per_process_gpu_memory_fraction = 0.9
>>> config.gpu_options.visible_device_list = str('1')
>>> sess = tf.Session(config=config)
2017-11-03 13:02:14.730453: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2017-11-03 13:02:14.966925: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties: 
name: Quadro K2200 major: 5 minor: 0 memoryClockRate(GHz): 1.124
pciBusID: 0000:81:00.0
totalMemory: 3.95GiB freeMemory: 3.91GiB
2017-11-03 13:02:14.967000: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 1, name: Quadro K2200, pci bus id: 0000:81:00.0, compute capability: 5.0)

As we see it has registered only GPU1. The results can be confirmed using nvidia-smi:

$ nvidia-smi 
Fri Nov  3 13:03:05 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.90                 Driver Version: 384.90                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro K2200        Off  | 00000000:03:00.0  On |                  N/A |
| 42%   51C    P0     2W /  39W |    613MiB /  4040MiB |      9%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Quadro K2200        Off  | 00000000:81:00.0 Off |                  N/A |
| 42%   41C    P8     1W /  39W |   3676MiB /  4042MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1397      G   /usr/bin/X                                   399MiB |
|    0      2772      G   compiz                                       200MiB |
|    1     16788      C   python                                      3664MiB |
+-----------------------------------------------------------------------------+

On the same Python shell let's call the method to get the list of available GPUs:

>>> from tensorflow.python.client import device_lib
>>> device_lib.list_local_devices()
2017-11-03 13:04:08.611074: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties: 
name: Quadro K2200 major: 5 minor: 0 memoryClockRate(GHz): 1.124
pciBusID: 0000:03:00.0
totalMemory: 3.95GiB freeMemory: 3.31GiB
2017-11-03 13:04:08.611198: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Device peer to peer matrix
2017-11-03 13:04:08.611248: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1051] DMA: 0 1 
2017-11-03 13:04:08.611263: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1061] 0:   Y N 
2017-11-03 13:04:08.611275: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1061] 1:   N Y 
2017-11-03 13:04:08.611293: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: Quadro K2200, pci bus id: 0000:03:00.0, compute capability: 5.0)
2017-11-03 13:04:08.611315: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:1) -> (device: 1, name: Quadro K2200, pci bus id: 0000:81:00.0, compute capability: 5.0)
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 13475891616555218543
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 3240755200
locality {
  bus_id: 1
}
incarnation: 15732487042202847083
physical_device_desc: "device: 0, name: Quadro K2200, pci bus id: 0000:03:00.0, compute capability: 5.0"
, name: "/device:GPU:1"
device_type: "GPU"
memory_limit: 147456000
locality {
  bus_id: 2
}
incarnation: 7726238831769587034
physical_device_desc: "device: 1, name: Quadro K2200, pci bus id: 0000:81:00.0, compute capability: 5.0"
]

Ooops! It just registered both GPUs! Let's confirm with nvidia-smi:

$ nvidia-smi 
Fri Nov  3 13:04:28 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.90                 Driver Version: 384.90                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro K2200        Off  | 00000000:03:00.0  On |                  N/A |
| 42%   51C    P0     4W /  39W |   3740MiB /  4040MiB |     14%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Quadro K2200        Off  | 00000000:81:00.0 Off |                  N/A |
| 42%   42C    P8     1W /  39W |   3676MiB /  4042MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1397      G   /usr/bin/X                                   410MiB |
|    0      2772      G   compiz                                       200MiB |
|    0     16788      C   python                                      3116MiB |
|    1     16788      C   python                                      3664MiB |
+-----------------------------------------------------------------------------+

As we see the process has acquired also the GPU0 and it's using all the available resources.

alxy · 2017-11-03T15:56:20Z

To temporarily solve this problem, you can make only specific GPUs visible before placing any keras import:

import os
os.environ['CUDA_VISIBLE_DEVICES'] = '1'

from keras.engine import Model

Like this, keras will only use the GPU with ID 1.

alsrgv · 2017-11-05T03:39:43Z

This affects Horovod as well. Unfortunately, CUDA_VISIBLE_DEVICES workaround is not desirable, as it prevents NCCL from doing CUDA IPC.

fchollet · 2017-11-05T04:06:21Z

Please take a look at the outstanding fix: #8377

wt-huang · 2018-11-13T00:49:13Z

Closing as this is resolved

datumbox mentioned this issue Nov 3, 2017

Importing Keras should not execute code, makes pydoc sad #8311

Closed

datumbox mentioned this issue Nov 3, 2017

Remove unintended session initialization on TF backend #8377

Merged

alsrgv mentioned this issue Nov 5, 2017

Meet error when run examples keras_mnist.py in horovod V0.9.10 horovod/horovod#67

Closed

alsrgv mentioned this issue Nov 5, 2017

Unable to get Horovod to run successfully on multiple-GPU horovod/horovod#75

Closed

AndrasEros mentioned this issue Nov 7, 2017

Premature GPU memory allocation in Keras 2.0.9 #8412

Closed

4 tasks

Mithrillion mentioned this issue Mar 22, 2018

drive.py crashes when running both keras model and simulator on local GPU udacity/CarND-Behavioral-Cloning-P3#31

Closed

wt-huang closed this as completed Nov 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Different behaviour of Keras 2.0.9 and 2.0.8 #8353

Different behaviour of Keras 2.0.9 and 2.0.8 #8353

mdoulaty commented Nov 2, 2017 •

edited

yanpanlau commented Nov 2, 2017

AndrasEros commented Nov 2, 2017

mark86092 commented Nov 2, 2017 •

edited

datumbox commented Nov 3, 2017

datumbox commented Nov 3, 2017

alxy commented Nov 3, 2017

alsrgv commented Nov 5, 2017

fchollet commented Nov 5, 2017

wt-huang commented Nov 13, 2018

Different behaviour of Keras 2.0.9 and 2.0.8 #8353

Different behaviour of Keras 2.0.9 and 2.0.8 #8353

Comments

mdoulaty commented Nov 2, 2017 • edited

yanpanlau commented Nov 2, 2017

AndrasEros commented Nov 2, 2017

mark86092 commented Nov 2, 2017 • edited

datumbox commented Nov 3, 2017

datumbox commented Nov 3, 2017

alxy commented Nov 3, 2017

alsrgv commented Nov 5, 2017

fchollet commented Nov 5, 2017

wt-huang commented Nov 13, 2018

mdoulaty commented Nov 2, 2017 •

edited

mark86092 commented Nov 2, 2017 •

edited