Switching GPUs before training when tensorflow is used #1602

snurkabill · 2016-01-30T20:58:01Z

Hi,

I would like to ask, how one can change used GPU when keras is used? I have to GPUs on my machine and I would like to run two separate scripts on them, parametrized by GPU id.

Is it possible?

Use case in TF is to run session with device("/gpu:x") where "x" is GPU's id.

grahamannett · 2016-02-03T02:24:12Z

I'm curious about this as well, since theano seems to need the flags/theanorc/configure a device at start, and not listed here: http://keras.io/faq/#how-can-i-run-keras-on-gpu and I tried to get it to work with (prior to keras fitting model)

tf.Session() as sess:
    with tf.device("/gpu:1"):

but it didn't work.

parag2489 · 2016-02-03T03:23:38Z

This article running Keras on multiple GPUs may give you some direction, it's untested though.

jfsantos · 2016-03-07T16:44:32Z

The referenced article is only valid for Theano, and setting a device on .theanorc or running Python with THEANO_FLAGS=device=gpu0 also works. Any ideas on how to make this on Tensorflow? I have a machine with 2 GPUs and would like to run two experiments in parallel, one in each GPU, but Tensorflow always picks the first GPU by default.

parag2489 · 2016-03-07T16:55:23Z

I think this link explains about using multiple GPUs, their placement etc. in Tensorflow.

jfsantos · 2016-03-07T17:53:18Z

Following the link @parag2489 posted, I did the following: created my model inside a context created using with tf.device('/gpu:1'): and changed the session to allow soft placement (in case the model uses any Tensorflow operations that do not have GPU implementations). Here's an example:

import keras.backend.tensorflow_backend as K

with K.tf.device('/gpu:1'):
    K._set_session(K.tf.Session(config=K.tf.ConfigProto(allow_soft_placement=True, log_device_placement=True)))
    model = Sequential()
   # define the rest of your model here...
    model.compile(loss='mse', optimizer='adam')

(note: you should set log_device_placement=False, I'm using it just for debugging)

This seems to work properly, but I did not test with all operations to make sure.

@fchollet Any comments on this? Should we add a function to configure the Tensorflow backend to specify a device? I think we could maybe hide device selection easily using the variable and placeholder functions.

grahamannett · 2016-03-08T02:17:56Z

Seems like there could be a function for every backend that allows the user to set gpu/cpu stuff through keras and not through theano/tensorflow/etc

snurkabill · 2016-03-08T02:36:08Z

solved by:
#1918

jfsantos · 2016-03-08T02:46:08Z

@grahamannett If we design such an API, we have to keep in mind that you can allocate different parts of a model to different devices (but at the same time, do not make it extremely complicated). It is straightforward to do in Tensorflow, and there is a currently experimental way to do it in Theano as well.

fchollet · 2016-03-08T03:12:03Z

Any PR to add such functionality to the backend would be really welcome!

On 7 March 2016 at 18:46, João Felipe Santos notifications@github.com
wrote:

@grahamannett https://github.com/grahamannett If we design such an API,
we have to keep in mind that you can allocate different parts of a model to
different devices (but at the same time, do not make it extremely
complicated). It is straightforward to do in Tensorflow, and there is a
currently experimental way
http://deeplearning.net/software/theano/tutorial/using_multi_gpu.html
to do it in Theano as well.

—
Reply to this email directly or view it on GitHub
#1602 (comment).

jfsantos · 2016-03-08T13:16:55Z

@fchollet Do you think we should specify devices on a per-layer basis? The API would get a little bloated, though. What I would really like to do is something similar to what is done in Tensorflow, where you can specify a "device context" and anything declared within that scope is allocated to that device. Any ideas?

fchollet · 2016-03-08T16:00:25Z

Why not just what TF is doing? It seems like a good API.

By the way, note that the variables used by a layer are instantiated when
that layer gets connected to the next one. Something to keep in mind with
regard to scopes.

On 8 March 2016 at 05:16, João Felipe Santos notifications@github.com
wrote:

@fchollet https://github.com/fchollet Do you think we should specify
devices on a per-layer basis? The API would get a little bloated, though.
What I would really like to do is something similar to what is done in
Tensorflow, where you can specify a "device context" and anything declared
within that scope is allocated to that device. Any ideas?

—
Reply to this email directly or view it on GitHub
#1602 (comment).

jfsantos · 2016-03-08T16:14:00Z

I think we could do something similar to what Tensorflow is doing and adapt the Theano backend to do the same thing. My idea is to add an implementation for the Theano backend that replaces the device parameter in Theano calls by the device specified by our scopes.

leocnj · 2016-12-28T15:42:48Z

@snurkabill If possible, will you please post a recipe showing how to use this newly added function? Thanks.

snurkabill · 2017-01-03T17:27:56Z

@leocnj just use keras syntax in "with("/:gpu42")" statement. It's same as TF has, but propagated via keras

wolfrevoda · 2017-03-15T03:04:10Z

use a instruct of CUDA_VISIBLE_DEVICES to choice which gpu can be seen and decide which gpu can be used indirectly

ktamiola · 2017-03-23T13:38:47Z

@jfsantos I have tried to follow:

import keras.backend.tensorflow_backend as K

with K.tf.device('/gpu:1'):
    K._set_session(K.tf.Session(config=K.tf.ConfigProto(allow_soft_placement=True, log_device_placement=True)))
    model = Sequential()
   # define the rest of your model here...
    model.compile(loss='mse', optimizer='adam')

however I get the following error:

  2
      3 with K.tf.device('/gpu:1'):
----> 4     K._set_session(K.tf.Session(config=K.tf.ConfigProto(allow_soft_placement=True, log_device_placement=True)))
      5     model = Sequential()
      6    # define the rest of your model here...

AttributeError: 'module' object has no attribute '_set_session'

Keras 2.0.2 / Tensorflow 0.12.1

jfsantos · 2017-03-27T19:00:21Z

@ktamiola Keras 2.0 was rewritten, so that API has probably changed. If you want to use only one GPU, use the CUDA_VISIBLE_DEVICES approach instead.

tanjosh · 2017-04-14T11:22:50Z

@jfsantos I am getting same error as @ktamiola and I have to use 2nd GPU.

Vishruit · 2017-05-01T17:02:35Z

@ktamiola @tanjosh You can use K.set_session(...). Have tried it works!

But I am still looking for "data parellelism" in Keras... :/ Any luck?

skjerns · 2017-06-07T15:21:48Z

I am trying to run two models on two GPUs at the same time inside the same script.

Seems like CUDA_VISIBLE_DEVICES only works when used before starting my script, but I want to switch GPUs during training/validation.

I've tried

with K.tf.device('/gpu:{}'.format(gpu)):
    K.set_session(K.tf.Session(config=K.tf.ConfigProto(allow_soft_placement=True, log_device_placement=True)))
    model = Sequential(name='ann')
    model.add(Dense 50, input_shape=input_shape, activation='elu', kernel_initializer='he_normal'))
    model.add(BatchNormalization())
    model.add(Dense(n_classes, activation = 'softmax'))
    model.compile(loss='categorical_crossentropy', optimizer=Adam(), metrics=[keras.metrics.categorical_accuracy])

But I get a lot of errors:

WARNING:tensorflow:Tried to colocate 
gradients_1/batch_normalization_4/moments/sufficient_statistics/count_grad/Const_1 with an op 
batch_normalization_4/moments/sufficient_statistics/count that had a different device: /device:CPU:0 vs 
/device:GPU:0. Ignoring colocation property.

should the soft placement not prevent those warnings? can I just ignore them? the model seems to run fine.

raginisharma14 · 2017-08-28T02:40:03Z

@Vishruit Could you please paste the code you tried because k.set_session is not working for me.

stale · 2017-11-26T03:02:21Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

RaphaelRoyerRivard · 2018-07-24T20:20:39Z

I tried with K.tf.device('/cpu:0'): but got a AttributeError: module 'keras.backend.tensorflow_backend' has no attribute 'device'. Any chance this feature is still available in the latest version of Keras?
I would like to be able to run predictions on the CPU and fit on the GPU.

stale bot added the stale label Nov 26, 2017

fchollet closed this as completed Jun 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switching GPUs before training when tensorflow is used #1602

Switching GPUs before training when tensorflow is used #1602

snurkabill commented Jan 30, 2016

grahamannett commented Feb 3, 2016

parag2489 commented Feb 3, 2016

jfsantos commented Mar 7, 2016

parag2489 commented Mar 7, 2016

jfsantos commented Mar 7, 2016

grahamannett commented Mar 8, 2016

snurkabill commented Mar 8, 2016

jfsantos commented Mar 8, 2016

fchollet commented Mar 8, 2016

jfsantos commented Mar 8, 2016

fchollet commented Mar 8, 2016

jfsantos commented Mar 8, 2016

leocnj commented Dec 28, 2016

snurkabill commented Jan 3, 2017

wolfrevoda commented Mar 15, 2017

ktamiola commented Mar 23, 2017 •

edited

jfsantos commented Mar 27, 2017

tanjosh commented Apr 14, 2017

Vishruit commented May 1, 2017

skjerns commented Jun 7, 2017 •

edited

raginisharma14 commented Aug 28, 2017

stale bot commented Nov 26, 2017

RaphaelRoyerRivard commented Jul 24, 2018 •

edited

Switching GPUs before training when tensorflow is used #1602

Switching GPUs before training when tensorflow is used #1602

Comments

snurkabill commented Jan 30, 2016

grahamannett commented Feb 3, 2016

parag2489 commented Feb 3, 2016

jfsantos commented Mar 7, 2016

parag2489 commented Mar 7, 2016

jfsantos commented Mar 7, 2016

grahamannett commented Mar 8, 2016

snurkabill commented Mar 8, 2016

jfsantos commented Mar 8, 2016

fchollet commented Mar 8, 2016

jfsantos commented Mar 8, 2016

fchollet commented Mar 8, 2016

jfsantos commented Mar 8, 2016

leocnj commented Dec 28, 2016

snurkabill commented Jan 3, 2017

wolfrevoda commented Mar 15, 2017

ktamiola commented Mar 23, 2017 • edited

jfsantos commented Mar 27, 2017

tanjosh commented Apr 14, 2017

Vishruit commented May 1, 2017

skjerns commented Jun 7, 2017 • edited

raginisharma14 commented Aug 28, 2017

stale bot commented Nov 26, 2017

RaphaelRoyerRivard commented Jul 24, 2018 • edited

ktamiola commented Mar 23, 2017 •

edited

skjerns commented Jun 7, 2017 •

edited

RaphaelRoyerRivard commented Jul 24, 2018 •

edited