-
Notifications
You must be signed in to change notification settings - Fork 19.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does Keras support using multiple GPUs? #2436
Comments
Yes, can run Keras models on multiple GPUs. This is only possible with the TensorFlow backend for the time being, because the Theano feature is still rather new. We are looking at adding support for multi-gpu in Theano in the near future (it should be fairly straightforward). With the TensorFlow backend, you can achieve this the same way as you would in pure TensorFlow: by using the |
I'm looking forward to it 😃 |
tf.device() scope? |
Any example to use multiple gpus with TF? |
Hm. Theano has libgpuarray, which allows one to push shared variables to different devices. This will not do all the work for you of recombining weight matrices but with a little effort you could use multiple GPUs. |
There is platoon, a project on top of Theano for data parallelism. Should Fred
|
I have looked into Platoon and it seemed like it was pretty much compatible On 25 April 2016 at 05:46, Frédéric Bastien notifications@github.com
|
The way libgpuarray work is by mapping variables to different GPUs, and
|
What's the priority of adding multi GPU support for the theano backend? |
I think it would expand user base for Keras. I have several Titan X in the
|
How does this actually work in tensorflow? There is a brief tutorial here: http://blog.keras.io/keras-as-a-simplified-interface-to-tensorflow-tutorial.html, I understand the concept of running the model replicas on seperate GPU devices & then merging the weights, but how do we actually run this? instead of model.fit do we call merged.fit on the result of the merged models? |
@tetmin I have the same confusion as yours. Although the blog shows how to predict model in different GPUs, it is still unclear how to train the same model across different GPUs in a single machine, i.e. I need data parallelism and don't know how to implement it in Keras with TensorFlow as backend. |
Agreed with @pengpaiSH and @tetmin . Hope there would be more details. |
@rudaoshi Well, I know this would not be proper to suggest since we are in the Keras community, and personally I am a Big Big Big fan of Keras! We know TensorFlow could utilize Multi-GPUs by computing averaging gradients across different devices, however, I am expecting Keras could provide a simple and unified API (Keras's Style) to help me focus my big picture end hide those IO/Parallel Computing details. For the time being, in order to make good use of multiple GPUs, I am doing my deep learning programs with MXNET, which I only specify the GPU IDs and the lib will do everything it needs under the hood. |
@fchollet I saw your blog with multi gpu training, thanks for pointing out the way doing multi gpu training, but I would really appreciate it if say model.fit() has a gpu=n option, I'll willing to implement my own version on that, may I ask for suggestions? or I'm willing to contribute on the multi gpu training within keras with more abstraction from end users. Thanks in advance! |
@WenchenLi +1, |
@WenchenLi did you create a PR for multigpu abstraction? |
Hope someone can contribute on the multi gpu training within keras. Thanks in advance. I have two gpus. I did not do anything to set which gpu would be used for training. But when I used the |
@anewlearner apparently this is the intended functionality of TF. See tensorflow/tensorflow#5066 for details Looking forward to a simplified version of mult-gpu :) |
For data parallelization in keras, you can use this approach: import tensorflow as tf from keras import backend as K from keras.models import Model from keras.layers import Input, merge from keras.layers.core import Lambda def slice_batch(x, n_gpus, part):
def to_multi_gpu(model, n_gpus=2):
arguments={'n_gpus':n_gpus, 'part':g})(x)
To use just take any model and set model = to_multi_gpu(model).model.fit() and model.predict() should work without any change.On Fri, Oct 21, 2016 at 6:13 PM, Alexander notifications@github.com wrote:
|
@jonilaserson , looks great! Does this work with the Theano backend or only TF? |
@jonilaserson Could you please provide more detailed comments for the codes? |
I tested the code provided by @jonilaserson and got a error. |
@anewlearner Have you solved the problem that you met with before? |
There was an indentation error in the code I posted. Here is a piece of code that should work: import tensorflow as tf from keras import backend as K from keras.models import Model from keras.layers import Input, merge from keras.layers.core import Lambda def slice_batch(x, n_gpus, part):
[part].
def to_multi_gpu(model, n_gpus=2):
slice
tensor,
arguments={'n_gpus':n_gpus, 'part':g})(x)
To use just take any model and set model = to_multi_gpu(model).model.fit() and model.predict() should work without any change.Example:from keras.layers.convolutional import Convolution2D from keras.layers.core import Activation import numpy as np def get_model(): x = Input( (96,96,1), name="input1") output = Convolution2D(64, 5, 5, border_mode='same', name="conv1")(x) output = Activation('relu', name="relu1")(output) [More layers...]model = Model(input=x, output=output) model = get_model() x = np.random.rand(1000, 96, 96, 1) On Mon, Oct 31, 2016 at 10:18 AM, Pai Peng notifications@github.com wrote:
|
@jonilaserson Thank you for you updating! Would you please comments on the code snippets
|
@jonilaserson Two gpus
One gpu
|
I think you should |
FYI - we just added an example of data-parallel distributed training with Keras using Horovod - https://github.com/uber/horovod/blob/master/examples/keras_mnist.py. It works both for multiple GPUs within the server, and across servers. Hope it helps. |
I used the code of @jonilaserson. And it works. However, it seems that multi-gpu converged slower compared to single gpu. Anyone else observed the same? |
@michelleowen you typically want to adjust learning rate to total # of GPUs across all the servers - here's an example for very simple scaling. Facebook published a paper with a more sophisticated strategy that works for a large number of GPUs. |
@alsrgv, thank you. This is very helpful. I will do some experiments to see how it works in my case. |
I guess the function previously mentioned by @avolkov1 is finally coming into Keras: |
@fernandoandreotti Yes and no. It's a cleaned-up variant of function from kuza55. It has nice documentation and grabs list of devices via device_lib instead of CUDA_VISIBLE_DEVICES. On the other hand it's missing some stuff from avolkov1: slicing on CPU, save/load of parameters of original serial model. Since there's no wrapper class, so the latter is not necessary, but at least might be documented. |
Keras v2.0.9 now includes it (release notes). Despite the improvements that can be done, I guess this issue should be closed. |
Any example of how to use this in the docs?
…On Fri, Nov 3, 2017 at 8:55 AM, Fernando Andreotti ***@***.*** > wrote:
Keras v2.0.9 now includes it. Despite the improvements that can be done, I
guess this issue should be closed.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2436 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ANU-StrjyNRZPJPvmUVjvJLg-EBD08eCks5syw0pgaJpZM4IMTcS>
.
|
Yes: https://keras.io/utils/#multi_gpu_model You can also check out Horovod, which seems nice. |
Is there any intention for making it work with CNTK too? |
@avolkov1 @jonilaserson Is there an issue with saving models using ModelCheckpoint using multi_gpu model. I actually used few other callbacks but it worked fine, but ModelCheckpoint is the one which fails to save the model, and throws error after an epcoh. CODE`class MyCallBack(keras.callbacks.Callback):
parallel_model = multi_gpu_model(model, gpus=2) #Adding Data Augmentation Provided by Keras Module datagen.fit(x_train) |
I had this same problem. Model Checkpoint will not work with multi GPU model. You can change the parameter save_weights_only to True and this will work fine HOWEVER if you then want to do inference on a SINGLE gpu the model will not load weights properly even if you load the checkpointed weights by name. Kind of an urgent question: is there a way to train on multiple GPUs but save the weights in such a way that I can do inference on only a single GPU? I am not sure how to get this to work properly as |
mmmm since it's urgent, maybe a dirty patch will do: couldn't you save the weights as matrices and then load them directly into the weights of the layers of a new (single GPU) model? edit: saving/loading the weights of the example from the docs doesn't work? https://keras.io/utils/#multi_gpu_model |
Thanks for quick response. I believe I have tried that. My weights were saved via the Model Checkpoint callback for a multi gpu model. When I re-instantiate the model I cannot load the weights to my single GPU model because I get an error stating that I am trying to load weights into a model with one layer when it expects four layers(4 is the number of GPUs I was using). edit:
That is correct. It does not work. Although I haven't tried the cpu device scope. Will try and let know. Ive only used model checkpoint callback with save_weights_only = True and model.load_weights. |
Did you double check that you are saving with the template model, not the multi_gpu one? From the docs: On model saving
edit: sorry I just re-read that you are saving through the callback...how are you doing that? Is each GPU saving a different file (or overwriting it)? |
@pGit1 Take a look at my example: Run like this to save weights:
Can then run again and it will load the checkpoint file and continue training. This will work with single GPU also.
I have a slightly different implementation for multigpu, but you can use the mutligpu implementation from Keras. Just wrap it in a class to use the non-multigpu model for saving and loading weights. The essence of the wrapper class for saving/loading weights is:
This works with |
I've confirmed that the example from the docs will not work with Model Checkpoint call back either.
Your example seems like it maywork but I having trouble thinking of a simple example of how to use. Your guidance would be much appreciated. Is something like this feasible?
Would something like this work? Actually I need a THANK YOU! EDIT: Looking at you Cifar 10 example it looks like something like this would work. Im in a crunch so don't want to embark on the above journey if I am missing something glaring. |
In general I think this line from docs in your code explain it all
In general one should be easily be able to train in parallel on multiple GPUs use callbacks to save weights on the parallel run and load back those saved weights to the serial model that was parallized in the first place (without having to re-instantiate the serial model as a parallel model). I think your code allows one to train on 8 GPUs but then load weights and infer on one. It should be a option perhaps in the >=2.0.9 implementation? Training with keras.utils.multi_gpu_model() works great and definitely provides a speed up. It just doesn't play nice with Model Checkpoint, or weight saving/loading. |
@pGit1 Yea, what you have there should work. Or you can can use the
Then you can use your example above with this new class.
|
THANK YOU!! Your code works. To test I bypassed multi-gpu-model altogether. After training on a simple dummy data set, I call the function a function that returns two models (serial and parallel) and only choose the serial_model. Keep in mind during training I call the fit function with the parallel model not the serial model. I also feed my best weight callback to the parallel model during training. Once this is done I load the learned weights into the serial model and get the expected results without any errors. I am not entirely sure why this works but it does. I confirmed multi-gpu training and single gpu inference. Now I am going to clean up my code to do something like you outline above. Thanks again for your help!! EDIT: The cleaned up version where you wrap the multi-gpu-model class works flawlessly. This is definitely my preferred method. Thanks again for all of your help. Your code is an extremely valuable contribution. |
EDIT on Jan 11, 2019 I've tried to customize |
@fchollet @pGit1 I @nicolefinnie @@avolkov1, I solved the problem using the following way. I changed some lines in the major codes of keras (particularly in topology.py or network.py, and callbacks.py). Here, I just modified the following codes. Callbacks.py:class ModelCheckpoint(Callback):
Topology.py/network.py:
|
Theano has supported multiple GPUs since v0.8.0.
(cf. Using multiple GPUs — Theano 0.8.0 documentation )
Does Keras also support using multiple GPUs?
For example, can I run the below task?
The text was updated successfully, but these errors were encountered: