Skip to content
This repository has been archived by the owner on Jul 1, 2024. It is now read-only.

mxnet not using the GPU with mxnet-cu90 on Windows #106

Open
roboserg opened this issue May 26, 2018 · 11 comments
Open

mxnet not using the GPU with mxnet-cu90 on Windows #106

roboserg opened this issue May 26, 2018 · 11 comments

Comments

@roboserg
Copy link

I was using latest keras with tensorflow with GPU. I installed mxnet in the following way:

pip install keras-mxnet
pip install mxnet-cu90

I changed "backend": "mxnet" in the keras config file. In jupyter I see "Using MXNet backend", but when training only CPU is utilized and not the GPU.

Any advice?

@roywei
Copy link

roywei commented May 26, 2018

Hi, which Cuda version did you installed? mxnet-cu90 is working fine with cuda9.0, you can try mxnet-cu80 or mxnet-cu91. Also, if you have multiple gpus, could you try model=multi_gpu_model(model, gpus=num_gpus)

@roboserg
Copy link
Author

roboserg commented May 26, 2018

I have a single 1070 GTX GPU. I have cuda 9.0 installed (V9.0.176) and it was working with the TF backend for 9.0

Should I still try cu80 or cu91 ?

@roywei
Copy link

roywei commented May 26, 2018

Let me try to reproduce and get back to you. Thanks

@roywei
Copy link

roywei commented May 27, 2018

Hi @roboserg, I was not able to reproduce the error, could you provide a minimum reproducible code? How are you building the model?

I am testing on a AWS P3.8xLarge instance with 4 GPUs, and I tested a few scripts in example folder: cifar10_cnn, mnist_cnn, and lstm_text_generation
It utilize 1 GPU by default without any multi_gpu_model, which behaves the same as TensorFlow backend.

Pip packages version:

keras-mxnet                        2.1.6.1    
mxnet-cu90                         1.2.0  

python mnist_cnn.py gives following GPU utilization, same result running in jupyter notebook

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.111                Driver Version: 384.111                   |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla V100-SXM2...  On   | 00000000:00:1B.0 Off |                  Off |
| N/A   50C    P0   138W / 300W |   1056MiB / 16152MiB |     58%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla V100-SXM2...  On   | 00000000:00:1C.0 Off |                  Off |
| N/A   44C    P0    35W / 300W |     10MiB / 16152MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  Tesla V100-SXM2...  On   | 00000000:00:1D.0 Off |                  Off |
| N/A   43C    P0    38W / 300W |     10MiB / 16152MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  Tesla V100-SXM2...  On   | 00000000:00:1E.0 Off |                  Off |
| N/A   45C    P0    40W / 300W |     10MiB / 16152MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0     21083      C   python                                      1046MiB |
+-----------------------------------------------------------------------------+

@roboserg
Copy link
Author

roboserg commented May 27, 2018

Thanks for checking it out. I tried this example https://github.com/awslabs/keras-apache-mxnet/blob/master/examples/mnist_cnn.py

It still utilizes my GPU for 0%-1% and CPU for 60+% so I guess its running CPU only.

I am sure its mxnet, since it says "Using MXNet backend" after importing keras. For Keras config I was editing it under C:\Users\Roboserg\ .keras\keras.json

I have mxnet-cu90 version 1.2 and keras-mxnet version 2.1.6.1. I am testing on my local Windows 10 machine in Anaconda env.

I don't know what else to try. Any help would be appreciated. Is there a way to force mxnet to use the GPU? On the other hand I dont have non GPU versions of mxnet installed, only the GPU one.

ps. would you mind to tell how I can print the same table as you?

@roboserg
Copy link
Author

roboserg commented May 27, 2018

Using incubator-mxnet indeed uses my GPU, i.e. running python example/image-classification/train_mnist.py --network lenet --gpus 0

But how do I run mxnet with GPU in the jupyter notebook with Keras, for example using examples/mnist_cnn.py linked above? (from keras-apache-mxnet)

@roywei
Copy link

roywei commented May 27, 2018

on Linux it’s ‘nvidia-smi’ On windows run nvidia-smi.exe under your nvidia installation folder. Let me test this on windows.

You can specify which device to run by passing a context param during compile, for example in mnist_cnn.py change line 59 to:

model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adadelta(),
              metrics=['accuracy'], context= ["gpu(0)", "gpu(1)"])

we hide this from user as it's for mxnet only and not a Keras API.

@sandeep-krishnamurthy
Copy link

@roboserg - Were you able to train with GPUs?

@roboserg
Copy link
Author

roboserg commented May 31, 2018

@sandeep-krishnamurthy

yes, see below please

@roboserg
Copy link
Author

roboserg commented May 31, 2018

@roywei thank you very much! context= ["gpu(0)"] fixed the problem! I still think its a bug, because as I wrote above, I did all the steps for proper GPU installation and it was not using the GPU by default.

Thanks again, was able to train the MNIST example in only 80 seconds on my 1070 GTX! Thats 60000 * 12 images, incredible!

@roywei
Copy link

roywei commented May 31, 2018

@roboserg Tested on windows, indeed it does not utilize GPU by default, I think it's a bug, for now just use the context param. Thanks for the catch!

@roywei roywei added the bug label May 31, 2018
@roywei roywei changed the title mxnet not using the GPU with mxnet-cu90 mxnet not using the GPU with mxnet-cu90 on Windows May 31, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants