Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to use GPU with Theano #8

Closed
Mikeprod opened this issue Jun 30, 2016 · 5 comments
Closed

Unable to use GPU with Theano #8

Mikeprod opened this issue Jun 30, 2016 · 5 comments

Comments

@Mikeprod
Copy link

I built your docker for gpu with the provided info in the README. Then when I use the command to launch the docker, it runs fine.
In a new notebook, I simply type import theano and I get the following error :

ERROR (theano.sandbox.cuda): ERROR: Not using GPU. Initialisation of device gpu failed:
initCnmem: cnmemInit call failed! Reason=CNMEM_STATUS_OUT_OF_MEMORY. numdev=1

What am I doing wrong ?

I then tested the cpu version, theano and Keras are working just fine.

@saiprashanths
Copy link
Collaborator

saiprashanths commented Jun 30, 2016

You probably have something else running on your GPU, which is causing Theano to not be able to allocate enough memory. You shouldn't get this error if you kill all processes using GPU. Alternatively, you can edit /root/.theanorc to remove the line cnmem=0.95 (or change it to a lower fraction)

The cnmem parameter dictates the fraction of the total GPU memory to be reserved for the CUDA memory allocator. See http://deeplearning.net/software/theano/library/config.html#config.config.lib.cnmem for more details.

@Mikeprod
Copy link
Author

OK, I just deleted the part [lib] in the theano config file to make it work. Thanks for the really quick answer.

@mahdaneh
Copy link

Hi,
I've built the gpu version of your docker and everything runs fine.Really good one and thanks! But I've faced another problem with concurrent running theano processes. When two dockers are started subsequently, one that started firstly does not allow the other one take GPU memory (Not using GPU. Initialisation of device x failed), however, cnmem is set for both processes to small float number (say 0.1)
How can one manage to run several theano processes without getting "Not using GPU. Initialisation of device x failed"? Actually, having several dockers, where each run a theano process cause that error. Are there any ways to manage concurrent running theano processes in different dockers?
Thanks for your reply!

@saiprashanths
Copy link
Collaborator

@mahdaneh I'm not too sure about this, but this is my speculation: nvidia-docker does some GPU isolation (see https://github.com/NVIDIA/nvidia-docker/wiki/GPU-isolation). Hence, it might not be possible for 2 containers to access the same GPU. You can try running multiple Theano processes from inside the same container to see if it works.

@mahdaneh
Copy link

Thanks for your suggestion. I found why I get this error since the GPU has not enough memory to run several Theano processes concurrently! Checking nvidia-smi is good to assess whether the GPU has enough memory to run another Theano process or not. Another important thing is that the users should set the cnmem to a small number if they wish to run concurrently several processes on GPU, otherwise, one Theano or Tensorflow process can take all GPU memory by default and does not let other processes start their run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants