Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the CUDA version supported? #19

Closed
jxyin opened this issue Sep 5, 2018 · 6 comments
Closed

What is the CUDA version supported? #19

jxyin opened this issue Sep 5, 2018 · 6 comments

Comments

@jxyin
Copy link

jxyin commented Sep 5, 2018

I tried to run dopamine on my GPU machine w/ Ubuntu 16.04.4 and CUDA 9.0. I was following the testing and training instruction in the provided Readme file under virtualenv. The testing and training was running fine but all on CPU only (high CPU utilization and Zero GPU utilization all the way after one iteration is finished). I'm running using "dopamine/agents/dqn/configs/dqn.gin" and the configuration uses GPU:0 as tf_device by default. Does any body have any pointer on such kind of situation?

@psc-g
Copy link
Collaborator

psc-g commented Sep 6, 2018

the tests will not run on gpu by default.
did you try running the training binary to see if it uses your gpu?
CUDA support is coming from tensorflow and is not specific to dopamine.

@jxyin
Copy link
Author

jxyin commented Sep 7, 2018 via email

@psc-g
Copy link
Collaborator

psc-g commented Sep 7, 2018

when you run it, you should get a printout of the arguments passed to dqn, one should look like tf_device: ..., from here: https://github.com/google/dopamine/blob/master/dopamine/agents/dqn/dqn_agent.py#L126
what does that line say in your run?

and again, tensorflow is what's handling device usage (see, e.g. https://github.com/google/dopamine/blob/master/dopamine/agents/dqn/dqn_agent.py#L145)

are you able to run other tensorflow programs successfully using your gpu?

@inoryy
Copy link

inoryy commented Sep 7, 2018

Works fine out of the box for me. Here's the output I get when I run
python -um tests.agents.rainbow.rainbow_agent_test

....
2018-09-07 12:58:44.844573: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1484] Adding visible gpu devices: 0
2018-09-07 12:58:44.844612: I tensorflow/core/common_runtime/gpu/gpu_device.cc:965] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-09-07 12:58:44.844619: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971]      0
2018-09-07 12:58:44.844626: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] 0:   N
2018-09-07 12:58:44.844802: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3353 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:09:00.0, compute capability: 6.1)
..
----------------------------------------------------------------------
Ran 26 tests in 5.420s

OK

@michac
Copy link

michac commented Sep 13, 2018

The instructions in the README for creating the virtualenv have the line pip install absl-py atari-py gin-config gym opencv-python tensorflow which I copied and ran exactly, and I wasn't getting GPU support either. I ended up here: https://www.tensorflow.org/install/install_linux which indicated I needed to install tensorflow-gpu to get GPU support. Once I ran pip install tensorflow-gpu things worked as expected. This is probably obvious to someone with more tensorflow experience than me :). I ran the code sample here: https://www.tensorflow.org/guide/using_gpu to verify GPU support in tensorflow.

@psc-g
Copy link
Collaborator

psc-g commented Sep 13, 2018

thanks for the feedback! i've just updated the instructions in our README to reflect this.

@psc-g psc-g closed this as completed Sep 13, 2018
arthurarg added a commit to arthurarg/dopamine that referenced this issue May 5, 2019
Fix thread pool for to run iterations in async runner.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants