What is the CUDA version supported? #19

jxyin · 2018-09-05T15:59:57Z

I tried to run dopamine on my GPU machine w/ Ubuntu 16.04.4 and CUDA 9.0. I was following the testing and training instruction in the provided Readme file under virtualenv. The testing and training was running fine but all on CPU only (high CPU utilization and Zero GPU utilization all the way after one iteration is finished). I'm running using "dopamine/agents/dqn/configs/dqn.gin" and the configuration uses GPU:0 as tf_device by default. Does any body have any pointer on such kind of situation?

psc-g · 2018-09-06T18:45:25Z

the tests will not run on gpu by default.
did you try running the training binary to see if it uses your gpu?
CUDA support is coming from tensorflow and is not specific to dopamine.

jxyin · 2018-09-07T08:38:42Z

Hi Castro, Thanks for the reply. I've tried both the tests and training binary, and it still does not use GPU. Attached please find the screen shot showing the training is running however no process is running on GPU. Best Regards, Terry YIN

…

________________________________ From: Pablo Samuel Castro <notifications@github.com> Sent: Friday, September 7, 2018 2:45 AM To: google/dopamine Cc: jxyin; Author Subject: Re: [google/dopamine] What is the CUDA version supported? (#19) the tests will not run on gpu by default. did you try running the training binary to see if it uses your gpu? CUDA support is coming from tensorflow and is not specific to dopamine. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub<#19 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/ARGrZodW1lcopqo_O1MRYFnSRPVxlU6Uks5uYW1RgaJpZM4WbOBQ>.

psc-g · 2018-09-07T09:28:11Z

when you run it, you should get a printout of the arguments passed to dqn, one should look like tf_device: ..., from here: https://github.com/google/dopamine/blob/master/dopamine/agents/dqn/dqn_agent.py#L126
what does that line say in your run?

and again, tensorflow is what's handling device usage (see, e.g. https://github.com/google/dopamine/blob/master/dopamine/agents/dqn/dqn_agent.py#L145)

are you able to run other tensorflow programs successfully using your gpu?

inoryy · 2018-09-07T10:01:54Z

Works fine out of the box for me. Here's the output I get when I run
python -um tests.agents.rainbow.rainbow_agent_test

....
2018-09-07 12:58:44.844573: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1484] Adding visible gpu devices: 0
2018-09-07 12:58:44.844612: I tensorflow/core/common_runtime/gpu/gpu_device.cc:965] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-09-07 12:58:44.844619: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971]      0
2018-09-07 12:58:44.844626: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] 0:   N
2018-09-07 12:58:44.844802: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3353 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:09:00.0, compute capability: 6.1)
..
----------------------------------------------------------------------
Ran 26 tests in 5.420s

OK

michac · 2018-09-13T06:39:39Z

The instructions in the README for creating the virtualenv have the line pip install absl-py atari-py gin-config gym opencv-python tensorflow which I copied and ran exactly, and I wasn't getting GPU support either. I ended up here: https://www.tensorflow.org/install/install_linux which indicated I needed to install tensorflow-gpu to get GPU support. Once I ran pip install tensorflow-gpu things worked as expected. This is probably obvious to someone with more tensorflow experience than me :). I ran the code sample here: https://www.tensorflow.org/guide/using_gpu to verify GPU support in tensorflow.

psc-g · 2018-09-13T09:56:54Z

thanks for the feedback! i've just updated the instructions in our README to reflect this.

Fix thread pool for to run iterations in async runner.

psc-g closed this as completed Sep 13, 2018

arthurarg added a commit to arthurarg/dopamine that referenced this issue May 5, 2019

Merge pull request google#19 from wes-turner/fix_thread_pool

1ce0d6c

Fix thread pool for to run iterations in async runner.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What is the CUDA version supported? #19

What is the CUDA version supported? #19

jxyin commented Sep 5, 2018

psc-g commented Sep 6, 2018

jxyin commented Sep 7, 2018 via email

psc-g commented Sep 7, 2018

inoryy commented Sep 7, 2018 •

edited

michac commented Sep 13, 2018

psc-g commented Sep 13, 2018

What is the CUDA version supported? #19

What is the CUDA version supported? #19

Comments

jxyin commented Sep 5, 2018

psc-g commented Sep 6, 2018

jxyin commented Sep 7, 2018 via email

psc-g commented Sep 7, 2018

inoryy commented Sep 7, 2018 • edited

michac commented Sep 13, 2018

psc-g commented Sep 13, 2018

inoryy commented Sep 7, 2018 •

edited