Question about Zero Volatile GPU-Util #5

SannyZhou · 2019-01-22T15:12:04Z

Hello,
I am trying to train and evaluate your LISA model on CoNLL dataset.
While trying to train the model on a GPU, I use the cmd as CUDA_VISIBLE_DEVICES=0 bin/evaluate.sh config/conll05-lisa.conf --save_dir model. However, it seems that nothing works on the GPU. The nvidia-smi shows volatile GPU-util is zero.
How to make best use of GPU for TensorFlow Estimators?
Do you have any ideas about the reason of this problem?

The text was updated successfully, but these errors were encountered:

patverga · 2019-01-22T15:40:27Z

First thing to check is that you've installed tensorflow with gpu support. The default tensorflow package is cpu only.
pip3 install --user tensorflow-gpu

SannyZhou · 2019-01-22T17:01:18Z

The package is tensorflow-gpu 1.9.0. @patverga

strubell · 2019-01-22T17:30:13Z

Does tensorflow output a line like:

2019-01-22 12:22:22.434234: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1098] Created TensorFlow
device (/job:localhost/replica:0/task:0/device:GPU:0 with 11428 MB memory)
-> physical GPU (device: 0, name: GeForce GTX TITAN X, pci bus id:
0000:82:00.0, compute capability: 5.2)

If so, then it's using GPU. If not, then you likely have some kind of
configuration issue.

I would expect GPU usage to fluctuate a lot during evaluation, and in fact
for most of the time to be spent on CPU since the code calls the official
CoNLL evaluation scripts (perl). Currently I believe evaluation uses the
same batch size as training, but you could increase it depending on your
GPU's memory to make better use of the GPU.

The code currently doesn't have a "predict" mode, which simply outputs
predictions for sentences without evaluating. This may be more the
functionality you desire, and I'm happy to accept pull requests :)

SannyZhou · 2019-02-02T11:10:05Z

Does tensorflow output a line like:
2019-01-22 12:22:22.434234: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1098] Created TensorFlow
device (/job:localhost/replica:0/task:0/device:GPU:0 with 11428 MB memory)
-> physical GPU (device: 0, name: GeForce GTX TITAN X, pci bus id:
0000:82:00.0, compute capability: 5.2)
If so, then it's using GPU. If not, then you likely have some kind of
configuration issue.

I would expect GPU usage to fluctuate a lot during evaluation, and in fact
for most of the time to be spent on CPU since the code calls the official
CoNLL evaluation scripts (perl). Currently I believe evaluation uses the
same batch size as training, but you could increase it depending on your
GPU's memory to make better use of the GPU.

The code currently doesn't have a "predict" mode, which simply outputs
predictions for sentences without evaluating. This may be more the
functionality you desire, and I'm happy to accept pull requests :)

Thanks for your patient answer. I suddenly found that I set the parameter of debug as 1, which caused the high frequency of evaluation for validation and the low GPU usage.

strubell · 2019-02-02T15:12:13Z

Great, happy to hear you solved it!

…

On Sat, Feb 2, 2019 at 6:10 AM Jie Zhou ***@***.***> wrote: Closed #5 <#5>. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#5 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADHZt0rdXfpt0LCZjDXWzXQEAVPgcRirks5vJXINgaJpZM4aM4Ig> .

SannyZhou closed this as completed Feb 2, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about Zero Volatile GPU-Util #5

Question about Zero Volatile GPU-Util #5

SannyZhou commented Jan 22, 2019

patverga commented Jan 22, 2019

SannyZhou commented Jan 22, 2019

strubell commented Jan 22, 2019

SannyZhou commented Feb 2, 2019

strubell commented Feb 2, 2019 via email

Question about Zero Volatile GPU-Util #5

Question about Zero Volatile GPU-Util #5

Comments

SannyZhou commented Jan 22, 2019

patverga commented Jan 22, 2019

SannyZhou commented Jan 22, 2019

strubell commented Jan 22, 2019

SannyZhou commented Feb 2, 2019

strubell commented Feb 2, 2019 via email