Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Utilization 0% on all but one GPU using torch #772

Closed
creepyghost opened this issue May 23, 2016 · 3 comments
Closed

Utilization 0% on all but one GPU using torch #772

creepyghost opened this issue May 23, 2016 · 3 comments

Comments

@creepyghost
Copy link

Hi, I was trying to run the dbpedia text-classification example ( https://github.com/NVIDIA/DIGITS/tree/master/examples/text-classification ) on an Ubuntu 14.04 server with 4x Tesla K10s. The job started but GPU utilization is 99% only on one GPU. The other GPUs are stuck at 0%. I can see luajit processes in nvidia-smi on the other GPUs though. I can also see 6-8% memory utilization on the three GPUs as opposed to the 50%+ on the GPU that seems to be in use in the training phase.

Is this expected behaviour?

@creepyghost creepyghost changed the title Utilization 0% on all but one GPU using torch via digits Utilization 0% on all but one GPU using torch May 23, 2016
@gheinrich
Copy link
Contributor

HI @creepyghost thanks for the feedback. Yes this is the expected behaviour for now as I haven't made the couple of changes required to do multi-GPU training in the text classification model. If you wish to try it as an exercise you have to encapsulate the model in a DataParallelTable as in there

@gheinrich
Copy link
Contributor

@creepyghost are you willing to test #828 which I think should fix this issue?

@gheinrich
Copy link
Contributor

No feedback => closing.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants