You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I was trying to run the dbpedia text-classification example ( https://github.com/NVIDIA/DIGITS/tree/master/examples/text-classification ) on an Ubuntu 14.04 server with 4x Tesla K10s. The job started but GPU utilization is 99% only on one GPU. The other GPUs are stuck at 0%. I can see luajit processes in nvidia-smi on the other GPUs though. I can also see 6-8% memory utilization on the three GPUs as opposed to the 50%+ on the GPU that seems to be in use in the training phase.
Is this expected behaviour?
The text was updated successfully, but these errors were encountered:
creepyghost
changed the title
Utilization 0% on all but one GPU using torch via digits
Utilization 0% on all but one GPU using torch
May 23, 2016
HI @creepyghost thanks for the feedback. Yes this is the expected behaviour for now as I haven't made the couple of changes required to do multi-GPU training in the text classification model. If you wish to try it as an exercise you have to encapsulate the model in a DataParallelTable as in there
Hi, I was trying to run the dbpedia text-classification example ( https://github.com/NVIDIA/DIGITS/tree/master/examples/text-classification ) on an Ubuntu 14.04 server with 4x Tesla K10s. The job started but GPU utilization is 99% only on one GPU. The other GPUs are stuck at 0%. I can see luajit processes in nvidia-smi on the other GPUs though. I can also see 6-8% memory utilization on the three GPUs as opposed to the 50%+ on the GPU that seems to be in use in the training phase.
Is this expected behaviour?
The text was updated successfully, but these errors were encountered: