Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Low GPU utilization #14

Closed
peto184 opened this issue Mar 29, 2019 · 2 comments
Closed

Low GPU utilization #14

peto184 opened this issue Mar 29, 2019 · 2 comments

Comments

@peto184
Copy link

peto184 commented Mar 29, 2019

Hi,

We are trying to recreate the results on the chairs dataset. Everything works OK, except the GPU utilization is very low (<5%) and the CPU is non stop on 100%. Is this correct? Otherwise with these specs, we are looking for one month of training.

I've included short log from beginning of training process below.

tf-version: 1.12
os: Windows 10,
cpu: i7-7700 CPU @ 3.60 GHz
gpu: GeForce GTX 1080

(3dspacegen) D:\3dspacegen\vae_gan\3D-Reconstruction-Image>python 20-VAE-3D-IWGAN.py -n chair -d data\\voxels\\chair -i data\\overlays\\chair
Using TensorFlow backend.
[2019-03-29 14:12:51,790] [tl_logging] [WARNING]: WARNING: Function: `tensorlayer.activation.leaky_relu` (in file: D:\miniconda\envs\3dspacegen\lib\site-packages\tensorlayer\activation.py) is deprecated and will be removed after 2018-09-30.
Instructions for updating: This API is deprecated. Please use as `tf.nn.leaky_relu`

[2019-03-29 14:12:52,118] [tl_logging] [WARNING]: WARNING: Function: `tensorlayer.layers.utils.set_name_reuse` (in file: D:\miniconda\envs\3dspacegen\lib\site-packages\tensorlayer\layers\utils.py) is deprecated and will be removed after 2018-06-30.
Instructions for updating: TensorLayer relies on TensorFlow to check name reusing

[2019-03-29 14:12:52,118] [tl_logging] [WARNING]: WARNING: this method is DEPRECATED and has no effect, please remove it from your code.
[2019-03-29 14:12:52,352] [tl_logging] [WARNING]: WARNING: this method is DEPRECATED and has no effect, please remove it from your code.
[2019-03-29 14:12:52,539] [tl_logging] [WARNING]: WARNING: this method is DEPRECATED and has no effect, please remove it from your code.
[2019-03-29 14:12:52,711] [tl_logging] [WARNING]: WARNING: this method is DEPRECATED and has no effect, please remove it from your code.
[2019-03-29 14:12:52,852] [tl_logging] [WARNING]: WARNING: this method is DEPRECATED and has no effect, please remove it from your code.
[2019-03-29 14:12:53,820] [tl_logging] [WARNING]: WARNING: this method is DEPRECATED and has no effect, please remove it from your code.
2019-03-29 14:12:55.576779: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
2019-03-29 14:12:55.717923: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-03-29 14:12:55.721467: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988]
2019-03-29 14:12:55.821546: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties: 
name: GeForce GTX 1080 major: 6 minor: 1 memoryClockRate(GHz): 1.7335
pciBusID: 0000:01:00.0
totalMemory: 8.00GiB freeMemory: 6.60GiB
2019-03-29 14:12:55.827285: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2019-03-29 14:12:56.189054: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-03-29 14:12:56.193933: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988]      0
2019-03-29 14:12:56.196421: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0:   N
2019-03-29 14:12:56.199839: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 8175 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0, compute capability: 6.1)
Number of train images: 14482
Epoch: [ 0/1500] [   0/  56] time: 70.5338, d_loss: 9.9828, g_loss: 30.3922, v_loss: 0.9569, r_loss: 0.3035
Epoch: [ 0/1500] [   1/  56] time: 24.3561, d_loss: 9.9533, g_loss: 30.3922, v_loss: 0.8012, r_loss: 0.2749
@peto184
Copy link
Author

peto184 commented Mar 29, 2019

Never mind, I've managed to resolve the issue. Set device count GPU to 1, not 0.

config = tf.ConfigProto(device_count = {'GPU': 1})

@EdwardSmith1884
Copy link
Owner

cool! glad everything is working!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants