You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcurand.so locally
0.8.0
Steps to reproduce
(tensorflow)username@pcname:~/Research/ai/tf_examples$ python digits.py
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcurand.so locally
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties:
name: GeForce GTX 980 Ti
major: 5 minor: 2 memoryClockRate (GHz) 1.291
pciBusID 0000:01:00.0
Total memory: 6.00GiB
Free memory: 5.53GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:755] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 980 Ti, pci bus id: 0000:01:00.0)
E tensorflow/stream_executor/cuda/cuda_driver.cc:932] failed to allocate 5.53G (5935898624 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
Step #100, epoch #8, avg. train loss: 2.79814
Step #200, epoch #16, avg. train loss: 1.27029
Step #300, epoch #25, avg. train loss: 0.98960
Step #400, epoch #33, avg. train loss: 0.84844
Step #500, epoch #41, avg. train loss: 0.75324
Accuracy: 0.744444
Running digits.py throws the "failed to allocate 5.53G" (the available memory on GPU is 6GB).
@zheng-xq Not fatal, but I don't get CUDA_ERROR_OUT_OF_MEMORY error in other tutorials. Need to trace it in the cuda_driver.cc to see the how gpu memory requests are handled.
@terrytangyuan Yep. Thanks for pointing out that example. I thought maybe there was some autoscaling magic going on.
Environment info
Operating System: 16.04
Installed version of CUDA and cuDNN:
(please attach the output of
ls -l /path/to/cuda/lib/libcud*
):Installed using:
Steps to reproduce
Running digits.py throws the "failed to allocate 5.53G" (the available memory on GPU is 6GB).
Possible solution
I can restrict the allocated memory using
but I am wondering if there is any other way to handle this error.
The text was updated successfully, but these errors were encountered: