cuDNN launch failure : input shape ([306,1,16,29]) #11

qianyunw · 2019-07-23T06:57:31Z

Hi,

thanks for sharing! Sorry to bother you againT_T.

I am currently trying to run your code on my machine(Python 2.7& Tensorflow 1.12.0).
When I run command line "python run_voca.py", there are some problems, The following is the output, is there something wrong with my settings?

Thank you so much!!

<
python run_voca.py --tf_model_fname './model/gstep_52280.model' --ds_fname './ds_graph/output_graph.pb' --audio_fname './audio/test_sentence.wav' --template_fname './template/FLAME_sample.ply' --condition_idx 3 --out_path './animation_output'
2019-07-23 06:49:29.981799: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2019-07-23 06:49:34.971252: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.545
pciBusID: 0000:18:00.0
totalMemory: 10.76GiB freeMemory: 1.87GiB
2019-07-23 06:49:35.165961: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 1 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.545
pciBusID: 0000:3b:00.0
totalMemory: 10.76GiB freeMemory: 1.77GiB
2019-07-23 06:49:35.287032: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 2 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.545
pciBusID: 0000:86:00.0
totalMemory: 10.76GiB freeMemory: 10.60GiB
2019-07-23 06:49:35.287328: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0, 1, 2
2019-07-23 06:50:29.496448: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-07-23 06:50:29.496516: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0 1 2
2019-07-23 06:50:29.496526: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N N N
2019-07-23 06:50:29.496549: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 1: N N N
2019-07-23 06:50:29.496557: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 2: N N N
2019-07-23 06:50:29.496791: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1607 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:18:00.0, compute capability: 7.5)
2019-07-23 06:50:31.759153: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 1503 MB memory) -> physical GPU (device: 1, name: GeForce RTX 2080 Ti, pci bus id: 0000:3b:00.0, compute capability: 7.5)
2019-07-23 06:50:31.759597: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 10232 MB memory) -> physical GPU (device: 2, name: GeForce RTX 2080 Ti, pci bus id: 0000:86:00.0, compute capability: 7.5)
process subj - seq
2019-07-23 06:50:41.148658: W tensorflow/core/framework/allocator.cc:122] Allocation of 201326592 exceeds 10% of system memory.
2019-07-23 06:50:41.449976: W tensorflow/core/framework/allocator.cc:122] Allocation of 201326592 exceeds 10% of system memory.
2019-07-23 06:50:42.074583: W tensorflow/core/framework/allocator.cc:122] Allocation of 201326592 exceeds 10% of system memory.
2019-07-23 06:50:42.386122: W tensorflow/core/framework/allocator.cc:122] Allocation of 201326592 exceeds 10% of system memory.
2019-07-23 06:50:42.732171: W tensorflow/core/framework/allocator.cc:122] Allocation of 201326592 exceeds 10% of system memory.
2019-07-23 06:51:52.237460: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0, 1, 2
2019-07-23 06:51:52.237883: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-07-23 06:51:52.237898: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0 1 2
2019-07-23 06:51:52.237909: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N N N
2019-07-23 06:51:52.237916: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 1: N N N
2019-07-23 06:51:52.237924: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 2: N N N
2019-07-23 06:51:52.238128: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1607 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:18:00.0, compute capability: 7.5)
2019-07-23 06:51:52.238453: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 1503 MB memory) -> physical GPU (device: 1, name: GeForce RTX 2080 Ti, pci bus id: 0000:3b:00.0, compute capability: 7.5)
2019-07-23 06:51:52.238653: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 10232 MB memory) -> physical GPU (device: 2, name: GeForce RTX 2080 Ti, pci bus id: 0000:86:00.0, compute capability: 7.5)
2019-07-23 06:52:12.202266: E tensorflow/stream_executor/cuda/cuda_dnn.cc:373] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2019-07-23 06:52:12.202348: W ./tensorflow/stream_executor/stream.h:2093] attempting to perform DNN operation using StreamExecutor without DNN support
Traceback (most recent call last):
File "run_voca.py", line 44, in
inference(tf_model_fname, ds_fname, audio_fname, template_fname, condition_idx, out_path)
File "/home/wangqianyun/voca/utils/inference.py", line 83, in inference
predicted_vertices = np.squeeze(session.run(output_decoder, feed_dict))
File "/home/wangqianyun/voca/voca/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/home/wangqianyun/voca/voca/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "/home/wangqianyun/voca/voca/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
run_metadata)
File "/home/wangqianyun/voca/voca/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: cuDNN launch failure : input shape ([306,1,16,29])
[[node VOCA/SpeechEncoder/batch_norm_1/cond/FusedBatchNorm_1 (defined at /home/wangqianyun/voca/utils/inference.py:65) = FusedBatchNorm[T=DT_FLOAT, data_format="NCHW", epsilon=1.001e-05, is_training=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](VOCA/SpeechEncoder/batch_norm_1/cond/FusedBatchNorm_1-0-TransposeNHWCToNCHW-LayoutOptimizer, VOCA/SpeechEncoder/batch_norm_1/cond/FusedBatchNorm_1/Switch_1, VOCA/SpeechEncoder/batch_norm_1/cond/FusedBatchNorm_1/Switch_2, VOCA/SpeechEncoder/batch_norm_1/cond_1/AssignMovingAvg/sub/Switch, VOCA/SpeechEncoder/batch_norm_1/cond_1/AssignMovingAvg_1/sub/Switch)]]

Caused by op u'VOCA/SpeechEncoder/batch_norm_1/cond/FusedBatchNorm_1', defined at:
File "run_voca.py", line 44, in
inference(tf_model_fname, ds_fname, audio_fname, template_fname, condition_idx, out_path)
File "/home/wangqianyun/voca/utils/inference.py", line 65, in inference
saver = tf.train.import_meta_graph(tf_model_fname + '.meta')
File "/home/wangqianyun/voca/voca/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1674, in import_meta_graph
meta_graph_or_file, clear_devices, import_scope, **kwargs)[0]
File "/home/wangqianyun/voca/voca/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1696, in _import_meta_graph_with_return_elements
**kwargs))
File "/home/wangqianyun/voca/voca/local/lib/python2.7/site-packages/tensorflow/python/framework/meta_graph.py", line 806, in import_scoped_meta_graph_with_return_elements
return_elements=return_elements)
File "/home/wangqianyun/voca/voca/local/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "/home/wangqianyun/voca/voca/local/lib/python2.7/site-packages/tensorflow/python/framework/importer.py", line 442, in import_graph_def
_ProcessNewOps(graph)
File "/home/wangqianyun/voca/voca/local/lib/python2.7/site-packages/tensorflow/python/framework/importer.py", line 234, in _ProcessNewOps
for new_op in graph._add_new_tf_operations(compute_devices=False): # pylint: disable=protected-access
File "/home/wangqianyun/voca/voca/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 3440, in _add_new_tf_operations
for c_op in c_api_util.new_tf_operations(self)
File "/home/wangqianyun/voca/voca/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 3299, in _create_op_from_tf_operation
ret = Operation(c_op, self)
File "/home/wangqianyun/voca/voca/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1770, in init
self._traceback = tf_stack.extract_stack()

InternalError (see above for traceback): cuDNN launch failure : input shape ([306,1,16,29])
[[node VOCA/SpeechEncoder/batch_norm_1/cond/FusedBatchNorm_1 (defined at /home/wangqianyun/voca/utils/inference.py:65) = FusedBatchNorm[T=DT_FLOAT, data_format="NCHW", epsilon=1.001e-05, is_training=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](VOCA/SpeechEncoder/batch_norm_1/cond/FusedBatchNorm_1-0-TransposeNHWCToNCHW-LayoutOptimizer, VOCA/SpeechEncoder/batch_norm_1/cond/FusedBatchNorm_1/Switch_1, VOCA/SpeechEncoder/batch_norm_1/cond/FusedBatchNorm_1/Switch_2, VOCA/SpeechEncoder/batch_norm_1/cond_1/AssignMovingAvg/sub/Switch, VOCA/SpeechEncoder/batch_norm_1/cond_1/AssignMovingAvg_1/sub/Switch)]]

TimoBolkart · 2019-07-23T15:25:22Z

Hi,

can you please provide some information about the cuda and cudnn version that you are using?

qianyunw · 2019-07-25T02:51:05Z

Hi,

sorry for getting back to you late. I am using cuda 9.0 and cudnn 7. ^_^

TimoBolkart · 2019-08-05T08:32:17Z

The code was tested with cuda 9.0 and cudnn 7.1
Sorry but I don't know what could cause your error

qianyunw · 2019-08-08T11:02:44Z

Hi,

Thank you so much for your reply ^_^
The problem is caused by gpu memory, I added the following code in inference.py, it works prefectly!
<
os.environ['CUDA_VISIBLE_DEVICES']='2'

config = tf.ConfigProto()
config.gpu_options.allow_growth=True
sess = tf.Session(config=config)

TimoBolkart · 2019-08-08T11:12:22Z

Hi, great that it works now and thanks a lot for the feedback

TimoBolkart closed this as completed Aug 8, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cuDNN launch failure : input shape ([306,1,16,29]) #11

cuDNN launch failure : input shape ([306,1,16,29]) #11

qianyunw commented Jul 23, 2019

TimoBolkart commented Jul 23, 2019

qianyunw commented Jul 25, 2019

TimoBolkart commented Aug 5, 2019

qianyunw commented Aug 8, 2019

TimoBolkart commented Aug 8, 2019

cuDNN launch failure : input shape ([306,1,16,29]) #11

cuDNN launch failure : input shape ([306,1,16,29]) #11

Comments

qianyunw commented Jul 23, 2019

TimoBolkart commented Jul 23, 2019

qianyunw commented Jul 25, 2019

TimoBolkart commented Aug 5, 2019

qianyunw commented Aug 8, 2019

TimoBolkart commented Aug 8, 2019