could not set cudnn filter descriptor: CUDNN_STATUS_BAD_PARAM #5772

yetionyo · 2016-11-22T03:12:14Z

The version of cuda and cudnn meets the requirement, but still cannot use cudnn properly.

What related GitHub issues or StackOverflow threads have you found by searching the web for your problem?

Environment info

Operating System:
Linux version 3.16.0-30-generic (buildd@kissel) (gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1) ) #40~14.04.1-Ubuntu

Installed version of CUDA and cuDNN:
(please attach the output of ls -l /path/to/cuda/lib/libcud*):
-rw-r--r-- 1 root root 558720 Sep 15 07:02 /usr/local/cuda/lib64/libcudadevrt.a
lrwxrwxrwx 1 root root 16 Sep 15 07:05 /usr/local/cuda/lib64/libcudart.so -> libcudart.so.8.0
lrwxrwxrwx 1 root root 19 Sep 15 07:05 /usr/local/cuda/lib64/libcudart.so.8.0 -> libcudart.so.8.0.44
-rw-r--r-- 1 root root 415432 Sep 15 07:02 /usr/local/cuda/lib64/libcudart.so.8.0.44
-rw-r--r-- 1 root root 775162 Sep 15 07:02 /usr/local/cuda/lib64/libcudart_static.a
lrwxrwxrwx 1 root root 13 Nov 22 10:55 /usr/local/cuda/lib64/libcudnn.so -> libcudnn.so.5
lrwxrwxrwx 1 root root 17 Nov 22 10:55 /usr/local/cuda/lib64/libcudnn.so.5 -> libcudnn.so.5.1.5
-rw-r--r-- 1 root root 78065952 Nov 22 10:09 /usr/local/cuda/lib64/libcudnn.so.5.0.5
-rw-r--r-- 1 root root 79337624 Nov 22 10:17 /usr/local/cuda/lib64/libcudnn.so.5.1.5
-rw-r--r-- 1 root root 69756172 Nov 22 10:17 /usr/local/cuda/lib64/libcudnn_static.a

If installed from binary pip package, provide:

A link to the pip package you installed:
export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.11.0-cp27-none-linux_x86_64.whl
The output from python -c "import tensorflow; print(tensorflow.__version__)".
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcurand.so locally
0.11.0

If possible, provide a minimal reproducible example (We usually don't have time to read hundreds of lines of your code)

when trying to call a function that is only supported by cudnn, for example conv2d

The text was updated successfully, but these errors were encountered:

prb12 · 2016-11-22T21:30:57Z

@yetionyo Could you please supply a minimal repro example?
@zheng-xq Can you think of any reason why this might happen?

yetionyo · 2016-11-23T12:45:58Z

It is proved to be irrelevant with conv2d itself, maybe it's related with the way I used conv2d, because I can run this demo without this problem.
import tensorflow as tf

my_data = tf.random_normal([20,20,20,3])
my_filter = tf.random_normal([3,3,3,10])
conv_result = tf.nn.conv2d(my_data, my_filter, strides=[1, 1, 1, 1], padding="VALID")
sess = tf.Session()
result = sess.run(conv_result)
print result

But it's a little strange that what kind of operation would lead to this problem (it's more like a failure of calling cudnn)

prb12 · 2016-11-23T18:59:32Z

Similar problem to #5476, #4909 and #4111 ?

All these seem to be mention passing an empty numpy array into TF.... @zheng-xq Is there perhaps some input validation missing on cuDNN ops?

yetionyo · 2016-11-24T02:06:44Z

Yeah, these problems are similar to mine. Maybe empty numpy array is not main reason in this problem, but some improper ops indeed exist. Thanks :)

prb12 · 2016-11-24T18:07:22Z

I'd like to leave this open until we understand why an empty array causes a CUDA error, rather than a TensorFlow runtime InvalidArgument error status.

gibiansky · 2017-01-10T00:14:19Z

Looks like this is still an issue on current master. It would be nice to get this fixed! The CUDA error is quite mysterious when you run into it.

ronghanghu · 2017-02-23T05:00:20Z

This issue seems to affect TensorFlow Fold, which uses dynamic network structures and can often generate empty tensor if a path is not used in a dynamic batch

tensorflowbutler · 2017-12-22T07:40:08Z

It has been 14 days with no activity and this issue has an assignee.Please update the label and/or status accordingly.

yzhwang · 2017-12-22T19:38:18Z

There is a pull request that should handle the issue. Please check after the pull request has been approved: #15264

tensorflowbutler · 2018-01-06T18:56:39Z

Nagging Assigneee: It has been 14 days with no activity and this issue has an assignee. Please update the label and/or status accordingly.

tensorflowbutler · 2018-01-24T13:22:14Z

Nagging Assignee: It has been 14 days with no activity and this issue has an assignee. Please update the label and/or status accordingly.

yzhwang · 2018-01-25T00:11:19Z

#15264 has been merged, so I believe the issue should have been fixed by that. Please reopen if it still exists.

drscotthawley · 2018-02-02T23:05:55Z

Just found this page. I'm seeing this error with a fresh nightly tensorflow-gpu on Ubuntu. So, despite the merge, this doesn't look resolved.

kirk86 · 2018-02-04T23:32:54Z

Same here I get this error as well on ubuntu tf 1.4.1 not the nightly build.

ppwwyyxx · 2018-02-05T00:57:54Z

@drscotthawley you need to provide more details (logs, small repro code, etc) for people to tell whether it's the same problem (empty tensors into cudnn) or not. The fix above only adds support of empty tensor on certain ops, and very likely there are ops not covered.

yzhwang · 2018-02-06T05:34:19Z

@ppwwyyxx Thanks for the comment! @drscotthawley and @kirk86 , could you provide more info so that I can take a closer look?

drscotthawley · 2018-02-06T05:54:11Z

@ppwwyyxx @yzhwang I had just downloaded a fresh CUDA from NVIDIA, which defaults to version 9.1, not realizing that TF didn't support that yet. I resolved this problem by downgrading to CUDA 9.0. You can close this issue again.
@kirk86, try using CUDA 9.0 instead. Also, I'm using CUDNN 7.0.5 and it's working.

Might be worth noting: I've built TF from source before, but couldn't manage to do so using CUDA 9.1. I don't recall the errors, just that downgrading to 9.0 finally enabled me to "get back to work."

kirk86 · 2018-02-06T14:53:18Z

@drscotthawley Thanks for you answer but in my case I can't do that. It's a shared system and I'm not an admin.

prb12 added stat:awaiting tensorflower Status - Awaiting response from tensorflower stat:awaiting response Status - Awaiting response from author labels Nov 22, 2016

aselle removed the stat:awaiting response Status - Awaiting response from author label Nov 23, 2016

prb12 assigned zheng-xq Nov 23, 2016

prb12 added the bug label Nov 23, 2016

yetionyo closed this as completed Nov 24, 2016

prb12 reopened this Nov 24, 2016

aselle added type:bug Bug and removed bug labels Feb 9, 2017

ronghanghu mentioned this issue Feb 27, 2017

Zero-sized batch causes operations like conv2d to crash (on GPU) tensorflow/fold#23

Closed

zheng-xq assigned yzhwang and unassigned zheng-xq Dec 22, 2017

yzhwang closed this as completed Jan 25, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

could not set cudnn filter descriptor: CUDNN_STATUS_BAD_PARAM #5772

could not set cudnn filter descriptor: CUDNN_STATUS_BAD_PARAM #5772

yetionyo commented Nov 22, 2016

prb12 commented Nov 22, 2016

yetionyo commented Nov 23, 2016

prb12 commented Nov 23, 2016

yetionyo commented Nov 24, 2016

prb12 commented Nov 24, 2016

gibiansky commented Jan 10, 2017

ronghanghu commented Feb 23, 2017

tensorflowbutler commented Dec 22, 2017

yzhwang commented Dec 22, 2017 •

edited

tensorflowbutler commented Jan 6, 2018

tensorflowbutler commented Jan 24, 2018

yzhwang commented Jan 25, 2018

drscotthawley commented Feb 2, 2018 •

edited

kirk86 commented Feb 4, 2018

ppwwyyxx commented Feb 5, 2018

yzhwang commented Feb 6, 2018

drscotthawley commented Feb 6, 2018 •

edited

kirk86 commented Feb 6, 2018

could not set cudnn filter descriptor: CUDNN_STATUS_BAD_PARAM #5772

could not set cudnn filter descriptor: CUDNN_STATUS_BAD_PARAM #5772

Comments

yetionyo commented Nov 22, 2016

What related GitHub issues or StackOverflow threads have you found by searching the web for your problem?

Environment info

If possible, provide a minimal reproducible example (We usually don't have time to read hundreds of lines of your code)

prb12 commented Nov 22, 2016

yetionyo commented Nov 23, 2016

prb12 commented Nov 23, 2016

yetionyo commented Nov 24, 2016

prb12 commented Nov 24, 2016

gibiansky commented Jan 10, 2017

ronghanghu commented Feb 23, 2017

tensorflowbutler commented Dec 22, 2017

yzhwang commented Dec 22, 2017 • edited

tensorflowbutler commented Jan 6, 2018

tensorflowbutler commented Jan 24, 2018

yzhwang commented Jan 25, 2018

drscotthawley commented Feb 2, 2018 • edited

kirk86 commented Feb 4, 2018

ppwwyyxx commented Feb 5, 2018

yzhwang commented Feb 6, 2018

drscotthawley commented Feb 6, 2018 • edited

kirk86 commented Feb 6, 2018

yzhwang commented Dec 22, 2017 •

edited

drscotthawley commented Feb 2, 2018 •

edited

drscotthawley commented Feb 6, 2018 •

edited