New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Internal error Blas xGEMV launch failed
on Tensorflow v2.8.0 for the same block of codes that runs perfectly well on Tensorflow v2.4.1
#54463
Comments
Currently, the workaround for me is to use CUDA 11.1 with cuDNN 8.1.1. I arrived at this after finding out that Google Colab has TensorFlow 2.8.0 installed but runs on CUDA 11.1, although TensorFlow's compatibility matrix recommends CUDA 11.2. When I installed CUDA 11.1, which is usually bundled with cuDNN 8.0.x, TensorFlow threw an error saying it requires cuDNN 8.1.x. Hence, upgrading cuDNN to 8.1.1 does the trick. Having said that, I believe the reported bug is something to be looked at and addressed. I have a feeling this problem would appear in all TensorFlow versions that recommends CUDA 11.2 and cuDNN 8.1, i.e., TensorFlow >= 2.5.0, and I am saying this because I was getting the same error after downgrading to TensorFlow 2.7.0 on CUDA 11.2 and cuDNN 8.1. For those who has CUDA 11.1 installed with cuDNN 8.0.x on Ubuntu 18.04 / 20.04, the following commands would upgrade your cuDNN version from 8.0.x to 8.1.1.
|
@arvindrajan92 , |
hi @tilakrayal, thank you for getting back to me. your gist brings me back to this issue though. could you check your link please? also, this is my google colab notebook which says CUDA 11.1 when i execute |
@arvindrajan92, This error is due to
Can you verify the memory usage with nvidia-smi? If you have any other |
Thank you for taking a look at this issue @gadagashwini. Please allow me to address your points. OOM error -GPU is running out of memory
Doesn't have enough compute capacity There's a driver issue. Are you able to run this on a physical machine with CUDA 11.2.1 and cuDNN 8.1 without issues? |
Hi @gadagashwini, are you still looking into this issue? Thanks. |
Indeed this is expected behaviour. As per the Tensorflow document, CUDA 11.2 and cuDNN 8.1 are compatible versions. I could run the given code on CUDA 11.2 with cuDNN 8.1. Thanks! |
It may be a bug of cublas. cublas 11.4 resolved an issue:
In your case, m=1638400>2^20. As cublas is not open-source, it's unclear what versions of cublas have this issue. |
Thank you @njzjz, looking at From trying out different versions of CUDA, seems like the bug is introduced in CUDA 11.2 and only resolved in CUDA 11.4. I don't see TensorFlow throwing the error in CUDA 11.1. Hi @gadagashwini, I am happy to close the issue since it is a bug in cuBLAS 11.2. I suppose this is something to keep in mind so that upcoming TensorFlow versions are not built against CUDA 11.3, which may also have the same bug in cuBLAS. |
@arvindrajan92, |
Blas xGEMV launch failed : a.shape=[1,8696332,3], b.shape=[1,3,1], m=8696332, n=1, k=3 [Op:MatMul] I am getting error as above i am using CUDA 11.2 and tensorflow version 2.11.1 |
System information
Describe the current behavior
Running a block of code with Tensorflow v2.8.0 / Cuda 11.2 / CuDNN 8.1 returns an internal error
Blas xGEMV launch failed
when it runs perfectly well with Tensorflow v2.4.1 / Cuda 11.0 / CuDNN 8.0.Describe the expected behavior
Return the same output as Tensorflow v2.4.1 / Cuda 11.0 / CuDNN 8.0.
Contributing
Standalone code to reproduce the issue
The following block of code works perfectly well with Tensorflow v2.4.1 / Cuda 11.0 / CuDNN 8.0, but not with Tensorflow v2.8.0 / Cuda 11.2 / CuDNN 8.1.
An important point to note is that when I reduce the
shape
ofempty_image
to[512, 512, 3]
, there is no issue. However, I believe this is not a device memory issue as I can reproduce this with GeForce RTX 2080 Ti 11 GB as well as Tesla T4 16 GB.Other info / logs
The text was updated successfully, but these errors were encountered: