InternalError (see above for traceback): Blas SGEMM launch failed : m=802816, n=64, k=32 #224

to-be-snail · 2019-02-12T02:14:14Z

When I perform channel pruning on the mobilenet at ilsvrc12 dataset,this error occured. But the pruning at cifar10 dataset can be done normally.

jiaxiang-wu · 2019-02-12T03:26:59Z

Maybe something related to the GPU memory?
https://stackoverflow.com/questions/37337728/tensorflow-internalerror-blas-sgemm-launch-failed

to-be-snail · 2019-02-12T03:32:02Z

Maybe something related to the GPU memory?
https://stackoverflow.com/questions/37337728/tensorflow-internalerror-blas-sgemm-launch-failed

My machine is GTX2080,the GPUmemory is 8G,I dont know if i can finish the pruning...

jiaxiang-wu · 2019-02-12T03:34:53Z

Could you try solutions provided in the above stack-overflow link, and see if anything helps?

to-be-snail · 2019-02-12T03:38:59Z

anything
I'm sure I only run a tensorflow program at the same time and have reinstalled the tensorflow-gpu,it didn't worked.

jiaxiang-wu · 2019-02-12T03:40:54Z

Maybe this one? https://stackoverflow.com/a/43130779/10611647

gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.3)
sess = tf.Session(config=tf.ConfigProto(
  allow_soft_placement=True, log_device_placement=True))

to-be-snail · 2019-02-12T03:41:03Z

Could you try solutions provided in the above stack-overflow link, and see if anything helps?

I'm sure I only run a tensorflow program at the same time and have reinstalled the tensorflow-gpu,it didn't worked.

to-be-snail · 2019-02-12T03:43:36Z

Maybe this one? https://stackoverflow.com/a/43130779/10611647
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.3)
sess = tf.Session(config=tf.ConfigProto(
allow_soft_placement=True, log_device_placement=True))

I have tried,although I'm not sure where to put it.

jiaxiang-wu · 2019-02-12T03:45:04Z

How many GPU cards do you have?

to-be-snail · 2019-02-12T03:46:31Z

How many GPU cards do you have?

only one...

jiaxiang-wu · 2019-02-12T04:16:57Z

Try to reduce the batch size?

to-be-snail · 2019-02-12T04:42:37Z

Try to reduce the batch size?

I have reduced the batch_size_eval to 1

jiaxiang-wu · 2019-02-12T04:44:08Z

If the error occurs in the training process, then you should reduce FLAGS.batch_size instead of FLAGS.batch_size_eval.

to-be-snail · 2019-02-12T04:53:01Z

If the error occurs in the training process, then you should reduce FLAGS.batch_size instead of FLAGS.batch_size_eval.

It didn't work...

jiaxiang-wu · 2019-02-25T00:59:38Z

Any updates? Still not working?

ShuteLee · 2019-03-18T12:34:35Z

Hey bro, have u figured it out ? I met the same issue

0113bernoyoun · 2019-04-04T12:22:06Z

plz if you solve this problem, let me know how to solve it,,,

Donald-Su · 2019-08-07T07:44:32Z

I encountered the same issue when I run my code at the machine of the GTX2080(the signal GPU memory is 8G, total have two card), the error info as the following:

InternalError (see above for traceback): Blas SGEMM launch failed : m=53290, n=80, k=64
	 [[node while/AdvInceptionV3/AdvInceptionV3/Conv2d_3b_1x1/Conv2D (defined at /home/suy/.pyenv/versions/mypython3.6/lib/python3.6/site-packages/tensorflow/contrib/layers/python/layers/layers.py:1057)  = Conv2D[T=DT_FLOAT, data_format="NHWC", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](while/AdvInceptionV3/AdvInceptionV3/MaxPool_3a_3x3/MaxPool, while/AdvInceptionV3/AdvInceptionV3/Conv2d_3b_1x1/kernel/Regularizer/l2_regularizer/L2Loss/Enter)]]
	 [[{{node while/Exit/_791}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_4223_while/Exit", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

However, I could run the same code at another machine of the GTX2080(the signal GPU memory is 10G, total have two card).

I still don't know why.

ShuteLee · 2019-08-07T08:33:28Z

I fixed this issue just by installing the patches of CUDA_Toolkit @Donald-Su @0113bernoyoun

Donald-Su · 2019-08-07T09:31:08Z

I fixed this issue just by installing the patches of CUDA_Toolkit @Donald-Su @0113bernoyoun

Hi ShuteLee, the machine installed the CUDA_Toolkit, but still have the issue

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176

ShuteLee · 2019-08-07T11:07:45Z

I fixed this issue just by installing the patches of CUDA_Toolkit @Donald-Su @0113bernoyoun

Hi ShuteLee, the machine installed the CUDA_Toolkit, but still have the issue
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176

Please be sure that you have installed the four PATCHES

https://developer.nvidia.com/cuda-90-download-archive?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1604&target_type=runfilelocal

Donald-Su · 2019-08-08T06:59:50Z

I fixed this issue just by installing the patches of CUDA_Toolkit @Donald-Su @0113bernoyoun

Hi ShuteLee, the machine installed the CUDA_Toolkit, but still have the issue
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176
Please be sure that you have installed the four PATCHES

https://developer.nvidia.com/cuda-90-download-archive?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1604&target_type=runfilelocal

There is not the package for my OS of the ubuntu 18.04

ShuteLee · 2019-08-08T09:01:06Z

I fixed this issue just by installing the patches of CUDA_Toolkit @Donald-Su @0113bernoyoun

Hi ShuteLee, the machine installed the CUDA_Toolkit, but still have the issue
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:03_CDT_2017
Cuda compilation tools, release 9.0, V9.0.176
Please be sure that you have installed the four PATCHES
https://developer.nvidia.com/cuda-90-download-archive?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1604&target_type=runfilelocal
There is not the package for my OS of the ubuntu 18.04

So, maybe the CUDA Tookit 9.0 is not so compatible with your Ubuntu 18.04. you can choose a more recent version.

bryanbocao · 2020-12-29T23:36:55Z

Make sure TensorFlow is in 1.12.0 version mentioned in main.sh?

pip install tensorflow-gpu==1.12.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

InternalError (see above for traceback): Blas SGEMM launch failed : m=802816, n=64, k=32 #224

InternalError (see above for traceback): Blas SGEMM launch failed : m=802816, n=64, k=32 #224

to-be-snail commented Feb 12, 2019

jiaxiang-wu commented Feb 12, 2019

to-be-snail commented Feb 12, 2019

jiaxiang-wu commented Feb 12, 2019 •

edited

Loading

to-be-snail commented Feb 12, 2019

jiaxiang-wu commented Feb 12, 2019

to-be-snail commented Feb 12, 2019

to-be-snail commented Feb 12, 2019

jiaxiang-wu commented Feb 12, 2019

to-be-snail commented Feb 12, 2019

jiaxiang-wu commented Feb 12, 2019

to-be-snail commented Feb 12, 2019

jiaxiang-wu commented Feb 12, 2019

to-be-snail commented Feb 12, 2019

jiaxiang-wu commented Feb 25, 2019

ShuteLee commented Mar 18, 2019

0113bernoyoun commented Apr 4, 2019

Donald-Su commented Aug 7, 2019 •

edited

Loading

ShuteLee commented Aug 7, 2019

Donald-Su commented Aug 7, 2019

ShuteLee commented Aug 7, 2019

Donald-Su commented Aug 8, 2019

ShuteLee commented Aug 8, 2019

bryanbocao commented Dec 29, 2020

InternalError (see above for traceback): Blas SGEMM launch failed : m=802816, n=64, k=32 #224

InternalError (see above for traceback): Blas SGEMM launch failed : m=802816, n=64, k=32 #224

Comments

to-be-snail commented Feb 12, 2019

jiaxiang-wu commented Feb 12, 2019

to-be-snail commented Feb 12, 2019

jiaxiang-wu commented Feb 12, 2019 • edited Loading

to-be-snail commented Feb 12, 2019

jiaxiang-wu commented Feb 12, 2019

to-be-snail commented Feb 12, 2019

to-be-snail commented Feb 12, 2019

jiaxiang-wu commented Feb 12, 2019

to-be-snail commented Feb 12, 2019

jiaxiang-wu commented Feb 12, 2019

to-be-snail commented Feb 12, 2019

jiaxiang-wu commented Feb 12, 2019

to-be-snail commented Feb 12, 2019

jiaxiang-wu commented Feb 25, 2019

ShuteLee commented Mar 18, 2019

0113bernoyoun commented Apr 4, 2019

Donald-Su commented Aug 7, 2019 • edited Loading

ShuteLee commented Aug 7, 2019

Donald-Su commented Aug 7, 2019

ShuteLee commented Aug 7, 2019

Donald-Su commented Aug 8, 2019

ShuteLee commented Aug 8, 2019

bryanbocao commented Dec 29, 2020

jiaxiang-wu commented Feb 12, 2019 •

edited

Loading

Donald-Su commented Aug 7, 2019 •

edited

Loading