Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting invalid device function error while trying to make runtest in Ubuntu 14.04 with Cuda 6.0 #626

Closed
dasabir opened this issue Jul 5, 2014 · 13 comments

Comments

@dasabir
Copy link

dasabir commented Jul 5, 2014

Hi,

I have installed and compiled caffe successfully (I mean I ran 'make all' and 'make test' without any error). While running 'make runtest' I'm getting an "invalid device function error". The full log is given below. I'm using cuda 6.0 in Ubuntu 14.04 LTS. The gcc/g++ version is 4.6 and I installed and changed the make files as described in '#337' by weinman. I had to install gcc/g++ 4.6 as I was having an error as described in the above link while using gcc/g++ 4.8
Any help will be highly appreciated

The log is below:::
[ RUN ] StochasticPoolingLayerTest/1.TestGradientGPU
F0705 07:26:44.199472 14804 pooling_layer.cu:186] Check failed: error == cudaSuccess (8 vs. 0) invalid device function
*** Check failure stack trace: ***
@ 0x2ad77f5559fd google::LogMessage::Fail()
@ 0x2ad77f55789d google::LogMessage::SendToLog()
@ 0x2ad77f5555ec google::LogMessage::Flush()
@ 0x2ad77f5581be google::LogMessageFatal::~LogMessageFatal()
@ 0x610fba caffe::PoolingLayer<>::Forward_gpu()
@ 0x431c58 caffe::GradientChecker<>::CheckGradientSingle()
@ 0x474124 caffe::StochasticPoolingLayerTest_TestGradientGPU_Test<>::TestBody()
@ 0x55b30d testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x553131 testing::Test::Run()
@ 0x553216 testing::TestInfo::Run()
@ 0x553357 testing::TestCase::Run()
@ 0x5536ae testing::internal::UnitTestImpl::RunAllTests()
@ 0x55ae8d testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x55278e testing::UnitTest::Run()
@ 0x4120dd main
@ 0x2ad7818ffec5 (unknown)
@ 0x416e57 (unknown)
make: *** [runtest] Aborted (core dumped)

Thanks,
Abir

@kloudkl
Copy link
Contributor

kloudkl commented Jul 6, 2014

@wusx11
Copy link

wusx11 commented Jul 18, 2014

@dasabir Hello dasabir, have you solved your problem now? I'm stuck in the same problem with yours...And I haven't found solution from the page of kloudkl. Could you please share your way to solve the problem?

@dasabir
Copy link
Author

dasabir commented Jul 18, 2014

Hi @wusx11 , I should say I could solve the problem. Though I have done a fresh re-install of Ubuntu 14.04, yet I think last time I was making a mistake in the Makefile.config file. For cuda 6.0 I was uncommenting line 14 and 15 as instructed. But I was not putting an escape character () at the end of line 13. Could you please try it? If that resolves the problem, then this issue can also be closed.

@wusx11
Copy link

wusx11 commented Jul 21, 2014

Hi @dasabir , I hadn't put an escape character at the end of the line 13 as well... And after I did that, it works well! Thank you so much!

@dasabir
Copy link
Author

dasabir commented Jul 21, 2014

Hi @wusx11, I'm glad that the issue was resolved. I think its time to close the ticket. Also I would request the contributors to add a flag in the comment section of the makefile.config in this regard.

@dasabir dasabir closed this as completed Jul 21, 2014
@kelvinxu
Copy link

kelvinxu commented Aug 7, 2014

Hi @dasabir & @wusx11,

Could you give me a bit more detail on this bug? I'm running 12.04 and Cuda-5.5 and getting the exact error. My line 13/14 is just commented so I'm unclear on your fix.

@runbin
Copy link

runbin commented Aug 15, 2014

And uncomment the line 24/25 for CUDA 6.0 is necessary.

@blackCmd
Copy link

blackCmd commented Aug 3, 2015

@dasabir Hmm.... i'm unclear too... ㅠ_ㅠ
Can you show me example what you said...

thanks

@AustinVan
Copy link

I face the same issue, and i solved it successfully.
firstly, you need to open the file Makefile.config in you caffe directory
you can see the line like below:

CUDA_ARCH := -gencode arch=compute_20,code=sm_20
-gencode arch=compute_20,code=sm_21
-gencode arch=compute_30,code=sm_30
-gencode arch=compute_35,code=sm_35
# -gencode=arch=compute_50,code=sm_50
# -gencode=arch=compute_50,code=compute_50

uncomment the last two lines
then you also need to add " " at the end of

-gencode arch=compute_35,code=sm_35

after these you need to make again
good luck!

@loretoparisi
Copy link

I'm running this error with

$ docker run -ti caffe:gpu caffe --version
libdc1394 error: Failed
caffe version 1.0.0-rc3

and

$ nvidia-smi
Tue Oct 25 15:08:35 2016       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 370.28                 Driver Version: 370.28                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1080    Off  | 0000:01:00.0      On |                  N/A |
|  0%   48C    P8     7W / 200W |     62MiB /  8105MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 1080    Off  | 0000:02:00.0     Off |                  N/A |
|  0%   38C    P8     7W / 200W |      1MiB /  8113MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0      1241    G   /usr/lib/xorg/Xorg                              60MiB |
+-----------------------------------------------------------------------------+

and

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Sun_Sep__4_22:14:01_CDT_2016
Cuda compilation tools, release 8.0, V8.0.44

@vamsus
Copy link

vamsus commented Mar 10, 2017

I am facing a similar error. using latest CUDA version (8.0) with enabled GPU Nvidia Geforce 820M Ubuntu 16.04. How to change the CUDA arch.

[ RUN ] TanHLayerTest/2.TestTanH
F0310 07:19:41.605973 3025 tanh_layer.cu:26] Check failed: error == cudaSuccess (8 vs. 0) invalid device function
*** Check failure stack trace: ***
@ 0x7f5cb33b75cd google::LogMessage::Fail()
@ 0x7f5cb33b9433 google::LogMessage::SendToLog()
@ 0x7f5cb33b715b google::LogMessage::Flush()
@ 0x7f5cb33b9e1e google::LogMessageFatal::~LogMessageFatal()
@ 0x7f5cb162f2aa caffe::TanHLayer<>::Forward_gpu()
@ 0x481379 caffe::Layer<>::Forward()
@ 0x7b1320 caffe::TanHLayerTest<>::TestForward()
@ 0x8e1cb3 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x8db2ca testing::Test::Run()
@ 0x8db418 testing::TestInfo::Run()
@ 0x8db4f5 testing::TestCase::Run()
@ 0x8dc7cf testing::internal::UnitTestImpl::RunAllTests()
@ 0x8dcaf3 testing::UnitTest::Run()
@ 0x46693d main
@ 0x7f5cb0d3b830 __libc_start_main
@ 0x46dfd9 _start
@ (nil) (unknown)
Makefile:532: recipe for target 'runtest' failed
make: *** [runtest] Aborted (core dumped)

@TheShadow29
Copy link

@vamsus have you been able to solve the problem?

@vamsus
Copy link

vamsus commented Aug 15, 2017

@TheShadow29 I solved CUDA 8.0 installation. By disabling CUDNN support. As Nvidia 820M compute capability is 2.1. To support CUDNN compute capability should be more than 3.0.

(https://developer.nvidia.com/cuda-gpus) u can check your GPU compute capability. Disable it by commenting line in the makefile.

If u face same error then follow this installation guide link. (http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#axzz4ajfl49uf).

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants