Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

syncedmem.cpp:57] Check failed: error == cudaSuccess (11 vs. 0) #1600

Closed
jwessnit opened this issue Dec 19, 2014 · 5 comments
Closed

syncedmem.cpp:57] Check failed: error == cudaSuccess (11 vs. 0) #1600

jwessnit opened this issue Dec 19, 2014 · 5 comments

Comments

@jwessnit
Copy link

hi,

would anyone know what the problem might be here? cudaMalloc is failing on iteration 0 when I am trying to train my own network. make runtest worked with the exception of 12 unit tests (see below).

I have seen Cuda error 11 a couple of times in other submitted issues but none caused by syncedmem.cpp.

I1219 14:44:31.277175 4727 net.cpp:208] This network produces output accuracy
I1219 14:44:31.277303 4727 net.cpp:208] This network produces output loss
I1219 14:44:31.277470 4727 net.cpp:467] Collecting Learning Rate and Weight Decay.
I1219 14:44:31.277608 4727 net.cpp:219] Network initialization done.
I1219 14:44:31.277739 4727 net.cpp:220] Memory required for data: 5493236
I1219 14:44:31.278095 4727 solver.cpp:41] Solver scaffolding done.
I1219 14:44:31.278239 4727 solver.cpp:160] Solving testnet
I1219 14:44:31.278445 4727 solver.cpp:247] Iteration 0, Testing net (#0)
F1219 14:44:31.733762 4727 syncedmem.cpp:57] Check failed: error == cudaSuccess (11 vs. 0) invalid argument
*** Check failure stack trace: ***
@ 0xb2619060 (unknown)
@ 0xb2618f5c (unknown)
@ 0xb2618b78 (unknown)
@ 0xb261af98 (unknown)
Aborted

make runtest errors:

[----------] Global test environment tear-down
[==========] 838 tests from 169 test cases ran. (911468 ms total)
[ PASSED ] 826 tests.
[ FAILED ] 12 tests, listed below:
[ FAILED ] NetTest/0.TestParamPropagateDown, where TypeParam = caffe::FloatCPU
[ FAILED ] NetTest/1.TestParamPropagateDown, where TypeParam = caffe::DoubleCPU
[ FAILED ] NetTest/2.TestParamPropagateDown, where TypeParam = caffe::FloatGPU
[ FAILED ] NetTest/3.TestParamPropagateDown, where TypeParam = caffe::DoubleGPU
[ FAILED ] MathFunctionsTest/0.TestSgnbitCPU, where TypeParam = float
[ FAILED ] MathFunctionsTest/0.TestSignCPU, where TypeParam = float
[ FAILED ] MathFunctionsTest/1.TestSignCPU, where TypeParam = double
[ FAILED ] MathFunctionsTest/1.TestSgnbitCPU, where TypeParam = double
[ FAILED ] HingeLossLayerTest/0.TestGradientL1, where TypeParam = caffe::FloatCPU
[ FAILED ] HingeLossLayerTest/1.TestGradientL1, where TypeParam = caffe::DoubleCPU
[ FAILED ] HingeLossLayerTest/2.TestGradientL1, where TypeParam = caffe::FloatGPU
[ FAILED ] HingeLossLayerTest/3.TestGradientL1, where TypeParam = caffe::DoubleGPU

12 FAILED TESTS
YOU HAVE 2 DISABLED TESTS

@dreadlord1984
Copy link

Have you solved this problem? Or what's the reason of this situation?
I meet the same error in fintuning the trained model just as describe here . http://caffe.berkeleyvision.org/gathered/examples/finetune_flickr_style.html.

I used the caffe version 0.99. And run successfully in other training task, just failed finetuning the trained model with different classes output.
My error:

math_functions.cpp:90] Check failed: error == cudaSuccess (11 vs. 0) invalid argument
*** Check failure stack trace: ***
@ 0x7fca72b8edaa (unknown)
@ 0x7fca72b8ece4 (unknown)
@ 0x7fca72b8e6e6 (unknown)
@ 0x7fca72b91687 (unknown)
@ 0x4b6dec caffe::caffe_copy<>()
@ 0x49ff4f caffe::SGDSolver<>::ComputeUpdateValue()
@ 0x49b6d3 caffe::Solver<>::Solve()
@ 0x41046e caffe::Solver<>::Solve()
@ 0x40de4d train()
@ 0x40f508 main
@ 0x7fca6ff3bec5 (unknown)
@ 0x40d6e9 (unknown)
Aborted (core dumped)

@mingtop
Copy link

mingtop commented Jan 22, 2015

When use the GT9500 Gpu , I got the problem
But I fixed it by Change my Gpu GTX750.
Caffe support the compution >= 2.0.
you should change you GpU

@jwessnit
Copy link
Author

I have not yet solved this problem. Unfortunately, I am using a Jetson TK1, so I can't change GPU.

@shelhamer
Copy link
Member

The latest release rc2 included the Jetson compatibility changes so it should work fine. Check out this blog post for details: http://jetsonhacks.com/2015/01/17/nvidia-jetson-tk1-caffe-deep-learning-framework/.

Please ask about hardware on the caffe-users group.

@leejiajun
Copy link

@jwessnit Have you solved this problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants