in debug mode, got Assertion `cudaGetLastError() == cudaSuccess' failed #6285

aseuteurideu · 2016-12-13T13:34:33Z

Environment: Ubuntu 14.04, CUDA 8, CuDNN 5.1, Tensorflow r0.12
Build command
bazel build -c opt --config cuda -c dbg --strip=never //tensorflow/tools/pip_package:build_pip_package

previously, following the installation of tensorflow (not in debug mode), my code works well. But, after I rebuild with above command, it shows this error in the middle of running the session.

I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcupti.so.8.0 locally
python: external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorExecutor.h:262: static void Eigen::internal::TensorExecutor<Expression, Eigen::GpuDevice, Vectorizable>::run(const Expression&, const Eigen::GpuDevice&) [with Expression = const Eigen::TensorAssignOp<Eigen::TensorMap<Eigen::Tensor<float, 1, 1, long int>, 16, Eigen::MakePointer>, const Eigen::TensorCwiseBinaryOp<Eigen::internal::scalar_max_op<const float, const float>, const Eigen::TensorMap<Eigen::Tensor<const float, 1, 1, long int>, 16, Eigen::MakePointer>, const Eigen::TensorCwiseNullaryOp<Eigen::internal::scalar_constant_op, const Eigen::TensorMap<Eigen::Tensor<const float, 1, 1, long int>, 16, Eigen::MakePointer> > > >; bool Vectorizable = true]: Assertion `cudaGetLastError() == cudaSuccess' failed.

the backtrace result from gdb when the error comes (seems from ReLU operator):

(gdb) bt
#0 0x00007ff450df6c37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007ff450dfa028 in __GI_abort () at abort.c:89
#2 0x00007ff450defbf6 in __assert_fail_base (
fmt=0x7ff450f403b8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n",
assertion=assertion@entry=0x7ff42dc57260 "cudaGetLastError() == cudaSuccess",
file=file@entry=0x7ff42dc57210 "external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorExecutor.h", line=line@entry=262,
function=function@entry=0x7ff42dc58f80 <Eigen::internal::TensorExecutor<Eigen::TensorAssignOp<Eigen::TensorMap<Eigen::Tensor<float, 1, 1, long>, 16, Eigen::MakePointer>, Eigen::TensorCwiseBinaryOp<Eigen::internal::scalar_max_op<float const, float const>, Eigen::TensorMap<Eigen::Tensor<float const, 1, 1, long>, 16, Eigen::MakePointer> const, Eigen::TensorCwiseNullaryOp<Eigen::internal::scalar_constant_op, Eigen::TensorMap<Eigen::Tensor<float const, 1, 1, long>, 16, Eigen::MakePointer> const> const> const> const, Eigen::GpuDevice, true>::run(Eigen::TensorAssignOp<Eigen::TensorMap<Eigen::Tensor<float, 1, 1, long>, 16, Eigen::MakePointer>, Eigen::TensorCwiseBinaryOp<Eigen::internal::scalar_max_op<float const, float const>, Eigen::TensorMap<Eigen::Tensor<float const, 1, 1, long>, 16, Eigen::MakePointer> const, Eigen::TensorCwiseNullaryOp<Eigen::internal::scalar_constant_op, Eigen::TensorMap<Eigen::Tensor<float const, 1, 1, long>, 16, Eigen::MakePointer> const> const> const> const&, Eigen::GpuDevice const&)::PRETTY_FUNCTION> "static void Eigen::internal::TensorExecutor<Expression, Eigen::GpuDevice, Vectorizable>::run(const Expression&, const Eigen::GpuDevice&) [with Expression = const Eigen::TensorAssignOp<Eigen::TensorMap"...) at assert.c:92
#3 0x00007ff450defca2 in __GI___assert_fail (
assertion=0x7ff42dc57260 "cudaGetLastError() == cudaSuccess",
file=0x7ff42dc57210 "external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorExecutor.h", line=262,
function=0x7ff42dc58f80 <Eigen::internal::TensorExecutor<Eigen::TensorAssignOp<Eigen::TensorMap<Eigen::Tensor<float, 1, 1, long>, 16, Eigen::MakePointer>, Eigen::TensorCwiseBinaryOp<Eigen::internal::scalar_max_op<float const, float const>, Eigen::TensorMap<Eigen::Tensor<float const, 1, 1, long>, 16, Eigen::MakePointer> const, Eigen::TensorCwiseNullaryOp<Eigen::internal::scalar_constant_op, Eigen::TensorMap<Eigen::Tensor<float const, 1, 1, long>, 16, Eigen::MakePointer> const> const> const> const, Eigen::GpuDevice, true>::run(Eigen::TensorAssignOp<Eigen::TensorMap<Eigen::Tensor<float, 1, 1, long>, 16, Eigen::MakePointer>, Eigen::TensorCwiseBinaryOp<Eigen::internal::scalar_max_op<float const, float const>, Eigen::TensorMap<Eigen::Tensor<float const, 1, 1, long>, 16, Eigen::MakePointer> const, Eigen::TensorCwiseNullaryOp<Eigen::internal::scalar_constant_op, Eigen::TensorMap<Eigen::Tensor<float const, 1, 1, long>, 16, Eigen::MakePointer> const> const> const> const&, Eigen::GpuDevice const&)::PRETTY_FUNCTION> "static void Eigen::internal::TensorExecutor<Expression, Eigen::GpuDevice, Vectorizable>::run(const Expression&, const Eigen::GpuDevice&) [with Expression = const Eigen::TensorAssignOp<Eigen::TensorMap"...) at assert.c:101
#4 0x00007ff42ad39672 in Eigen::internal::TensorExecutor<Eigen::TensorAssignOp<Eigen::TensorMap<Eigen::Tensor<float, 1, 1, long>, 16, Eigen::MakePointer>, Eigen::TensorCwiseBinaryOp<Eigen::internal::scalar_max_op<float const, float const>, Eigen::TensorMap<Eigen::Tensor<float const, 1, 1, long>, 16, Eigen::MakePointer> const, Eigen::TensorCwiseNullaryOp<Eigen::internal::scalar_constant_op, Eigen::TensorMap<Eigen::Tensor<float const, 1, 1, long>, 16, Eigen::MakePointer> const> const> const> const, Eigen::GpuDevice, true>::run (expr=..., device=...)
at external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorExecutor.h:260
#5 0x00007ff42ad37a6b in Eigen::TensorDevice<Eigen::TensorMap<Eigen::Tensor<float, 1, 1, long>, 16, Eigen::MakePointer>, Eigen::GpuDevice>::operator=<Eigen::TensorCwiseBinaryOp<Eigen::internal::scalar_max_op<float const, float const>, Eigen::TensorMap<Eigen::Tensor<float const, 1, 1, long>, 16, Eigen::MakePointer> const, Eigen::TensorCwiseNullaryOp<Eigen::internal::scalar_constant_op, Eigen::TensorMap<Eigen::Tensor<float const, 1, 1, long>, 16, Eigen::MakePointer> const> const> > (
this=0x7ff35b7fcfe0, other=...)
at external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorDevice.h:35
#6 0x00007ff42ad36cb0 in tensorflow::functor::Relu<Eigen::GpuDevice, float>::operator() (
this=0x7ff35b7fd04f, d=..., features=..., activations=...)
at ./tensorflow/core/kernels/relu_op_functor.h:35
#7 0x00007ff42acf48c1 in tensorflow::ReluOp<Eigen::GpuDevice, float>::Operate (this=0x59fde00,
context=0x7ff35b7fd8f0, input=..., output=0x7ff23ce68390)
at ./tensorflow/core/kernels/relu_op.h:40
#8 0x00007ff42acee479 in tensorflow::UnaryElementWiseOp<float, tensorflow::ReluOp<Eigen::GpuDevice, float> >::Compute (this=0x59fde00, context=0x7ff35b7fd8f0)
at ./tensorflow/core/framework/numeric_op.h:62
#9 0x00007ff42c18523c in tensorflow::BaseGPUDevice::Compute (this=0x5964560, op_kernel=0x59fde00,
context=0x7ff35b7fd8f0) at tensorflow/core/common_runtime/gpu/gpu_device.cc:385
#10 0x00007ff42c402c39 in tensorflow::(anonymous namespace)::ExecutorState::Process (
this=0x3d553c80, tagged_node=..., scheduled_usec=1481634999542636)
at tensorflow/core/common_runtime/executor.cc:1364
#11 0x00007ff42c404881 in tensorflow::(anonymous namespace)::ExecutorState::__lambda4::operator() (
__closure=0x3d557a70) at tensorflow/core/common_runtime/executor.cc:1746
#12 0x00007ff42c40ada4 in std::_Function_handler<void(), tensorflow::(anonymous namespace)::ExecutorState::ScheduleReady(const TaggedNodeSeq&, tensorflow::(anonymous namespace)::ExecutorState::TaggedNodeReadyQueue*)::__lambda4>::_M_invoke(const std::_Any_data &) (__functor=...)
at /usr/include/c++/4.8/functional:2071
#13 0x00007ff428bcc7a8 in std::function<void ()>::operator()() const (this=0x3d557ab0)
at /usr/include/c++/4.8/functional:2471
#14 0x00007ff42c6f0b80 in tensorflow::thread::EigenEnvironment::ExecuteTask (this=0x59921a8, t=...)
at tensorflow/core/lib/core/threadpool.cc:82
#15 0x00007ff42c6f29fb in Eigen::NonBlockingThreadPoolTempltensorflow::thread::EigenEnvironment::WorkerLoop (this=0x59921a0, thread_id=7)
at external/eigen_archive/unsupported/Eigen/CXX11/src/ThreadPool/NonBlockingThreadPool.h:167
#16 0x00007ff42c6f1508 in Eigen::NonBlockingThreadPoolTempltensorflow::thread::EigenEnvironment::NonBlockingThreadPoolTempl(int, tensorflow::thread::EigenEnvironment)::{lambda()#1}::operator()() const
() at external/eigen_archive/unsupported/Eigen/CXX11/src/ThreadPool/NonBlockingThreadPool.h:58
#17 0x00007ff42c6f3927 in std::_Function_handler<void (), Eigen::NonBlockingThreadPoolTempltensorflow::thread::EigenEnvironment::NonBlockingThreadPoolTempl(int, tensorflow::thread::EigenEnvironment)::{lambda()#1}>::_M_invoke(std::_Any_data const&) (__functor=...) at /usr/include/c++/4.8/functional:2071
#18 0x00007ff428bcc7a8 in std::function<void ()>::operator()() const (this=0x59c3bf0)

I haven't tried anything as I don't have any idea about this error. I tried with other architecture, it was working. The difference between this and other architecture is this architecture (the one with error) has depthwise layer and the other (the one without error) doesn't have depthwise layer. Seems not connected with the error from relu layer, so I don't have any idea. Both architectures were working in the non-debugging mode

The text was updated successfully, but these errors were encountered:

andydavis1 · 2016-12-15T18:54:40Z

Just adding some information (we don't currently have a solution or cycles to look at this). We have seen issues like this in the past where it is either an ODR violation or a bug in nvcc.

aselle · 2016-12-29T20:15:23Z

@zheng-xq, do you have any other thoughts. If we can't reproduce this, I will need to close it as unreproducible. Sorry :(.

endernewton · 2017-02-24T03:20:41Z

@kalkaneus Did you resolve the problem? I have a similar issue here.

aseuteurideu · 2017-02-27T02:53:33Z

@endernewton unfortunately not :(

endernewton · 2017-02-27T03:13:33Z

Oh I actually got it working here. The reason is I am compiling on my centos server without sudo access. I need to use the customized gcc to compile blaze. I didn't set up blaze correctly. Once it was compiled with right gcc, the problem is gone.

tatatodd · 2017-03-01T06:42:52Z

@kalkaneus A few suggestions:

Perhaps you can try TensorFlow 1.0? Or take @endernewton 's advice and try to ensure you have your build environment set up to use the right gcc?

Note that the build command you listed has both -c opt and -c dbg. That is an uncommon usage:
bazel build -c opt --config cuda -c dbg --strip=never //tensorflow/tools/pip_package:build_pip_package

If you just want symbols in your binary, you should drop the -c dbg:
bazel build -c opt --config cuda --strip=never //tensorflow/tools/pip_package:build_pip_package

Let us know whether that helps!

aseuteurideu · 2017-03-02T00:45:21Z

@tatatodd @endernewton thank you for your suggestions. But as for now, I am doing something else, so later when I have extra time, I'll try to do that in my machine. I'll let you know the result afterwards

And I want to have dbg option as I want thoroughly debug it. Without the dbg flag, everything works well so far.

tatatodd · 2017-03-03T03:15:28Z

In that case I'll close this issue out. @kalkaneus feel free to file a new issue if you encounter new problems. Thanks!

andydavis1 added the cuda label Dec 15, 2016

aselle added type:bug Bug stat:awaiting tensorflower Status - Awaiting response from tensorflower labels Dec 29, 2016

aselle removed the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Feb 27, 2017

tatatodd added the stat:awaiting response Status - Awaiting response from author label Mar 1, 2017

aselle removed the stat:awaiting response Status - Awaiting response from author label Mar 2, 2017

tatatodd closed this as completed Mar 3, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

in debug mode, got Assertion `cudaGetLastError() == cudaSuccess' failed #6285

in debug mode, got Assertion `cudaGetLastError() == cudaSuccess' failed #6285

aseuteurideu commented Dec 13, 2016

andydavis1 commented Dec 15, 2016

aselle commented Dec 29, 2016

endernewton commented Feb 24, 2017

aseuteurideu commented Feb 27, 2017

endernewton commented Feb 27, 2017

tatatodd commented Mar 1, 2017

aseuteurideu commented Mar 2, 2017

tatatodd commented Mar 3, 2017

in debug mode, got Assertion `cudaGetLastError() == cudaSuccess' failed #6285

in debug mode, got Assertion `cudaGetLastError() == cudaSuccess' failed #6285

Comments

aseuteurideu commented Dec 13, 2016

andydavis1 commented Dec 15, 2016

aselle commented Dec 29, 2016

endernewton commented Feb 24, 2017

aseuteurideu commented Feb 27, 2017

endernewton commented Feb 27, 2017

tatatodd commented Mar 1, 2017

aseuteurideu commented Mar 2, 2017

tatatodd commented Mar 3, 2017