Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CuDNNError: CUDNN_STATUS_MAPPING_ERROR #3060

Closed
machanic opened this issue Jul 26, 2017 · 12 comments
Closed

CuDNNError: CUDNN_STATUS_MAPPING_ERROR #3060

machanic opened this issue Jul 26, 2017 · 12 comments
Labels
stale Not updated for a longer period of time.

Comments

@machanic
Copy link

When I am training my chainercv model . it report :

Traceback (most recent call last):

  File "chainercv/trainer/train.py", line 191, in <module>
    main()
  File "chainercv/trainer/train.py", line 187, in main
    trainer.run()
  File "/usr/local/anaconda3/lib/python3.6/site-packages/chainer/training/trainer.py", line 296, in run
    update()
  File "/usr/local/anaconda3/lib/python3.6/site-packages/chainer/training/updater.py", line 177, in update
    self.update_core()
  File "/usr/local/anaconda3/lib/python3.6/site-packages/chainer/training/updater.py", line 306, in update_core
    loss = loss_func(*in_arrays)
  File "/home/machen/face_expr/chainercv/links/model/faster_rcnn/faster_rcnn_train_chain.py", line 77, in __call__
    features = self.faster_rcnn.extractor(imgs)
  File "/home/machen/face_expr/chainercv/links/model/faster_rcnn/faster_rcnn_vgg.py", line 288, in __call__
    h = func(h)
  File "/usr/local/anaconda3/lib/python3.6/site-packages/chainer/links/connection/convolution_2d.py", line 154, in __call__
    x, self.W, self.b, self.stride, self.pad)
  File "/usr/local/anaconda3/lib/python3.6/site-packages/chainer/functions/connection/convolution_2d.py", line 439, in convolution_2d
    return func(x, W, b)
  File "/usr/local/anaconda3/lib/python3.6/site-packages/chainer/function.py", line 200, in __call__
    outputs = self.forward(in_data)
  File "/usr/local/anaconda3/lib/python3.6/site-packages/chainer/function.py", line 327, in forward
    return self.forward_gpu(inputs)
  File "/usr/local/anaconda3/lib/python3.6/site-packages/chainer/functions/connection/convolution_2d.py", line 151, in forward_gpu
    y_desc.value, y.data.ptr)
  File "cupy/cuda/cudnn.pyx", line 692, in cupy.cuda.cudnn.convolutionForward (cupy/cuda/cudnn.cpp:7119)
  File "cupy/cuda/cudnn.pyx", line 705, in cupy.cuda.cudnn.convolutionForward (cupy/cuda/cudnn.cpp:6931)
  File "cupy/cuda/cudnn.pyx", line 417, in cupy.cuda.cudnn.check_status (cupy/cuda/cudnn.cpp:2059)
cupy.cuda.cudnn.CuDNNError: CUDNN_STATUS_MAPPING_ERROR: b'CUDNN_STATUS_MAPPING_ERROR'
@machanic
Copy link
Author

machanic commented Jul 26, 2017

When I use cudnn v5.1 and cuda 8.0 I encounter this problem

Traceback (most recent call last):
  File "chainercv/trainer/train.py", line 192, in <module>
    main()
  File "chainercv/trainer/train.py", line 188, in main
    trainer.run()
  File "/usr/local/anaconda3/lib/python3.6/site-packages/chainer/training/trainer.py", line 296, in run
    update()
  File "/usr/local/anaconda3/lib/python3.6/site-packages/chainer/training/updater.py", line 177, in update
    self.update_core()
  File "/usr/local/anaconda3/lib/python3.6/site-packages/chainer/training/updater.py", line 306, in update_core
    loss = loss_func(*in_arrays)
  File "/home/machen/face_expr/chainercv/links/model/faster_rcnn/faster_rcnn_train_chain.py", line 77, in __call__
    features = self.faster_rcnn.extractor(imgs)
  File "/home/machen/face_expr/chainercv/links/model/faster_rcnn/faster_rcnn_vgg.py", line 288, in __call__
    h = func(h)
  File "/usr/local/anaconda3/lib/python3.6/site-packages/chainer/links/connection/convolution_2d.py", line 154, in __call__
    x, self.W, self.b, self.stride, self.pad)
  File "/usr/local/anaconda3/lib/python3.6/site-packages/chainer/functions/connection/convolution_2d.py", line 439, in convolution_2d
    return func(x, W, b)
  File "/usr/local/anaconda3/lib/python3.6/site-packages/chainer/function.py", line 200, in __call__
    outputs = self.forward(in_data)
  File "/usr/local/anaconda3/lib/python3.6/site-packages/chainer/function.py", line 327, in forward
    return self.forward_gpu(inputs)
  File "/usr/local/anaconda3/lib/python3.6/site-packages/chainer/functions/connection/convolution_2d.py", line 151, in forward_gpu
    y_desc.value, y.data.ptr)
  File "cupy/cuda/cudnn.pyx", line 692, in cupy.cuda.cudnn.convolutionForward
  File "cupy/cuda/cudnn.pyx", line 705, in cupy.cuda.cudnn.convolutionForward
  File "cupy/cuda/cudnn.pyx", line 417, in cupy.cuda.cudnn.check_status
cupy.cuda.cudnn.CuDNNError: CUDNN_STATUS_MAPPING_ERROR: b'CUDNN_STATUS_MAPPING_ERROR'

@machanic
Copy link
Author

Help me please

@machanic
Copy link
Author

import cupy
cupy.version
'1.0.1'
import chainer
chainer.version
'2.0.1'

@machanic
Copy link
Author

CUDNN_STATUS_MAPPING_ERROR:
An access to GPU memory space failed, which is
usually caused by a failure to bind a texture.
To correct: prior to the function call, unbind any
previously bound textures.
Otherwise, this may indicate an internal error/bug
in the library.

@machanic
Copy link
Author

I finnaly figure out what the problem is. I have modify your chainercv last classification layer which is num_class < 20 to fit my needs.
But when I load the optimizer model which I trained with the old_num_class=20 before.
I hit the problem, which report:

Traceback (most recent call last):
  File "elementwise_sample.py", line 61, in <module>
    )(x, y)
  File "cupy\core\elementwise.pxi", line 508, in cupy.core.core.ElementwiseKernel.__call__ (cupy\core\core.cpp:34118)
  File "cupy\core\elementwise.pxi", line 334, in cupy.core.core._broadcast (cupy\core\core.cpp:31734)
  File "cupy\core\core.pyx", line 1504, in cupy.core.core.broadcast.__init__ (cupy\core\core.cpp:50697)
ValueError: Broadcasting failed

What the chainer.serializers.load_npz('./snapshot_optimizer.npz', optimizer) actually load?
what is stored inside ./snapshot_optimizer.npz?

@niboshi
Copy link
Member

niboshi commented Jul 27, 2017

Hi, could you provide a (minimal) code to reproduce the problem?

@machanic
Copy link
Author

I can't provide I have found 2 situation will cause this problem.

  1. you have already modified last layer(eg. classify output class number changed). but load chainer.serializers.load_npz('./snapshot_optimizer.npz', optimizer) and you then use 2 GPU to train.
  2. you use 2 GPU to train in parallelUpdate, but you then specify model.to_gpu(0) to the main GPU.

@msakai
Copy link
Contributor

msakai commented Sep 4, 2017

I also encountered the same error. Here's partial stack trace:

  File "/usr/local/lib/python3.5/dist-packages/chainer/links/connection/dilated_convolution_2d.py", line 133, in __call__
    x, self.W, self.b, self.stride, self.pad, self.dilate)
  File "/usr/local/lib/python3.5/dist-packages/chainer/functions/connection/dilated_convolution_2d.py", line 392, in dilated_convolution_2d
    return func(x, W, b)
  File "/usr/local/lib/python3.5/dist-packages/chainer/function.py", line 200, in __call__
    outputs = self.forward(in_data)
  File "/usr/local/lib/python3.5/dist-packages/chainer/function.py", line 327, in forward
    return self.forward_gpu(inputs)
  File "/usr/local/lib/python3.5/dist-packages/chainer/functions/connection/dilated_convolution_2d.py", line 153, in forward_gpu
    workspace_size, one.data, y_desc.value, y.data.ptr)
  File "cupy/cuda/cudnn.pyx", line 692, in cupy.cuda.cudnn.convolutionForward (cupy/cuda/cudnn.cpp:7119)
  File "cupy/cuda/cudnn.pyx", line 705, in cupy.cuda.cudnn.convolutionForward (cupy/cuda/cudnn.cpp:6931)
  File "cupy/cuda/cudnn.pyx", line 417, in cupy.cuda.cudnn.check_status (cupy/cuda/cudnn.cpp:2059)
cupy.cuda.cudnn.CuDNNError: CUDNN_STATUS_MAPPING_ERROR: b'CUDNN_STATUS_MAPPING_ERROR'

I'm using Python 3.5.2, chainer-2.0.2, cupy-1.0.1, cudnn-5.1 and cuda-8.0.

I'm NOT using chainercv, load_npz, nor multiple GPUs.

@msakai
Copy link
Contributor

msakai commented Sep 4, 2017

I repeat the same experiment multiple times, and I also got CUDA_ERROR_ILLEGAL_ADDRESS error.

  File "/usr/local/lib/python3.5/dist-packages/chainer/functions/activation/tanh.py", line 86, in tanh
    return Tanh()(x)
  File "/usr/local/lib/python3.5/dist-packages/chainer/function.py", line 200, in __call__
    outputs = self.forward(in_data)
  File "/usr/local/lib/python3.5/dist-packages/chainer/function.py", line 327, in forward
    return self.forward_gpu(inputs)
  File "/usr/local/lib/python3.5/dist-packages/chainer/functions/activation/tanh.py", line 37, in forward_gpu
    cuda.cupy.tanh(x[0], out=y)
  File "/usr/local/lib/python3.5/dist-packages/cupy/core/fusion.py", line 705, in __call__
    return self._cupy_op(*args, **kwargs)
  File "cupy/core/elementwise.pxi", line 780, in cupy.core.core.ufunc.__call__ (cupy/core/core.cpp:48290)
  File "cupy/cuda/function.pyx", line 143, in cupy.cuda.function.Function.linear_launch (cupy/cuda/function.cpp:4267)
  File "cupy/cuda/function.pyx", line 111, in cupy.cuda.function._launch (cupy/cuda/function.cpp:3653)
  File "cupy/cuda/driver.pyx", line 127, in cupy.cuda.driver.launchKernel (cupy/cuda/driver.cpp:2547)
  File "cupy/cuda/driver.pyx", line 62, in cupy.cuda.driver.check_status (cupy/cuda/driver.cpp:1452)
cupy.cuda.driver.CUDADriverError: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered
  File "/usr/local/lib/python3.5/dist-packages/chainer/links/connection/dilated_convolution_2d.py", line 133, in __call__
    x, self.W, self.b, self.stride, self.pad, self.dilate)
  File "/usr/local/lib/python3.5/dist-packages/chainer/functions/connection/dilated_convolution_2d.py", line 392, in dilated_convolution_2d
    return func(x, W, b)
  File "/usr/local/lib/python3.5/dist-packages/chainer/function.py", line 200, in __call__
    outputs = self.forward(in_data)
  File "/usr/local/lib/python3.5/dist-packages/chainer/function.py", line 327, in forward
    return self.forward_gpu(inputs)
  File "/usr/local/lib/python3.5/dist-packages/chainer/functions/connection/dilated_convolution_2d.py", line 128, in forward_gpu
    W[:, :, j:j + 1, i:i + 1])
  File "/usr/local/lib/python3.5/dist-packages/cupy/creation/from_data.py", line 77, in ascontiguousarray
    return core.ascontiguousarray(a, dtype)
  File "cupy/core/core.pyx", line 1825, in cupy.core.core.ascontiguousarray (cupy/core/core.cpp:59041)
  File "cupy/core/core.pyx", line 1836, in cupy.core.core.ascontiguousarray (cupy/core/core.cpp:58923)
  File "cupy/core/core.pyx", line 1593, in cupy.core.core.elementwise_copy (cupy/core/core.cpp:57853)
  File "cupy/core/elementwise.pxi", line 780, in cupy.core.core.ufunc.__call__ (cupy/core/core.cpp:48290)
  File "cupy/cuda/function.pyx", line 143, in cupy.cuda.function.Function.linear_launch (cupy/cuda/function.cpp:4267)
  File "cupy/cuda/function.pyx", line 111, in cupy.cuda.function._launch (cupy/cuda/function.cpp:3653)
  File "cupy/cuda/driver.pyx", line 127, in cupy.cuda.driver.launchKernel (cupy/cuda/driver.cpp:2547)
  File "cupy/cuda/driver.pyx", line 62, in cupy.cuda.driver.check_status (cupy/cuda/driver.cpp:1452)
cupy.cuda.driver.CUDADriverError: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered

@msakai
Copy link
Contributor

msakai commented Sep 6, 2017

It turns out that my cases are caused by passing out-of-bounds values to EmbedID link.
So it might be different from the original problem reported by sharpstill.

@sharpstill
Running the code under the environmental variables of CHAINER_DEBUG=1, CUDA_LAUNCH_BLOCKING=1 and/or under cuda-memcheck command may help debugging the problem.

@stale
Copy link

stale bot commented Dec 5, 2017

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale Not updated for a longer period of time. label Dec 5, 2017
@stale
Copy link

stale bot commented Jan 4, 2018

This issue is closed as announced. Feel free to re-open it if needed.

@stale stale bot closed this as completed Jan 4, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale Not updated for a longer period of time.
Projects
None yet
Development

No branches or pull requests

3 participants