CUDA_ERROR_MISALIGNED_ADDRESS on MNIST example #2810

johnfrombluff · 2016-06-11T22:05:14Z

Summary

What might be causing this error when running python tensorflow/models/image/mnist/convolutional.py?

E tensorflow/stream_executor/cuda/cuda_event.cc:49] Error polling for event status: failed to query event: CUDA_ERROR_MISALIGNED_ADDRESS

Environment info

Operating System:
Linux Lounge 4.5.6-200.fc23.x86_64 #1 SMP Wed Jun 1 21:28:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

Installed version of CUDA and cuDNN:
(please attach the output of ls -l /path/to/cuda/lib/libcud*):
ls -l /usr/local/cuda-7.5/lib64/libcud*
-rw-r--r--. 1 root root 322936 Aug 16 2015 /usr/local/cuda-7.5/lib64/libcudadevrt.a
lrwxrwxrwx. 1 root root 16 Aug 16 2015 /usr/local/cuda-7.5/lib64/libcudart.so -> libcudart.so.7.5
lrwxrwxrwx. 1 root root 19 Aug 16 2015 /usr/local/cuda-7.5/lib64/libcudart.so.7.5 -> libcudart.so.7.5.18
-rwxr-xr-x. 1 root root 383336 Aug 16 2015 /usr/local/cuda-7.5/lib64/libcudart.so.7.5.18
-rw-r--r--. 1 root root 720192 Aug 16 2015 /usr/local/cuda-7.5/lib64/libcudart_static.a
-rwxr-xr-x. 1 root root 61453024 Jun 11 12:35 /usr/local/cuda-7.5/lib64/libcudnn.so
-rwxr-xr-x. 1 root root 61453024 Jun 11 12:35 /usr/local/cuda-7.5/lib64/libcudnn.so.4
-rwxr-xr-x. 1 root root 61453024 Jun 11 12:35 /usr/local/cuda-7.5/lib64/libcudnn.so.4.0.7
-rwxr-xr-x. 1 root root 59909104 Jun 11 12:35 /usr/local/cuda-7.5/lib64/libcudnn.so.5
-rwxr-xr-x. 1 root root 59909104 Jun 11 12:35 /usr/local/cuda-7.5/lib64/libcudnn.so.5.0.5
-rw-r--r--. 1 root root 62025862 Jun 11 12:35 /usr/local/cuda-7.5/lib64/libcudnn_static.a

If installed from binary pip package, provide:

1. Which pip package you installed.

export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.9.0rc0-cp27-none-linux_x86_64.whl
pip install --upgrade $TF_BINARY_URL

2. The output from python -c "import tensorflow; print(tensorflow.__version__)".
python -c "import tensorflow; print(tensorflow.version)"
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally

If installed from sources, provide the commit hash:

Steps to reproduce

1 python tensorflow/models/image/mnist/convolutional.py.
2. Observe errror CUDA_ERROR_MISALIGNED_ADDRESS
3. Scratch head

What have you tried?

Searching the internet for clues, none found

Logs or other output that would be helpful

(If logs are large, please upload as attachment).
Results of cuda-memcheck and dmesg
error.txt

The text was updated successfully, but these errors were encountered:

zheng-xq · 2016-06-13T19:27:53Z

Swapping my action with Benoit. All the misaligned memory reads came from Eigen kernels in the attached error messages.

johnfrombluff · 2016-06-13T21:07:28Z

Thanks for your help, but I'm still lost. I'm a user, not a programmer. I'm a statistician. So I'm not sure what to do to fix this problem. Does "All the misaligned memory reads came from Eigen kernels" mean I've done something wrong? If so, what have I done wrong? If not, then what can I do to get tensorflow working on my machine?

bryonglodencissp · 2016-06-13T21:10:16Z

@johnfrombluff GREETINGS. I don't know what misaligned memory reads came from Eigen kernels

Do you have a basic block size like (or such as) 256 , 512, 1024

johnfrombluff · 2016-06-13T23:07:52Z

Sorry, I'm still confused. What block size are you referring to? Which file(s) should I look at to find what you're talking about?

I'm trying to run example code that comes with the tensorflow distribution. Shouldn't that code run on all supported architectures? Maybe GNU/Linux or my GPU is not supported, but I haven't noticed that in the documentation?

And thank you for your attempt to help me!

zheng-xq · 2016-06-14T00:30:29Z

@johnfrombluff, my earlier comment only meant to point out which part of the program is triggering the error to my colleague. It didn't imply you did something wrong.

Your GPU is GTX 750 Ti, which is gm107. It is supported in theory. But it is a low-end GPU, out of which you might not see a very big speedup.

Since we never saw this problem before, and were unable to reproduce it, the only way to root cause is to ask you to run experiments. However, some steps are not the easiest for users who are not familiar with GPU programming.

Alternatively, you can try a different GPU. Both Titan-X and GTX 1080 are very popular choices. If you still see the same problem with the latest Cuda driver, Cuda SDK and more recent GPU, we would definitely like to investigate.

tsitsilin · 2016-06-14T11:28:15Z

@zheng-xq, I have a similar setup on Ubuntu 14.04.4 LTS: Cuda v7.5, Cudnn v4, /gpu/tensorflow-0.9.0rc0-cp27-none-linux_x86_64.whl with GTX750 Ti and see the same issue on the mnist example:
E tensorflow/stream_executor/cuda/cuda_event.cc:49] Error polling for event status: failed to query event: CUDA_ERROR_MISALIGNED_ADDRESS.

I went one step further and tried another tf examples, such as alexnet, imagenet, cifar10_multi_gpu_train. It seems they run OK, see attached log, and the problem is in the code of convolutional.py.

CUDA_ERROR_MISALIGNED_ADDRESS.txt

acowlikeobject · 2016-06-15T09:14:17Z

@zheng-xq See the same error when running MNIST test.
E tensorflow/stream_executor/cuda/cuda_event.cc:49] Error polling for event status: failed to query event: CUDA_ERROR_MISALIGNED_ADDRESS

Also on Ubuntu 14.04, Cuda v7.5, Cudnn v4. Use the nvidia-docker using this image.

This is using a GTX 960M (use it for sanity checks before spinning up servers).

I'm calling via Keras MNIST example. Same example works fine using Theano backend (via Keras configuration).

cuda-memcheck.txt
environment.txt

kalleknast · 2016-06-19T13:17:46Z

I get the same a similar error from running tf.nn.softmax().
I can give more details if this seems relevant.

To reproduce:

import tensorflow as tf
logits = tf.random_normal((10, 2))
y = tf.nn.softmax(logits)

with tf.Session() as sess:
print(y.eval())

Result:

I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties: name: Quadro K2200 major: 5 minor: 0 memoryClockRate (GHz) 1.124 pciBusID 0000:03:00.0 Total memory: 3.99GiB Free memory: 3.20GiB I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0 I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y I tensorflow/core/common_runtime/gpu/gpu_device.cc:806] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Quadro K2200, pci bus id: 0000:03:00.0) E tensorflow/stream_executor/cuda/cuda_driver.cc:1140] could not synchronize on CUDA context: CUDA_ERROR_MISALIGNED_ADDRESS :: No stack trace available E tensorflow/stream_executor/cuda/cuda_event.cc:49] Error polling for event status: failed to query event: CUDA_ERROR_MISALIGNED_ADDRESS F tensorflow/core/common_runtime/gpu/gpu_event_mgr.cc:198] Unexpected Event status: 1 F tensorflow/core/common_runtime/gpu/gpu_util.cc:370] GPU sync failed Aborted (core dumped)

dzupin · 2016-06-28T06:14:50Z

Same error trying to run basic MNIST example.
Linux Y15MATE 4.4.0-24-generic #43-Ubuntu SMP Wed Jun 8 19:27:37 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

OUTPUT:
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:924] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties:
name: GeForce 840M
major: 5 minor: 0 memoryClockRate (GHz) 1.124
pciBusID 0000:06:00.0
Total memory: 2.00GiB
Free memory: 1.84GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:806] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce 840M, pci bus id: 0000:06:00.0)
E tensorflow/stream_executor/cuda/cuda_event.cc:49] Error polling for event status: failed to query event: CUDA_ERROR_MISALIGNED_ADDRESS
F tensorflow/core/common_runtime/gpu/gpu_event_mgr.cc:198] Unexpected Event status: 1

Process finished with exit code 134

floringogianu · 2016-06-29T07:03:57Z

I tried running the mnist example after I installed TensorFlow in virtualenv and I got the same error, Ubuntu 16, gcc 5.3.1, python 3.5.1, Driver Version: 361.42, cuda 7.5, this time with a GTX960 with 4GiB, which should be more than enough for this network model:

python -m tensorflow.models.image.mnist.convolutional

I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
Extracting data/train-images-idx3-ubyte.gz
Extracting data/train-labels-idx1-ubyte.gz
Extracting data/t10k-images-idx3-ubyte.gz
Extracting data/t10k-labels-idx1-ubyte.gz
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:924] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties: 
name: GeForce GTX 960M
major: 5 minor: 0 memoryClockRate (GHz) 1.176
pciBusID 0000:01:00.0
Total memory: 4.00GiB
Free memory: 3.33GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:806] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 960M, pci bus id: 0000:01:00.0)
Initialized!
E tensorflow/stream_executor/cuda/cuda_event.cc:49] Error polling for event status: failed to query event: CUDA_ERROR_MISALIGNED_ADDRESS
F tensorflow/core/common_runtime/gpu/gpu_event_mgr.cc:198] Unexpected Event status: 1
[1]    25066 abort (core dumped)  python -m tensorflow.models.image.mnist.convolutional

edit: Running cifar10 model seems to be working just fine...

MartianWearables · 2016-07-01T08:07:17Z

I've run into the same problem exactly as described by floringogianu except w/ Ubuntu 16.04 and gcc 4.9. Also, i used the --override flag when installing cuda toolkit via the .run script, which may or may not be relevant. cifar10 runs fine.

MartianWearables · 2016-07-02T10:29:24Z

I've been able to circumvent the issue by following this setup, which includes switching to python 3.x. and the associated binary.

I also noted that the default build does not target nvidia compute capability 5.0 (that of the GTX 960M), but it works anyways. I attempted to build from source myself but ran into some linking errors which I don't have time to address at the moment. Others have had trouble building for Ubuntu 16.04 as documented in this thread, with some success. I'd be interested in knowing if anyone else succeeds at this and having that issue closed.

Waffleboy · 2016-07-07T17:11:13Z

Same problem here. Downgraded to gcc 4.9 to use theano, now tensorflow is broken with CUDA_MISALIGNED_ADDRESS. (bottleneck generation is rapidly improved though)

zheng-xq · 2016-07-07T20:27:26Z

@johnfrombluff, @tsitsilin, @acowlikeobject, @kalleknast, @dzupin, @floringogianu, @MartianWearables, sorry that we cannot reproduce this problem on our side. I will try to guess where the problem is and see whether it could be fixed.

Among folks who encountered this problem, what is common is that all used gm107 and gm108 based GPUs. That is compute capability 5.0. TensorFlow binary by default carries compute capability 3.5 and 5.2. The Cuda driver will extract the compute 3.5 PTX and JIT compile into compute 5.0 SASS upon the first run. Given the error message is "Invalid local read of size 16", my current guess is that the JIT compiler in the Cuda driver is generating wrong code for tf.nn.softmax on GPUs with compute capability 5.0.

Here are a number of things to try:

Enable compute capability 5.0 directly when building from the source code. It is part of the "configure". This would enable SASS 5.0 from the static Cuda compiler, and bypasses the JIT Cuda compiler in the Cuda driver.
Install the latest driver from NVIDIA.

If #1 still fails, we can dump the SASS code from your binary and see what goes wrong.

zheng-xq · 2016-07-13T20:26:55Z

From an offline conversation, we can confirm that this problem goes away:

Build from source while explicitly setting 5.0 build target.
Or install the latest graphics driver 367.27.

So it does seem like a JIT compiler issue that goes away the latest driver.

MartinThoma · 2016-07-15T07:41:00Z

To expand @zheng-xq fix:

Get your current driver version: nvidia-smi | grep "Driver Version" (or by using the GUI: nvidia-settings)
Go to http://www.nvidia.com/Download/index.aspx and get your driver (nvidia-smi might be handy to find your GPU)
Install Tensorflow from sources. A few notes for Ubuntu 16.04 users:
- You will have to install: sudo apt-get install gcc-4.9 g++-4.9
- Install CUDA via the script from nvidia, not via repository.

edit: Updating the driver seems not to be that easy (see ask.SE question). @zheng-xq Could you please add some details how to build tensorflow setting it explicitly to 5.0? Is it possible to build tensorflow when one installed CUDA via apt-get (and thus does not have one cuda folder)?

zheng-xq · 2016-07-15T15:36:19Z

@MartinThoma, you can set compute version for 5.0 via "configure".

Please specify a list of comma-separated Cuda compute capabilities you want to
build with. You can find the compute capability of your device at:
https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your
build time and binary size. [Default is: "3.5,5.2"]: 3.5

The same thing applies to Cuda and Cudnn paths.

Please specify the location where CUDA 7.5 toolkit is installed. Refer to
README.md for more details. [default is: /usr/local/cuda]: /usr/local/cuda

What does the structure of the Cuda binaries look like when you do apt-get? You can download and install directly from NVIDIA. If that is not possible, if the file directory is similar, you can pass that directory to "configure". If the file directory is completely different for some reason, I guess you can build another directory with symlinks that mimics the downloaded Cuda directory.

https://developer.nvidia.com/cuda-downloads

Hope that helps.

MartinThoma · 2016-07-15T16:37:17Z

What does the structure of the Cuda binaries look like when you do apt-get?

Here are some of the files:

/usr/bin/nvcc
/usr/bin/nvidia-smi
/usr/lib/x86_64-linux-gnu/libcudadevrt.a
/usr/include/cudnn.h

When configure of tensorflow asks me where the cuda folder is, I pointed to /usr/lib/x86_64-linux-gnu/. But then it complained that there is no /usr/lib/x86_64-linux-gnu/lib64 (or something similar).

I think I'll just install it manually, because I run into those problems quite regularly.

zheng-xq · 2016-07-15T17:09:23Z

If those are not symlinks, I would recommend you to manually reinstall that.

ra9hur · 2016-07-20T16:24:30Z

Was facing same problem, get "CUDA_ERROR_MISALIGNED_ADDRESS" with MNIST samples. Below are the environment versions.
Ubuntu - 16.04
Driver - 361.45 or 364.19
CUDA - 7.5
CUDNN - 4.0
TF - 0.9
gcc - 4.9

Downgraded tensorflow to 0.8 and this error does not show up. However, started facing a new problem. TF would just hang, nvidia-smi shows temperature at 68 C and the process stops responding. It should probably the driver issue. Installed all the latest versions (except TF) and its all fine now.
Ubuntu - 16.04
Driver - 367.35
CUDA - 8.0 RC
CUDNN - 5.0
TF - 0.8
gcc - 5.41 (default that comes with Ubuntu 16.04 install)

zhanglistar · 2017-09-07T03:30:52Z

I have the same error:
017-09-07 11:20:05.454046: E tensorflow/stream_executor/cuda/cuda_event.cc:49] Error polling for event status: failed to query event: CUDA_ERROR_ILLEGAL_ADDRESS 2017-09-07 11:20:05.454119: F tensorflow/core/common_runtime/gpu/gpu_event_mgr.cc:203] Unexpected Event status: 1 2017-09-07 11:20:05.454124: E tensorflow/stream_executor/cuda/cuda_blas.cc:365] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED 2017-09-07 11:20:05.454183: W tensorflow/stream_executor/stream.cc:1601] attempting to perform BLAS operation using StreamExecutor without BLAS support ^[tools/jumbo/bin/python4.8: line 11: 32638 Aborted /opt/compiler/gcc-4.8.2/lib/ld-linux-x86-64.so.2 --library-path $SCRIPTPATH/../lib:/opt/compiler/gcc-4.8.2/lib:$LD_LIBRARY_PATH $SCRIPTPATH/python "$@"

Env:
GPU: Tesla P40
Driver: 375.20
cudnn： 5.1.10
gcc: 4.8.2
os: centos 4.3

sherrym assigned zheng-xq Jun 13, 2016

zheng-xq assigned benoitsteiner and unassigned zheng-xq Jun 13, 2016

prb12 added bug labels Jun 20, 2016

Waffleboy mentioned this issue Jul 7, 2016

Error polling for event status: failed to query event: CUDA_ERROR_MISALIGNED_ADDRESS #3224

Closed

zheng-xq closed this as completed Jul 15, 2016

zheng-xq mentioned this issue Jul 18, 2016

Running Tensorflow on GeForce 940M (Ubuntu 14) #1060

Closed

MInner mentioned this issue Oct 26, 2016

tf.nn.softmax on GPU causes CUDA_ERROR_ILLEGAL_ADDRESS (okay on CPU) #5221

Closed

aselle added type:bug Bug and removed bug labels Feb 9, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA_ERROR_MISALIGNED_ADDRESS on MNIST example #2810

CUDA_ERROR_MISALIGNED_ADDRESS on MNIST example #2810

johnfrombluff commented Jun 11, 2016 •

edited

zheng-xq commented Jun 13, 2016

johnfrombluff commented Jun 13, 2016

bryonglodencissp commented Jun 13, 2016

johnfrombluff commented Jun 13, 2016

zheng-xq commented Jun 14, 2016

tsitsilin commented Jun 14, 2016

acowlikeobject commented Jun 15, 2016

kalleknast commented Jun 19, 2016

dzupin commented Jun 28, 2016

floringogianu commented Jun 29, 2016 •

edited

MartianWearables commented Jul 1, 2016

MartianWearables commented Jul 2, 2016

Waffleboy commented Jul 7, 2016

zheng-xq commented Jul 7, 2016

zheng-xq commented Jul 13, 2016

MartinThoma commented Jul 15, 2016 •

edited

zheng-xq commented Jul 15, 2016

MartinThoma commented Jul 15, 2016

zheng-xq commented Jul 15, 2016

ra9hur commented Jul 20, 2016

zhanglistar commented Sep 7, 2017

CUDA_ERROR_MISALIGNED_ADDRESS on MNIST example #2810

CUDA_ERROR_MISALIGNED_ADDRESS on MNIST example #2810

Comments

johnfrombluff commented Jun 11, 2016 • edited

Summary

Environment info

Steps to reproduce

What have you tried?

Logs or other output that would be helpful

zheng-xq commented Jun 13, 2016

johnfrombluff commented Jun 13, 2016

bryonglodencissp commented Jun 13, 2016

johnfrombluff commented Jun 13, 2016

zheng-xq commented Jun 14, 2016

tsitsilin commented Jun 14, 2016

acowlikeobject commented Jun 15, 2016

kalleknast commented Jun 19, 2016

To reproduce:

Result:

dzupin commented Jun 28, 2016

floringogianu commented Jun 29, 2016 • edited

MartianWearables commented Jul 1, 2016

MartianWearables commented Jul 2, 2016

Waffleboy commented Jul 7, 2016

zheng-xq commented Jul 7, 2016

zheng-xq commented Jul 13, 2016

MartinThoma commented Jul 15, 2016 • edited

zheng-xq commented Jul 15, 2016

MartinThoma commented Jul 15, 2016

zheng-xq commented Jul 15, 2016

ra9hur commented Jul 20, 2016

zhanglistar commented Sep 7, 2017

johnfrombluff commented Jun 11, 2016 •

edited

floringogianu commented Jun 29, 2016 •

edited

MartinThoma commented Jul 15, 2016 •

edited