New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
segmentation fault #2034
Comments
Hi, What is the hardware configuration of your machine and what versions of the NVIDIA software do you have installed? Thanks, |
I got the same problem under: sudo pip3 install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.8.0rc0-cp34-cp34m-linux_x86_64.whl Cuda: /usr/local/cuda-7.5/ If I import numpy or matplotlib before tensorflow, it won't crash. If I import tensorflow at the very beginning, it goes to segmentation fault. I guess your "engineers" simply forget to import some libs in this tensorflow release 😆 ~/>python3.4 Python 3.4.3+ (default, Oct 14 2015, 16:03:50) [GCC 5.2.1 20151010] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import tensorflow I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcublas.so locally I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcudnn.so locally I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcufft.so locally I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcuda.so.1 locally I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcurand.so locally Segmentation fault (core dumped) ~/>python3.4 Python 3.4.3+ (default, Oct 14 2015, 16:03:50) [GCC 5.2.1 20151010] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import numpy >>> import tensorflow I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcublas.so locally I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcudnn.so locally I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcufft.so locally I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcuda.so.1 locally I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcurand.so locally >>>#Everything good |
Yes I've got the same problem, and as @zhang8473 said, if I import numpy and matplotlib first, it won't crash. Python 2.7.6 (default, Jun 22 2015, 17:58:13)
Python 2.7.6 (default, Jun 22 2015, 17:58:13)
|
same problem here |
I've got the same issue with the CPU version: All I have is imports in my test script, and I get the segfault:
No errors if I skip import of tensorflow, or if I put it as the final import instead of the first. I ran it through gdb and got the following backtrace:
The issue seems very similar to one that used to exist in numpy |
I'm looking into this further and there seems to be a known issue related to this somehow in the tf_session_helper.h
Note the error check: Numpy cannot be included before tf_session_helper.h Could be that we need something similar for |
Oh wait.. the problem appears if we put TensorFlow before numpy... maybe not it |
The error sounds like "import_array" is not getting run. That's a function that sets up some global state and must be run before using numpy C API. We used to have the following in tf_session.i
It looks like it's got enhanced with some logic which I don't fully understand. @girving -- do you see any scenarios where "import_array()" isn't going to run? |
@tmsimont: I'm confused by your stacktrace. Why is scipy involved? Is it possible to get a reproduction case that doesn't involve scipy being the culprit? |
Ah, looks like there a few tests that do touch scipy, including one broken one that requires it or fails. I will get the culprit to fix that one, but it's unrelated to this issue. |
It seems that in my case it wasn't just numpy... This works:
This fails
However, this does not fail:
So it seems there is something in This does not fail
Also -- FYI, I'm using scipy here as this comes from the Mandlebrot example After running some more tests and comparing my issue to the OP I'm afraid I may have a different problem than what others in this thread have encountered. @mouendless, @chengdianxuezi and @zhang8473 all seem to have a problem with just the single |
@vrv or @martinwicke: Do you know what version of numpy we're building against? If these versions don't match, numpy might decide to crash. |
@caisq for pip build info |
@girving I am using |
I noticed something potentially relevant here:
There's a warning about the numpy include using deprecated NumPy API. I tried to use
Is TensorFlow trying to use two different versions of the NumPy API? |
It does look like an old numpy is creeping in from somewhere.
|
@tmsimont: Can you remove |
I'm fixing |
With |
That's "tensorflow::ConvertTensorToNdarray(tensorflow::Tensor const&, _object**)" (used c++ filt) So the library is being included through "py_func_lib", could you remove "py_func_lib" dependency here? "https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/BUILD#L1010" |
#2114 fixes the |
@caisq: Do you know what numpy version we're building against for the pip packages? |
I have faced same issue for the segmentation fault after compiling the 0.8 GPU version for tensor flow. Work around mentioned by @zhang8473 works perfectly,i.e import numpy then import tensor flow. In addition to above workaround, i found something interesting. When building from source if i skip first step and directly go to build with GPU option, i am getting segmentation fault. $ bazel build -c opt //tensorflow/tools/pip_package:build_pip_package $======To build with GPU support:======= $ bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg $ pip install /tmp/tensorflow_pkg/tensorflow-0.8.0-py2-none-linux_x86_64.whl On other hand if i use all above mentioned commands in order then i do not get any segmentation fault. O/P if we skip the first step: Pardon my ignorance but can somebody tell me the correct way to compile the 0.8 GPU version. |
@kanwar2preet: Is it possible to get a stack trace from that crash by running Python inside gdb? Also, unfortunately I didn't quite follow the set of commands that work and do not work from the above description. Can you say which commands you mean again? |
@girving I posted a copy of my GDB run in #2129. Unfortunately, without debug symbols it's not much help. I did try compiling TF with "-g" on (manually removed the -g0 in the BUILD file), but debug symbols still didn't seem there, so I didn't bother. Here's some outputs of some strace runs though.
|
In those outputs, my python installation is |
@girving @vrv The numpy version we use for current nightly builds and 0.8 release builds are: 1.8.2 is the version that comes with apt-get on ubuntu:14.04. See: |
@gladys0313 I'm not sure if it is because of readline. But when I reinstall the readline-6.0.0 and use it to recompile the python. The error is gone. I didn't do anything else. Thus I doubt it is related to the readline. In my case, the error is when I run |
@zszhong Okay I see..I will try later, I was just wondering how you realized that you should reinstall readline? For me it seems a very random discovery... |
@gladys0313 , I can't remember the details. Maybe after some googling, and got some hints. And then I tried, then it worked. |
Confirming that importing numpy before tensorflow fixes it. (ok, it's more of a workaround than a fix) For me it started seg faulting after I installed scipy in an otherwise clean virtualenv on Ubuntu 15.04. |
In my case, |
Probably because that updated scipy or numpy as a side effect.
|
A quick-fix for this might be to install version 0.7:
or any other version found here: https://storage.googleapis.com/tensorflow/ This worked for me, that is on 0.8.0 |
For anyone who finds this thread: TensorFlow 0.9 has the workaround. |
Actually, I have the same issue in Tensorflow 0.9 👎 |
@amineHorseman Are you sure you have the latest version ( |
@MartinThoma yes, I'm sure about the version. |
@amineHorseman If importing numpy doesn't fix it, please file a different issue. It is probably unrelated, unless you have reason to believe otherwise. |
Is there any update now? I found there was no such problem in pyCharm IDE (with sklearn installed)
But the same code cann't work in ipython, Dont know why |
For everyone using RHEL/Centos 6.x, you might have to build tensorflow from source by installing bazel first. TensorFlow (Installation on RHEL/Centos 6.x):1) Enable Repos for Installing python2.7CentOS 6 Install Software Collections in Scientific Linux 6Shell
Set 2) Install python2.7
3) Install Developer Toolset
4) Build and Install BazelEnter Software Collection environment with Developer Toolset.
select version compile
exit from Software Collection environment 5) Build TensorFlow with BazelEnter Build Environment
select version 6) Since GNU C library version in CentOS 6 is less than 2.17, a slight modification needs to be applied before compilation.Modify tf_extension_linkopts function in tensorflow/tensorflow.bzl FROM
TO
7) Buildconfigure workspace, just leave everything as default
exit from Software Collection environment 8) Install tensorflow via pip_package
9) Test tensorflow
Hope this helps!!! |
I had a similar problem. And I've solved that problem by upgrading matplotlib.
|
How to solve the similar problem as follow, I try to upgrade matplotlib and so on, but they are failed.
|
…upstream-py-dev Add some extra pkgs to tensorflow-build so it builds python properly
I install tensorflow use:sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.8.0rc0-cp27-none-linux_x86_64.whl
linux version is :Linux version 3.10.0-229.el7.x86_64 (builder@kbuilder.dev.centos.org) (gcc version 4.8.2 20140120 (Red Hat 4.8.2-16) (GCC) ) #1 SMP Fri Mar 6 11:36:42 UTC 2015
when I learn python, and import tensorflow,segmentation fault
The text was updated successfully, but these errors were encountered: