Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segmentation fault #2034

Closed
chengdianxuezi opened this issue Apr 20, 2016 · 79 comments
Closed

segmentation fault #2034

chengdianxuezi opened this issue Apr 20, 2016 · 79 comments
Assignees

Comments

@chengdianxuezi
Copy link

I install tensorflow use:sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.8.0rc0-cp27-none-linux_x86_64.whl

linux version is :Linux version 3.10.0-229.el7.x86_64 (builder@kbuilder.dev.centos.org) (gcc version 4.8.2 20140120 (Red Hat 4.8.2-16) (GCC) ) #1 SMP Fri Mar 6 11:36:42 UTC 2015

when I learn python, and import tensorflow,segmentation fault

import tensorflow
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcurand.so locally
Segmentation fault (core dumped)

@prb12
Copy link
Member

prb12 commented Apr 20, 2016

Hi,
Could you please provide a little more information:

What is the hardware configuration of your machine and what versions of the NVIDIA software do you have installed?

Thanks,
Paul

@zhang8473
Copy link

zhang8473 commented Apr 21, 2016

I got the same problem under:

sudo pip3 install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.8.0rc0-cp34-cp34m-linux_x86_64.whl

Cuda: /usr/local/cuda-7.5/
GPU: NVIDIA Geforce GTX 970M
CUDNN: cudnn-7.0-linux-x64-v4.0-prod.tgz
Linux: 4.2.0-30-generic #35-Ubuntu

If I import numpy or matplotlib before tensorflow, it won't crash. If I import tensorflow at the very beginning, it goes to segmentation fault. I guess your "engineers" simply forget to import some libs in this tensorflow release 😆

~/>python3.4
Python 3.4.3+ (default, Oct 14 2015, 16:03:50) 
[GCC 5.2.1 20151010] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcurand.so locally
Segmentation fault (core dumped)
~/>python3.4
Python 3.4.3+ (default, Oct 14 2015, 16:03:50) 
[GCC 5.2.1 20151010] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> import tensorflow
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcurand.so locally
>>>#Everything good

@mouendless
Copy link

Yes I've got the same problem, and as @zhang8473 said, if I import numpy and matplotlib first, it won't crash.
And I'm using the version without GPU

Python 2.7.6 (default, Jun 22 2015, 17:58:13)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.

import tensorflow
Segmentation fault (core dumped)

Python 2.7.6 (default, Jun 22 2015, 17:58:13)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.

import numpy
import matplotlib
import tensorflow

@fxia22
Copy link

fxia22 commented Apr 22, 2016

same problem here
CUDA 7.5 with CUDNN V5(build 5004)
Python 2.7.5 (default, Nov 20 2015, 02:00:19)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-4)] on linux2

@tmsimont
Copy link

I've got the same issue with the CPU version:
pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.8.0-cp27-none-linux_x86_64.whl

All I have is imports in my test script, and I get the segfault:

# Import libraries for simulation
import tensorflow as tf
import numpy as np
import scipy.ndimage as nd

No errors if I skip import of tensorflow, or if I put it as the final import instead of the first.

I ran it through gdb and got the following backtrace:

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff2d0e220 in PyArray_API () from /media/local/usr/lib64/python2.7/site-packages/numpy/core/multiarray.so
(gdb) backtrace
#0  0x00007ffff2d0e220 in PyArray_API () from /media/local/usr/lib64/python2.7/site-packages/numpy/core/multiarray.so
#1  0x00007fffe0ed8e54 in _import_array ()
    at /opt/_internal/cpython-2.7.11-ucs4/lib/python2.7/site-packages/numpy/core/include/numpy/__multiarray_api.h:1633
#2  initspecfun () at build/src.linux-x86_64-2.7/scipy/special/specfunmodule.c:5830
#3  0x00007ffff7b2fc0e in _PyImport_LoadDynamicModule () from /usr/lib64/libpython2.7.so.1.0
#4  0x00007ffff7abf06c in ?? () from /usr/lib64/libpython2.7.so.1.0
#5  0x00007ffff7abf2a4 in ?? () from /usr/lib64/libpython2.7.so.1.0
#6  0x00007ffff7b01756 in PyImport_ImportModuleLevel () from /usr/lib64/libpython2.7.so.1.0
#7  0x00007ffff7af276b in ?? () from /usr/lib64/libpython2.7.so.1.0
#8  0x00007ffff7ad96f6 in PyObject_Call () from /usr/lib64/libpython2.7.so.1.0
#9  0x00007ffff7af2c80 in PyEval_CallObjectWithKeywords () from /usr/lib64/libpython2.7.so.1.0
#10 0x00007ffff7af48ed in PyEval_EvalFrameEx () from /usr/lib64/libpython2.7.so.1.0
#11 0x00007ffff7afa33e in PyEval_EvalCodeEx () from /usr/lib64/libpython2.7.so.1.0
#12 0x00007ffff7b27142 in PyEval_EvalCode () from /usr/lib64/libpython2.7.so.1.0
#13 0x00007ffff7b2efb0 in PyImport_ExecCodeModuleEx () from /usr/lib64/libpython2.7.so.1.0
#14 0x00007ffff7b2f1ca in ?? () from /usr/lib64/libpython2.7.so.1.0
#15 0x00007ffff7b0104f in ?? () from /usr/lib64/libpython2.7.so.1.0
#16 0x00007ffff7b018d4 in PyImport_ImportModuleLevel () from /usr/lib64/libpython2.7.so.1.0
#17 0x00007ffff7af276b in ?? () from /usr/lib64/libpython2.7.so.1.0
#18 0x00007ffff7ad96f6 in PyObject_Call () from /usr/lib64/libpython2.7.so.1.0
#19 0x00007ffff7af2c80 in PyEval_CallObjectWithKeywords () from /usr/lib64/libpython2.7.so.1.0
#20 0x00007ffff7af48ed in PyEval_EvalFrameEx () from /usr/lib64/libpython2.7.so.1.0
#21 0x00007ffff7afa33e in PyEval_EvalCodeEx () from /usr/lib64/libpython2.7.so.1.0
#22 0x00007ffff7b27142 in PyEval_EvalCode () from /usr/lib64/libpython2.7.so.1.0
#23 0x00007ffff7b2efb0 in PyImport_ExecCodeModuleEx () from /usr/lib64/libpython2.7.so.1.0
#24 0x00007ffff7b2f1ca in ?? () from /usr/lib64/libpython2.7.so.1.0
#25 0x00007ffff7b2f870 in ?? () from /usr/lib64/libpython2.7.so.1.0
#26 0x00007ffff7b0104f in ?? () from /usr/lib64/libpython2.7.so.1.0
#27 0x00007ffff7b01624 in PyImport_ImportModuleLevel () from /usr/lib64/libpython2.7.so.1.0
#28 0x00007ffff7af276b in ?? () from /usr/lib64/libpython2.7.so.1.0
#29 0x00007ffff7ad96f6 in PyObject_Call () from /usr/lib64/libpython2.7.so.1.0
#30 0x00007ffff7af2c80 in PyEval_CallObjectWithKeywords () from /usr/lib64/libpython2.7.so.1.0
#31 0x00007ffff7af48ed in PyEval_EvalFrameEx () from /usr/lib64/libpython2.7.so.1.0
#32 0x00007ffff7afa33e in PyEval_EvalCodeEx () from /usr/lib64/libpython2.7.so.1.0
#33 0x00007ffff7b27142 in PyEval_EvalCode () from /usr/lib64/libpython2.7.so.1.0
#34 0x00007ffff7b2efb0 in PyImport_ExecCodeModuleEx () from /usr/lib64/libpython2.7.so.1.0
#35 0x00007ffff7b2f1ca in ?? () from /usr/lib64/libpython2.7.so.1.0
#36 0x00007ffff7b2f870 in ?? () from /usr/lib64/libpython2.7.so.1.0
#37 0x00007ffff7b0104f in ?? () from /usr/lib64/libpython2.7.so.1.0
#38 0x00007ffff7b01624 in PyImport_ImportModuleLevel () from /usr/lib64/libpython2.7.so.1.0
#39 0x00007ffff7af276b in ?? () from /usr/lib64/libpython2.7.so.1.0
#40 0x00007ffff7ad96f6 in PyObject_Call () from /usr/lib64/libpython2.7.so.1.0
#41 0x00007ffff7af2c80 in PyEval_CallObjectWithKeywords () from /usr/lib64/libpython2.7.so.1.0
#42 0x00007ffff7af48ed in PyEval_EvalFrameEx () from /usr/lib64/libpython2.7.so.1.0
#43 0x00007ffff7afa33e in PyEval_EvalCodeEx () from /usr/lib64/libpython2.7.so.1.0
#44 0x00007ffff7b27142 in PyEval_EvalCode () from /usr/lib64/libpython2.7.so.1.0
#45 0x00007ffff7b2efb0 in PyImport_ExecCodeModuleEx () from /usr/lib64/libpython2.7.so.1.0
#46 0x00007ffff7b2f1ca in ?? () from /usr/lib64/libpython2.7.so.1.0
#47 0x00007ffff7b0104f in ?? () from /usr/lib64/libpython2.7.so.1.0
#48 0x00007ffff7b018d4 in PyImport_ImportModuleLevel () from /usr/lib64/libpython2.7.so.1.0
#49 0x00007ffff7af276b in ?? () from /usr/lib64/libpython2.7.so.1.0
#50 0x00007ffff7ad96f6 in PyObject_Call () from /usr/lib64/libpython2.7.so.1.0
#51 0x00007ffff7af2c80 in PyEval_CallObjectWithKeywords () from /usr/lib64/libpython2.7.so.1.0
#52 0x00007ffff7af48ed in PyEval_EvalFrameEx () from /usr/lib64/libpython2.7.so.1.0
#53 0x00007ffff7afa33e in PyEval_EvalCodeEx () from /usr/lib64/libpython2.7.so.1.0
#54 0x00007ffff7b27142 in PyEval_EvalCode () from /usr/lib64/libpython2.7.so.1.0
#55 0x00007ffff7b2efb0 in PyImport_ExecCodeModuleEx () from /usr/lib64/libpython2.7.so.1.0
#56 0x00007ffff7b2f1ca in ?? () from /usr/lib64/libpython2.7.so.1.0
#57 0x00007ffff7b2f870 in ?? () from /usr/lib64/libpython2.7.so.1.0
#58 0x00007ffff7b0104f in ?? () from /usr/lib64/libpython2.7.so.1.0
#59 0x00007ffff7b01624 in PyImport_ImportModuleLevel () from /usr/lib64/libpython2.7.so.1.0
#60 0x00007ffff7af276b in ?? () from /usr/lib64/libpython2.7.so.1.0
#61 0x00007ffff7ad96f6 in PyObject_Call () from /usr/lib64/libpython2.7.so.1.0
#62 0x00007ffff7af2c80 in PyEval_CallObjectWithKeywords () from /usr/lib64/libpython2.7.so.1.0
#63 0x00007ffff7af48ed in PyEval_EvalFrameEx () from /usr/lib64/libpython2.7.so.1.0
#64 0x00007ffff7afa33e in PyEval_EvalCodeEx () from /usr/lib64/libpython2.7.so.1.0
#65 0x00007ffff7b27142 in PyEval_EvalCode () from /usr/lib64/libpython2.7.so.1.0
#66 0x00007ffff7b338ad in ?? () from /usr/lib64/libpython2.7.so.1.0
#67 0x00007ffff7ac16ad in PyRun_FileExFlags () from /usr/lib64/libpython2.7.so.1.0
#68 0x00007ffff7ac2294 in PyRun_SimpleFileExFlags () from /usr/lib64/libpython2.7.so.1.0
#69 0x00007ffff7ac9e63 in Py_Main () from /usr/lib64/libpython2.7.so.1.0
#70 0x00007ffff7488b05 in __libc_start_main () from /lib64/libc.so.6
#71 0x000000000040078e in _start ()
(gdb)

The issue seems very similar to one that used to exist in numpy

@tmsimont
Copy link

I'm looking into this further and there seems to be a known issue related to this somehow in the tf_session_helper.h


#ifdef PyArray_Type
#error "Numpy cannot be included before tf_session_helper.h."
#endif

// Disallow Numpy 1.7 deprecated symbols.
#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION

// We import_array in the tensorflow init function only.
#define PY_ARRAY_UNIQUE_SYMBOL _tensorflow_numpy_api
#ifndef TF_IMPORT_NUMPY
#define NO_IMPORT_ARRAY
#endif

Note the error check: Numpy cannot be included before tf_session_helper.h

Could be that we need something similar for #ifdef PyArray_API

@tmsimont
Copy link

Oh wait.. the problem appears if we put TensorFlow before numpy... maybe not it

@yaroslavvb
Copy link
Contributor

The error sounds like "import_array" is not getting run. That's a function that sets up some global state and must be run before using numpy C API.

We used to have the following in tf_session.i

%include "tensorflow/python/platform/numpy.i"
%init %{
import_array();
%}

It looks like it's got enhanced with some logic which I don't fully understand. @girving -- do you see any scenarios where "import_array()" isn't going to run?

@girving
Copy link
Contributor

girving commented Apr 25, 2016

@tmsimont: I'm confused by your stacktrace. Why is scipy involved? Is it possible to get a reproduction case that doesn't involve scipy being the culprit?

@girving
Copy link
Contributor

girving commented Apr 25, 2016

Ah, looks like there a few tests that do touch scipy, including one broken one that requires it or fails. I will get the culprit to fix that one, but it's unrelated to this issue.

@tmsimont
Copy link

It seems that in my case it wasn't just numpy...

This works:

import tensorflow as tf
import numpy as np

This fails

import tensorflow as tf
import scipy.ndimage as nd

However, this does not fail:

import tensorflow as tf
import scipy

So it seems there is something in scipy.ndimage that causes a segmentation fault after tensorflow is loaded.

This does not fail

import scipy.ndimage as nd
import tensorflow as tf

Also -- FYI, I'm using scipy here as this comes from the Mandlebrot example

After running some more tests and comparing my issue to the OP I'm afraid I may have a different problem than what others in this thread have encountered.

@mouendless, @chengdianxuezi and @zhang8473 all seem to have a problem with just the single import tensorflow include. I apparently only have the issue when scipy.ndimage is imported after tensorflow...

@girving
Copy link
Contributor

girving commented Apr 26, 2016

@vrv or @martinwicke: Do you know what version of numpy we're building against?
@chengdianxuezi, @mouendless, @fxia22: What versions of numpy do you have?

If these versions don't match, numpy might decide to crash.

@vrv
Copy link

vrv commented Apr 26, 2016

@caisq for pip build info

@fxia22
Copy link

fxia22 commented Apr 26, 2016

@girving I am using numpy==1.11.0

@tmsimont
Copy link

I noticed something potentially relevant here:

INFO: From Compiling tensorflow/python/lib/core/py_func.cc:
In file included from third_party/py/numpy/numpy_include/numpy/ndarraytypes.h:17
77:0,
                 from third_party/py/numpy/numpy_include/numpy/ndarrayobject.h:1
8,
                 from third_party/py/numpy/numpy_include/numpy/arrayobject.h:4,
                 from tensorflow/python/lib/core/py_func.cc:19:
third_party/py/numpy/numpy_include/numpy/npy_1_7_deprecated_api.h:15:2: warning:
 #warning "Using deprecated NumPy API, disable it by " "#defining NPY_NO_DEPRECA
TED_API NPY_1_7_API_VERSION" [-Wcpp]
 #warning "Using deprecated NumPy API, disable it by " \
  ^

There's a warning about the numpy include using deprecated NumPy API.

I tried to use #define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION as it is used in other TensorFlow headers, but then the project fails to compile:

tensorflow/python/lib/core/py_func.cc: In function 'tensorflow::Status tensorflow::{anonymous}::ConvertNdarrayToTensor(PyObject*, tensorflow::Tensor*)':
tensorflow/python/lib/core/py_func.cc:235:50: error: 'PyArrayObject' has no member named 'data'
       memcpy(const_cast<char*>(p.data()), input->data, p.size());
                                                  ^
tensorflow/python/lib/core/py_func.cc: In function 'tensorflow::Status tensorflow::ConvertTensorToNdarray(const tensorflow::Tensor&, PyObject**)':
tensorflow/python/lib/core/py_func.cc:330:61: error: 'PyArrayObject' has no member named 'data'
     PyObject** out = reinterpret_cast<PyObject**>(np_array->data);
                                                             ^
tensorflow/python/lib/core/py_func.cc:345:22: error: 'PyArrayObject' has no member named 'data'
     memcpy(np_array->data, p.data(), p.size());
                      ^

Is TensorFlow trying to use two different versions of the NumPy API?

@martinwicke
Copy link
Member

It does look like an old numpy is creeping in from somewhere.
On Tue, Apr 26, 2016 at 06:48 Trevor Simonton notifications@github.com
wrote:

I noticed something potentially relevant here:

INFO: From Compiling tensorflow/python/lib/core/py_func.cc:
In file included from third_party/py/numpy/numpy_include/numpy/ndarraytypes.h:17
77:0,
from third_party/py/numpy/numpy_include/numpy/ndarrayobject.h:1
8,
from third_party/py/numpy/numpy_include/numpy/arrayobject.h:4,
from tensorflow/python/lib/core/py_func.cc:19:
third_party/py/numpy/numpy_include/numpy/npy_1_7_deprecated_api.h:15:2: warning:
#warning "Using deprecated NumPy API, disable it by " "#defining NPY_NO_DEPRECA
TED_API NPY_1_7_API_VERSION" [-Wcpp]
#warning "Using deprecated NumPy API, disable it by "
^

There's a warning about the numpy include using deprecated NumPy API.

I tried to use #define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION as it is
used in other TensorFlow headers, but then the project fails to compile:

tensorflow/python/lib/core/py_func.cc: In function 'tensorflow::Status tensorflow::{anonymous}::ConvertNdarrayToTensor(PyObject_, tensorflow::Tensor_)':
tensorflow/python/lib/core/py_func.cc:235:50: error: 'PyArrayObject' has no member named 'data'
memcpy(const_cast<char*>(p.data()), input->data, p.size());
^
tensorflow/python/lib/core/py_func.cc: In function 'tensorflow::Status tensorflow::ConvertTensorToNdarray(const tensorflow::Tensor&, PyObject**)':
tensorflow/python/lib/core/py_func.cc:330:61: error: 'PyArrayObject' has no member named 'data'
PyObject** out = reinterpret_cast<PyObject**>(np_array->data);
^
tensorflow/python/lib/core/py_func.cc:345:22: error: 'PyArrayObject' has no member named 'data'
memcpy(np_array->data, p.data(), p.size());
^

Is TensorFlow trying to use two different versions of the NumPy API?


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#2034 (comment)

@girving
Copy link
Contributor

girving commented Apr 26, 2016

@tmsimont: Can you remove py_func.cc entirely (or comment out the whole file) and see if that fixes the problem? py_func should definitely be fixed, but I'm skeptical that it's the problem here (hopefully I'm wrong!).

@girving
Copy link
Contributor

girving commented Apr 26, 2016

I'm fixing py_func to not roll it's own numpy import logic now in case that is the problem.

@tmsimont
Copy link

With py_func.cc commented out the compiler warning goes away and so does the segmentation fault, but import tensorflow throws an error: undefined symbol: _ZN10tensorflow22ConvertTensorToNdarrayERKNS_6TensorEPP7_object.

@yaroslavvb
Copy link
Contributor

yaroslavvb commented Apr 26, 2016

That's "tensorflow::ConvertTensorToNdarray(tensorflow::Tensor const&, _object**)" (used c++ filt)

So the library is being included through "py_func_lib", could you remove "py_func_lib" dependency here? "https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/BUILD#L1010"

@girving
Copy link
Contributor

girving commented Apr 26, 2016

#2114 fixes the py_func weirdness, but I still don't think it's related to the original issue.

@girving
Copy link
Contributor

girving commented Apr 26, 2016

@caisq: Do you know what numpy version we're building against for the pip packages?

@kanwar2preet
Copy link

I have faced same issue for the segmentation fault after compiling the 0.8 GPU version for tensor flow.

Work around mentioned by @zhang8473 works perfectly,i.e import numpy then import tensor flow.

In addition to above workaround, i found something interesting. When building from source if i skip first step and directly go to build with GPU option, i am getting segmentation fault.

$ bazel build -c opt //tensorflow/tools/pip_package:build_pip_package

$======To build with GPU support:=======
$ bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

$ bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg

$ pip install /tmp/tensorflow_pkg/tensorflow-0.8.0-py2-none-linux_x86_64.whl

On other hand if i use all above mentioned commands in order then i do not get any segmentation fault.

O/P if we skip the first step:

screen shot 2016-04-28 at 2 51 29 pm

Pardon my ignorance but can somebody tell me the correct way to compile the 0.8 GPU version.

@girving
Copy link
Contributor

girving commented Apr 28, 2016

@kanwar2preet: Is it possible to get a stack trace from that crash by running Python inside gdb? Also, unfortunately I didn't quite follow the set of commands that work and do not work from the above description. Can you say which commands you mean again?

@stephenroller
Copy link
Contributor

@girving I posted a copy of my GDB run in #2129. Unfortunately, without debug symbols it's not much help. I did try compiling TF with "-g" on (manually removed the -g0 in the BUILD file), but debug symbols still didn't seem there, so I didn't bother.

Here's some outputs of some strace runs though.

$ strace python -c "import numpy; import tensorflow" 2> tfnosegfault.txt
$ strace python -c "import tensorflow" 2> tfsegfault.txt 

tfnosegfault.txt
tfsegfault.txt

@stephenroller
Copy link
Contributor

In those outputs, my python installation is /work/01813/roller/maverick/packages/python/, which has lib and such under it.) What's interesting is all the calls to scipy and pandas and such. From what I can tell, they're picked up by the skflow module, which is imported somewhere...

@caisq
Copy link
Contributor

caisq commented Apr 29, 2016

@girving @vrv The numpy version we use for current nightly builds and 0.8 release builds are:
On Mac: 1.11.0
ON Linux (in ubuntu:14.04 Docker images): 1.8.2

1.8.2 is the version that comes with apt-get on ubuntu:14.04. See:
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/ci_build/install/install_deb_packages.sh#L41

@bug-fixed
Copy link

@gladys0313 I'm not sure if it is because of readline. But when I reinstall the readline-6.0.0 and use it to recompile the python. The error is gone. I didn't do anything else. Thus I doubt it is related to the readline. In my case, the error is when I run python in a terminal, it outputs Segment fault. And when I recompiled it, it can run OK. I use readline-6.0, did you try readline-6.0? not readline-6.3. In my case, reahat 6.5, gcc-4.8.4, python 2.7, readline-6.0

@gladys0313
Copy link

@zszhong Okay I see..I will try later, I was just wondering how you realized that you should reinstall readline? For me it seems a very random discovery...

@bug-fixed
Copy link

@gladys0313 , I can't remember the details. Maybe after some googling, and got some hints. And then I tried, then it worked.

@apkuhar
Copy link

apkuhar commented May 24, 2016

Confirming that importing numpy before tensorflow fixes it. (ok, it's more of a workaround than a fix)

For me it started seg faulting after I installed scipy in an otherwise clean virtualenv on Ubuntu 15.04.

@davebs
Copy link

davebs commented May 29, 2016

In my case, pip install scikit-image fixed it.

@martinwicke
Copy link
Member

Probably because that updated scipy or numpy as a side effect.
On Sat, May 28, 2016 at 20:59 davebs notifications@github.com wrote:

In my case, pip install scikits.ndimage fixed it.


You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub
#2034 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AAjO_Y81C43-2fWqec_YERAKYwrI2jPvks5qGQ8agaJpZM4ILdja
.

@prhbrt
Copy link

prhbrt commented May 29, 2016

A quick-fix for this might be to install version 0.7:

pip3 install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.7.1-cp34-none-linux_x86_64.whl

or any other version found here: https://storage.googleapis.com/tensorflow/

This worked for me, that is on 0.8.0 python3 -c 'import tensorflow' resulted in a segmentation fault, whereas 0.7.1 did not.

@girving
Copy link
Contributor

girving commented Jun 7, 2016

For anyone who finds this thread: TensorFlow 0.9 has the workaround.

@amineHorseman
Copy link

Actually, I have the same issue in Tensorflow 0.9 👎

@MartinThoma
Copy link

@amineHorseman Are you sure you have the latest version (print(tensorflow.__version__))?
For me, it works. I can now import tf without importing np before.

@amineHorseman
Copy link

amineHorseman commented Jun 9, 2016

@MartinThoma yes, I'm sure about the version.
I installed tensorflow-0.9.0rc0 from source in conda environment and uninstalled the older version. The current configuration is: python 2.7.11, cuda 7.5, cudnn 5, OS X 11.10.5.
EDIT: importing numpy or not didn't fix the problem

@girving
Copy link
Contributor

girving commented Jun 9, 2016

@amineHorseman If importing numpy doesn't fix it, please file a different issue. It is probably unrelated, unless you have reason to believe otherwise.

@countkisg
Copy link

countkisg commented Jul 19, 2016

Is there any update now? I found there was no such problem in pyCharm IDE (with sklearn installed)

import tensorflow as tf
import numpy as np

But the same code cann't work in ipython, Dont know why

@varundmishra
Copy link

For everyone using RHEL/Centos 6.x, you might have to build tensorflow from source by installing bazel first.
Here's how I got past this issue on RHEL 6.8:

TensorFlow (Installation on RHEL/Centos 6.x):

1) Enable Repos for Installing python2.7

CentOS 6 yum install centos-release-scl-rh
RHEL 6 yum-config-manager --enable rhel-server-rhscl-6-rpms

Install Software Collections in Scientific Linux 6Shell

yum install "http://ftp.scientificlinux.org/linux/scientific/6/external_products/softwarecollections/yum-conf-softwarecollections-1.0-1.el6.noarch.rpm"

Set gpgcheck=0 in /etc/yum.repos.d/softwarecollections.repo

2) Install python2.7

yum install python27 python27-numpy swig python27-python-devel python27-python-wheel python27-pip

3) Install Developer Toolset

yum install devtoolset-4

4) Build and Install Bazel

Enter Software Collection environment with Developer Toolset.
scl enable devtoolset-4 bash
Prepare Source Code
clone source code repository

git clone https://github.com/bazelbuild/bazel.git
cd bazel

select version
git checkout 0.3.1
Build and Install

compile
./compile.sh
install

mkdir -p ~/bin
cp output/bazel ~/bin/

exit from Software Collection environment
exit

5) Build TensorFlow with Bazel

Enter Build Environment
scl enable devtoolset-4 python27 bash
Prepare Source Code
clone source code repository

git clone https://github.com/tensorflow/tensorflow.git
cd tensorflow

select version
git checkout v0.10.0

6) Since GNU C library version in CentOS 6 is less than 2.17, a slight modification needs to be applied before compilation.

Modify tf_extension_linkopts function in tensorflow/tensorflow.bzl FROM

def tf_extension_linkopts():
  return []  # No extension link opts

TO

def tf_extension_linkopts():
  return ["-lrt"]

7) Build

configure workspace, just leave everything as default
./configure
build

~/bin/bazel build -c opt //tensorflow/tools/pip_package:build_pip_package
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg

exit from Software Collection environment
exit

8) Install tensorflow via pip_package

cd /tmp/tensorflow_pkg
pip install --upgrade tensorflow-0.10.0-cp27-none-linux_x86_64.whl

9) Test tensorflow

python2.7
Python 2.7.12 (default, Jun 28 2016, 17:49:40)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-17)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf

exit()

Hope this helps!!!

@ryuseonghan
Copy link

I had a similar problem. And I've solved that problem by upgrading matplotlib.

sudo pip install --upgrade matplotlib

@TaihuLight
Copy link

How to solve the similar problem as follow, I try to upgrade matplotlib and so on, but they are failed.
[llliao@GPU-1-10 ~]$ python3
Python 3.5.4 (default, Aug 8 2017, 11:09:21)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-11)] on linux
Type "help", "copyright", "credits" or "license" for more information.

import tensorflow as tf
sess=tf.Session()
2018-01-13 20:00:56.735985: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
2018-01-13 20:00:57.053726: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties:
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
pciBusID: 0000:04:00.0
totalMemory: 11.17GiB freeMemory: 4.29GiB
2018-01-13 20:00:57.350308: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 1 with properties:
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
pciBusID: 0000:05:00.0
totalMemory: 11.17GiB freeMemory: 2.58GiB
2018-01-13 20:00:57.658419: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 2 with properties:
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
pciBusID: 0000:83:00.0
totalMemory: 11.17GiB freeMemory: 6.35GiB
Segmentation fault (core dumped)

fsx950223 pushed a commit to fsx950223/tensorflow that referenced this issue Nov 28, 2023
…upstream-py-dev

Add some extra pkgs to tensorflow-build so it builds python properly
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests