New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Illegal instruction (core dumped) after running import tensorflow #17411

Closed
konnerthg opened this Issue Mar 4, 2018 · 82 comments

Comments

Projects
None yet
@konnerthg
Copy link

konnerthg commented Mar 4, 2018

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below):
    1.6.0-cp27-cp27mu-manylinux1_x86_64 (can only guess since python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)" gives me an error already)
  • Python version: Python 2.7.12
  • Exact command to reproduce: import tensorflow

I created a fresh virtual environment: virtualenv -p python2 test_venv/
And installed tensorflow: pip install --upgrade --no-cache-dir tensorflow
import tensorflow gives me Illegal instruction (core dumped)

Please help me understand what's going on and how I can fix it. Thank you.

CPU information:

-cpu
          description: CPU
          product: Intel(R) Core(TM) i3 CPU       M 330  @ 2.13GHz
          bus info: cpu@0
          version: CPU Version
          capabilities: x86-64 fpu fpu_exception wp vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 popcnt lahf_lm tpr_shadow vnmi flexpriority ept vpid dtherm arat cpufreq

EDIT
Stacktrace obtained with gdb:

#0  0x00007fffe5793880 in std::pair<std::__detail::_Node_iterator<std::pair<tensorflow::StringPiece const, std::function<bool (tensorflow::Variant*)> >, false, true>, bool> std::_Hashtable<tensorflow::StringPiece, std::pair<tensorflow::StringPiece const, std::function<bool (tensorflow::Variant*)> >, std::allocator<std::pair<tensorflow::StringPiece const, std::function<bool (tensorflow::Variant*)> > >, std::__detail::_Select1st, std::equal_to<tensorflow::StringPiece>, tensorflow::StringPieceHasher, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::_M_emplace<std::pair<tensorflow::StringPiece, std::function<bool (tensorflow::Variant*)> > >(std::integral_constant<bool, true>, std::pair<tensorflow::StringPiece, std::function<bool (tensorflow::Variant*)> >&&) ()
   from /media/gerry/hdd_1/ws_hdd/test_venv/local/lib/python2.7/site-packages/tensorflow/python/../libtensorflow_framework.so
#1  0x00007fffe5795735 in tensorflow::UnaryVariantOpRegistry::RegisterDecodeFn(std::string const&, std::function<bool (tensorflow::Variant*)> const&) () from /media/gerry/hdd_1/ws_hdd/test_venv/local/lib/python2.7/site-packages/tensorflow/python/../libtensorflow_framework.so
#2  0x00007fffe5770a7c in tensorflow::variant_op_registry_fn_registration::UnaryVariantDecodeRegistration<tensorflow::Tensor>::UnaryVariantDecodeRegistration(std::string const&) ()
   from /media/gerry/hdd_1/ws_hdd/test_venv/local/lib/python2.7/site-packages/tensorflow/python/../libtensorflow_framework.so
#3  0x00007fffe56ea165 in _GLOBAL__sub_I_tensor.cc ()
   from /media/gerry/hdd_1/ws_hdd/test_venv/local/lib/python2.7/site-packages/tensorflow/python/../libtensorflow_framework.so
#4  0x00007ffff7de76ba in call_init (l=<optimized out>, argc=argc@entry=2, argv=argv@entry=0x7fffffffd5c8, env=env@entry=0xa7b4d0)
    at dl-init.c:72
#5  0x00007ffff7de77cb in call_init (env=0xa7b4d0, argv=0x7fffffffd5c8, argc=2, l=<optimized out>) at dl-init.c:30
#6  _dl_init (main_map=main_map@entry=0xa11920, argc=2, argv=0x7fffffffd5c8, env=0xa7b4d0) at dl-init.c:120
#7  0x00007ffff7dec8e2 in dl_open_worker (a=a@entry=0x7fffffffb5c0) at dl-open.c:575
#8  0x00007ffff7de7564 in _dl_catch_error (objname=objname@entry=0x7fffffffb5b0, errstring=errstring@entry=0x7fffffffb5b8, 
    mallocedp=mallocedp@entry=0x7fffffffb5af, operate=operate@entry=0x7ffff7dec4d0 <dl_open_worker>, args=args@entry=0x7fffffffb5c0)
    at dl-error.c:187
#9  0x00007ffff7debda9 in _dl_open (
    file=0x7fffea7cbc34 "/media/gerry/hdd_1/ws_hdd/test_venv/local/lib/python2.7/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so", mode=-2147483646, caller_dlopen=0x51ad19 <_PyImport_GetDynLoadFunc+233>, nsid=-2, argc=<optimized out>, argv=<optimized out>, env=0xa7b4d0)
    at dl-open.c:660
#10 0x00007ffff75ecf09 in dlopen_doit (a=a@entry=0x7fffffffb7f0) at dlopen.c:66
#11 0x00007ffff7de7564 in _dl_catch_error (objname=0x9b1870, errstring=0x9b1878, mallocedp=0x9b1868, operate=0x7ffff75eceb0 <dlopen_doit>, 
    args=0x7fffffffb7f0) at dl-error.c:187
#12 0x00007ffff75ed571 in _dlerror_run (operate=operate@entry=0x7ffff75eceb0 <dlopen_doit>, args=args@entry=0x7fffffffb7f0) at dlerror.c:163
#13 0x00007ffff75ecfa1 in __dlopen (file=<optimized out>, mode=<optimized out>) at dlopen.c:87
#14 0x000000000051ad19 in _PyImport_GetDynLoadFunc ()
#15 0x000000000051a8e4 in _PyImport_LoadDynamicModule ()
#16 0x00000000005b7b1b in ?? ()
#17 0x00000000004bc3fa in PyEval_EvalFrameEx ()
#18 0x00000000004c136f in PyEval_EvalFrameEx ()
#19 0x00000000004b9ab6 in PyEval_EvalCodeEx ()
#20 0x00000000004b97a6 in PyEval_EvalCode ()
#21 0x00000000004b96df in PyImport_ExecCodeModuleEx ()
#22 0x00000000004b2b06 in ?? ()
#23 0x00000000004a4ae1 in ?? ()

EDIT 2
Bazel version: N/A
CUDA/cuDNN version: N/A
GPU model and memory: N/A

After downgrading to an older version of tensorflow the error goes away. I've been advised that my CPU (see information above) might not work with some improvements in the new API. If this is the case, I suppose there's no solution for my problem. Therefore, I will close this thread. Feel free to correct me though. Thank you for your support

@tensorflowbutler

This comment has been minimized.

Copy link
Member

tensorflowbutler commented Mar 4, 2018

Thank you for your post. We noticed you have not filled out the following field in the issue template. Could you update them if they are relevant in your case, or leave them as N/A? Thanks.
Bazel version
CUDA/cuDNN version
GPU model and memory

@konnerthg konnerthg closed this Mar 4, 2018

@tianyang-li

This comment has been minimized.

Copy link

tianyang-li commented Mar 4, 2018

I'm having the same (or similar) "illegal instruction" problem when I run

import tensorflow as tf

I'm only using the CPU 1.6 version on 64 bit Ubuntu Linux.

After downgrading to the CPU 1.5 version, it doesn't have this problem.

@LukianBat

This comment has been minimized.

Copy link

LukianBat commented Mar 5, 2018

How i can downgrade to the CPU 1.5 version?

@konnerthg

This comment has been minimized.

Copy link
Author

konnerthg commented Mar 5, 2018

Try running
pip uninstall tensorflow
And then
pip install tensorflow==1.5

EDIT
just to give credit, solution is from here:
https://stackoverflow.com/questions/49094597/illegal-instruction-core-dumped-after-running-import-tensorflow

@priyablue

This comment has been minimized.

Copy link

priyablue commented Mar 5, 2018

Thanks konnerthg, even i was having the same problem. Your command helped me to sort this issue. Thanks again.

@royyannick

This comment has been minimized.

Copy link

royyannick commented Mar 5, 2018

Same here.
With the latest wheel, I had the illegal instruction problem on Ubuntu 16.04, however I downgraded to tensorflow-gpu==1.5 and it works!

@SirPadric

This comment has been minimized.

Copy link

SirPadric commented Mar 7, 2018

downgrade to 1.5 worked for me, too

@jo7ueb

This comment has been minimized.

Copy link

jo7ueb commented Mar 8, 2018

@konnerthg Downgrading to 1.5 is just work around, this issue is not solved yet.
Which commit/PR solved this issue?

@QuantumTradingGroup

This comment has been minimized.

Copy link

QuantumTradingGroup commented Mar 8, 2018

I am also getting this error in python 3.6

@nayzen

This comment has been minimized.

Copy link

nayzen commented Mar 8, 2018

Hey !
Thank you for your solution ! Really. I have this problem for a week now and I was starting to become crazy ! Thx !

@eelectron

This comment has been minimized.

Copy link

eelectron commented Mar 11, 2018

THANKS for solution.It worked on my Ubuntu 16.04, 64 bit, python3.5 .

@jmoeyersons

This comment has been minimized.

Copy link

jmoeyersons commented Mar 12, 2018

Thanks for the solution! Downgrading to version 1.5 fixed the issue. Tested on a Ubuntu 16.04 server with python 2.7

@Bauxitedev

This comment has been minimized.

Copy link

Bauxitedev commented Mar 15, 2018

Same issue, downgrading from Tensorflow 1.6 to 1.5 solved it. Running Xubuntu 16.04 64-bit, Python 3.5.

@NinemillaKA

This comment has been minimized.

Copy link

NinemillaKA commented Mar 16, 2018

Thanks for all this solve my issue on Python 3.6

_ (tensorflow) naniny@Aspire-E5-573:~$ pip unistall tensorflow

_(tensorflow) naniny@Aspire-E5-573:~$ pip install tensorflow==1.5

_(tensorflow) naniny@Aspire-E5-573:~$ python

_ (tensorflow) naniny@Aspire-E5-573:~$ import tensorflow as tf

now works without any problem ...

@RylanSchaeffer

This comment has been minimized.

Copy link

RylanSchaeffer commented Mar 16, 2018

This is really weird. Does anyone know what causes the issue? I'm surprised that TensorFlow 1.6 would have a bug this big.

@nacl

This comment has been minimized.

Copy link

nacl commented Mar 17, 2018

I am encountering this issue as well with tensorflow-gpu 1.6.0, on linux, using python 3.6.4. I have installed tensorflow using pip itself. Simply running this produces a SIGILL:

$ python3 -m tensorflow
zsh: illegal hardware instruction  python3 -m tensorflow

I get stack traces similar to what is mentioned in this ticket's description.

This seems to be occurring due to the use of AVX instructions in the latest Tensorflow packages uploaded to pip. Running python3 through GDB and disassembling the crashing function points to this instruction:

=> 0x00007fffb9689660 <+80>:    vmovdqu 0x10(%r13),%xmm1

Which is an AVX instruction not supported on older or less-featureful CPUs that do not have AVX support. The tensorflow(-gpu) 1.5.0 pip packages do not use AVX instructions, and thus there are no problems using it with these CPUs.

The solution would be for a build of tensorflow(-gpu) that is not compiled with AVX instructions to be published (or to build a copy locally). The provided installation instructions do not mention any specific CPU requirements nor how to determine compatibility with the provided binaries.

In the meantime, reverting to tensorflow(-gpu) 1.5.0 using something like what @NinemillaKA mentioned above is an effective workaround.

@deadpyxel

This comment has been minimized.

Copy link

deadpyxel commented Mar 19, 2018

I have the same issue, and, as many have commented, downgrade from 1.6.0 to 1.5.0.

For the record, I tried running tensorflow (CPU-only version) on 2 different computers:

Computer 1:

OS = Ubuntu 16.04 x64 LTS
Python = Python 3.6
pip version = 9.0.1
tensorflow version = TensorFlow 1.6.0
CPU = Intel Core 2 Quad Q6600  @2.40GHz

Computer 2:

OS = Ubuntu 16.04 x64 LTS
Python = Python 3.6
pip version = 9.0.1
tensorflow version = TensorFlow 1.6.0
CPU = Intel Celeron N2820 @2.413GHz

I agree with @nacl that we should have those requirements about the instruction set more clear, and if possible, a separated, updated build for processors that doesn't support AVX instructions. To be honest, I find a bit discouraging have to work with outdated version of any technology, I think many feel the same.

@yaroslavvb

This comment has been minimized.

Copy link
Contributor

yaroslavvb commented Mar 19, 2018

The alternative to having a different build for each architecture type is to use dynamic dispatch. IE, PyTorch has one binary for all architectures and selects most efficient ops during runtime @caisq

@Djiky

This comment has been minimized.

Copy link

Djiky commented Mar 22, 2018

Thanks

@jaeseung16

This comment has been minimized.

Copy link

jaeseung16 commented Mar 23, 2018

I also encounter the same issue. I tried it on two machines, and it works on one of them.

First, I installed it on my MacBook Pro. And I did not have any issues.

MacBook Pro (Retina, Mid 2012)
CPU = 2.3 GHz Intel Core i7
OS = MacOS 10.13.3
Python = Python 3.6.4
pip version = 9.0.3
TensorFlow version = 1.6.0

So I upgraded my MacPro. But this time, I am getting Illegal instruction: 4 when I try to import tensorflow.

Mac Pro (Mid 2010)
CPU = 2 x 2.4 GHz Quad-Core Intel Xeon
OS = MacOS 10.13.3
Python = Python 3.6.4
pip version = 9.0.3
TensorFlow version = 1.6.0

(Update on 3/30/2018)
The same problem with TensorFlow 1.7. So I guess I use TensorFlow 1.5.

@spinorx

This comment has been minimized.

Copy link

spinorx commented Mar 25, 2018

This is still an issue in 1.6 and potentially in 1.7. Why is this closed? @yaroslavvb 's solution seems reasonable. I have downgraded to 1.5 for now.

@captainst

This comment has been minimized.

Copy link

captainst commented Mar 27, 2018

Not sure but from this link, since ver1.6.0, intel CPU instruction optimizer had been introduced to tensorflow. I think that probably this is the cause.
https://software.intel.com/en-us/articles/intel-optimized-tensorflow-wheel-now-available

@yaroslavvb

This comment has been minimized.

Copy link
Contributor

yaroslavvb commented Mar 29, 2018

@captainst that's Intel-specific release, different from the official release that you get by doing pip install. SIGILL issues after 1.6 upgrade are likely caused by adding AVX

@avpdiver

This comment has been minimized.

Copy link

avpdiver commented Mar 30, 2018

I have the same issue.
Ubuntu 18.04 x64
Python 3.6.5rc1
TensorFlow 1.7.0

@laurentS

This comment has been minimized.

Copy link

laurentS commented Aug 9, 2018

I posted some links to a few community builds of tensorflow here which might help avoid having to build from source.

@AlexisWilke

This comment has been minimized.

Copy link

AlexisWilke commented Aug 23, 2018

Indeed, I followed the instructions found at https://www.tensorflow.org/install/install_linux and get nothing more than the "Illegal instruction (core dumped)" when testing as requested on that installation page.

This issue should either not be closed or have an actual solution that makes sense (i.e. not installing version 1.5)

It worked on Ubuntu 18.04. So again, mentioning the version on the installation page may be a good idea as a solution about what works and what doesn't.

ben243871694 added a commit to HPI-Information-Systems/Quagga that referenced this issue Aug 24, 2018

Update setup.py
use tensorflow 1.5 because of tensorflow/tensorflow#17411

ben243871694 added a commit to HPI-Information-Systems/QuaggaLib that referenced this issue Aug 24, 2018

Update setup.py
use tensorflow==1.5 because of tensorflow/tensorflow#17411
@moctardiallo

This comment has been minimized.

Copy link

moctardiallo commented Aug 28, 2018

@mmistele

This comment has been minimized.

Copy link

mmistele commented Aug 31, 2018

Root problem may have to do with protobuf and incompatibility around pthread_once and std::call_once. I ran into a segfault myself when importing tensorflow right after another google package called sentencepiece, and the maker of sentencepiece fixed it by making a patch for protobuf that replaces the std::call_once implementation with another.

google/sentencepiece#186

@mitar

This comment has been minimized.

Copy link

mitar commented Aug 31, 2018

To make this issue more constructive, I think it would be useful it TensorFlow would check for instructions it requires on the CPU first, and print an error if they are missing. Similarly how it currently says that some instructions are available but not compiled against. Then it would be easier to differentiate between bugs and simply not using correct binary for a given CPU.

@ghost

This comment has been minimized.

Copy link

ghost commented Sep 2, 2018

Mr royyannick ..Infact i have been searching google for many times and it has affected my studying both Keras and Tensoflow but today you have made my day..You are great.
Thanks

@alxfed

This comment has been minimized.

Copy link

alxfed commented Sep 9, 2018

Tried both virtual environment and (f...ing) conda (you, dude upstream, go f y s!) on the tensorflow 1.10.1 (latest) in Ubuntu 16.04 with the same error. Switched to the previous version 1.9 - everything works fine.

@mikaelfs

This comment has been minimized.

Copy link

mikaelfs commented Sep 11, 2018

I happened to reproduce this issue on a machine running an old CPU. Here is the article that explains the possible options to resolve the issue.

Those who want to install the latest TensorFlow for old CPU without AVX support but does not have the time to build from source can also download the WHL file from this Github repository.

@alxfed

This comment has been minimized.

Copy link

alxfed commented Sep 11, 2018

This is BS. I rolled my installation back to 1.9 (not 'before 1.6' as you say in this article) and the binary worked (the day before yesterday).

@mikaelfs

This comment has been minimized.

Copy link

mikaelfs commented Sep 12, 2018

If you run this on command line:

$ lsb_release -a| grep "Release" | awk '{print $2}'
$ grep flags -m1 /proc/cpuinfo | cut -d ":" -f 2 | tr '[:upper:]' '[:lower:]' | { read FLAGS; OPT="-march=native"; for flag in $FLAGS; do case "$flag" in "sse4_1" | "sse4_2" | "ssse3" | "fma" | "cx16" | "popcnt" | "avx" | "avx2") OPT+=" -m$flag";; esac; done; MODOPT=${OPT//_/\.}; echo "$MODOPT"; }

and see 16.04 for 1) and -mavx or -mavx2 for 2) in the output, it can be another problem that is not related with AVX support.

If those flags are not there, that's something that I should add into my note, thanks to you.

@mtian2018

This comment has been minimized.

Copy link

mtian2018 commented Sep 12, 2018

Same error here,

CentOS 7, Python 3.6.5, Intel CPU core2 duo e8500. pip install.

version 1.9 does't work. version 1.5 imports ok.

version 1.10 seems ok on my laptop which has Ubuntu 18.04 and Intel i5-6200U.

@failure-to-thrive

This comment has been minimized.

Copy link

failure-to-thrive commented Sep 12, 2018

This is stated at https://www.tensorflow.org/install/install_sources
Note: Starting from 1.6 release, our prebuilt binaries will use AVX instructions. Older CPUs may not be able to execute these binaries.

I think that might have been mentioned at the much more prominent location!

@zeroows

This comment has been minimized.

Copy link

zeroows commented Sep 18, 2018

This solved my issue:
After installing NVIDIA driver, CUDA Toolkit, and CUDNN.
First uninstall tensorflow-gpu:

$ pip uninstall tensorflow-gpu

Then install tensorflow-gpu using Anaconda:

$ conda create -n tensorflow
$ conda install tensorflow-gpu -n tensorflow
@techemayo

This comment has been minimized.

Copy link

techemayo commented Sep 24, 2018

Try running
pip uninstall tensorflow
And then
pip install tensorflow==1.5

EDIT
just to give credit, solution is from here:
https://stackoverflow.com/questions/49094597/illegal-instruction-core-dumped-after-running-import-tensorflow

Thanks it works

@icyhearts

This comment has been minimized.

Copy link

icyhearts commented Oct 3, 2018

Maybe related to AVX instruction. pip prebuilt tensorflow-1.6 and higher versions are built with AVX instruction, some CPUs don't have AVX instruction. pip prebuilt tensorflow-1.5is not built with AVX instruction.
Suggestion: 1): use lower version of tensorflow
2): compile higher version of tensorflow from source

@AlexisWilke

This comment has been minimized.

Copy link

AlexisWilke commented Oct 3, 2018

Yes. Indeed. It would be better, though, if the software would tell me rather than just crash. I don't have a problem with the requirement, just the way it is handled... On Linux, it would be very easy to check in /proc/cpuinfo for the flags line where avx would need to appear. If not, generate an error and exit(1).

Here is my flags on my old computer without AVX

flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 popcnt lahf_lm ssbd ibrs ibpb stibp kaiser tpr_shadow vnmi flexpriority ept vpid dtherm ida flush_l1d

@bandarikanth

This comment has been minimized.

Copy link

bandarikanth commented Dec 29, 2018

System information

  • Lenovo-G500 8GB RAM description: CPU
    product: Intel(R) Core(TM) i3 CPU M 330 @ 2.13GHz
    bus info: cpu@0
    version: CPU Version
    OS; Ubuntu-16.05
    pip : 18 version latest
    I dont have gpu

i also getting illegal instruction core dumped. tensorforflow 1.5 is working for me ,
but
I need to install tensorflowv1.10 0r latest for my project .

I tried to to install in tensorflow in different ways , those are

  1. without anacoda , python 2.7, using pip ...pip install --upgrade tensorflow
    2 without anacoda , python 3.5, using pip ""
    3.without anacoda , python 3.6, using pip ""
    4.with anacoda , python 2.7, using conda conda install -c conda-forge tensorflow
    5.without anacoda , python 2.7, using pip ""
    6.without anacoda , python 2.7, using pip ""

neither worked for me,
what is the issue.

@dstine

This comment has been minimized.

Copy link

dstine commented Dec 29, 2018

@bandarikanth

The manner in which you install tensorflow shouldn’t matter. The problem is that the tensorflow 1.6+ prebuilt binaries require the AVX instruction set extensions, and your processor doesn’t support AVX. You can either build from source, move to a computer with a new-enough processor, or stick with 1.5.

@bandarikanth

This comment has been minimized.

Copy link

bandarikanth commented Jan 14, 2019

@erdeq-upenn

This comment has been minimized.

Copy link

erdeq-upenn commented Feb 8, 2019

works for me if downgrade to 1.5 (pip install tensorflow==1.5)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment