Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instructions for x86-64 install? Getting Illegal instruction #6

Closed
edalquist opened this issue Jan 25, 2021 · 11 comments
Closed

Instructions for x86-64 install? Getting Illegal instruction #6

edalquist opened this issue Jan 25, 2021 · 11 comments

Comments

@edalquist
Copy link
Contributor

I'm trying to see if I can get this running in an LXC container running ubuntu server 20.02

The one change I made from the instructions is to install TF via:

pip install https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow_cpu-2.4.0-cp38-cp38-manylinux2010_x86_64.whl

When running stream.py I get:

(argos-venv) edalquist@argos:~/argos$ python stream.py --ip 0.0.0.0 --port 8080 --config configs.driveway_stream
INFO:__main__:package import START
INFO:__main__:package import END
INFO:notifier:mqtt init
INFO:__main__:flask init..
INFO:__main__:start reading video file
INFO:__main__:TFObjectDetector init START
 * Serving Flask app "stream" (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
INFO:werkzeug: * Running on http://0.0.0.0:8080/ (Press CTRL+C to quit)
INFO:input.rtmpstream:rtmp capture init START
Illegal instruction

Here is my OpenCV build info dump:

(argos-venv) edalquist@argos:~/argos$ python -c "import cv2; print(cv2.getBuildInformation())"

General configuration for OpenCV 4.5.1 =====================================
  Version control:               4.5.1-dirty

  Platform:
    Timestamp:                   2021-01-02T13:00:02Z
    Host:                        Linux 4.15.0-1077-gcp x86_64
    CMake:                       3.18.4
    CMake generator:             Unix Makefiles
    CMake build tool:            /bin/gmake
    Configuration:               Release

  CPU/HW features:
    Baseline:                    SSE SSE2 SSE3
      requested:                 SSE3
    Dispatched code generation:  SSE4_1 SSE4_2 FP16 AVX AVX2 AVX512_SKX
      requested:                 SSE4_1 SSE4_2 AVX FP16 AVX2 AVX512_SKX
      SSE4_1 (15 files):         + SSSE3 SSE4_1
      SSE4_2 (1 files):          + SSSE3 SSE4_1 POPCNT SSE4_2
      FP16 (0 files):            + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 AVX
      AVX (4 files):             + SSSE3 SSE4_1 POPCNT SSE4_2 AVX
      AVX2 (29 files):           + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2
      AVX512_SKX (4 files):      + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2 AVX_512F AVX512_COMMON AVX512_SKX

  C/C++:
    Built as dynamic libs?:      NO
    C++ standard:                11
    C++ Compiler:                /usr/lib/ccache/compilers/c++  (ver 9.3.1)
    C++ flags (Release):         -Wl,-strip-all   -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Wsuggest-override -Wno-delete-non-virtual-dtor -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections  -msse -msse2 -msse3 -fvisibility=hidden -fvisibility-inlines-hidden -O3 -DNDEBUG  -DNDEBUG
    C++ flags (Debug):           -Wl,-strip-all   -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Wsuggest-override -Wno-delete-non-virtual-dtor -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections  -msse -msse2 -msse3 -fvisibility=hidden -fvisibility-inlines-hidden -g  -O0 -DDEBUG -D_DEBUG
    C Compiler:                  /usr/lib/ccache/compilers/cc
    C flags (Release):           -Wl,-strip-all   -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wuninitialized -Wno-comment -Wno-strict-overflow -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections  -msse -msse2 -msse3 -fvisibility=hidden -O3 -DNDEBUG  -DNDEBUG
    C flags (Debug):             -Wl,-strip-all   -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wuninitialized -Wno-comment -Wno-strict-overflow -fdiagnostics-show-option -Wno-long-long -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections  -msse -msse2 -msse3 -fvisibility=hidden -g  -O0 -DDEBUG -D_DEBUG
    Linker flags (Release):      -Wl,--exclude-libs,libippicv.a -Wl,--exclude-libs,libippiw.a -L/root/ffmpeg_build/lib  -Wl,--gc-sections -Wl,--as-needed
    Linker flags (Debug):        -Wl,--exclude-libs,libippicv.a -Wl,--exclude-libs,libippiw.a -L/root/ffmpeg_build/lib  -Wl,--gc-sections -Wl,--as-needed
    ccache:                      YES
    Precompiled headers:         NO
    Extra dependencies:          ade Qt5::Core Qt5::Gui Qt5::Widgets Qt5::Test Qt5::Concurrent /lib64/libpng.so /lib64/libz.so dl m pthread rt
    3rdparty dependencies:       ittnotify libprotobuf libjpeg-turbo libwebp libtiff libopenjp2 IlmImf quirc ippiw ippicv

  OpenCV modules:
    To be built:                 calib3d core dnn features2d flann gapi highgui imgcodecs imgproc ml objdetect photo python3 stitching video videoio
    Disabled:                    world
    Disabled by dependency:      -
    Unavailable:                 java python2 ts
    Applications:                -
    Documentation:               NO
    Non-free algorithms:         NO

  GUI:
    QT:                          YES (ver 5.15.0)
      QT OpenGL support:         NO
    GTK+:                        NO
    VTK support:                 NO

  Media I/O:
    ZLib:                        /lib64/libz.so (ver 1.2.7)
    JPEG:                        libjpeg-turbo (ver 2.0.6-62)
    WEBP:                        build (ver encoder: 0x020f)
    PNG:                         /lib64/libpng.so (ver 1.5.13)
    TIFF:                        build (ver 42 - 4.0.10)
    JPEG 2000:                   build (ver 2.3.1)
    OpenEXR:                     build (ver 2.3.0)
    HDR:                         YES
    SUNRASTER:                   YES
    PXM:                         YES
    PFM:                         YES

  Video I/O:
    DC1394:                      NO
    FFMPEG:                      YES
      avcodec:                   YES (58.109.100)
      avformat:                  YES (58.61.100)
      avutil:                    YES (56.60.100)
      swscale:                   YES (5.8.100)
      avresample:                NO
    GStreamer:                   NO
    v4l/v4l2:                    YES (linux/videodev2.h)

  Parallel framework:            pthreads

  Trace:                         YES (with Intel ITT)

  Other third-party libraries:
    Intel IPP:                   2020.0.0 Gold [2020.0.0]
           at:                   /tmp/pip-req-build-ms668fyv/_skbuild/linux-x86_64-3.8/cmake-build/3rdparty/ippicv/ippicv_lnx/icv
    Intel IPP IW:                sources (2020.0.0)
              at:                /tmp/pip-req-build-ms668fyv/_skbuild/linux-x86_64-3.8/cmake-build/3rdparty/ippicv/ippicv_lnx/iw
    Lapack:                      NO
    Eigen:                       NO
    Custom HAL:                  NO
    Protobuf:                    build (3.5.1)

  OpenCL:                        YES (no extra features)
    Include path:                /tmp/pip-req-build-ms668fyv/opencv/3rdparty/include/opencl/1.2
    Link libraries:              Dynamic load

  Python 3:
    Interpreter:                 /opt/python/cp38-cp38/bin/python (ver 3.8.6)
    Libraries:                   libpython3.8.a (ver 3.8.6)
    numpy:                       /tmp/pip-build-env-qm375ina/overlay/lib/python3.8/site-packages/numpy/core/include (ver 1.17.3)
    install path:                python

  Python (for build):            /bin/python2.7

  Java:
    ant:                         NO
    JNI:                         NO
    Java wrappers:               NO
    Java tests:                  NO

  Install to:                    /tmp/pip-req-build-ms668fyv/_skbuild/linux-x86_64-3.8/cmake-install
-----------------------------------------------------------------
@unclebacon-live
Copy link

I was actually just working on trying to build this from the ground up for amd64. Would be nice to see

@edalquist
Copy link
Contributor Author

Note if I uninstall tensorflow it doesn't crash, just complains it couldn't find it:

(argos-venv) edalquist@argos:~/argos$ python stream.py --ip 0.0.0.0 --port 8080 --config configs.driveway_stream
INFO:__main__:package import START
INFO:__main__:package import END
INFO:notifier:mqtt init
INFO:__main__:flask init..
INFO:__main__:start reading video file
INFO:__main__:TFObjectDetector init START
 * Serving Flask app "stream" (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
INFO:werkzeug: * Running on http://0.0.0.0:8080/ (Press CTRL+C to quit)
Exception in thread Thread-9:
Traceback (most recent call last):
  File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner
INFO:input.rtmpstream:rtmp capture init START
    self.run()
  File "/usr/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/home/edalquist/argos/detection/detect_base.py", line 145, in detect_continuously
    self.initialize_tf_model()
  File "/home/edalquist/argos/detection/detect_base.py", line 40, in initialize_tf_model
    from tflib.tflite_util import DetectorTFLite
  File "/home/edalquist/argos/tflib/tflite_util.py", line 10, in <module>
    from tensorflow.lite.python.interpreter import Interpreter
ModuleNotFoundError: No module named 'tensorflow'
INFO:detection.door_detect:stateHistory: []
INFO:detection.door_detect:stateHistory: []
INFO:input.rtmpstream:rtmp capture init END
INFO:detection.door_detect:stateHistory: []
INFO:detection.door_detect:stateHistory: []
INFO:detection.door_detect:stateHistory: []
INFO:detection.door_detect:stateHistory: []
INFO:detection.door_detect:stateHistory: []
^CTraceback (most recent call last):
  File "stream.py", line 248, in <module>
    t = sd.start()
  File "stream.py", line 63, in start
    self.od.wait_for_ready()
  File "/home/edalquist/argos/detection/detect_base.py", line 49, in wait_for_ready
    self.__cv.wait()
  File "/usr/lib/python3.8/threading.py", line 302, in wait
    waiter.acquire()
KeyboardInterrupt
FATAL: exception not rethrown
Aborted

Makes me think I'm not installing the right TF wheel.

@angadsingh
Copy link
Owner

@edalquist even if tensorflow isn't installed at all you'll just see a ModuleNotFoundError and argos will keep running, since the object detector is running in a separate thread. so just that thread crashes (and my bad - i haven't handled killing the whole process on the tensorflow thread crashing)

it doesn't matter that your underlying OS is ubuntu. it could have been OSX and you'd have faced the same error above. what matters is the cpu architecture (armh, amd64, x86). pip looks for architecture-specific wheels. for mainstream architectures (amd64, x86), you don't need to install a specific wheel of tensorflow. that workaround was the only way for raspberry pi armv6/7 architectures whose pip repositories did not contain tensorflow 2.x wheels yet.

you may just 'pip install tensorflow==2.4.0' on a mainstream architecture machine (or docker container). (e.g. thats what you need to do on your macbook pro as well)

also, the current Dockerfile in the repo is based from arm32v7/python:3.7-slim-buster and instructions are put together specific to that. let me put together an x86_64 docker image.

@angadsingh
Copy link
Owner

FWIW the wheel should have been tensorflow-2.4.0-cp37-cp37m-manylinux2010_x86_64.whl, since we're using python3.7

@edalquist
Copy link
Contributor Author

Interesting, somehow I ended up on python 3.8, I guess that is the default "python3" for ubuntu 20.02.

I did pip install tensorflow==2.4.0 which ended up installing the same wheel I manually found and I get the same Illegal instruction error.

I'll try recreating the venv but explicitly using 3.7

@edalquist
Copy link
Contributor Author

No luck with a Python 3.7 install either:

  Python 3:
    Interpreter:                 /opt/python/cp37-cp37m/bin/python (ver 3.7.9)
    Libraries:                   libpython3.7m.a (ver 3.7.9)
    numpy:                       /tmp/pip-build-env-7d0lu0w8/overlay/lib/python3.7/site-packages/numpy/core/include (ver 1.14.5)
    install path:                python

Same Illegal Instruction error. Any tips on how to better debug where that error might be coming from?

@angadsingh
Copy link
Owner

alright, i just pushed 2 new x86_64 docker images - angadsingh/argos:x86_64 and angadsingh/argos:x86_64_gpu, which are based on ubuntu itself as the base image (based on the tensorflow docker which is based on ubuntu). it works fine on my macbook. try using them (one is a cpu version and one supports using a nvidia GPU). updated the README.

https://hub.docker.com/repository/docker/angadsingh/argos/tags

example runs:

docker run --rm -p8081:8081 -v "/Users/asingh/workspace/pi object detection/argos/configs:/configs" -v "/Users/asingh/workspace/pi object detection/argos/detections:/output_detections" -v ~/.ssh:/root/.ssh angadsingh/argos:x86_64 /usr/src/argos/stream.py --ip 0.0.0.0 --port 8081 --config configs.config_tflite_ssd

INFO:__main__:package import START
INFO:__main__:package import END
INFO:paramiko.transport:Connected (version 2.0, client OpenSSH_7.9p1)
INFO:paramiko.transport:Authentication (publickey) successful!
INFO:__main__:flask init..
INFO:__main__:start reading video file
INFO:__main__:TFObjectDetector init START
 * Serving Flask app "stream" (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
INFO:input.rtmpstream:rtmp capture init START
INFO:werkzeug: * Running on http://0.0.0.0:8081/ (Press CTRL+C to quit)
2021-01-25 05:46:00.521889: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-01-25 05:46:00.521976: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
INFO:detection.door_detect:stateHistory: []
INFO:detection.door_detect:stateHistory: []
INFO:detection.door_detect:stateHistory: []
INFO:detection.door_detect:stateHistory: []
INFO:detection.door_detect:stateHistory: []
INFO:detection.door_detect:stateHistory: []
INFO:detection.door_detect:stateHistory: []
INFO:detection.door_detect:stateHistory: []
INFO:detection.door_detect:stateHistory: []
INFO:input.rtmpstream:rtmp capture init END
INFO:__main__:TFObjectDetector init END
INFO:__main__:detect_objects init..
INFO:detection.door_detect:door state changed: DoorStates.DOOR_CLOSED
INFO:detection.door_detect:motion state changed: MotionStates.NO_MOTION
INFO:lib.ha_webhook:DoorStates.DOOR_CLOSED
INFO:__main__:od=0.00/md=0.00/st=0.00 fps
INFO:detection.door_detect:stateHistory: [DoorStates.DOOR_CLOSED[0], MotionStates.NO_MOTION[0]]
INFO:lib.ha_webhook:MotionStates.NO_MOTION
INFO:detection.door_detect:stateHistory: [DoorStates.DOOR_CLOSED[1], MotionStates.NO_MOTION[1]]
INFO:detection.door_detect:stateHistory: [DoorStates.DOOR_CLOSED[2], MotionStates.NO_MOTION[2]]
INFO:__main__:od=0.00/md=5.00/st=76.00 fps
INFO:detection.door_detect:stateHistory: [DoorStates.DOOR_CLOSED[3], MotionStates.NO_MOTION[3]]
Pinsights-MacBook-Pro:argos asingh$ docker run --rm -p8081:8081 -v "/Users/asingh/workspace/pi object detection/argos/configs:/configs" -v "/Users/asingh/workspace/pi object detection/argos/detections:/output_detections" -v ~/.ssh:/root/.ssh angadsingh/argos:x86_64_gpu /usr/src/argos/stream.py --ip 0.0.0.0 --port 8081 --config configs.config_tflite_ssd
INFO:__main__:package import START
INFO:__main__:package import END
INFO:paramiko.transport:Connected (version 2.0, client OpenSSH_7.9p1)
INFO:paramiko.transport:Authentication (publickey) successful!
INFO:__main__:flask init..
INFO:__main__:start reading video file
INFO:__main__:TFObjectDetector init START
 * Serving Flask app "stream" (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
INFO:input.rtmpstream:rtmp capture init START
INFO:werkzeug: * Running on http://0.0.0.0:8081/ (Press CTRL+C to quit)
2021-01-25 05:52:31.608788: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
INFO:detection.door_detect:stateHistory: []
INFO:detection.door_detect:stateHistory: []
INFO:detection.door_detect:stateHistory: []
INFO:detection.door_detect:stateHistory: []
INFO:detection.door_detect:stateHistory: []
INFO:detection.door_detect:stateHistory: []
INFO:detection.door_detect:stateHistory: []
INFO:detection.door_detect:stateHistory: []
INFO:detection.door_detect:stateHistory: []
INFO:detection.door_detect:stateHistory: []
INFO:input.rtmpstream:rtmp capture init END
INFO:__main__:TFObjectDetector init END
INFO:__main__:detect_objects init..
INFO:detection.door_detect:door state changed: DoorStates.DOOR_CLOSED
INFO:detection.door_detect:motion state changed: MotionStates.NO_MOTION
INFO:lib.ha_webhook:DoorStates.DOOR_CLOSED
INFO:__main__:od=0.00/md=0.00/st=0.00 fps
INFO:lib.ha_webhook:MotionStates.NO_MOTION
INFO:detection.door_detect:stateHistory: [DoorStates.DOOR_CLOSED[0], MotionStates.NO_MOTION[0]]

@edalquist
Copy link
Contributor Author

Thanks! I'll get a docker vm setup tomorrow and give it a try.

@angadsingh
Copy link
Owner

regarding your issue. i think its this:
tensorflow/tensorflow#17411
https://stackoverflow.com/questions/49094597/illegal-instruction-core-dumped-after-running-import-tensorflow

what machine are you running this on? is it a NAS? your CPU might not have AVX instructions. tensorflow apparently uses them since 1.6 (but we need 2.x so cant downgrade to 1.5) and in that case you'll have to build tensorflow from source.

try out the different tensorflow docker versions from here and see which one works on your machine: https://hub.docker.com/r/tensorflow/tensorflow/tags (install an image and then just run python and do import tensorflow)

@edalquist
Copy link
Contributor Author

Ah this is an old Xenon E5 server, I can try moving the container over to a Ryzen machine and see if it is happier.

@edalquist
Copy link
Contributor Author

That was it! Moved from a Xeon E5-2670 to a Ryzen 9 3900X and it works!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants