Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

can not install mxnet on jetson nano, jetpack 4.3, mxnet 1.4 #17923

Closed
JustinhoCHN opened this issue Mar 27, 2020 · 7 comments
Closed

can not install mxnet on jetson nano, jetpack 4.3, mxnet 1.4 #17923

JustinhoCHN opened this issue Mar 27, 2020 · 7 comments
Labels

Comments

@JustinhoCHN
Copy link

JustinhoCHN commented Mar 27, 2020

Description

Try to install mxnet on jetson nano, according to mxnet jetson_setup guide, during installation, no error raised, but when I import mxnet in python3, "Segmentation fault (core dumped)"

Error Message

root@jetbot:/home/jetbot/Downloads# python3
Python 3.6.9 (default, Nov  7 2019, 10:44:02) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import mxnet
Segmentation fault (core dumped)

To Reproduce

My installation procedure:

  1. install dependencies:
sudo apt update
sudo apt -y install \
                        build-essential \
                        git \
                        graphviz \
                        libatlas-base-dev \
                        libopencv-dev \
                        python3-pip

sudo pip3 install --upgrade \
                        pip \
                        setuptools

sudo pip3 install \
                        graphviz==0.8.4 \
                        jupyter \
                        numpy==1.18.2

a little different form official guide, which I use pip3 because I need to use python3, and the numpy version is 1.18.2 rather than 1.15.2 because the later pip install suggested to do so.

  1. clone the repo
git clone --recursive https://github.com/apache/incubator-mxnet.git mxnet
  1. environment
export PATH=/usr/local/cuda/bin:$PATH
export MXNET_HOME=$HOME/mxnet/
export PYTHONPATH=$MXNET_HOME/python:$PYTHONPATH
  1. check cuda version
nvcc --version

we can get:

Copyright (c) 2005-2019 NVIDIA Corporation
Built on Mon_Mar_11_22:13:24_CDT_2019
Cuda compilation tools, release 10.0, V10.0.326
  1. download the whl file and libmxnet.so
wget -c https://s3.us-east-2.amazonaws.com/mxnet-public/install/jetson/1.4.0/mxnet-1.4.0-cp36-cp36m-linux_aarch64.whl  # I'm using python3.6

wget -c https://s3.us-east-2.amazonaws.com/mxnet-public/install/jetson/1.4.1/libmxnet.so
mkdir mxnet/lib
mv libmxnet.so mxnet/lib  # put this file in lib
  1. install the whl
sudo pip install mxnet-1.4.0-cp27-cp27mu-linux_aarch64.whl
  1. try to import
root@jetbot:/home/jetbot/Downloads# python3
Python 3.6.9 (default, Nov  7 2019, 10:44:02) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import mxnet
Segmentation fault (core dumped)

What have you tried to solve it?

  1. try to build from source, it took me 2 hours and exited with error (sorry I forget to log the error info)
  2. reinstall the os, reflash the image, and try the jetson_setup guide again.

Environment

We recommend using our script for collecting the diagnositc information. Run the following command and paste the outputs below:

curl --retry 10 -s https://raw.githubusercontent.com/dmlc/gluon-nlp/master/tools/diagnose.py | python

outputs here

----------Python Info----------
Version      : 3.6.9
Compiler     : GCC 8.3.0
Build        : ('default', 'Nov  7 2019 10:44:02')
Arch         : ('64bit', 'ELF')
------------Pip Info-----------
Version      : 20.0.2
Directory    : /usr/local/lib/python3.6/dist-packages/pip
----------MXNet Info-----------
Hashtag not found. Not installed from pre-built package.
----------System Info----------
Platform     : Linux-4.9.140-tegra-aarch64-with-Ubuntu-18.04-bionic
system       : Linux
node         : jetbot
release      : 4.9.140-tegra
version      : #1 SMP PREEMPT Mon Dec 9 22:47:42 PST 2019
----------Hardware Info----------
machine      : aarch64
processor    : aarch64
Architecture:        aarch64
Byte Order:          Little Endian
CPU(s):              4
On-line CPU(s) list: 0-3
Thread(s) per core:  1
Core(s) per socket:  4
Socket(s):           1
Vendor ID:           ARM
Model:               1
Model name:          Cortex-A57
Stepping:            r1p1
CPU max MHz:         1479.0000
CPU min MHz:         102.0000
BogoMIPS:            38.40
L1d cache:           32K
L1i cache:           48K
L2 cache:            2048K
Flags:               fp asimd evtstrm aes pmull sha1 sha2 crc32
----------Network Test----------
Setting timeout: 10
Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0066 sec, LOAD: 1.6708 sec.
Timing for GluonNLP GitHub: https://github.com/dmlc/gluon-nlp, DNS: 0.0013 sec, LOAD: 1.3129 sec.
Timing for GluonNLP: http://gluon-nlp.mxnet.io, DNS: 0.0048 sec, LOAD: 0.5813 sec.
Timing for D2L: http://d2l.ai, DNS: 0.0032 sec, LOAD: 0.3769 sec.
Timing for D2L (zh-cn): http://zh.d2l.ai, DNS: 0.0027 sec, LOAD: 0.2090 sec.
Timing for FashionMNIST: https://repo.mxnet.io/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, DNS: 0.0038 sec, LOAD: 0.3148 sec.
Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0026 sec, LOAD: 4.6375 sec.
Timing for Conda: https://repo.continuum.io/pkgs/free/, DNS: 0.0038 sec, LOAD: 3.1302 sec.

Another info:

  • image version: jetpack 4.3
  • cuda: 10.0
  • python: 3.6
@JustinhoCHN
Copy link
Author

2020.03.30 update:
I followed the docker cross-compille method, but failed, error log:

aarch64-unknown-linux-gnueabi-g++: internal compiler error: Killed (program cc1plus)
0x55a661ee7e24 execute
	/dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:2854
0x55a661ee8174 do_spec_1
	/dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:4658
0x55a661eeabc3 process_brace_body
	/dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5941
0x55a661eeabc3 handle_braces
	/dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5855
0x55a661ee88c9 do_spec_1
	/dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5312
0x55a661eeabc3 process_brace_body
	/dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5941
0x55a661eeabc3 handle_braces
	/dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5855
0x55a661ee88c9 do_spec_1
	/dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5312
0x55a661ee870b do_spec_1
	/dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5427
0x55a661eeabc3 process_brace_body
	/dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5941
0x55a661eeabc3 handle_braces
	/dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5855
0x55a661ee88c9 do_spec_1
	/dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5312
0x55a661eeabc3 process_brace_body
	/dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5941
0x55a661eeabc3 handle_braces
	/dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5855
0x55a661ee88c9 do_spec_1
	/dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5312
0x55a661eeabc3 process_brace_body
	/dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5941
0x55a661eeabc3 handle_braces
	/dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5855
0x55a661ee88c9 do_spec_1
	/dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5312
0x55a661eeabc3 process_brace_body
	/dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5941
0x55a661eeabc3 handle_braces
	/dockcross/crosstool/toolchain/.build/src/gcc-4.9.4/gcc/gcc.c:5855
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.
make: *** [build/src/operator/numpy/np_percentile_op.o] Error 4
make: *** Waiting for unfinished jobs....
Makefile:569: recipe for target 'build/src/operator/numpy/np_percentile_op.o' failed
2020-03-30 11:15:50,474 - root - INFO - Waiting for status of container 7ee922ca33a2 for 600 s.
2020-03-30 11:15:50,659 - root - INFO - Container exit status: {'Error': None, 'StatusCode': 2}
2020-03-30 11:15:50,659 - root - ERROR - Container exited with an error 
2020-03-30 11:15:50,659 - root - INFO - Executed command for reproduction:

ci/build.py -p jetson

2020-03-30 11:15:50,659 - root - INFO - Stopping container: 7ee922ca33a2
2020-03-30 11:15:50,660 - root - INFO - Removing container: 7ee922ca33a2
2020-03-30 11:15:50,723 - root - INFO - Other running containers: ['e9ad8fc19642', '6acad0d82e1c', '5132086269ff', 'a6249b881f24', 'f0ad24aeb876', '6bb047a698b7', '9e6f0d904a4b', '241f1bc44984', '19ff0cea2fa5', '868864fd7dd7']
2020-03-30 11:15:50,723 - root - CRITICAL - Execution of ['/work/mxnet/ci/docker/runtime_functions.sh', 'build_jetson'] failed with status: 2

platform: x86 pc, i7-9700, rtx 2080ti

@JustinhoCHN
Copy link
Author

Guys, I solve that problem, from nvidia jetson forums advice:

  1. Download the mxnet-1.6 whl for jetson: link

sudo pip3 install numpy==1.16
sudo pip3 install mxnet-1.6.0-py3-none-any.whl
  1. switch to sudo account! that’s import for me:
sudo su
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/mxnet
root@jetbot:/home/jetbot# python3
Python 3.6.9 (default, Nov  7 2019, 10:44:02) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import mxnet as mx
>>> mx.__version__
'1.6.0'

If I use python3 under jetbot account, will thorw segmentation fault error, seems I forgot change to sudo.

@bhavitvyamalik
Copy link

bhavitvyamalik commented May 14, 2020

@JustinhoCHN what was the CUDA version of your Jetson Nano image? It gives me the following error:

>>> import mxnet
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/dist-packages/mxnet/__init__.py", line 24, in <module>
    from .context import Context, current_context, cpu, gpu, cpu_pinned
  File "/usr/local/lib/python3.6/dist-packages/mxnet/context.py", line 24, in <module>
    from .base import classproperty, with_metaclass, _MXClassPropertyMetaClass
  File "/usr/local/lib/python3.6/dist-packages/mxnet/base.py", line 214, in <module>
    _LIB = _load_lib()
  File "/usr/local/lib/python3.6/dist-packages/mxnet/base.py", line 205, in _load_lib
    lib = ctypes.CDLL(lib_path[0], ctypes.RTLD_LOCAL)
  File "/usr/lib/python3.6/ctypes/__init__.py", line 348, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libcudart.so.10.0: cannot open shared object file: No such file or directory

My Jetson Nano image has CUDA 10.2

@JustinhoCHN
Copy link
Author

@JustinhoCHN what was the CUDA version of your Jetson Nano image? It gives me the following error:

>>> import mxnet
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/dist-packages/mxnet/__init__.py", line 24, in <module>
    from .context import Context, current_context, cpu, gpu, cpu_pinned
  File "/usr/local/lib/python3.6/dist-packages/mxnet/context.py", line 24, in <module>
    from .base import classproperty, with_metaclass, _MXClassPropertyMetaClass
  File "/usr/local/lib/python3.6/dist-packages/mxnet/base.py", line 214, in <module>
    _LIB = _load_lib()
  File "/usr/local/lib/python3.6/dist-packages/mxnet/base.py", line 205, in _load_lib
    lib = ctypes.CDLL(lib_path[0], ctypes.RTLD_LOCAL)
  File "/usr/lib/python3.6/ctypes/__init__.py", line 348, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libcudart.so.10.0: cannot open shared object file: No such file or directory

My Jetson Nano image has CUDA 10.2

mine is cuda 10.0, jetpack 4.3

@bhavitvyamalik
Copy link

bhavitvyamalik commented May 14, 2020

Do you still have the image which you used on your Jetson Nano? Nvidia keeps on updating their images regularly and I ended up with CUDA 10.2 :/

Edit: This works?
image

@JustinhoCHN
Copy link
Author

Do you still have the image which you used on your Jetson Nano? Nvidia keeps on updating their images regularly and I ended up with CUDA 10.2 :/

Edit: This works?
image

yes it is

@JustinhoCHN
Copy link
Author

Do you still have the image which you used on your Jetson Nano? Nvidia keeps on updating their images regularly and I ended up with CUDA 10.2 :/

Edit: This works?
image

I changed the os to jetpack 4.4 DP, same issue as yours, I try to install cuda 10.0 in jetpack 4.4, but when I import mxnet, it'll just quit with "segmentation fault". I'm working on compile the cuda 10.2 python whl for jetson. I'll let you know if I make it.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants