Skip to content

🐛 [Bug] Segmentation Fault when running on Jetson Orin NX #1891

@janblumenkamp

Description

@janblumenkamp

Bug Description

I built TensorRT for the Jetson Orin NX. I followed the instructions here and am building on the Jetson on the pyt2.0 branch, which uses the TensorRT 1.4.0 RC.

To Reproduce

I use the test script from here. When I try to run it, I get the following output:

/usr/local/lib/python3.8/dist-packages/torch_tensorrt/fx/tracer/acc_tracer/acc_ops.py:840: UserWarning: Unable to import torchvision related libraries.: No module named 'torchvision'. Please install torchvision lib in order to lower stochastic_depth
  warnings.warn(
WARNING: [Torch-TensorRT TorchScript Conversion Context] - Unable to determine GPU memory usage
WARNING: [Torch-TensorRT TorchScript Conversion Context] - Unable to determine GPU memory usage
WARNING: [Torch-TensorRT TorchScript Conversion Context] - CUDA initialization failure with error: 222. Please check your CUDA installation:  http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html
Segmentation fault (core dumped)

Expected behavior

The test model is converted successfully.

Environment

Build information about Torch-TensorRT can be found by turning on debug messages

  • Torch-TensorRT Version (e.g. 1.0.0): 1.4.0 (pyt2.0 branch ec06d6f)
  • PyTorch Version (e.g. 1.0): 2.0.0a0+8aa34602.nv23.03 (installed from NVidia as instructed here)
  • CPU Architecture: aarch64
  • OS (e.g., Linux): Linux-5.10.104-tegra-aarch64-with-glibc2.29
  • How you installed PyTorch (conda, pip, libtorch, source): NVidia compiled version, torch works on CUDA.
  • Build command you used (if compiling from source):
    • bazel build //:libtorchtrt --platforms //toolchains:jetpack_5.0
    • python3 setup.py install --use-cxx11-abi --jetpack-version 5.0
  • Are you using local sources or building from archives: Local sources
  • Python version: 3.8.10 (default, Mar 13 2023, 10:26:41) [GCC 9.4.0] (64-bit runtime)
  • CUDA version: Jetson Orin NX
  • GPU models and configuration: Jetson Orin NX
  • Any other relevant information: N/A

Additional context

Output of python3 -m torch.utils.collect_env:

nvidia@tegra-ubuntu:~$ python3 -m torch.utils.collect_env
Collecting environment information...
PyTorch version: 2.0.0a0+8aa34602.nv23.03
Is debug build: False
CUDA used to build PyTorch: 11.4
ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.5 LTS (aarch64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.31

Python version: 3.8.10 (default, Mar 13 2023, 10:26:41)  [GCC 9.4.0] (64-bit runtime)
Python platform: Linux-5.10.104-tegra-aarch64-with-glibc2.29
Is CUDA available: True
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: Could not collect
Nvidia driver version: Could not collect
cuDNN version: Probably one of the following:
/usr/lib/aarch64-linux-gnu/libcudnn.so.8.6.0
/usr/lib/aarch64-linux-gnu/libcudnn_adv_infer.so.8.6.0
/usr/lib/aarch64-linux-gnu/libcudnn_adv_train.so.8.6.0
/usr/lib/aarch64-linux-gnu/libcudnn_cnn_infer.so.8.6.0
/usr/lib/aarch64-linux-gnu/libcudnn_cnn_train.so.8.6.0
/usr/lib/aarch64-linux-gnu/libcudnn_ops_infer.so.8.6.0
/usr/lib/aarch64-linux-gnu/libcudnn_ops_train.so.8.6.0
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: False

CPU:
Architecture:                    aarch64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
CPU(s):                          8
On-line CPU(s) list:             0-7
Thread(s) per core:              1
Core(s) per socket:              4
Socket(s):                       2
Vendor ID:                       ARM
Model:                           1
Model name:                      ARMv8 Processor rev 1 (v8l)
Stepping:                        r0p1
CPU max MHz:                     1984.0000
CPU min MHz:                     115.2000
BogoMIPS:                        62.50
L1d cache:                       512 KiB
L1i cache:                       512 KiB
L2 cache:                        2 MiB
L3 cache:                        4 MiB
Vulnerability Itlb multihit:     Not affected
Vulnerability L1tf:              Not affected
Vulnerability Mds:               Not affected
Vulnerability Meltdown:          Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:        Mitigation; __user pointer sanitization
Vulnerability Spectre v2:        Not affected
Vulnerability Srbds:             Not affected
Vulnerability Tsx async abort:   Not affected
Flags:                           fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp uscat ilrcpc flagm

Versions of relevant libraries:
[pip3] mypy-extensions==1.0.0
[pip3] numpy==1.19.4
[pip3] torch==2.0.0a0+8aa34602.nv23.3
[pip3] torch-tensorrt==1.3.0+3d6a1ba5
[conda] Could not collect

Metadata

Metadata

Assignees

Labels

No ActivitybugSomething isn't workingplatform: aarch64Bugs regarding the x86_64 builds of TRTorch

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions