You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've searched other issues and no duplicate issues were found.
I'm convinced that this is not my fault but a bug.
Description
Inside ghcr.io/autowarefoundation/autoware-openadk:latest-devel-cuda container
Im trying to use tensorrt_yolox package. The package includes some CUDA kernels which fails to build and shows the following warning:
--- stderr: tensorrt_yolox
CMake Warning at CMakeLists.txt:19 (message):
CUDA is not found. preprocess acceleration using CUDA will not be
available.
It seems that CMake variable CMAKE_CUDA_COMPILER is not set
Then while using tensorrt_yolox for object detection, the system crashes with the following error:
[tensorrt_yolox_node_exe-2] /home/os/elm/autoware/install/tensorrt_yolox/lib/tensorrt_yolox/tensorrt_yolox_node_exe: symbol lookup error: /home/os/elm/autoware/install/tensorrt_yolox/lib/libtensorrt_yolox.so: undefined symbol: _ZN14tensorrt_yolox50resize_bilinear_letterbox_nhwc_to_nchw32_batch_gpuEPfPhiiiiiiifP11CUstream_st
[ERROR] [tensorrt_yolox_node_exe-2]: process has died [pid 977, exit code 127, cmd '/home/os/elm/autoware/install/tensorrt_yolox/lib/tensorrt_yolox/tensorrt_yolox_node_exe --ros-args -r __node:=tensorrt_yolox --params-file /tmp/launch_params_d1ll7q3z --params-file /tmp/launch_params_cq_ya7ic -r ~/in/image:=/fr_camera/image_rect -r ~/out/objects:=roi0'].
The missing symbol is actually a CUDA kernel that failed to build previously.
Expected behavior
Docker OpenADK Image should have the CUDA support and be able to properly build tensorrt_yolox. By doing that, the runtime error of the missing symbol will not be there anymore.
Actual behavior
tensorrt_yolox builds with a Warning and skips building the CUDA kernels, which leads to a runtime crash later.
colcon build --symlink-install --cmake-args -DCMAKE_BUILD_TYPE=Release --packages-select tensorrt_yolox you should notice the cmake warning mentioned above.
ros2 launch tensorrt_yolox yolox_s_plus_opt.launch.xml input/image:=/img output/objects:=/roi0 Thats an example for launch an object detection model. Once you subscribe to output topic ros2 topic echo /roi0 you should get the runtime error mentioned above.
Versions
No response
Possible causes
After some investigation and trying to build the official CUDA Samples to track the issue, it appeared that some cuda libraries were missing
/usr/bin/ld: cannot find -lcudadevrt
/usr/bin/ld: cannot find -lcudart_static
After applying the following patch and rebuilding the docker image, the cuda kernels were built and object detection model was running well.
Checklist
Description
Inside ghcr.io/autowarefoundation/autoware-openadk:latest-devel-cuda container
Im trying to use tensorrt_yolox package. The package includes some CUDA kernels which fails to build and shows the following warning:
--- stderr: tensorrt_yolox
CMake Warning at CMakeLists.txt:19 (message):
CUDA is not found. preprocess acceleration using CUDA will not be
available.
It seems that CMake variable CMAKE_CUDA_COMPILER is not set
Then while using tensorrt_yolox for object detection, the system crashes with the following error:
[tensorrt_yolox_node_exe-2] /home/os/elm/autoware/install/tensorrt_yolox/lib/tensorrt_yolox/tensorrt_yolox_node_exe: symbol lookup error: /home/os/elm/autoware/install/tensorrt_yolox/lib/libtensorrt_yolox.so: undefined symbol: _ZN14tensorrt_yolox50resize_bilinear_letterbox_nhwc_to_nchw32_batch_gpuEPfPhiiiiiiifP11CUstream_st
[ERROR] [tensorrt_yolox_node_exe-2]: process has died [pid 977, exit code 127, cmd '/home/os/elm/autoware/install/tensorrt_yolox/lib/tensorrt_yolox/tensorrt_yolox_node_exe --ros-args -r __node:=tensorrt_yolox --params-file /tmp/launch_params_d1ll7q3z --params-file /tmp/launch_params_cq_ya7ic -r ~/in/image:=/fr_camera/image_rect -r ~/out/objects:=roi0'].
The missing symbol is actually a CUDA kernel that failed to build previously.
Expected behavior
Actual behavior
tensorrt_yolox builds with a Warning and skips building the CUDA kernels, which leads to a runtime crash later.
Steps to reproduce
Inside ghcr.io/autowarefoundation/autoware-openadk:latest-devel-cuda container
Versions
No response
Possible causes
After some investigation and trying to build the official CUDA Samples to track the issue, it appeared that some cuda libraries were missing
/usr/bin/ld: cannot find -lcudadevrt
/usr/bin/ld: cannot find -lcudart_static
After applying the following patch and rebuilding the docker image, the cuda kernels were built and object detection model was running well.
Additional context
No response
The text was updated successfully, but these errors were encountered: