Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segfault on re-init pipeline #182

Open
xback opened this issue Jun 1, 2022 · 11 comments
Open

segfault on re-init pipeline #182

xback opened this issue Jun 1, 2022 · 11 comments

Comments

@xback
Copy link

xback commented Jun 1, 2022

Hi,
I've made a few functions in C which constructs a pipeline using nvinfer and this yolo lib.

When:

  • Create a pipeline
  • Run it (engine gets created 1st time and all works well)
  • Cleanup pipeline
  • re-create the pipeline
  • Run it (crash!)

Following crash happens 100%

image

Disassembly:

image

@marcoslucianops
Copy link
Owner

Did you change something in nvdsinfer_custom_impl_Yolo files? Which is your Ubuntu, CUDA, TensorRT and DeepStream versions? Which model are you using?

@xback
Copy link
Author

xback commented Jun 7, 2022

Nothing was changed.

  • Ubuntu 18.04 (customized)
  • Jetpack 4.6.2 (32.7.2)
  • CUDA: 10.2
  • Deepstream 6.0.1
  • Jetson TX2
  • CTI spacely carrier board using their BSP

@xback
Copy link
Author

xback commented Jun 7, 2022

It looks like lots of other issues are also present regarding argus daemon and such ..
Most people run a single process, which create a single pipeline once, and restart the process if something occurs.
And as bonus .. nvidia staff on the forum kind-off expect it to be used this way 🙄

In my case, I create a process, create/clean multiple pipelines in parallel (thread per pipeline) without ever killing the process to start over ..

Having ancient kernel and gstreamer versions at nvidia's side doesn't help either ..

@marcoslucianops
Copy link
Owner

I use a dynamic pipeline for my commercial projects without any issue. But I keep the pipeline running and change the input and output elements (adding or removing them).

@marmikshah
Copy link

marmikshah commented Oct 20, 2022

Hey, I am facing the exact same issue. Was there any fix for this?
I have a Yolov4 that I need to run for about 2k video files so I cleanup and re-init the pipeline for each file.
The seg fault happens after about 40 files. (If this helps, each file is about 30-45 seconds long)

When I switch to the https://github.com/NVIDIA-AI-IOT/yolov4_deepstream, it seems to work without any issues.

My Setup is:

Ubuntu 20.04
Deepstream 6.0.1
CUDA: 11.4
GPU: RTX 2080Ti (And also tested on RTX 3090)

Thanks.

@marcoslucianops
Copy link
Owner

Can you check the RAM memory usage when your code is running? If possible, run with gdb to debug the segmentation fault.

@marmikshah
Copy link

Hey,
The RAM is constant throughout the run until segfault is reached. The GPU memory usage is also constant and does not increase. So it does not really look like a memory leak issue.

I ran with gdb and it segfaults with this error

Thread 1 "python3" received signal SIGSEGV, Segmentation fault.
0x00007fb818d550fc in ?? () from /usr/lib/x86_64-linux-gnu/libnvinfer.so.8

Are there any specific gdb flags you want me to try?
Incase this is of any importance, my code is in Python and not C++.

Thanks.

@thunder95
Copy link

@marmikshah I met the same problem. did you solve it?

CUDA: 11.7
DPS: 6.1

@marcoslucianops this is my log:
#0 0x00007fffee05b470 in ?? () from /lib/x86_64-linux-gnu/libnvinfer.so.8
#1 0x00007fffc0252d91 in getNumChannels(nvinfer1::ITensor*) () from /home/nvidia/dps-app/nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
#2 0x00007fffc02564e3 in Yolo::buildYoloNetwork(std::vector<float, std::allocator >&, nvinfer1::INetworkDefinition&) ()
from /home/nvidia/dps-app/nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
#3 0x00007fffc0255ce9 in Yolo::parseModel(nvinfer1::INetworkDefinition&) () from /home/nvidia/dps-app/nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
#4 0x00007fffc0255906 in Yolo::createEngine(nvinfer1::IBuilder*, nvinfer1::IBuilderConfig*) () from /home/nvidia/dps-app/nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
#5 0x00007fffc02690ca in NvDsInferYoloCudaEngineGet () from /home/nvidia/dps-app/nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
#6 0x00007fffc1c22603 in ?? () from ///opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer.so
#7 0x00007fffc1c29a6b in ?? () from ///opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer.so
#8 0x00007fffc1c00463 in nvdsinfer::NvDsInferContextImpl::buildModel(_NvDsInferContextInitParams&) () from ///opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer.so
#9 0x00007fffc1c00d63 in nvdsinfer::NvDsInferContextImpl::generateBackendContext(_NvDsInferContextInitParams&) () from ///opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer.so
#10 0x00007fffc1c057cd in nvdsinfer::NvDsInferContextImpl::initialize(_NvDsInferContextInitParams&, void*, void ()(INvDsInferContext, unsigned int, NvDsInferLogLevel, char const*, void*)) ()
from ///opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer.so
#11 0x00007fffc1c060ce in createNvDsInferContext(INvDsInferContext**, _NvDsInferContextInitParams&, void*, void ()(INvDsInferContext, unsigned int, NvDsInferLogLevel, char const*, void*)) ()
from ///opt/nvidia/deepstream/deepstream-6.1/lib/libnvds_infer.so
#12 0x00007fffc1ca57a3 in ?? () from /usr/lib/x86_64-linux-gnu/gstreamer-1.0/deepstream/libnvdsgst_infer.so
#13 0x00007fffd3e9f931 in ?? () from /lib/x86_64-linux-gnu/libgstbase-1.0.so.0
#14 0x00007fffd3e9fbb5 in ?? () from /lib/x86_64-linux-gnu/libgstbase-1.0.so.0
#15 0x00007ffff32c46fb in ?? () from /lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
#16 0x00007ffff32c4e18 in gst_pad_set_active () from /lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
#17 0x00007ffff32a1d25 in ?? () from /lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
#18 0x00007ffff32b4d3c in gst_iterator_fold () from /lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
#19 0x00007ffff32a2546 in ?? () from /lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
#20 0x00007ffff32a455e in ?? () from /lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
#21 0x00007ffff32a47e8 in ?? () from /lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
#22 0x00007ffff32a69d2 in gst_element_change_state () from /lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
#23 0x00007ffff32a7119 in ?? () from /lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
#24 0x00007ffff32831b8 in ?? () from /lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
#25 0x00007ffff32a69d2 in gst_element_change_state () from /lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
#26 0x00007ffff32a7119 in ?? () from /lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
#27 0x00007ffff32831b8 in ?? () from /lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
#28 0x00007ffff32a69d2 in gst_element_change_state () from /lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
#29 0x00007ffff32a6a1b in gst_element_change_state () from /lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
#30 0x00007ffff32a7119 in ?? () from /lib/x86_64-linux-gnu/libgstreamer-1.0.so.0
#31 0x00005555555641df in main ()

@marmikshah
Copy link

@thunder95 I have not been able to find a solution to it. So temporarily I have reverted to using DetectNet instead of Yolo.

@marcoslucianops
Copy link
Owner

Try to use the new ONNX implementation.

@Beking1949
Copy link

hi, Has this problem been solved?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants