Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Zed_wrapper shuts down after some time #674

Closed
NikoBach2 opened this issue Feb 22, 2021 · 10 comments
Closed

[Question] Zed_wrapper shuts down after some time #674

NikoBach2 opened this issue Feb 22, 2021 · 10 comments
Assignees
Labels

Comments

@NikoBach2
Copy link

Hi

We are running 7 ZED2 cameras on 7 Jetson TX2 boards with Jetpack 4.3 . The ros master is running on an external machine, with ubuntu 16.04.
We have lately seen that the zed_wrapper nodes has shutdown after some time. The nodes does not necessarily shut down at the same time. It can vary from less than an hour to a couple of days from last reboot.
It seems to vary, whether it is the publish_state node, or the zed_node, which shuts down.

Following message:

[roslaunch][ERROR] 2021-02-22 12:37:40,620: ================================================================================REQUIRED process [portal_1_camera_0/zed_node-2] has died!
process has died [pid 7475, exit code -9, cmd /home/nvidia/catkin_ws/devel/lib/zed_wrapper/zed_wrapper_node __name:=zed_node __log:=/home/nvidia/.ros/log/a6d0f202-7509-11eb-94e4-98eecb97be3c/portal_1_camera_0-zed_node-2.log].
log file: /home/nvidia/.ros/log/a6d0f202-7509-11eb-94e4-98eecb97be3c/portal_1_camera_0-zed_node-2*.log
Initiating shutdown!
================================================================================ 

Is there a way to tell why the process has died?

Please ask for further informations.

Best regards Nikolaj

@NikoBach2 NikoBach2 changed the title [Question] [Question] Zed_wrapper shuts down after some time Feb 22, 2021
@Myzhar
Copy link
Member

Myzhar commented Feb 22, 2021

Hi @NikoBach2
is there some kind of information in the log files?
log file: /home/nvidia/.ros/log/a6d0f202-7509-11eb-94e4-98eecb97be3c/portal_1_camera_0-zed_node-2*.log

@NikoBach2
Copy link
Author

Hi @Myzhar
The only file in that folder is the roslaunch.log file

@Myzhar
Copy link
Member

Myzhar commented Feb 22, 2021

From what I know the exit code -9 means "Out of memory".
What kind of task are the ZED nodes performing? What modules have you activated?

@NikoBach2
Copy link
Author

We are running the zed_wrapper zed2.launch
And then once a while start and stop an svo-recording through the zed_node/start_svo_recording and zed_node/stop_svo_recording service. At the moment we are recording every hour, and in the meantime the zed node is just running without we write to it.

@Myzhar
Copy link
Member

Myzhar commented Feb 22, 2021

I suggest you monitor the memory usage to see if the crashes are in some way related to the status of the recording services.

@NikoBach2
Copy link
Author

I will try do that

@NikoBach2
Copy link
Author

NikoBach2 commented Feb 24, 2021

You where right, it was a memory issue, when i tried to run the zed_wrapper node in debugger mode.
I was trying to use the debugger mode to find out why the nodes broke down. And now I have a new error message, with exit code -11.

[roslaunch][ERROR] 2021-02-24 04:00:08,366: ================================================================================REQUIRED process [portal_1_camera_0/zed_node-2] has died!
process has died [pid 3844, exit code -11, cmd /home/nvidia/catkin_ws/devel/lib/zed_wrapper/zed_wrapper_node __name:=zed_node __log:=/home/nvidia/.ros/log/b7cdfa5e-75c3-11eb-927c-98eecb97be3c/portal_1_camera_0-zed_node-2.log].
log file: /home/nvidia/.ros/log/b7cdfa5e-75c3-11eb-927c-98eecb97be3c/portal_1_camera_0-zed_node-2*.log
Initiating shutdown!
================================================================================

And another camera have this error.

[roslaunch][ERROR] 2021-02-23 18:00:02,582: ================================================================================REQUIRED process [portal_1_camera_2/zed_node-2] has died!
process has died [pid 21830, exit code -6, cmd /home/nvidia/catkin_ws/devel/lib/zed_wrapper/zed_wrapper_node __name:=zed_node __log:=/home/nvidia/.ros/log/b7cdfa5e-75c3-11eb-927c-98eecb97be3c/portal_1_camera_2-zed_node-2.log].
log file: /home/nvidia/.ros/log/b7cdfa5e-75c3-11eb-927c-98eecb97be3c/portal_1_camera_2-zed_node-2*.log
Initiating shutdown!
================================================================================

What does that mean?

@NikoBach2
Copy link
Author

@Myzhar do you have a clue ?

@Myzhar
Copy link
Member

Myzhar commented Mar 2, 2021

The -11 code means Invalid access to storage., check if the ROS log folder has not filled the TX2 system space.
The -6 code means Abnormal termination., there can be many causes to this, you should investigate on the status of the TX2 board to better understand what's happening.
I suggest you try to use jtop that is a very useful tool to monitor the status of Nvidia Jetson boards.

A tip: to have a better knowledge about the meanings of the exit codes you can look at the files signum.h and signum-generic.h in the usr folder.

@NikoBach2
Copy link
Author

Thank you very much. I will try looking at it. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

2 participants