Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple RS (D435) cameras couldn't work consistently every time when launched by roslaunch #1734

Closed
Austin-Hu opened this issue Mar 3, 2021 · 15 comments
Labels

Comments

@Austin-Hu
Copy link

Hardware Configuration:

  • Intel Tiger Lake (with i5-1135G7)
  • RealSense D435 x 2

Software Configuration:

How to reproduce:

  1. roslaunch realsense2_camera rs_multiple_devices.launch serial_no_camera1:=$SN_1 serial_no_camera2:=$SN_2;
  2. Ctrl + C to terminate;
  3. Repeat Step 1 & 2.

Tthe random issue happens with the log:

[ERROR] [1614725788.584507682]: An exception has been thrown: failed to set power state
[ERROR] [1614725788.584532752]: Exception: failed to set power state
 02/03 22:56:28,584 ERROR [140219894437632] (handle-libusb.h:95) failed to claim usb interface: 0, error: RS2_USB_STATUS_BUSY
 02/03 22:56:28,584 ERROR [140219919615744] (sensor.cpp:523) acquire_power failed: failed to set power state
 02/03 22:56:28,584 WARNING [140219919615744] (rs.cpp:372) null pointer passed for argument "list"
@Austin-Hu
Copy link
Author

BTW, I also got the random issue, with the latest development commit of realsense-ros:

@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented Mar 3, 2021

Hi @Austin-Hu I note that you are using kernel 5.9. The librealsense SDK is currently only validated to use Ubuntu kernels up to 5.4. So there may be unpredictable consequences from using an officially unsupported newer kernel such as 5.9.

image

An alternative approach to ROS multicam that you could try in order to test whether it produces improved reliability for you is to launch each camera from a separate terminal using roslaunch realsense2_camera rs_camera.launch camera:=cam_1 serial_no:=$SN_1 for the first camera and roslaunch realsense2_camera rs_camera.launch camera:=cam_2 serial_no:=$SN_2 for the second camera in the other terminal.

https://github.com/IntelRealSense/realsense-ros/wiki/Showcase-of-using-2-cameras

You could also try adding the command initial_reset:=true to the end of the cam1 and cam2 roslaunch statements to perform a reset of the camera at launch to see if that helps with the Failed to set power state error.

@Austin-Hu
Copy link
Author

Hi @MartyG-RealSense,

Thanks for your quick reply!

The reason I'm using Linux kernel v5.9.0, because its i915 driver started to support Tiger Lake (with Intel Gen12 GPU) since v5.8.0.

As you suggested, launching the 2 cameras separately (sequentially) works for me, with several rounds of termination and relaunching. But we'd not prefer this workaround for our application based on realsense-ros (sorry about that).

Adding the initial_reset:=true option to load rs_multiple_devices.launch doesn't work correctly, via either:

roslaunch realsense2_camera rs_multiple_devices.launch serial_no_camera1:=$SN_1 serial_no_camera2:=$SN_2 initial_reset:=true

or:

roslaunch realsense2_camera rs_multiple_devices.launch serial_no_camera1:=$SN_1 initial_reset:=true serial_no_camera2:=$SN_2 initial_reset:=true

with the error log:

[ INFO] [1614784254.865876839]: Device with physical ID 4-1-2 was found.
[ INFO] [1614784254.865889735]: Device with name Intel RealSense D435 was found.
[ INFO] [1614784254.866255873]: Device with port number 4-1 was found.
[ INFO] [1614784254.866279055]: Device USB type: 3.2
[ INFO] [1614784254.866307950]: Resetting device...
[ INFO] [1614784254.867881351]:  
[ WARN] [1614784254.872024380]: Device 1/2 failed with exception: failed to set power state
 03/03 15:10:54,871 ERROR [140542567671552] (handle-libusb.h:95) failed to claim usb interface: 0, error: RS2_USB_STATUS_BUSY
 03/03 15:10:54,871 ERROR [140542592849664] (sensor.cpp:523) acquire_power failed: failed to set power state
 03/03 15:10:54,872 WARNING [140542592849664] (rs.cpp:306) null pointer passed for argument "device"
[ INFO] [1614784255.042615462]: Device with serial number 044322073015 was found.

[ INFO] [1614784255.042660902]: Device with physical ID 4-2-3 was found.
[ INFO] [1614784255.042676841]: Device with name Intel RealSense D435 was found.
[ INFO] [1614784255.043137360]: Device with port number 4-2 was found.
[ INFO] [1614784255.043164531]: Device USB type: 3.2
[ INFO] [1614784255.043208669]: Resetting device...

[ INFO] [1614784260.886245691]:  
[ WARN] [1614784260.900962508]: Device 1/2 failed with exception: failed to set power state
 03/03 15:11:00,899 ERROR [140477758531328] (handle-libusb.h:51) failed to open usb interface: 0, error: RS2_USB_STATUS_NO_DEVICE
 03/03 15:11:00,899 ERROR [140477775316736] (sensor.cpp:523) acquire_power failed: failed to set power state
 03/03 15:11:00,900 WARNING [140477775316736] (rs.cpp:306) null pointer passed for argument "device"
 03/03 15:11:00,918 ERROR [140477357610752] (handle-libusb.h:51) failed to open usb interface: 0, error: RS2_USB_STATUS_NO_DEVICE
[ WARN] [1614784260.919259550]: Device 2/2 failed with exception: failed to set power state
[ERROR] [1614784260.919330908]: The requested device with serial number 044322070603 is NOT found. Will Try again.

BTW, from the Linux kernel support list of librealsense, shall I submit this issue to librealsense community?

Thanks!

@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented Mar 3, 2021

Submitting a request for support of newer kernels would likely not accelerate the implementation schedule. The librealsense SDK can be built from source code in RSUSB backend mode to bypass dependence on kernel versions and Linux versions and also avoid the need for patching, though this type of build is suited to single-camera applications.

As the 400 Series cameras can work with any Intel or Arm processor, the Tiger Lake processor is unlikely to be a factor in the problem that you are experiencing with Failed to set power state.

Would it be practical to use usb_port_id in the roslaunch instruction to access a particular USB port that a camera is attached to instead of accessing the camera by its serial number? Or does the camera need to work no matter which USB port it is inserted into because it cannot be known in advance which port will be used, in which case the serial number would be the correct approach?

https://github.com/IntelRealSense/realsense-ros#launch-parameters

image

@ngaloppo
Copy link

ngaloppo commented Mar 4, 2021

Is a race in the realsense2_camera wrapper or in the librealsense library a potential culprit, given that starting two nodes sequentially with the single-device launch file seems to work consistently, but starting multiple nodes at once with a single launch file seems to fail intermittently? Perhaps a system call that should be protected a critical section?

Note that we've been observing the following kernel error (as reported by dmesg) when the reported issue occurs:

[ 4158.270972] traps: nodelet[15778] general protection fault ip:7ff3b9f7b03e sp:7ff3a5bd4c28 error:0 in librealsense2.so.2.42.0[7ff3b9ab2000+128d000]
[ 5418.013977] teams[17813]: segfault at 7f018f5ae9d0 ip 00007f01c14a6bd8 sp 00007ffe54d907b0 error 4 in libpthread-2.27.so[7f01c149e000+1a000]
[ 5418.013981] Code: 00 00 41 56 41 55 41 54 55 53 48 83 ec 40 64 48 8b 04 25 28 00 00 00 48 89 44 24 38 31 c0 48 85 ff 0f 84 d3 00 00 00 48 89 fb <8b> bf d0 02 00 00 85 ff 0f 88 c2 00 00 00 48 39 9b 28 06 00 00 0f
[ 5437.091467] perf: interrupt took too long (3148 > 3131), lowering kernel.perf_event_max_sample_rate to 63500
[ 5539.934678] teams[17854]: segfault at 7f018f5ae9d0 ip 00007f01c14a6bd8 sp 00007ffe54d907b0 error 4 in libpthread-2.27.so[7f01c149e000+1a000]
[ 5539.934683] Code: 00 00 41 56 41 55 41 54 55 53 48 83 ec 40 64 48 8b 04 25 28 00 00 00 48 89 44 24 38 31 c0 48 85 ff 0f 84 d3 00 00 00 48 89 fb <8b> bf d0 02 00 00 85 ff 0f 88 c2 00 00 00 48 39 9b 28 06 00 00 0f
[ 5559.849949] traps: nodelet[18073] general protection fault ip:7f4d4e77c03e sp:7f4d42532c28 error:0 in librealsense2.so.2.42.0[7f4d4e2b3000+128d000]
[ 5642.780361] traps: nodelet[18616] general protection fault ip:7f0990828eac sp:7f09669b9218 error:0 in libpthread-2.27.so[7f099081b000+1a000]

@Austin-Hu
Copy link
Author

Austin-Hu commented Mar 4, 2021

@MartyG-RealSense, thanks for your suggestion! But using usb_port_id isn't helpful to resolve this random issue:

roslaunch realsense2_camera rs_multiple_devices.launch usb_port_id:=4-1 usb_port_id:=4-2

The 2 RS cameras could always work successfully after rebooting system, but one of them would randomly have the RS2_USB_STATUS_BUSY error, after Ctrl + C to terminate roslaunch and typing the command to launch again. But if I repeat the terminating and launching steps later, both of the 2 cameras could occasionally work.

I'm looking at the context of Issue #1187 where you talked about the same problem.

@Austin-Hu
Copy link
Author

Once the random RS2_USB_STATUS_BUSY error happens for USB port 4-2, it would always happen later for the specific USB port 4-2 only, no matter which RS camera is attached with it (I switched the 2 RS cameras and then rebooted OS):

[ INFO] [1614825441.741981594]: Device with serial number 044322070603 was found.
[ INFO] [1614825441.742014822]: Device with physical ID 4-1-2 was found.
[ INFO] [1614825441.742025623]: Device with name Intel RealSense D435 was found.
[ INFO] [1614825441.742464931]: Device with port number 4-1 was found.
[ INFO] [1614825441.742485847]: Device USB type: 3.2

BTW, as @ngaloppo mentioned above, I also got such dmesg error during reproducing the issue:

[   52.717878] traps: nodelet[4064] general protection fault ip:7f488077803e sp:7f485fffdc28 error:0 in librealsense2.so.2.42.0[7f48802af000+128d000]

@doronhi
Copy link
Contributor

doronhi commented Mar 4, 2021

The issue seems to me not to be a ROS issue but a librealsense2 issue.
I created an inner request to try reproducing as a librealsense2 unit-test (DSO-16699). Relates to #1346.

@Austin-Hu
Copy link
Author

@doronhi, thanks for forwarding the context of #1346, and I agree with your comment:

Events when multiple process try simultaneously to access the same device should not be handled by a script, and not by realsense2_camera but by the driver layer, i.e. by librealsense2 library.

As for your suggestion that "it's best to use the build that uses the v4l backend", were you mentioning the steps of "Use the V4L Native backend by applying the kernel patching" here (althoug I'm not using Jetson)? Thanks!

@leopck
Copy link

leopck commented Mar 4, 2021

@doronhi reading from librealsense installation readme.md they explained that

 Note: Linux build configuration is presently configured to use the V4L2 backend by default.

It looks like librealsense in Linux is already using V4L2 backend.

Although realsense-ros readme.md states otherwise,

librealsense2 is not built to use native v4l2 driver but the less stable RS-USB protocol. That is because the last is more general and operational on a larger variety of platforms.

Is there a difference between the 2 librealsense2 SDK that are mentioned in these readme.md?

@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented Mar 4, 2021

Hi @leopck It is the same SDK in both cases but it will have different behaviors depending on which build method was used.

RSUSB backend (a build method previously known as libuvc backend) has the advantage when set to True of installing librealsense over an internet connection without dependence on Linux versions or kernel versions and without the need for patching. This can be useful in situations where some element of the RealSense user's Linux configuration such as the kernel version conflicts with librealsense.

It does have some limitations though and is not recommended for use in commercial projects. For more information about this, please visit the link below and scroll down through the linked-to comment to the section headed What are the advantages and disadvantages of using libuvc vs patched kernel modules?

IntelRealSense/librealsense#5212 (comment)

V4L2 backend can be considered to be a build mode where libuvc backend (installation without kernel patching) is set to False.

IntelRealSense/librealsense#6841 (comment)

@leopck
Copy link

leopck commented Mar 4, 2021

@MartyG-RealSense @doronhi thank you very much for your V4L2 backend explanation and suggestion. I've tested by recompiling the librealsense with v4l2 backend and it's now running without crashing 10/10 times of me running dual camera setup.

@ngaloppo @Austin-Hu this solution works for me, if you don't see the dual camera setup crashing issue anymore then it's great.

 04/03 17:06:19,889 WARNING [139904398915328] (backend-v4l2.cpp:1775) Invalid md size: bytes used =  0 ,start offset=10
 04/03 17:06:19,889 WARNING [139904398915328] (backend-v4l2.cpp:1775) Invalid md size: bytes used =  0 ,start offset=10
 04/03 17:06:19,911 WARNING [139904376588032] (backend-v4l2.cpp:1775) Invalid md size: bytes used =  0 ,start offset=10
 04/03 17:06:19,911 WARNING [139904376588032] (backend-v4l2.cpp:1775) Invalid md size: bytes used =  0 ,start offset=10
 04/03 17:06:19,911 WARNING [139904376588032] (backend-v4l2.cpp:1775) Invalid md size: bytes used =  0 ,start offset=10
 04/03 17:06:19,911 WARNING [139904376588032] (backend-v4l2.cpp:1775) Invalid md size: bytes used =  0 ,start offset=10
[ INFO] [1614877580.091994826]: RealSense Node Is Up!
[ WARN] [1614877580.097508556]: 
[ WARN] [1614877580.097994150]: frame's time domain is HARDWARE_CLOCK. Timestamps may reset periodically.

@MartyG-RealSense
Copy link
Collaborator

Excellent news @leopck - thanks very much for the information about your success!

@Austin-Hu
Copy link
Author

@MartyG-RealSense, @doronhi, @leopck, @ngaloppo, appreciate for your comments and suggestions! Turning off FORCE_LIBUVC to build librealsense works for me, and I can successfully launch the 2 RS cameras together every time (tried > 20 times).

@MartyG-RealSense, please help forward such information (or workaround) of applying the V4L backend (for Linux only) to other issues with similar problem.

Thanks again, and close this issue.

@MartyG-RealSense
Copy link
Collaborator

Great news @Austin-Hu that -DFORCE_LIBUVC=OFF solved your multiple camera problem with Failed to set power state. I have added notes to my research records so that they come up when researching future cases similar to this one. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants