Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Baseline calculation problem in rtabmap stereo with camera info from D435i #1310

Closed
BarzelS opened this issue Jul 28, 2020 · 26 comments
Closed
Labels

Comments

@BarzelS
Copy link

BarzelS commented Jul 28, 2020

Hi,
I'm getting this error in rtabmap odometry stereo:
The stereo baseline (0.0) should be positive (baseline=-Tx/fx). We assume a horizontal left/right stereo "
"setup where the Tx (or P(0,3)) is negative in the right camera info msg."
It takes data from both /camera/infra1/camera_info and /camera/infra2/camera_info to calculate based on this equation:
baseline = right_.fx()!=0.0 && left_.fx() != 0.0 ? left_.Tx() / left_.fx() - right_.Tx()/right_.fx():0.0;
I found out that both left_.Tx and right_.Tx in the camera info topics are both 0.0,
How can it be if the baseline should be 0.05m(The distance between the infra camera)?

@doronhi
Copy link
Contributor

doronhi commented Jul 28, 2020

The information regarding the baseline does not appear in the camera info as each cameraInfo topic describes a single sensor. It appears in the tf messages.
This is changed in #1242 . Does that PR match your needs?

@BarzelS
Copy link
Author

BarzelS commented Jul 28, 2020

The information regarding the baseline does not appear in the camera info as each cameraInfo topic describes a single sensor. It appears in the tf messages.
This is changed in #1242 . Does that PR match your needs?

If I understand correctly from the PR the data in "/camera/infra1/camera_info" and /camera/infra2/camera_info" is currently not correct(RealSense ROS v2.2.15) and the mentioned PR will solve it?
If that's the case so yes the PR match my needs.

@doronhi
Copy link
Contributor

doronhi commented Jul 28, 2020

This issue is being discussed for a long time now. The main reason is that I wan't convinced what "correct" is, nor how to check it.
Therefor I will greatly appreciate if you managed to give it a try and see if it works well with rtabmap odometry stereo.

@BarzelS
Copy link
Author

BarzelS commented Jul 28, 2020

It's not parameters that should be obtained from the Intel-Realsense hardware team?
I will try to compile the PR and check it

@doronhi
Copy link
Contributor

doronhi commented Jul 30, 2020

I'd like to know if that PR works for you.
In any case, the information published on the /tf topic is based on the extrinsics given by the device.

@BarzelS
Copy link
Author

BarzelS commented Jul 30, 2020

I'd like to know if that PR works for you.
In any case, the information published on the /tf topic is based on the extrinsics given by the device.

I've updated to the PR you mentioned, it seems that now the baseline in not 0.0 but I'm still getting this error:
[ERROR] [1596108006.583834357]: The stereo baseline (-0.049855) should be positive (baseline=-Tx/fx). We assume a horizontal left/right stereo setup where the Tx (or P(0,3)) is negative in the right camera info msg.

I've tried to switch between the cameras topics but the stereo_odometry node crashed with a segmentation fault from Eigen.

@RealSenseSupport
Copy link
Collaborator

@DronistB Thanks for your verification feedback! We've linked to the contributor of PR#1242 to see if any input about this. Thanks!

@mindThomas
Copy link

mindThomas commented Aug 26, 2020

@DronistB Personally I have been struggling with getting a rectification process working with the T265 camera. Whether or not you would be struggling with similar issues is unfamiliar to me.
Since I was unable to get rtabmaps stereo rectification nor the stereo_image_proc node to work, so I moved to the one from ethzl: https://github.com/ethz-asl/image_undistort/
Would you mind to give this one a try combined with my PR #1242 ?
Alternatively you can also try this plainer PR #901 which just includes the missing translational part in the P matrix.

@BarzelS
Copy link
Author

BarzelS commented Aug 27, 2020

@DronistB Personally I have been struggling with getting a rectification process working with the T265 camera. Whether or not you would be struggling with similar issues is unfamiliar to me.
Since I was unable to get rtabmaps stereo rectification nor the stereo_image_proc node to work, so I moved to the one from ethzl: https://github.com/ethz-asl/image_undistort/
Would you mind to give this one a try combined with my PR #1242 ?
Alternatively you can also try this plainer PR #901 which just includes the missing translational part in the P matrix.

I'd like to know if that PR works for you.
In any case, the information published on the /tf topic is based on the extrinsics given by the device.

I've updated to the PR you mentioned, it seems that now the baseline in not 0.0 but I'm still getting this error:
[ERROR] [1596108006.583834357]: The stereo baseline (-0.049855) should be positive (baseline=-Tx/fx). We assume a horizontal left/right stereo setup where the Tx (or P(0,3)) is negative in the right camera info msg.

I've tried to switch between the cameras topics but the stereo_odometry node crashed with a segmentation fault from Eigen.

Hi,
As you can see above I've tried PR #1242 and got the issues as described.
I'm talking about the stereo_odometry node and not the stereo_image_proc. Have you managed to get a stable stereo odometry using your updates?
Thanks.

@mindThomas
Copy link

I am not sure that rtabmap, and thus the stereo_odometry node, supports the fisheye calibration used for the Realsense cameras
That 's the reason for me suggesting you to try the image_undistort or stereo_image_proc nodes and feed the rectified input into rtabmap.
See more here:

@BarzelS
Copy link
Author

BarzelS commented Aug 30, 2020

Thanks @mindThomas
Can you elaborate which inputs from D435i, I will need to insert to the image_undistort node and which outputs of it will I need to use in the stereo odometry node?

@RealSenseSupport
Copy link
Collaborator

@DronistB Did you get it through? Thanks!

@RealSenseSupport
Copy link
Collaborator

@DronistB Did you still have question about this? Looking forward to your update. Thanks!

@RealSenseSupport
Copy link
Collaborator

@DronistB Sorry we didn't hear from you for weeks. Will close this at this point. Please feel free to create new one if you still have questions or issues. Thanks!

@TouchDeeper
Copy link

I also have the same issue when run vins-fusion supported rtabmap:
The stereo baseline (-0.049985) should be positive (baseline=-Tx/fx). We assume a horizontal left/right stereo setup where the Tx (or P(0,3)) is negative in the right camera info msg.
Mathieu, the author of the rtabmap, said this error is caused by the positive P(0,3) of the projection matrix of the right camera_info in this issue. Is there a quick fix to negate the P(0,3)?

@BarzelS
Copy link
Author

BarzelS commented Jan 4, 2021

I also have the same issue when run vins-fusion supported rtabmap:
The stereo baseline (-0.049985) should be positive (baseline=-Tx/fx). We assume a horizontal left/right stereo setup where the Tx (or P(0,3)) is negative in the right camera info msg.
Mathieu, the author of the rtabmap, said this error is caused by the positive P(0,3) of the projection matrix of the right camera_info in this issue. Is there a quick fix to negate the P(0,3)?

I would be really happy to get a solution to this problem as well.
Does the problem is related to this usage:

void BaseRealSenseNode::updateStreamCalibData(const rs2::video_stream_profile& video_profile)
{
    stream_index_pair stream_index{video_profile.stream_type(), video_profile.stream_index()};
    auto intrinsic = video_profile.get_intrinsics();

Does the get_intrinsics function returns false data?
Thanks

@doronhi
Copy link
Contributor

doronhi commented Jan 12, 2021

The value regarding the distance between the 2 cameras is not coming from video_profile.get_intrinsics(); but from auto LEFT_T_RIGHT = right_video_profile.get_extrinsics_to(left_video_profile);
The value there is correct.
There seems, to me, a disagreement regarding the sign of the baseline. I'll have to look into it.
If I understand correctly, the sign of the given baseline (~0.0499) is opposite then required by rtabmap (according to @SBarzz , @TouchDeeper ) but complies with image_undistort ( according to @mindThomas )
(created inner report: DSO-16391)

@pavloblindnology
Copy link
Contributor

pavloblindnology commented Feb 24, 2021

The value regarding the distance between the 2 cameras is not coming from video_profile.get_intrinsics(); but from auto LEFT_T_RIGHT = right_video_profile.get_extrinsics_to(left_video_profile);
The value there is correct.
There seems, to me, a disagreement regarding the sign of the baseline. I'll have to look into it.
If I understand correctly, the sign of the given baseline (~0.0499) is opposite then required by rtabmap (according to @SBarzz , @TouchDeeper ) but complies with image_undistort ( according to @mindThomas )
(created inner report: DSO-16391)

According to stereo convention:
P(0,3) = -fx * B (according to sensor_msgs/CameraInfo)
or
P(0,3) = fx * Tx, where Tx stands for the left camera optical center relative to the right camera optical center, so
Tx = -B (according to OpenCV).
You can feel the idea by the fact that P represents a projection matrix in the rectified coordinate systems for the right camera, i.e. it projects points given in the rectified left camera coordinate system into the rectified right camera's image. And a point's pixel coordinates in the right camera always look shifted left comparing to pixel coordinates of the same point in left camera. So, P(0,3) should be negative.

@mindThomas
Copy link

mindThomas commented Mar 30, 2021

@pavloblindnology I think the documentation of the stereo convention is contradicting.

For a stereo pair, the fourth column [Tx Ty 0]' is related to the position of the optical center of the second (right) camera in the first (left) camera's frame.

Given that the z-axis is pointing in the camera viewing direction and the y-axis is pointing down, then the x-axis is pointing right such that the optical center of the right camera must be at a positive x-axis value relative to the left camera. Hence Tx should be positive for the right camera.

The first camera always has Tx = Ty = 0. For the right (second) camera of a horizontal stereo pair, Ty = 0 and Tx = -fx' * B, where B is the baseline between the cameras.

However this line in the documentation tends to suggest the opposite, namely that Tx of the right camera should have negative sign.

@pavloblindnology
Copy link
Contributor

pavloblindnology commented Apr 1, 2021

@mindThomas It's actually not contradicting. "is related" doesn't mean "is identical". It's related, sure, because it's negative value proportional to baseline. Anyway, we have exact formula for Tx calculation, which always has higher priority than somewhat unclear description. And it gives a negative value. Besides, the logic of P-matrix which I described above assumes Tx negative for right camera.

@mindThomas
Copy link

@pavloblindnology The sentence specifically say that Tx and Ty is the position of the optical center of the second (right) camera in the first (left) camera's frame
I don't see how this can be interpreted differently than Tx should be positive for the right camera.
image

@pavloblindnology
Copy link
Contributor

pavloblindnology commented Apr 8, 2021

@mindThomas Where did you get that sentence? From sensor_msgs/CameraInfo we read

For a stereo pair, the fourth column [Tx Ty 0]' IS RELATED to the
position of the optical center of the second camera in the first
camera's frame
.
So, once more

  1. is related is not identical to is equal.
  2. You get explicit formula next
  3. Basic ROS stereo packages image_geometry, stereo_image_proc assume it's negative
  4. Just do the math. P is a projection matrix from optical XYZ to pixel
    coordinates:
    [u v w]' = P * [X Y Z 1]'
    x = u / w
    y = v / w,
    So, for left camera we get xL = (X*fx + Z*cx)/Z
    for right camera we get xR = (X*fx + Z*cx + Tx)/Z
    which gives
    disparity = xL-xR = -Tx/Z
    Tx = -disparity*Z
    Disparity is a positive number (same object has larger x pixel coordinate in left frame than in right frame), Z is a positive number, which gives us no choice other than Tx being negative.

@mindThomas
Copy link

mindThomas commented Apr 27, 2021

@pavloblindnology I will give you that the wording in the documentation, i.e. the use of 'related', is ambiguous. Saying something is related can basically mean anything. A function argument, x, is also 'related' to the affine relationship a*x+b and also 'related' to the exponential relationship a*exp(-b*x).

In any case I do agree with your math and the interpretation that you highlight.

Proof

Another way to look at it in my opinion is rather to say that the fourth column of the projection matrix for the RIGHT camera ([Tx Ty 0]) is exactly the optical center of the LEFT camera frame defined in the RIGHT camera frame. This can also be derived by defining the transformation matrices:

LEFT_T_WORLD = [LEFT_R_WORLD, LEFT_O_WORLD;
                  0, 0, 0, 1]
RIGHT_T_LEFT = [RIGHT_R_LEFT, RIGHT_O_LEFT;
                  0, 0, 0, 1]
RIGHT_O_LEFT = [Tx_world, Ty_world, 0]'

With the convention that LEFT_T_WORLD is the transformation matrix that transforms a point from the WORLD frame to the LEFT camera frame, i.e. p_LEFT = LEFT_T_WORLD * p_WORLD.
And where LEFT_O_WORLD is the origin of the WORLD frame in the LEFT camera frame. Furthermore RIGHT_O_LEFT is the origin of the LEFT camera frame in the RIGHT camera frame which is related exactly to the translation that we are talking about, [Tx, Ty, 0]' and the images are assumed to be rectified such that RIGHT_R_LEFT = eye(3).
Now to project a point defined in the WORLD frame to LEFT camera frame we have:

[uL vL wL]' = P * LEFT_T_WORLD * [xW yW zW 1]'

Where:

P = [fx, 0, cx, 0;
     0, fy, cy, 0;
     0,  0,  1, 0]

And to project from WORLD into the RIGHT camera frame we would add the transformation matrix between the frames:

[uR vR wR]' = P * RIGHT_T_LEFT * LEFT_T_WORLD * [xW yW zW 1]'

Comparing this to the previous LEFT projection equation leaves us with:

P_right = P * RIGHT_T_LEFT
        = [fx, 0, cx, fx*Tx_world;
           0, fy, cy, fy*Ty_world;
           0,  0,  1, 0]

Which concludes the derivation by stating that:

Tx = Tx_world * fx
Ty = Ty_world * fy

Where Tx_world is the translation from the RIGHT to the LEFT camera frame positive towards the right, thus Tx_world and Tx are negative values.

Q.E.D.

@pavloblindnology
Copy link
Contributor

@mindThomas Sure. Your derivation is correct. My analysis wasn't a derivation - just a quick analysis of the final result.

@doronhi
Copy link
Contributor

doronhi commented Apr 29, 2021

Thank you @pavloblindnology and @mindThomas for your deep and elaborate discussion and the time and effort you put into it.
The conclusion, if I understand correctly, is that the sign of the baseline should be reversed. Would you take a look at #1832, please?
Also, @mindThomas , do you believe a similar correction should be made to the image_undistort repository? If so, would you care to open an issue there?

@mindThomas
Copy link

@dorohni I have created the following PR which would correct this baseline issue and fix the Projection matrix according to our discussion.
#1840
I do not have a sensor on hand at the moment, so I have been unable to verify the change, nor have I verified whether it works with the image_undistort repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants