Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merging with vision landing project #1

Open
kripper opened this issue Mar 16, 2023 · 14 comments
Open

Merging with vision landing project #1

kripper opened this issue Mar 16, 2023 · 14 comments

Comments

@kripper
Copy link

kripper commented Mar 16, 2023

Hi @chobitsfan,

I'm the current maintainer of RosettaDrone and we are looking forward to contribute to an opensource visual based precision landing project.

I reviewed your implementation and understood everything (code is clean).

Your implementation has this pros:

  • Uses Apriltag
  • Computes a landing point (offset) independent of the marker centers.
  • Sends out the data to compute the yaw

Vision landing's track_targets has this pros:

  • gstreamer support...but probably slower then just working with the latest decoded frame
  • Uses an input video queue to reduce latency (receives new frames in background while doing the image processing at the same time in other thread)...but I believe the V4L implementation does something similar under the hood (?). It's important to always have the next frame decoded in memory ready to be processed as soon as the previous detection process finishes. Unnecessary frames will be dropped (wasted resources), but the overall latency is optimized.
  • outputs AR images/video to debug.
  • threshold filter based on historical detections (tells if we can trust the detection). Does the Apriltag library output wrong detections? If so, maybe we can just trust all detections (?) and we don't need to filter again. Anyway, the filters should be applied using the results of multiple markers.
  • filter wrong pose estimations, for example, when a marker is detected with negative z value, ie. supposedly being behind the camera.
  • documentation (how to calibrate the camera for correct pose estimation)

I believe both projects should be merged somehow so a maintainer community can be built around.
Of course, they are completely different implementations, but the final goal is exactly the same.

Before proposing a project merge strategy, I would like to see the code of the "frontend" where you process the marker data sent via IPC from the backend "capture.c".

You are probably doing there stuff like filtering out detection errors, maybe generating AR images for debugging and other stuff that is done in the equivalent vision landing python script.

About the scope:

We are also interested in doing some extrapolation to:

  • Compute the absolute target position to deal with latencies: we need to track the drone position when the image was captured and add the displacement of the drone since this timestamp.
  • Correct the absolute target position considering the last computed absolute positions: since the target could be moving, we could do a linear extrapolation based on the last two positions or a b-spline extrapolation considering tree or more points.
  • Also a PID could be used to smooth the drone landing motion

Anyway, I believe the merged project should just focus on returning the position of the target relative to the camera, including the error filters required to compute a robust and consistent landing target position based on all markers and maybe also considering previous computations.

The rest I mentioned above could be implemented in the flight controller.

BTW, do you know if this has been implemented?

For RosettaDrone, we will need a independent implementation (apart from the FC), so it would be ideal to use a shared library for this (a second layer separated from the flight controller and from the target tracking/capture layer).

@kripper
Copy link
Author

kripper commented Mar 16, 2023

Vision Landing users/devs:

Please check this:
https://discuss.ardupilot.org/t/precision-landing-with-multiple-apriltag/89911

@chobitsfan
Copy link
Owner

chobitsfan commented Mar 17, 2023

Hi @kripper Thank you

Before proposing a project merge strategy, I would like to see the code of the "frontend" where you process the marker data sent via IPC from the backend "capture.c".

I did not do much thing in frontend, it just pack data in mavlink packet. It is located in https://github.com/chobitsfan/mavlink-udp-proxy/tree/new_main. In lastest commit. I moved to use https://github.com/chobitsfan/libcamera-apps/tree/pr_apriltag instead, because raspberry pi moved from v4l2 to libcamera.

@kripper
Copy link
Author

kripper commented Mar 17, 2023

Ok, I'll take a look.

I just finished moving your code into track_targets.cpp in order to be able to debug the output with AR drawings.

I'm testing with my OpenGL simulator that generates and sends the video to track_targets.cpp.

Here is a preview of two tagStandard41h12 tags placed in the simulator (I'm testing with low quality video and it works stable).

image

Any particular reason to use this family tag?

I'm now struggling trying to project the (x,y,z) coordinates returned by estimate_tag_pose() on top of the camera image using the existing drawARLandingCube() method from Vision Landing:

...
matd_t* m1 = matd_multiply(pose.R, tgt_offset);
matd_t* m2 = matd_add(m1, pose.t);
x = m2->data[0];
y = m2->data[1];
z = m2->data[2];
...
drawARLandingCube(img, m, CamParam);

I'm specifically trying to figure out how to use the same aruco::CameraParameters with your code, which has only this:

apriltag_detection_info_t det_info = {.tagsize = 0.113, .fx = 978.0558315419056, .fy = 980.40099676993566, .cx = 644.32270873931213, .cy = 377.51661754419627};

@kripper
Copy link
Author

kripper commented Mar 17, 2023

it just pack data in mavlink packet.

Ok, that means that the FlightController receives multiple landing targets (one for each detected marker) and selects which one to use (or what to do with this redundant information)?

I believe this filtering process should be better done before sending the mavlink messages to the FC, since we have more information about the markers and their confidence levels.

@kripper
Copy link
Author

kripper commented Mar 17, 2023

BTW, while I was coding I managed to wake up @fnoop from his hibernation process.
I suspect he substituted his drone for a girlfriend...but he will be following our progress.
He commented about an issues with the latency which we will have to address next:
goodrobots/vision_landing#123 (comment)

@chobitsfan
Copy link
Owner

Hi @kripper

Ok, that means that the FlightController receives multiple landing targets (one for each detected marker) and selects which one to use (or what to do with this redundant information)?

No. raspberry pi computes the landing point based on which marker it detects. There is only one landing point, raspberry pi knows offsets from marker to the landing point.

@chobitsfan
Copy link
Owner

Here is a preview of two tagStandard41h12 tags placed in the simulator (I'm testing with low quality video and it works stable). Any particular reason to use this family tag?

AprilTag dev team recommend it, see https://github.com/AprilRobotics/apriltag/wiki/AprilTag-User-Guide#choosing-a-tag-family

@kripper
Copy link
Author

kripper commented Mar 17, 2023

There is only one landing point

Oh, right. I forgot you were only working with the first detected marker.
I will visually check if the projected landing point from one marker is consistent with the projected landing point from the other markers. If the error is big, I was thinking in using the centroid. If not, using the first is fine.

@kripper
Copy link
Author

kripper commented Mar 17, 2023

Ok, I just found the way to use the aruco::CameraParameters with the Apriltag.
This is needed to project the exact 3D coord of the landing position which is important for dealing with extrapolation and latency issues.

Preview:

image

@chobitsfan
Copy link
Owner

BTW, I believe you are not using camera calibration parameters

Focal length is used, but lens distortion is not used. For raspberry pi camera and my application, it is accurate enough even without lens distortion correction.

@kripper
Copy link
Author

kripper commented Mar 19, 2023

Merge is ready. I'm doing tests before releasing.

I also started addressing the latency drift problem.

In our case, we will also have to implement the motion control on our own:
RosettaDrone/rosettadrone#132

What is your experience with the latency drift (pose estimation is never current, so whatever motion instruction you send will always have some error).

Please comment there.

@kripper
Copy link
Author

kripper commented Mar 20, 2023

I published the result of the "merge" here:
https://github.com/kripper/vision-landing-2

I included your "IPC communication protocol" (the pose values you were sending to your "frontend") in apriltag-detector.cpp so you could easily enable them and switch to use vision-landing-2 in your projects.

You could also be interested in the alternative input source "pipe-buffer" to pass raw images with less latency.

See more details in the README.

@fnoop
Copy link

fnoop commented Mar 23, 2023

@chobitsfan, could you please declare here/in the code the licensing of this code? There is a kind of public domain declaration at the top of capture.c, but I think that might be from v4l2 project?

@chobitsfan
Copy link
Owner

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants