Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Out-of-Box Stereo Neural Inference Support #216

Closed
Luxonis-Brandon opened this issue Oct 2, 2020 · 2 comments
Closed

Out-of-Box Stereo Neural Inference Support #216

Luxonis-Brandon opened this issue Oct 2, 2020 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@Luxonis-Brandon
Copy link
Contributor

Luxonis-Brandon commented Oct 2, 2020

Start with the why:

As it stands now using stereo neural inference requires a decent knowledge of homography to rectify the metadata results. Although such adjustments of the results is very low load on the host (and even low load for a microcontroller), it is quite high mental load for the programmer to implement and verify the correctness of the implementation.

And without these corrections, the stereo neural inference is not very accurate (e.g.: https://github.com/luxonis/depthai-experiments/tree/master/triangulation-3D-visualizer).

If the stereo neural inference is performed on the rectified_left and rectified_right images directly, the no adjustment to the metadata results (e.g. the Key-Points) is necessary on the host, and so getting 3D position results is trivial.

Move to the how:

Implement the capability to run stereo neural inference on the rectified_left and rectified_right streams such that the metadata results (Key-Points) are already stereo rectified and able to be used directly for triangulation of the physical location in meters of the Key-Points.

Use the onboard eeprom data of the camera extrinsics to triangulate the 3D position of the Key-Points and return the XYZ positions of each as part of the meta-data. Return both the pixel-space meta-data and also the 3D XYZ position points alongside - allowing the host to also do its own calculation if desired.

Implement in 3 stages:

  1. On device decoded networks object-detector support (MobileNetv2-SSD, tiny-YOLOv3, YOLOv3)
    In terms of onboard processing, mobilenet/yolo are parsed internal to DepthAI. So when implementing this we can do bounding box triangulation on DepthAI directly (probably do based on center of the box?).

  2. Host side decoding of networks that are not decoded directly on DepthAI.
    And then after that is working, we can do examples of host-side triangulation of key-point networks for which we don't have on-device decoding. And example is facial landmarks or human pose estimation - both of which we have host-side decoding but not yet on-device decoding.

  3. microPython neural network decoding on-device
    And then once microPython (Scripting Support on DepthAI #207) is out, we could replicate the same, but on-device with microPython. And with this functionality, even networks that the Luxonis team is unaware of, can be supported with completely on-device neural decoding and stereo neural inference.

Move to the what:

Support "Out-of-Box" stereo neural inference, where the XYZ locations of the Key-Points is returned alongside the pixel-coordinates.

@Luxonis-Brandon
Copy link
Contributor Author

We implemented this with the new script node, such that the stereo neural inference metadata results are processed directly on DepthAI - but importantly - the code that does the processing on DepthAI can be modified/customized by any developer (such that filters can be added, other/different code can be added, etc.).

luxonis/depthai-experiments#110

@Luxonis-Brandon
Copy link
Contributor Author

jdavidberger pushed a commit to constructiverealities/depthai that referenced this issue May 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Development

No branches or pull requests

2 participants