A camera-only technique to correlate and determine distance to landmarks over time as the camera vehicle is moving.
Available in this repo as a Java implementation (in 2D and 3D), as well as a 2D C++ implementation with headers.
C++ implementation by Sahan Reddy and Ethan Leitner
After substantial testing with an OAK-D Pro stereoscopic depth camera, we found that its depth values were wildly inaccurate past distances of about 10ft. This is likely because the distance between the two "eyes" of the stereoscopic camera is so small that it is negligible at longer distances.
We would like to be able to determine the distance to landmarks at further distances as our vehicle is moving.
While it is difficult to determine accurate depth in a static scene, the location of the bounding box for landmark(s) in the picture frame (generated by YOLOv7 object detection ML) may provide a very accurate angle measure of the landmark's bearing relative to the camera over time.
This repo describes a technique to determine the distance to a landmark using just:
- the change in location and orientation of our camera
- the change in its bearing to a landmark.
...to triangulate the distance to the landmark, in a process called MotionParallax
In addition, this technique employs an accurate algorithm to correlate landmark instances from one frame to the next. This may be useful for de-duplication of landmark instances which are detected multiple times.
If a detected landmark and the angle to it is known, and the displacement and rotation between camera frames is known, side "b" (distance to landmark) can be calculated using the law of sines:
-
Find the bearing to each landmark in the current picture frame
-
Let the vehicle travel some distance, keeping track of its displacement and rotation
-
Find the bearing to each landmark in the current picture frame again, and compare it to previous frame(s)
-
Use an assignment algorithm to correlate landmarks in the current frame to those in the previous frame. Do not attempt to make a correlation for landmarks whose change in bearing exceeds a specified threshold
T1 -
For every landmark in the current frame, where possible search backwards in time for a given landmark until the change in the reverse bearing (from the landmark to the vehicle) is sufficiently-large to perform triangulation. Using the threshold
T2and change in bearingC, abs(C) ≥T2. For landmarks for which no sufficient data is present, treat it as a newly-discovered landmark and do not attempt to calculate its distance. -
Use the law of sines to triangulate the location of a given landmark in the current frame. To calculate this distance
b(in meters) using the camera's displacementc(in meters), the change in bearing from the kart to the landmarkB, and the change in bearing from the landmark to the vehicleC:
b = c*Sin(B) ÷ Sin(C)
This value b will be the distance from the location of the camera in the current frame to the landmark!
- Repeat step 5 and 6 for the full history of a given landmark, triangulating it multiple times. The center of these points will reliably represent the true location of the landmark.


