A deep dive into classical & modern 3D Computer Vision concepts and techniques applied here to the calibrated\synchronized stereo camera imagery provided by the "KITTI Dataset"
DLT Triangulation (described by MVG) of corresponding image point-feature pairs.
Please watch: Full Video Recordings on YouTube
Note how the triangulation resolution diminishes rapidly along the Line-of-Sight (Z axis).
Point color represents Z-Axis depth.
Optical Flow (described by Ma et al) of a fixed grid of points over a camera image sequence. It is Multi-Scale (i.e. recusrsively downsamples a pyramid of images) and applies Gradient based (Lucas and Kanade) computations.
Full Video Recordings on YouTube
Only numpy, matplotlib, scipy, scikit-image, pytest packages are required for this script.
To run within a virtual environment, create a separate virtual environment for the new project
python3 -m venv .venv Specifying .venv as the directory for it.
source .venv/bin/activate Activate the virtual environment by sourcing the activate script.
pip install -r requirements.txt Install required packages
pytest Run Unit Tests
python3 main_stereo_dlt_triangulation.py Run DLT Triangulation over example image pair
python3 main_optical_flow.py Run Optical Flow over example image pair
deactive Deactivate Virtual Environment before closing terminal.
- Original Project Page: https://webdiis.unizar.es/~raulmur/orbslam/
- Original Code: https://github.com/raulmur/ORB_SLAM2
Please watch: My ORB-SLAM2 Stereo-KITTI Outputs on YouTube
Several minor tweaks were made to run on:
- Ubuntu 24.04.4, C++ 14, opencv 4.6.0, eigen3 3.4.0, Pangolin 0.9.5
Apply refactor_for_upgraded_deps_ORBSLAM2.patch to your cloned ORB-SLAM2 repo to run locally.
ORB-SLAM computes in real-time the camera trajectory and sparse 3D scene reconstruction for Monocular, Stereo, and RGB-D Cameras. It is Keypoint (ORB feature) based, and employs Bundle-Adjustment to close large loops.
Unlike Monocular-SLAM, the Stereo-SLAM estimates the map and trajectory with metric scale and does not suffer from scale drift.
See below how Stereo ORBSLAM approaches the Loop Closure point with perfect accuracy, while the Monocular ORBSLAM has accumulated significant drift.
Both Mono and Stereo detect the Loop Closure and refine the total Map via Bundle Adjustment upon detection.
Direct-Sparse-Odometry with Loop Closure (LDSO) from Technical University Munich (Gao, Engel, Cremers et al)
- Original Project Page: https://cvg.cit.tum.de/research/vslam/ldso
- Original Code: https://github.com/tum-vision/LDSO
Please watch: My LDSO KITTI Outputs on YouTube
Few minor tweaks were made to run on:
Ubuntu 24.04.4, C++ 14, opencv 4.6.0, eigen3 3.4.0, Pangolin 0.9.5
Apply refactor_for_upgraded_deps_LDSO.patch to your cloned LDSO repo to run locally.
LDSO employs
- the original DSO [1] as a camera tracking front-end
- an additional Loop-Closure-Detection and Pose-Graph Optimization as a back-end.
See below the difference between the Trajectory before (red line) and after (yellow line) Loop-Detection & Global Optimization.
As a monocular SLAM, it accumulates drift\error in the unobservable degrees-of-freedom; i.e. global translation, rotation and scale Upon Loop Closure detection, this accumulated error is resolved by a global Pose-Graph Optimization. Nonetheless, as a Monocular SLAM, scale is still ambiguous\unobservable. This is not the case in Stereo ORB-SLAM2 shown previously.
Geiger A, Lenz P, Stiller C, Urtasun R, Vision meets Robotics: The KITTI Dataset, International Journal of Robotics Research (IJRR), 2013, https://www.cvlibs.net/datasets/kitti/raw_data.php
Hartley R, Zisserman A,Multiple View Geometry in Computer Vision, 2003, Cambridge University Press, 2nd edition
Ma Y, Soatto S, Kosecká, J, & Sastry S S (2004). An Invitation to 3-D Vision: From Images to Geometric Models. Springer-Verlag.
Qian-Yi Zhou and Jaesik Park and Vladlen Koltun, {Open3D}: {A} Modern Library for {3D} Data Processing, arXiv:1801.09847, 2018
Raúl Mur-Artal, and Juan D. Tardós. ORB-SLAM2: an Open-Source SLAM System for Monocular, Stereo and RGB-D Cameras ArXiv preprint arXiv:1610.06475, 2016.
Raúl Mur-Artal, J. M. M. Montiel and Juan D. Tardós. ORB-SLAM: A Versatile and Accurate Monocular SLAM System. IEEE Transactions on Robotics, vol. 31, no. 5, pp. 1147-1163, October 2015. (2015 IEEE Transactions on Robotics Best Paper Award)
J. Engel, V. Koltun, and D. Cremers, Direct Sparse Odometry, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 3, pp. 611–625, 2018
X. Gao, R. Wang, N. Demmel, and D. Cremers, LDSO: Direct Sparse Odometry with Loop Closure, iros, 2018, October
Torralba, A. and Isola, P. and Freeman, W.T. Foundations of Computer Vision, 2024, Adaptive Computation and Machine Learning series, MIT Press, https://mitpress.mit.edu/9780262048972/foundations-of-computer-vision/

