Skip to content

sanandeesh/Photogrammetric_ComputerVision

Repository files navigation

Photogrammetric ComputerVision

A deep dive into classical & modern 3D Computer Vision concepts and techniques applied here to the calibrated\synchronized stereo camera imagery provided by the "KITTI Dataset"

I. DLT Triangulation of Point Pairs from KITTI Stereo Cameras

DLT Triangulation (described by MVG) of corresponding image point-feature pairs.

Please watch: Full Video Recordings on YouTube

Triangulation Results Triangulation Results Note how the triangulation resolution diminishes rapidly along the Line-of-Sight (Z axis). Point color represents Z-Axis depth.

II. Optical Flow from KITTI Camera Sequance

Optical Flow (described by Ma et al) of a fixed grid of points over a camera image sequence. It is Multi-Scale (i.e. recusrsively downsamples a pyramid of images) and applies Gradient based (Lucas and Kanade) computations.

Full Video Recordings on YouTube

Optical Flow Results

Installation:

Only numpy, matplotlib, scipy, scikit-image, pytest packages are required for this script.

To run within a virtual environment, create a separate virtual environment for the new project

python3 -m venv .venv Specifying .venv as the directory for it.

source .venv/bin/activate Activate the virtual environment by sourcing the activate script.

pip install -r requirements.txt Install required packages

Usage:

pytest Run Unit Tests

python3 main_stereo_dlt_triangulation.py Run DLT Triangulation over example image pair

python3 main_optical_flow.py Run Optical Flow over example image pair

deactive Deactivate Virtual Environment before closing terminal.

ORB-SLAM2 from Universidad de Zaragoza (Raúl Mur-Artal et al)

Please watch: My ORB-SLAM2 Stereo-KITTI Outputs on YouTube

Several minor tweaks were made to run on:

  • Ubuntu 24.04.4, C++ 14, opencv 4.6.0, eigen3 3.4.0, Pangolin 0.9.5

Apply refactor_for_upgraded_deps_ORBSLAM2.patch to your cloned ORB-SLAM2 repo to run locally.

ORB-SLAM computes in real-time the camera trajectory and sparse 3D scene reconstruction for Monocular, Stereo, and RGB-D Cameras. It is Keypoint (ORB feature) based, and employs Bundle-Adjustment to close large loops.

Unlike Monocular-SLAM, the Stereo-SLAM estimates the map and trajectory with metric scale and does not suffer from scale drift. See below how Stereo ORBSLAM approaches the Loop Closure point with perfect accuracy, while the Monocular ORBSLAM has accumulated significant drift. ORB-SLAM2 Monocular vs Stereo Both Mono and Stereo detect the Loop Closure and refine the total Map via Bundle Adjustment upon detection.

Direct-Sparse-Odometry with Loop Closure (LDSO) from Technical University Munich (Gao, Engel, Cremers et al)

Please watch: My LDSO KITTI Outputs on YouTube

Few minor tweaks were made to run on:

Ubuntu 24.04.4, C++ 14, opencv 4.6.0, eigen3 3.4.0, Pangolin 0.9.5

Apply refactor_for_upgraded_deps_LDSO.patch to your cloned LDSO repo to run locally.

LDSO employs

  • the original DSO [1] as a camera tracking front-end
  • an additional Loop-Closure-Detection and Pose-Graph Optimization as a back-end.

See below the difference between the Trajectory before (red line) and after (yellow line) Loop-Detection & Global Optimization.

LDSO Outputs

As a monocular SLAM, it accumulates drift\error in the unobservable degrees-of-freedom; i.e. global translation, rotation and scale Upon Loop Closure detection, this accumulated error is resolved by a global Pose-Graph Optimization. Nonetheless, as a Monocular SLAM, scale is still ambiguous\unobservable. This is not the case in Stereo ORB-SLAM2 shown previously.

Resources:

Geiger A, Lenz P, Stiller C, Urtasun R, Vision meets Robotics: The KITTI Dataset, International Journal of Robotics Research (IJRR), 2013, https://www.cvlibs.net/datasets/kitti/raw_data.php

Hartley R, Zisserman A,Multiple View Geometry in Computer Vision, 2003, Cambridge University Press, 2nd edition

Ma Y, Soatto S, Kosecká, J, & Sastry S S (2004). An Invitation to 3-D Vision: From Images to Geometric Models. Springer-Verlag.

Qian-Yi Zhou and Jaesik Park and Vladlen Koltun, {Open3D}: {A} Modern Library for {3D} Data Processing, arXiv:1801.09847, 2018

Raúl Mur-Artal, and Juan D. Tardós. ORB-SLAM2: an Open-Source SLAM System for Monocular, Stereo and RGB-D Cameras ArXiv preprint arXiv:1610.06475, 2016.

Raúl Mur-Artal, J. M. M. Montiel and Juan D. Tardós. ORB-SLAM: A Versatile and Accurate Monocular SLAM System. IEEE Transactions on Robotics, vol. 31, no. 5, pp. 1147-1163, October 2015. (2015 IEEE Transactions on Robotics Best Paper Award)

J. Engel, V. Koltun, and D. Cremers, Direct Sparse Odometry, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 3, pp. 611–625, 2018

X. Gao, R. Wang, N. Demmel, and D. Cremers, LDSO: Direct Sparse Odometry with Loop Closure, iros, 2018, October

Torralba, A. and Isola, P. and Freeman, W.T. Foundations of Computer Vision, 2024, Adaptive Computation and Machine Learning series, MIT Press, https://mitpress.mit.edu/9780262048972/foundations-of-computer-vision/

About

A deep dive into classical and modern 3d Computer Vision techniques.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages