Monocular Visual Odometry

A Python implementation of classical monocular Visual Odometry using feature-based tracking and geometric pose estimation.

🎯 Overview

Visual Odometry (VO) estimates the trajectory of a camera by analyzing the motion of features in sequential images. This project demonstrates the fundamental pipeline used in robotics, autonomous vehicles, and AR/VR systems.

Key Features:

FAST corner detection for robust feature extraction
KLT optical flow for efficient feature tracking
Essential Matrix decomposition for camera pose estimation
Real-time trajectory visualization

🏗️ System Architecture

┌─────────────┐
│ Video/Image │
│   Sequence  │
└──────┬──────┘
       │
       v
┌─────────────────┐
│ Feature Detection│ ← FAST Algorithm
│   (Frame N-1)   │
└──────┬──────────┘
       │
       v
┌─────────────────┐
│ Feature Tracking │ ← Lucas-Kanade Optical Flow
│   (Frame N)     │
└──────┬──────────┘
       │
       v
┌─────────────────┐
│ Pose Estimation │ ← Essential Matrix + RANSAC
│  (R, t)         │
└──────┬──────────┘
       │
       v
┌─────────────────┐
│ Trajectory      │ ← Integrate Motion
│   Update        │
└─────────────────┘

📦 Installation

Prerequisites

Python 3.8+
pip

Setup

# Clone the repository
git clone https://github.com/yourusername/VisualOdometry.git
cd VisualOdometry

# Install dependencies
pip install -r requirements.txt

🚀 Usage

Running with Video File

python main.py --video path/to/your/video.mp4

Running with KITTI Dataset

Download a sequence from KITTI Odometry Dataset
Run:

python main.py --kitti path/to/dataset/sequences/00

Controls

ESC: Exit the application

🧮 How It Works

1. Feature Detection (FAST)

Detects corner features in the image using the FAST (Features from Accelerated Segment Test) algorithm.

detector = cv2.FastFeatureDetector_create(threshold=20)
keypoints = detector.detect(image)

2. Feature Tracking (KLT)

Tracks features from frame N-1 to frame N using Lucas-Kanade optical flow.

p1, status, err = cv2.calcOpticalFlowPyrLK(prev_img, curr_img, p0, None)

3. Essential Matrix Estimation

Computes the Essential Matrix E that encodes the relative camera motion.

E = K^T @ F @ K

Where:

K = Camera intrinsic matrix
F = Fundamental matrix

E, mask = cv2.findEssentialMat(points1, points0, focal, pp, method=cv2.RANSAC)

4. Pose Recovery

Decomposes E into rotation (R) and translation (t):

_, R, t, mask = cv2.recoverPose(E, points1, points0, focal, pp)

5. Trajectory Integration

Updates global pose by integrating relative motion:

t_global = t_global + R_global @ (scale * t)
R_global = R_global @ R

🎓 Mathematical Background

Epipolar Geometry

The epipolar constraint states that for corresponding points p1 and p2:

p2^T @ E @ p1 = 0

Essential Matrix Properties

Rank 2 matrix (singular)
5 degrees of freedom
Decomposition: E = [t]_x @ R

Where [t]_x is the skew-symmetric matrix of translation vector.

🔧 Hyperparameters

Parameter	Value	Purpose
FAST threshold	20	Corner detection sensitivity
KLT window size	21×21	Optical flow search area
RANSAC probability	0.999	Outlier rejection confidence
RANSAC threshold	1.0 px	Inlier distance threshold

📊 Known Limitations

Scale Ambiguity: Monocular VO cannot determine absolute scale (distance is relative)
Drift: Without loop closure, position error accumulates over time
Lighting Sensitivity: FAST features may fail in extreme lighting conditions
Rotation-Only Motion: Fails when camera only rotates (no translation)

🛠️ Improvements & Extensions

Add depth estimation using pre-trained neural networks
Implement bundle adjustment for trajectory optimization
Add loop closure detection
Replace FAST with learned features (SuperPoint)
Add IMU fusion (Visual-Inertial Odometry)
Implement local mapping (Visual SLAM)

📚 References

Nistér, D. (2004). "An efficient solution to the five-point relative pose problem"
Scaramuzza & Fraundorfer (2011). "Visual Odometry: Part I & II"
Hartley & Zisserman (2003). "Multiple View Geometry in Computer Vision"

🤝 Contributing

Contributions are welcome! Feel free to open issues or submit pull requests.

📄 License

MIT License - see LICENSE file for details

🙏 Acknowledgments

KITTI Vision Benchmark Suite for datasets
OpenCV community for excellent documentation
NASA JPL for inspiring Mars rover navigation work

Built for learning Visual Odometry fundamentals and preparing for Computer Vision interviews in Robotics and AR/VR.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
src		src
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Monocular Visual Odometry

🎯 Overview

🏗️ System Architecture

📦 Installation

Prerequisites

Setup

🚀 Usage

Running with Video File

Running with KITTI Dataset

Controls

🧮 How It Works

1. Feature Detection (FAST)

2. Feature Tracking (KLT)

3. Essential Matrix Estimation

4. Pose Recovery

5. Trajectory Integration

🎓 Mathematical Background

Epipolar Geometry

Essential Matrix Properties

🔧 Hyperparameters

📊 Known Limitations

🛠️ Improvements & Extensions

📚 References

🤝 Contributing

📄 License

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Languages

aniketDash7/visual_odometry

Folders and files

Latest commit

History

Repository files navigation

Monocular Visual Odometry

🎯 Overview

🏗️ System Architecture

📦 Installation

Prerequisites

Setup

🚀 Usage

Running with Video File

Running with KITTI Dataset

Controls

🧮 How It Works

1. Feature Detection (FAST)

2. Feature Tracking (KLT)

3. Essential Matrix Estimation

4. Pose Recovery

5. Trajectory Integration

🎓 Mathematical Background

Epipolar Geometry

Essential Matrix Properties

🔧 Hyperparameters

📊 Known Limitations

🛠️ Improvements & Extensions

📚 References

🤝 Contributing

📄 License

🙏 Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages