Welcome to the 3D Vision Playground repository! This repository serves as a hub for all my explorations and experiments in the realm of 3D computer vision. Here, you'll find a diverse collection of projects, codes, and resources covering various topics within 3D vision.
This repository houses projects related to point clouds, epipolar geometry, Structure from Motion (SfM), Monocular Depth Estimation, Feature Tracking, Optical and Scene Flow, Neural Radiance Fields, and other exciting topics in 3D computer vision. Whether you're a beginner or an experienced practitioner, you'll find something intriguing to explore and learn.
- Gain a deep understanding of fundamental concepts in 3D computer vision.
- Implement algorithms and models from scratch to solidify understanding.
- Explore modern techniques and applications in the field.
- Contribute to open-source projects and collaborate with the community.
-
Point Cloud Processing:
- Explore point cloud data structures and visualization techniques.
- Implement algorithms for point cloud registration and alignment.
-
Epipolar Geometry:
- Understand the principles of stereo vision and epipolar geometry.
- Implement stereo matching algorithms for depth estimation.
-
Structure from Motion (SfM):
- Learn about bundle adjustment and 3D reconstruction from images.
- Implement SfM pipelines for creating 3D models from image sequences.
-
Monocular Depth Estimation:
- Dive into deep learning architectures for predicting depth from single images.
- Train and evaluate models on benchmark datasets like KITTI and NYU Depth.
-
Feature Tracking:
- Explore feature detection and tracking algorithms such as ORB and SIFT.
- Implement feature-based SLAM systems for real-time tracking and mapping.
-
Optical and Scene Flow:
- Study methods for estimating motion and scene flow from image sequences.
- Implement algorithms for dense optical flow and scene flow estimation.
-
Neural Radiance Fields (NeRF):
- Investigate volumetric rendering techniques using neural networks.
- Implement NeRF and its variants for synthesizing novel views of 3D scenes.
-
3D Object Detection and Recognition:
- Explore methods for detecting and recognizing objects in 3D point clouds.
- Implement deep learning-based approaches such as PointRCNN and VoteNet.
-
Semantic 3D Scene Understanding:
- Dive into techniques for understanding the semantics of 3D scenes.
- Implement models for semantic segmentation and instance segmentation in point clouds.
-
3D Reconstruction from Multiple Views:
- Learn about multi-view stereo reconstruction techniques.
- Implement algorithms for dense reconstruction from calibrated image sequences.
-
Depth Completion and Surface Normal Estimation:
- Explore methods for completing sparse depth maps and estimating surface normals.
- Implement deep learning models for predicting dense depth and surface normal maps.
-
3D Object Pose Estimation:
- Study techniques for estimating the 3D pose of objects from images or point clouds.
- Implement pose estimation algorithms using geometric and learning-based approaches.
-
3D Registration and Alignment:
- Explore algorithms for aligning and registering 3D scans or point clouds.
- Implement ICP (Iterative Closest Point) and variants for rigid and non-rigid registration.
-
3D Reconstruction from Single Images:
- Investigate methods for reconstructing 3D shapes from single images.
- Implement shape-from-X techniques such as shape-from-shading and shape-from-texture.
-
3D Scene Understanding and Interaction:
- Study methods for understanding and interacting with 3D scenes in real-time.
- Implement interactive applications for virtual reality, augmented reality, and gaming.
- An Invitation to 3D Vision: A Tutorial for Everyone by Authors Sunglok Choi and JunHyeok Choi
- 3D-Computer-Vision-Research
- 3D-Machine-Learning
- Stanford CS231A: Computer Vision, From 3D Reconstruction to Recognition
- NUS CS5477 3D Computer Vision
- Geometric Deep Learning; Grids, Groups, Graphs, Geodesics, and Gauges
- Pytorch Geometric Tutorials
- IITB CS749: Digital Geometry Processing, Spring 2017 (C++)
- D. A. Forsyth and J. Ponce. Computer Vision: A Modern Approach (2nd Edition). Prentice Hall, 2011.
- R. Hartley and A. Zisserman. Multiple View Geometry in Computer Vision. Cambridge University Press, 2003.
- Sebastian Thrun, Wolfram Burgard, Dieter Fox. Probabilistic robotics. The MIT Press, 2005.
- Computer Vision: Algorithms and Applications 2nd Edition Richard Szeliski
- William L. Hamilton. (2020). Graph Representation Learning. Synthesis Lectures on Artificial Intelligence and Machine Learning, Vol. 14, No. 3 , Pages 1-159.
- "A Comparative Analysis of SIFT, Harris, FAST, and ORB" by Amit Kumar Gupta, Ajay Sharma, and Anupam Agrawal.
- "A Survey of 3D Object Representation Techniques" by Anil K. Jain and Klaus Schmid.
- "Review of 3D Shape Representation Techniques" by Angeliki Skoura and Theoharis Theoharis.
- "Deep Learning for 3D Computer Vision: A Survey" by Matthias Nießner, Michael Zollhöfer, and Shahram Izadi.
- "3D Object Recognition: A Contemporary Survey" by Ahmet Ekin, Aytül Erçil, and Tarkan Aydın.
- Open3D: A modern library for 3D data processing.
- Point Cloud Library (PCL): A large-scale, open project for point cloud processing.
- TensorFlow3D: A highly modular and efficient library that is designed to bring 3D deep learning capabilities into TensorFlow
- PyTorch3D: A library for deep learning with 3D data.
Contributions and feedback are welcome! If you're passionate about 3D,4D, or 5D computer vision or have ideas for improvement, feel free to open issues or pull requests.
Let's explore the fascinating world of 3D vision together!