Alexander Veicht · Felix Yang · Andri Horat · Deep Desai · Philipp Lindenberger
Our reimplemented and improved pipeline for MegaDepth generates high-quality depth maps
and camera poses for unstructured images of popular tourist landmarks.
MegaDepth, a dataset of unstructured images featuring popular tourist landmarks, was introduced in 2018. By leveraging structure from motion (SfM) and multi-view stereo (MVS) techniques along with data cleaning methods, MegaDepth generates camera poses and depth maps for each image. Nonetheless, the outcomes suffer from limitations like degenerate camera poses, incomplete depth maps, and inaccuracies caused by unregistered images or noise in the pipeline. Despite these flaws, MegaDepth has become an industry-standard dataset for training various computer vision models, including but not limited to single-view depth estimation, local features, feature matching, and multi-view refinement. This is primarily due to the diversity of scenes, occlusions, and appearance changes captured in the dataset, enabling the models to generalize well. Our project aims to systematically address these problems to establish a refined MegaDepth ground-truth (GT) pipeline using recent methods such as the hloc and Pixel-Perfect Structure-from-Motion.
Clone and install the repository by running the following commands:
git clone https://github.com/fyangch/RefinedMegaDepth.git
cd RefinedMegaDepth
pip install -e .
Download the south building dataset and extract it to the data
folder.
mkdir data
wget https://demuc.de/colmap/datasets/south-building.zip -O data/south-building.zip
unzip data/south-building.zip -d data
rm -rf data/south-building.zip data/south-building/sparse data/south-building/database.db
Run the following command to start the pipeline:
python -m megadepth.reconstruction scene=south-building
The images are expected to be split by scenes and stored in the following format:
data
├── scene_1
│ ├── images
│ │ ├── 00000000.jpg
│ │ ├── 00000001.jpg
│ │ ├── ...
├── scene_2
│ ├── images
│ │ ├── 00000000.jpg
│ │ ├── 00000001.jpg
│ │ ├── ...
├── ...
You can simply run the the reconstruction pipeline by specifying the scene name:
python -m megadepth.reconstruction scene=scene_1
The pipeline will read the images from folder and create the following folders for the outputs:
data
├── scene_1
│ ├── images
│ ├── features
│ ├── matches
│ ├── sparse
│ ├── dense
│ ├── metrics
│ ├── results
├── scene_2
│ ├── ...
├── ...
- Fix rotations
- Test mvs
- Remove dependencies (xarray, mit_semseg, ...)
- Check for licenses (segmentation models etc.)