Revisiting Depth Completion from a Stereo Matching Perspective for Cross-domain Generalization (3DV 2024)

🚨 This repository contains download links to our code, and trained deep stereo models of our work "Revisiting Depth Completion from a Stereo Matching Perspective for Cross-domain Generalization", 3DV 2024

by Luca Bartolomei^1,2, Matteo Poggi^1,2, Andrea Conti², Fabio Tosi², and Stefano Mattoccia^1,2

Advanced Research Center on Electronic System (ARCES)¹ University of Bologna²

Project Page | Paper | Supplementary

Note: 🚧 Kindly note that this repository is currently in the development phase. We are actively working to add and refine features and documentation. We apologize for any inconvenience caused by incomplete or missing elements and appreciate your patience as we work towards completion.

We would like to share with you our previous work Active Pattern Without Pattern Projector from which we took inspiration for this work.

🎬 Introduction

This paper proposes a new framework for depth comple-tion robust against domain-shifting issues. It exploits the generalization capability of modern stereo networks to face depth completion, by processing fictitious stereo pairs obtained through a virtual pattern projection paradigm. Any stereo network or traditional stereo matcher can be seamlessly plugged into our framework, allowing for the deployment of a virtual stereo setup that is future-proof against advancement in the stereo field.

Contributions:

We cast depth completion as a virtual stereo correspondence problem, where two appropriately patterned virtual images enable us to face depth completion with robust stereo-matching algorithms or networks.
Extensive experimental results with multiple datasets and networks demonstrate that our proposal vastly outperforms state-of-the-art concerning generalization capability.

If you find this code useful in your research, please cite:

@inproceedings{bartolomei2023revisiting,
      title={Revisiting Depth Completion from a Stereo Matching Perspective for Cross-domain Generalization}, 
      author={Luca Bartolomei and Matteo Poggi and Andrea Conti and Fabio Tosi and Stefano Mattoccia},
      year={2024},
      booktitle={2024 International Conference on 3D Vision (3DV)},
      organization={IEEE}
}

📥 Pretrained Models

Here, you can download the weights of RAFT-Stereo architecture.

Vanilla Models: these models are pretrained on Sceneflow vanilla images and Middlebury vanilla images
- RAFT-Stereo vanilla models (raft-stereo/sceneflow-raftstereo.tar)
Fine-tuned Models: starting from vanilla models, these models (sceneflow-*-raftstereo.tar) are finetuned in a specific real domain.
Models trained from scratch: these models (*-raftstereo.tar) are trained from scratch using our framework

To use these weights, please follow these steps:

Install GDown python package: pip install gdown
Download all weights from our drive: gdown --folder https://drive.google.com/drive/folders/1AZRHzCn7K7HiPQZocfxWplYHo3WhI8lm?usp=sharing

📝 Code

The Test section provides scripts to evaluate depth estimation models on datasets like VOID, NYU, DDAD and KITTIDC. It helps assess the accuracy of the models and saves predicted depth maps.

Please refer to each section for detailed instructions on setup and execution.

Warning:

Please be aware that we will not be releasing the training code for deep models. The provided code focuses on evaluation and demonstration purposes only.
With the latest updates in PyTorch, slight variations in the quantitative results compared to the numbers reported in the paper may occur.

🛠️ Setup Instructions

Ensure that you have installed all the necessary dependencies. The list of dependencies can be found in the ./requirements.txt file.

You can also follow this script to create a virtual environment and install all the dependencies:

$ conda create -n "vppdc" python
$ conda activate vppdc
$ python -m pip install -r requirements.txt

💾 Datasets

We used two datasets for training and evaluation.