Skip to content
Switch branches/tags
Go to file

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time


This is the project page of the paper "Flow-Motion and Depth Network for Monocular Stereo and Beyond'' Paper Link. RAL 2020 & to be presented on ICRA 2020.

This project page contians:

  • the implementation of the method,

  • the GTA-SfM tools and generated dataset.

All components will be open source once the paper is accepted.

The proposed method

In this work we propose a method that sloves monocular stereo and can further fuse depth information from multiple target images. The inputs and outputs of the method can be illustrated in the figure below. Given a source image, and one or many target images, the proposed method estimates the optical flow and relative poses between each source-target pair. The depth map of the source image is also estimated by fusing optical flow and pose information.


  • Code

Here, we provide the code for the understanding of the paper. Following BANet, LS-Net, the trained models are not open sourced. Please go to flow-motion-depth for more details.

The proposed dataset and tools

Training and evaluating neural networks require large-scale high-quality data. Different from the widely used dataset from DeMoN, we propose to render the dataset in GTA5 as a supplementary. A similiar data, MVS-Synth, is proposed in DeepMVS. Cameras in the MVS-Synth dataset usually moves in small translations. On the other hand, the proposed GTA-SfM dataset contains images with much larger view angle changes which is more close to structure-from-motion (SfM) applications. Below is the comparision of the proposed GTA-SfM (left) and MVS-Synth (right).

  • Extracted dataset

Please visit the extracted_dataset folder for more details.


This is the project page of the paper "Flow-Motion and Depth Network for Monocular Stereo and Beyond''



No releases published


No packages published