Skip to content

HenryLee0314/ECCV2022-MPF-net

Repository files navigation

Deep 360° Optical Flow Estimation Based on Multi-Projection Fusion

This project is the official repository of the ECCV2022 paper: Deep 360° Optical Flow Estimation Based on Multi-Projection Fusion.

Authors: Yiheng Li, Connelly Barnes, Kun Huang, and Fang-Lue Zhang

paper, dataset, video

Abstract

Optical flow computation is essential in the early stages of the video processing pipeline. This paper focuses on a less explored problem in this area, the 360° optical flow estimation using deep neural networks to support increasingly popular VR applications. To address the distortions of panoramic representations when applying convolutional neural networks, we propose a novel multi-projection fusion framework that fuses the optical flow predicted by the models trained using different projection methods. It learns to combine the complementary information in the optical flow results under different projections. We also build the first large-scale panoramic optical flow dataset to support the training of neural networks and the evaluation of panoramic optical flow estimation methods. The experimental results on our dataset demonstrate that our method outperforms the existing methods and other alternative deep networks that were developed for processing 360° content.

Poster

Video

A video presentation is available at Vimeo. Please have a look.

For those who can not watch the video due to network issue, we have a compressed video available at poster_and_video/video.mp4.

Citation

@inproceedings{li2022deep,
  title={Deep 360∘ Optical Flow Estimation Based on Multi-projection Fusion},
  author={Li, Yiheng and Barnes, Connelly and Huang, Kun and Zhang, Fang-Lue},
  booktitle={Computer Vision--ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23--27, 2022, Proceedings, Part XXXV},
  pages={336--352},
  year={2022},
  organization={Springer}
}

Requirements

We mainly borrowed code from the PWC optical flow neural network and updated it to the newest Pytorch version. Furthermore, in order to accelerate the process of converting different projections. We also require C++ and OpenCL environment for parallel computing.

Python side:

Please refer to the requirements.txt. It requires: Python=3.9 Pytorch=1.12 CUDA=11.6 OpenCV NumPy and the nvcc toolchain

C++/OpenCL side:

Please refer to the CMakeLists.txt. It requires: cmake>=3.4 cc/c++ opencv=4.6 opencl=2.2 pybind11

Installation

  1. On Ubuntu 20.04, enter the project folder and execute the script install.sh. It will install the CUDA operator.

  2. build the OpenCL code and port the dynamic file to the project folder

mkdir build && cd build && cmake .. && make -j8

Inference

Please update the input arg in the end_to_end_inference.py for the model path and the dataset path.

python end_to_end_inference.py

Contact

Please feel free to contact me (Yiheng) at leehenry0314@me.com. Or raise your query in the GitHub issue :)

Known issues

I have optimized the projection algorithm from pure C++ to OpenCL, and it caused some loss in precision. Further work will include adding an end-to-end training script and fine-tuning some fusion models.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors