Skip to content

danqu130/RPEFlow

Repository files navigation

RPEFlow: Multimodal Fusion of RGB-PointCloud-Event for Joint Optical Flow and Scene Flow Estimation (ICCV 2023)

Project Page, arXiv, Supp.

Environment

CUDA 11.7
CUDNN 8.5.0
torch 1.13.0
torchvision 0.14.0
pip install opencv-python tensorboard h5py imageio omegaconf

Compile CUDA extensions from CamLiFlow for faster training and evaluation:

cd models/csrc
python setup.py build_ext --inplace

Dataset

FlyingThings3D

If you just want to run this project, just download our pre-processed files here.

dataset/FlyingThings3D_subset_pc
├── train_preprocess_ev10_1
├── val_preprocess_ev10_1
Optional. If it doesn't meet your needs, you should first download the raw FlyingThings3D_subset dataset and perform the event simulation (with esim_py, currently not available) and point cloud generation steps. The point cloud generation process follows CamLiFlow.
python preprocess_flyingthings3d_subset.py # from https://github.com/MCG-NJU/CamLiFlow/blob/main/preprocess_flyingthings3d_subset.py
python scripts/convert_flyingthings3d_subset_hdf5.py --input_dir dataset/FlyingThings3D_subset_pc

EKubric

If you just want to run this project, just download our pre-processed files here.

dataset/ekubric/
├── sf_preprocess
Optional. If it doesn't meet your needs, you should first download the raw EKubric dataset and perform the point cloud generation and preprocess step.
dataset/ekubric/
├── backward_flow
├── depth
├── events_i50_c0.15
├── forward_flow
├── metadata
├── rgba
├── segmentation
python convert_kubric_hdf5.py --input_dir dataset/ekubric

DSEC

If you just want to run this project, just download our pre-processed files here.

dataset/DSEC/
├── train_preprocess_pc

Since there is no ground truth flow for the official test set, we can only divide the official training set into a train set and a val set. See TRAIN_SEQUENCE for details.

Optional. If it doesn't meet your needs, you should first download the raw DSEC dataset and perform the disparity and aligned image (Events and frame alignment) generation steps. Note that the disparity here is not the official sparse disparity, but is computed using the stereo matching model CFNet.
dataset/DSEC/
├── train
│   ├── thun_00_a 
│       ├── calibration
│       ├── disparity
│       ├── events
│       ├── flow
│       └── images
...
├── train_events.zip
├── train_images.zip
├── train_optical_flow.zip
├── train_calibration.zip

Evaluation

Weights

First download our pre-trained model weights here and place them in the checkpoints folder.

Things

python eval_withocc.py --config ./conf/test/things.yaml --weights ./checkpoints/RPEFlow_things.pt
Results
#### 2D Metrics ####
EPE: 1.402
1px: 86.22%
Fl:  5.75%
#### 3D Metrics ####
EPE: 0.042
5cm: 88.00%
10cm: 93.08%
#### 3D Metrics (Non-occluded) ####
EPE: 0.024
5cm: 93.14%
10cm: 96.72%

EKubric

python eval_withocc.py --config ./conf/test/ekubric.yaml --weights ./checkpoints/RPEFlow_ekubric.pt
Results
#### 2D Metrics ####
EPE: 0.439
1px: 95.99%
Fl:  1.48%
#### 3D Metrics ####
EPE: 0.027
5cm: 95.33%
10cm: 96.32%
#### 3D Metrics (Non-occluded) ####
EPE: 0.007
5cm: 98.66%
10cm: 99.19%

DSEC

❯ python eval_noocc.py --config ./conf/test/dsec.yaml --weights ./checkpoints/RPEFlow_DSEC.pt
Results
#### 2D Metrics ####
EPE: 0.326
1px: 95.28%
Fl:  1.15%
#### 3D Metrics ####
EPE: 0.103
5cm: 60.81%
10cm: 74.97%

Training

The model training requires four 24G GPUs (4 RTX3090 we use). We first pre-train on FlyingThings3D and then fine-tune on EKubric and DSEC respectively. Note that the pre-train stage on FlyingThings3D may take more than 8 days.

export CUDA_VISIBLE_DEVICES=0,1,2,3
python train.py --config ./conf/train/pretrain.yaml
python train.py --config ./conf/train/kubric.yaml --weights ./outputs/RPEFlow_pretrain_gpu4xbs4/best.pt
python train.py --config ./conf/train/dsec.yaml --weights ./outputs/RPEFlow_pretrain_gpu4xbs4/best.pt

Citation

@InProceedings{Wan_RPEFlow_ICCV_2023,
  author    = {Wan, Zhexiong and Mao, Yuxin and Zhang, Jing and Dai, Yuchao},
  title     = {RPEFlow: Multimodal Fusion of RGB-PointCloud-Event for Joint Optical Flow and Scene Flow Estimation},
  booktitle = {Proceedings of the IEEE International Conference on Computer Vision (ICCV)},
  year      = {2023},
}

Acknowledgments

This research was sponsored by Zhejiang Lab.

Thanks the ACs and the reviewers for their comments, which is very helpful to improve our paper.

Our project is based on CamLiFlow. Thanks for the following helpful open source projects: CamLiFlow, RAFT, RAFT-3D, kubric, esim_py, E-RAFT, DSEC, CFNet.

About

RPEFlow: Multimodal Fusion of RGB-PointCloud-Event for Joint Optical Flow and Scene Flow Estimation (ICCV 2023)

Resources

Stars

Watchers

Forks

Packages

No packages published