[ICCV 2023] Minimum Latency Deep Online Video Stabilization (Paper)

Zhuofan Zhang^1,, Zhen Liu^2,, Ping Tan³, Bing Zeng¹, Shuaicheng Liu^1,2,†

1. University of Electronic Science and Technology of China, 2. Megvii Research

3. The Hong Kong University of Science and Technology

*Equal contribution, †Corresponding author

Abstract

We present a novel camera path optimization framework for the task of online video stabilization. Typically, a stabilization pipeline consists of three steps: motion estimating, path smoothing, and novel view rendering. Most previous methods concentrate on motion estimation, proposing various global or local motion models. In contrast, path optimization receives relatively less attention, especially in the important online setting, where no future frames are available. In this work, we adopt recent off-the-shelf high-quality deep motion models for the motion estimation to recover the camera trajectory and focus on the latter two steps. Our network takes a short 2D camera path in a sliding window as input and outputs the stabilizing warp field of the last frame in the window, which warps the coming frame to its stabilized position. A hybrid loss is well-defined to constrain the spatial and temporal consistency. In addition, we build a motion dataset that contains stable and unstable motion pairs for the training. Extensive experiments demonstrate that our approach significantly outperforms state-of-the-art online methods both qualitatively and quantitatively and achieves comparable performance to offline methods.

Pipeline

MotionStab Dataset

The MotionStab dataset and the synthesized stable/unstable videos can be download from Google Drive. The dataset is organized as follow:

MotionStab
|--Regular
|  |--0-fi.npy
|  |--0-bi.npy
|  |--0-unstab.mp4
|  |--0-stab.mp4
|  |--...
|--QuickRotation
|  |--0-fi.npy
|  |--0-bi.npy
|  |--0-unstab.mp4
|  |--0-stab.mp4
|  |--...
|--Crowd
|  |--0-fi.npy
|  |--0-bi.npy
|  |--0-unstab.mp4
|  |--0-stab.mp4
|  |--...
...

For each synthesized video, xx-fi.npy, xx-bi.npy, xx-stab.mp4, xx-unstab.mp4 are inter-frame motions, ground truth warp fields, the unstable video, and the stable video, respectively.

Usage

Requirements

Python 3.7.13
PyTorch 1.9.0
Torchvision 0.10.0
CUDA 10.2 on Ubuntu 18.04

Install the require dependencies:

conda create -n nndvs python=3.7
conda activate nndvs
pip install -r requirements.txt

Evaluation

Download the reorganized NUS dataset from Google Drive and place it in the ./data folder.
Conduct full evaluation by running:
```
bash eval_nus.sh
```

Citation

If you find this work helpful, please cite our paper:

@InProceedings{Zhang_2023_ICCV,
    author    = {Zhang, Zhuofan and Liu, Zhen and Tan, Ping and Zeng, Bing and Liu, Shuaicheng},
    title     = {Minimum Latency Deep Online Video Stabilization},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2023},
    pages     = {23030-23039}
}

Contact

If you have any questions, feel free to contact Zhen Liu at liuzhen03@megvii.com.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
assets		assets
dataset		dataset
feature_tracker		feature_tracker
image_warper		image_warper
model		model
motion_estimater		motion_estimater
pretrained		pretrained
README.md		README.md
eval_nus.py		eval_nus.py
eval_nus.sh		eval_nus.sh
metrices.py		metrices.py
requirements.txt		requirements.txt
utils.py		utils.py

liuzhen03/NNDVS

Folders and files

Latest commit

History

Repository files navigation

[ICCV 2023] Minimum Latency Deep Online Video Stabilization (Paper)

Zhuofan Zhang1,*, Zhen Liu2,*, Ping Tan3, Bing Zeng1, Shuaicheng Liu1,2,†

1. University of Electronic Science and Technology of China, 2. Megvii Research

3. The Hong Kong University of Science and Technology

*Equal contribution, †Corresponding author

Abstract

Pipeline

MotionStab Dataset

Usage

Requirements

Evaluation

Citation

Contact

About

Resources

Stars

Watchers

Forks

Languages

Zhuofan Zhang^1,, Zhen Liu^2,, Ping Tan³, Bing Zeng¹, Shuaicheng Liu^1,2,†