Group-based Bi-Directional Recurrent Wavelet Neural Network for Efficient Video Super-Resolution (VSR)

Young-Ju Choi, Young-Woon Lee, and Byung-Gyu Kim

Intelligent Vision Processing Lab. (IVPL), Sookmyung Women's University, Seoul, Republic of Korea

This repository is the official PyTorch implementation of the paper published in Pattern Recognition Letters (Elsevier).

Summary of paper

Abstract

Video super-resolution (VSR) is an important technology for enhancing the quality of video frames. The recurrent neural network (RNN)-based approach is suitable for sequential data because it can use accu- mulated temporal information. However, since existing methods only tend to capture slow and symmet- rical motion with low frame rate, there are still limitations to restore the missing details for more dy- namic motion. Most of the previous methods using spatial information treat different types of the spatial features identically. It leads to lack of obtaining meaningful information and enhancing the fine details. We propose a group-based bi-directional recurrent wavelet neural network (GBR-WNN) to exploit spatio- temporal information effectively. The proposed group-based bi-directional RNN (GBR) framework is built on the well-structured process with the group of pictures (GOP). In a GOP, we resolves the low-resolution (LR) frames from border frames to center target frame. Because super-resolved features in a GOP are cu- mulative, neighboring features are improved progressively and asymmetrical motion can be dealt with. Also, we propose a temporal wavelet attention (TWA) adopting attention module for both spatial and temporal features simultaneously based on discrete wavelet transform. Experiments show that the pro- posed scheme achieves superior performance compared with state-of-the-art methods.

Network Architecture

Experimental Results

Getting Started

Dependencies and Installation

Anaconda3
Python == 3.6
```
conda create --name gbrwnn python=3.6
```

PyTorch (NVIDIA GPU + CUDA)

Trained on PyTorch 1.8.1 CUDA 10.2

conda install pytorch==1.8.1 torchvision==0.9.1 torchaudio==0.8.1 cudatoolkit=10.2 -c pytorch

tqdm, pyyaml, tensorboard, opencv-python, lmdb

conda install -c conda-forge tqdm pyyaml tensorboard
pip install opencv-python
pip install lmdb

Dataset Preparation

We used Vimeo90K dataset for training and Vid4, REDS4, SPMCS, DAVIS-2019 datasets for testing.

Prepare for Vimeo90K
1. Please refer to Dataset.md in our Deep-Video-Super-Resolution repository for more details.
2. Download dataset from the official website.
3. Put the dataset in ./datasets/
4. Generate LR data
  
  Run in ./codes/data_processing_scripts/
```
python generate_LR_Vimeo90K.py
```
5. Generate LMDB
  
  Run in ./codes/data_processing_scripts/
```
python generate_lmdb_Vimeo90K.py
```
Prepare for Vid4
1. Please refer to Dataset.md in our Deep-Video-Super-Resolution repository for more details.
2. Download dataset from here.
3. Put the dataset in ./datasets/
4. Generate LR data
  
  Run in ./codes/data_processing_scripts/
```
python generate_LR_Vid4.py
```
Prepare for REDS4
1. Please refer to Dataset.md in our Deep-Video-Super-Resolution repository for more details.
2. Download dataset from the official website.
3. Put the dataset in ./datasets/
Prepare for SPMCS
1. Download dataset from here.
2. Put the dataset in ./datasets/
3. Generate LR data
  
  Run in ./codes/data_processing_scripts/
```
python generate_LR_SPMCS.py
```
Prepare for DAVIS-2019
1. Download dataset from the official website.
2. Put the dataset in ./datasets/
3. Generate LR data
  
  Run in ./codes/data_processing_scripts/
```
python generate_LR_DAVIS.py
```

Model Zoo

Pre-trained models are available in below link.

Training

Run in ./codes/

GBR-WNN-L

Using single GPU
```
python train.py -opt options/train/train_GBRWNN_L.yml
```
Using multiple GPUs (nproc_per_node means the number of GPUs) with setting CUDA_VISIBLE_DEVICES in .yml file

For example, set 'gpu_ids: [0,1,2,3,4,5,6,7]' in .yml file for 8 GPUs
```
python -m torch.distributed.launch --nproc_per_node=8 --master_port=4321 train.py -opt options/train/train_GBRWNN_L.yml --launcher pytorch
```
GBR-WNN-M

Using single GPU
```
python train.py -opt options/train/train_GBRWNN_M.yml
```
Using multiple GPUs (nproc_per_node means the number of GPUs) with setting CUDA_VISIBLE_DEVICES in .yml file

For example, set 'gpu_ids: [0,1,2,3,4,5,6,7]' in .yml file for 8 GPUs
```
python -m torch.distributed.launch --nproc_per_node=8 --master_port=4321 train.py -opt options/train/train_GBRWNN_M.yml --launcher pytorch
```
GBR-WNN-S

Using single GPU
```
python train.py -opt options/train/train_GBRWNN_S.yml
```
Using multiple GPUs (nproc_per_node means the number of GPUs) with setting CUDA_VISIBLE_DEVICES in .yml file

For example, set 'gpu_ids: [0,1,2,3,4,5,6,7]' in .yml file for 8 GPUs
```
python -m torch.distributed.launch --nproc_per_node=8 --master_port=4321 train.py -opt options/train/train_GBRWNN_S.yml --launcher pytorch
```

Testing

Run in ./codes/

python test.py

You can test the GBR-WNN-L, GBR-WNN-M, GBR-WNN-S models under Vid4, REDS4, SPMCS, DAVIS-2019 test datasets by modifying the 'model_mode' and 'data_mode' in source code.

Citation

@article{choi2022group,
  title={Group-based bi-directional recurrent wavelet neural network for efficient video super-resolution (VSR)},
  author={Choi, Young-Ju and Lee, Young-Woon and Kim, Byung-Gyu},
  journal={Pattern Recognition Letters},
  volume={164},
  pages={246--253},
  year={2022},
  publisher={Elsevier}
}

Acknowledgement

The codes are heavily based on EDVR and BasicSR. Thanks for their awesome works.

EDVR : 
Wang, Xintao, et al. "Edvr: Video restoration with enhanced deformable convolutional networks." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 2019.

BasicSR :
@misc{basicsr,
  author =       {Xintao Wang and Liangbin Xie and Ke Yu and Kelvin C.K. Chan and Chen Change Loy and Chao Dong},
  title =        {{BasicSR}: Open Source Image and Video Restoration Toolbox},
  howpublished = {\url{https://github.com/XPixelGroup/BasicSR}},
  year =         {2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
codes		codes
datasets		datasets
experiments		experiments
images		images
pretrained_models		pretrained_models
tb_logger		tb_logger
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Group-based Bi-Directional Recurrent Wavelet Neural Network for Efficient Video Super-Resolution (VSR)

Young-Ju Choi, Young-Woon Lee, and Byung-Gyu Kim

Intelligent Vision Processing Lab. (IVPL), Sookmyung Women's University, Seoul, Republic of Korea

This repository is the official PyTorch implementation of the paper published in Pattern Recognition Letters (Elsevier).

Summary of paper

Abstract

Network Architecture

Experimental Results

Getting Started

Dependencies and Installation

Dataset Preparation

Model Zoo

Training

Testing

Citation

Acknowledgement

About

Releases

Packages

Languages

smu-ivpl/GBR-WNN

Folders and files

Latest commit

History

Repository files navigation

Group-based Bi-Directional Recurrent Wavelet Neural Network for Efficient Video Super-Resolution (VSR)

Young-Ju Choi, Young-Woon Lee, and Byung-Gyu Kim

Intelligent Vision Processing Lab. (IVPL), Sookmyung Women's University, Seoul, Republic of Korea

This repository is the official PyTorch implementation of the paper published in Pattern Recognition Letters (Elsevier).

Summary of paper

Abstract

Network Architecture

Experimental Results

Getting Started

Dependencies and Installation

Dataset Preparation

Model Zoo

Training

Testing

Citation

Acknowledgement

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages