Skip to content

zhaoyuzhi/SVCNet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SVCNet


Official PyTorch Implementation of the SVCNet Paper

Project | arXiv | IEEE Xplore

1 Introduction

SVCNet is an architecture for scribble-based video colorization, which includes two sub-networks: CPNet and SSNet. This repo contains training and evaluation code for the following paper:

SVCNet: Scribble-based Video Colorization Network with Temporal Aggregation
Yuzhi Zhao1, Lai-Man Po1, Kangcheng Liu2, Xuehui Wang3, Wing-Yin Yu1, Pengfei Xian1, Yujia Zhang4, Mengyang Liu4
1City University of Hong Kong, 2Nanyang Technological University, 3Shanghai Jiao Tong University, 4Tencent Video
IEEE Transactions on Image Processing (TIP), 2023

pipeline

2 Preparation

2.1 Environment

We test the code on CUDA 10.0 (higher version is also compatible). The basic requirements are as follows:

  • pytorch==1.2.0
  • torchvision==0.4.0
  • cupy-cuda100
  • python-opencv
  • scipy
  • scikit-image

If you use conda, the following command is helpful:

conda env create -f environment.yaml
conda activate svcnet

2.2 Pre-trained models

We upload the pre-trained SVCNet modules (including CPNet and SSNet) and other public pre-trained models (including PWCNet and VGG-16). By default we put all those files under a trained_models root.

All the pre-trained model files can be downloaded at this link.

Alternatively, you can download following files if you only want to do inference:

2.3 Dataset

We use ImageNet, DAVIS, and Videvo datasets as our training set. Please cite the original papers if you use these datasets. We release zip files that contain those images. By default we put all those files under a data root.

We generate saliency maps as pseudo segmentation labels for images in the ImageNet and Videvo datasets. Note that, images in the DAVIS dataset have segmentation labels. The saliency detection method is Pyramid Feature Attention Network for Saliency detection. The generated saliency maps are also released.

All the ImageNet files can be downloaded at this link. All the DAVIS-Videvo files can be downloaded at this link. Alternatively, you can find each seperate file below:

2.3.1 Training set of ImageNet (256x256 resolution, 1281167 files)

2.3.2 Validation set of ImageNet (256x256 resolution, 50000 files)

2.3.3 Training set of DAVIS-Videvo dataset (156 video clips)

2.3.4 Validation set of DAVIS-Videvo dataset (50 video clips)

3 Arrangement

  • CPNet: includes scripts and codes for training and validating CPNet

  • SSNet: includes scripts and codes for training SSNet and validating SVCNet

  • Evaluation: includes codes for evaluation (e.g., Tables II, IV, and V in the paper)

  • GCS: includes codes for generating validation color scribbles

4 Fast inference

4.1 Demo

We include a legacy video segment along with their corresponding color scribble frames with 4 different styles. The input grayscale frames and color scribbles are also included. You may find the code related to how to generate these color scribbles in GCS sub-folder. Users can easily reproduce the following results by running:

cd SSNet
python test.py

gif gif

gif gif

4.2 Test on user data

  • Creating your own scribbles (see GCS sub-folder). You need first provide the first color scribble; then, you can use generate_color_scribbles_video.py script to obtain the following scribbles based on the optical flows of your own grayscale video.

  • Inference with your generated scribbles (see SSNet sub-folder). Please follow the guide in the README file, e.g., running test.py.

5 Visualization

A few video samples on the validation dataset are illustrated below:

gif gif gif gif

6 Acknowledgement

Some codes are borrowed from the PyTorch-PFAN, SCGAN, VCGAN, PyTorch-PWC, and DEVC projects. Thanks for their awesome works.

7 Citation

If you think this work is helpful, please consider cite:

@article{zhao2023svcnet,
  title={SVCNet: Scribble-based Video Colorization Network with Temporal Aggregation},
  author={Zhao, Yuzhi and Po, Lai-Man and Liu, Kangcheng and Wang, Xuehui and Yu, Wing-Yin and Xian, Pengfei and Zhang, Yujia and Liu, Mengyang},
  journal={IEEE Transactions on Image Processing},
  volume={32},
  pages={4443-4458},
  year={2023}
}

About

SVCNet: Scribble-based Video Colorization Network with Temporal Aggregation. IEEE TIP, 2023

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published