UniST : Two Birds, One Stone: A Unified Framework for Joint Learning of Image and Video Style Transfers (ICCV2023)

Authors: Bohai Gu, Heng Fan, Libo Zhang
This repository is the official pytorch implementation of Two Birds, One Stone: A Unified Framework for Joint Learning of Image and Video Style Transfers.
Which proposes an unified style transfer framework, dubbed UniST, for arbitrary image and video style transfers, in which two tasks can benefit from each other to improve the performance.

Overview

The proposed network leverages the local and long-range dependencies jointly. More specifically, UniST first applies CNNs to generate tokens, and then models long-range dependencies to excavate domain-specific information with domain interaction transformer (DIT). Afterwards, DIT sequentially interacts contextualized domain information for joint learning.

Results

Both images and videos are provided with finest grainuarity style transfer results.

Image Style Transfer

Video Style Transfer

Application

Except the arbitrary image and video style transfers, UniST provides the multi-granularity style transfer. Style resolutions are 1024x1024, 512x512, 256x256, respectively.

Compared with some state-of-the-art algorithms, our method has a strong ability to generate finest grainuarity results with better feature representation. (Some of the SOATs are not supported for multi-granularity style transfer.)

Experiment

Requirements

python 3.6
pytorch 1.6.0
torchvision 0.4.2
PIL, einops, matplotlib
tqdm

Testing

Please download Pretrained models and put into the floder ./weight.

Please configure paramters in ./option/test_options.py. And set the pretrained checkpoint in ./models/model.py.

For multi_granularity and single_modality , please refer to the scripts in ./application.

python scripts/inference.py

Training

Pretrained models: vgg_r41.pth, dec_r41.pth, vgg_r51.pth. Please download them and put into the floder ./weight.

Style dataset is WikiArt collected from WIKIART.
Content dataset is COCO dataset for image, and MPI dataset or DAVIS for video.

Please configure paramters in ./option/train_options.py

python scripts/train.py

BibTeX

If this repo is useful to you, please cite our technical paper.

@InProceedings{Gu_2023_ICCV,
    author    = {Gu, Bohai and Fan, Heng and Zhang, Libo},
    title     = {Two Birds, One Stone: A Unified Framework for Joint Learning of Image and Video Style Transfers},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2023},
    pages     = {23545-23554}
}

Acknowledgments

We would like to express our gratitude for the contributions of several previous works to the implementation of UniST. This includes, but is not limited to pixel2style2pixel ,attention-is-all-you-need.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
Figure		Figure
application		application
configs		configs
criteria		criteria
datasets		datasets
models		models
options		options
scripts		scripts
training		training
utils		utils
README.md		README.md

NevSNev/UniST

Folders and files

Latest commit

History

Repository files navigation

UniST : Two Birds, One Stone: A Unified Framework for Joint Learning of Image and Video Style Transfers (ICCV2023)

Overview

Results

Image Style Transfer

Video Style Transfer

Application

Experiment

Requirements

Testing

Training

BibTeX

Acknowledgments

About

Resources

Stars

Watchers

Forks

Languages