MetaStyle: Three-Way Trade-Off Among Speed, Flexibility, and Quality in Neural Style Transfer
Clone or download
Latest commit c18a1ce Dec 4, 2018
Type Name Latest commit message Commit time
Failed to load latest commit information.
images add compare and citation Dec 3, 2018
src for public release Dec 3, 2018
.gitignore Initial commit Dec 2, 2018
LICENSE fix README Dec 3, 2018 Update Dec 4, 2018
requirements.txt for public release Dec 3, 2018


This repo contains the PyTorch code for our AAAI 2019 paper.

MetaStyle: Three-Way Trade-Off Among Speed, Flexibility, and Quality in Neural Style Transfer
Chi Zhang, Yixin Zhu, Song-Chun Zhu
To appear in Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2019.

In this paper, we propose to combine neural style transfer with bilevel optimization to trade off speed, flexibility, and quality. In contrast to previous methods, our approach could handle arbitrary artistic style (flexibility) in a real-time manner (speed) and achieves good image quality comparable to the impressive but slow iterative-optimization-based method proposed in Gatys et al. (quality). We instantiate the model using an image transformation network and solve it by Adam. The bilevel optimization encourages the model to first find the style-free representation of images (hence the name) and after super-fast model adaptation, we could have a model tailored to a style specifically. For further details, please refer to our paper.



We show some samples below. The left column shows the content image and its style-free representation. The rest of the figure displays different stylized content images.


The following figure shows comparison with other methods.


A video demo could be found here.



  • PyTorch (>= 0.4.0)
  • CUDA and cuDNN

See requirements.txt for a full list of packages required.



To train a model of your own:

  1. First download a content image dataset and a style image dataset. In this paper, we use MS-COCO and WikiArt.
  2. Run
python src/ train --content-dataset <path-to-your-content-dataset> --style-dataset <path-to-your-style-dataset> --cuda 1

Usually, the default parameters should work. However, you are always welcome to fine tune yourself. The training process could be monitored using Tensorboard. Since bilevel optimization requires "second-order" gradient computation, the training process might take a long time depending on the GPU you have, and the GPU memory consumption is huge.

We provide our pre-trained model here.

Fast Training

To adapt the model to a new style, run

python src/ train --content-dataset <path-to-your-content-dataset> --style-image <path-to-your-style-image> --model <path-to-your-trained-model> --cuda 1

Usually fast adaptation requires only 100 to 200 post update steps and could be done in less than 30 seconds, depending on the GPU you have.


To stylize a content image, run

python src/ test --content-image <path-to-your-content-image> --output-image <path-to-your-output-image> --model <path-to-your-trained-model> --cuda 1


If you find the paper and the code helpful, please cite us.

    author={Zhang, Chi, Zhu, Yixin, Zhu, Song-Chun},
    title={MetaStyle: Three-Way Trade-Off Among Speed, Flexibility, and Quality in Neural Style Transfer},
    booktitle={Proceedings of the AAAI Conference on Artificial Intelligence (AAAI)},


This project is impossible to finish without the help of my colleagues and the following open-source implementations.


MetaStyle is freely available for non-commercial use, and may be redistributed under these conditions. Please see the license for further details. For commercial license, please contact the authors.