Fast Style Transfer in TensorFlow
Add styles from famous paintings to any photo in a fraction of a second! You can even style videos!
Tested this using NVIDIA TESLA V100 VOLTA GPU ACCELERATOR 16GB It takes 100ms on a 2015 Titan X to style the MIT Stata Center (1024Γ680) like Udnie, by Francis Picabia.
This implementation is based off of a combination of Gatys' A Neural Algorithm of Artistic Style, Johnson's Perceptual Losses for Real-Time Style Transfer and Super-Resolution, and Ulyanov's Instance Normalization.
Copyright (c) 2016 Logan Engstrom. Contact me for commercial use (or rather any use that is not academic research) (email: engstrom at my university's domain dot edu). Free for research use, as long as proper attribution is given and this copyright notice is retained.
Here I've transformed every frame in a video, then combined the results. The style here is Udnie, as above.
See how to generate these videos here!
We added styles from various paintings to a photo of Chicago. Click on thumbnails to see full applied style images.
Our implementation uses TensorFlow to train a fast style transfer network. We use roughly the same transformation network as described in Johnson, except that batch normalization is replaced with Ulyanov's instance normalization, and the scaling/offset of the output tanh
layer is slightly different. We use a loss function close to the one described in Gatys, using VGG19 instead of VGG16 and typically using "shallower" layers than in Johnson's implementation (e.g. we use relu1_1
rather than relu1_2
). Empirically, this results in larger scale style features in transformations.
Use style.py
to train a new style transfer network. Run python style.py
to view all the possible parameters. Training takes 4-6 hours on a Maxwell Titan X. More detailed documentation here. Before you run this, you should run setup.sh
. Example usage:
python style.py --style path/to/style/img.jpg \
--checkpoint-dir checkpoint/path \
--test path/to/test/img.jpg \
--test-dir path/to/test/dir \
--content-weight 1.5e1 \
--checkpoint-iterations 1000 \
--batch-size 20
Use evaluate.py
to evaluate a style transfer network. Run python evaluate.py
to view all the possible parameters. Evaluation takes 100 ms per frame (when batch size is 1) on a Maxwell Titan X. More detailed documentation here. Takes several seconds per frame on a CPU. Models for evaluation are located here. Example usage:
python evaluate.py --checkpoint path/to/style/model.ckpt \
--in-path dir/of/test/imgs/ \
--out-path dir/for/results/
Use transform_video.py
to transfer style into a video. Run python transform_video.py
to view all the possible parameters. Requires ffmpeg
. More detailed documentation here. Example usage:
python transform_video.py --in-path path/to/input/vid.mp4 \
--checkpoint path/to/style/model.ckpt \
--out-path out/video.mp4 \
--device /gpu:0 \
--batch-size 4
You will need the following to run the above:
- TensorFlow 0.11.0
- Python 2.7.9, Pillow 3.4.2, scipy 0.18.1, numpy 1.11.2
- If you want to train (and don't want to wait for 4 months):
- A decent GPU
- All the required NVIDIA software to run TF on a GPU (cuda, etc)
- ffmpeg 3.1.3 if you want to stylize video
@misc{engstrom2016faststyletransfer,
author = {Logan Engstrom},
title = {Fast Style Transfer},
year = {2016},
howpublished = {\url{https://github.com/lengstrom/fast-style-transfer/}},
note = {commit xxxxxxx}
}