Skip to content

kianzohoury/style_transfer

Repository files navigation

Open In Colab

Exploring Artistic Style Transfer 🎨 🖼️

PyTorch-based implementations (from scratch) of several distinct deep learning approaches [1-3] that aim to solve a popular problem in computer vision called style transfer. Put simply, the task in style transfer is to generate an image that preserves the content of image x (i.e. semantics, shapes, edges, etc.) while matching the style of image y (i.e. textures, patterns, color, etc.). One may ask: what is the correct balance between content and style? As it turns out, the answer is more subjective than typical optimization/ML problems - "beauty is in the eye's of the beholder", as they say.

Background

Colab Notebook

An interactive notebook can be accessed here.

Requirements

Installation

Clone the repo to install:

$ git clone https://github.com/kianzohoury/style_transfer.git  

and install the dependencies with the PyPi package manager:

$ pip install style_transfer/.

Usage

Method I: Slow optimization

Unlike the other methods that require training, Gatys et al. [1] proposed optimization directly on images themselves. In this manner, the pixels of an image are considered parameters, and "training" involves updating the pixel values rather than a neural network's parameters. As seen below, the stylized images are visually pleasing and preserve content quite well, but are not efficient to generate (~ 75 seconds for 150 L-BFGS iterations on an NVIDIA V100 GPU).

Stylized images using (1) Starry Night by Van Gogh, (2) Girl with a Mandolin by Pablo Picasso, and (3) Sky and Water by MC Escher, using a content/style ratio = 1e-6.

To run this method from the command line, cd into /style_transfer and execute the following:

python -m stylize gatys --content-src <content path> --style-src <style path> 

Optionally, the same can be achieved by calling stylize.run_gatys_optimization():

from stylize import run_gatys_optimization

stylized_img = run_gatys_optimization(
    content_src="examples/content/tuebingen_neckarfront.jpeg",
    style_src="examples/style/van_gogh_starry_night.jpeg",
    ...
)

Options

Some of the important options are described below:

--content-weight (float): Content loss weight. Default: 1.0.

--style-weight (float): Style loss weight. Default: 1e6.

--tv-weight (float): Total variation regularization weight. Default: 1e-6.

--lbfgs-iters (int): Max number of L-BFGS iterations per optimization step. Default: 10.

--num-steps Number of image optimizations, resulting in a maximum of lbfgs_iters * num_steps total L-BFGS iterations. Default: 50.

--lr (float): Learning rate for L-BFGS optimizer. Default: 1e-3.

--init-noise (bool): Initializes generated image with noise. Default: False.

--save-fp (str, optional): Path to save generated image, using a valid format (e.g. jpg, tiff). Default: None.

--save-gif If True, saves a .gif version of the image saved under save_fp. Default: False.

Refer to the method signature of stylize.run_gatys_optimization() for the full list of options.

Feature Inversion

As an aside, feature inversion can be conducted when only the style is optimized for the perceptual loss objective. Starting with a noise image, the gram matrix (G=FF^T) w.r.t each style layer is the signal that guides the inversion process of the pretrained features. The resulting image resembles a texture. For more info, refer to Gatys et al.'s texture paper [add citation].

Reconstructed features from VGG16 activations layers (1) conv_1_2, (2) conv_2_2, (3) conv_3_3, and (4) conv_4_4. The images illustrate that deeper layers capture style more globally.

Method II: Transformation networks

Johnson et al. [2] proposed a transformation network, which is significantly faster than Method I (~1000x) faster for 256 x 256 images. However, the major drawback with this method is that a transformation network must be trained separately for each style...

Method III: CycleGAN

References

[1] L. A. Gatys, A. S. Ecker, and M. Bethge, "Image Style Transfer Using Convolutional Neural Networks," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 2414-2423. DOI

[2] J. Johnson, A. Alahi, and L. Fei-Fei, "Perceptual Losses for Real-Time Style Transfer and Super-Resolution," in Proceedings of the European Conference on Computer Vision (ECCV), 2016, pp. 694-711. DOI

[3] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, "Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks," in Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 2223-2232. DOI

About

Artistic image synthesis with neural style transfer

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published