# Neural style transfer

Neural style transfer is a technique based on deep convolutional neural networks that applies the style of an image to a content image. Among many things, this allows one to take ones favorite pictures and reimagine them as a creation of one's favorite painter, or to, for example, reimagine how the Eiffel tower would look like if it were covered in leaves.

This notebook makes the implementation of neural style transfer simple. By changing the content and style image paths in [Section 1](#1) one can quickly set up a new case with custom images and see the progress in the output of [Section 4](#4).

This notebook is based on notes from the [Coursera](https://www.coursera.org/) deep learning specialization and uses the VGG-19 network. 

The notebook is organized as follows:

- [0. Imports](#0)
- [1. Load content and style images](#1)
- [2. Load pre-trained CNN](#2)
- [3. Build the model](#3)
- [4. Train the model](#4)

<a name='0'></a>
## 0. Imports
This notebook requires `scipy`, `numpy`, `matplotlib`, and `tensorflow`. Some supporting functions are defined in the provided module `utils`.

In [None]:
import os
import sys
import scipy.io
import scipy.misc
import matplotlib.pyplot as plt
from matplotlib.pyplot import imshow
import numpy as np
import tensorflow as tf
from tensorflow.python.framework.ops import EagerTensor
from utils import *
%matplotlib inline

<a name='1'></a>
## 1. Load content and style images

The loaded content and style images are processed to have a 1:1 aspect ratio (using `make_square`), and of pixel size `img_size`.

In [None]:
img_size = 400

content_image = np.array(make_square(Image.open("images/escher_portrait.jpg")).resize((img_size, img_size)))
content_image = tf.constant(np.reshape(content_image, ((1,) + content_image.shape)))

style_image =  np.array(make_square(Image.open("images/escher.jpg")).resize((img_size, img_size)))
style_image = tf.constant(np.reshape(style_image, ((1,) + style_image.shape)))

ax1 = plt.subplot(1,2,1)
imshow(content_image[0])
ax1.title.set_text('Content image')

ax2 = plt.subplot(1,2,2)
imshow(style_image[0])
ax2.title.set_text('Style image')
plt.show()

<a name='2'></a>
## 2. Load pre-trained CNN
The VGG-19 model and weights are not included in the github repository and need to be downloaded first and saved in the `pretrained-model` folder. [Download VGG-19 weights](
https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg19_weights_tf_dim_ordering_tf_kernels_notop.h5).


In [None]:
tf.random.set_seed(10)
vgg = tf.keras.applications.VGG19(include_top=False,
                                  input_shape=(img_size, img_size, 3),
                                  weights='pretrained-model/vgg19_weights_tf_dim_ordering_tf_kernels_notop.h5')

vgg.trainable = False

The layers of the loaded VGG-19 model are printed below:

In [None]:
for layer in vgg.layers:
    print(layer.name)

<a name='3'></a>
## 3. Build the model

#### 3.1 Define style cost weights
Choose the layers in the VGG-19 network that will be used to compute the style cost. The tuple containts the name of the layer and the weight $\lambda^{[l]}$ at which it enters the total style cost.

In [None]:
STYLE_LAYERS = [
    ('block1_conv1', 0.05),
    ('block2_conv1', .1),
    ('block3_conv1', 0.15),
    ('block4_conv1', 0.3),
    ('block5_conv1', 0.3),
]

#### 3.2 Initialize generated image
Add uniform random noise to the content image to initialize the generated image.

In [None]:
generated_image = tf.Variable(tf.image.convert_image_dtype(content_image, tf.float32))
noise = tf.random.uniform(tf.shape(generated_image), -0.25, 0.25)
generated_image = tf.add(generated_image, noise)
generated_image = tf.clip_by_value(generated_image, clip_value_min=0.0, clip_value_max=1.0)

imshow(generated_image.numpy()[0])
plt.show()

#### 3.3 Compute content and style layer outputs

First, define the content layer and add it to to the model outputs.

In [None]:
content_layer = [('block5_conv4', 1)]

vgg_model_outputs = get_layer_outputs(vgg, STYLE_LAYERS + content_layer)

Compute the hidden layer activations for the content and style images and store them in `a_C` and `a_S`.

In [None]:
preprocessed_content =  tf.Variable(tf.image.convert_image_dtype(content_image, tf.float32))
a_C = vgg_model_outputs(preprocessed_content)

preprocessed_style =  tf.Variable(tf.image.convert_image_dtype(style_image, tf.float32))
a_S = vgg_model_outputs(preprocessed_style)

#### 3.4 Define the neural style transfer model

Given the activations in the content layer for the content image $a^{(C)}$ and the generated image $a^{(G)}$, the content cost is defined as


$$J_{content}(C,G) =  \frac{1}{4 \times n_H \times n_W \times n_C}\sum _{ \text{all entries}} (a^{(C)} - a^{(G)})^2\tag{1} $$

where $n_H, n_W$ and $n_C$ are the height, width and number of channels of the hidden layer you have chosen.

For the style cost, one needs to compute the Gram matrices $G^{(S)}_{(gram)}=a^{(S)} (a^{(S)})^T$ and $G^{(G)}_{(gram)}=a^{(G)} (a^{(G)})^T$ of the style and generated image, respectively. The style cost for layer $l$ can then be defined as

$$J_{style}^{[l]}(S,G) = \frac{1}{4 \times {n_C}^2 \times (n_H \times n_W)^2} \sum _{i=1}^{n_C}\sum_{j=1}^{n_C}(G^{(S)}_{(gram)i,j} - G^{(G)}_{(gram)i,j})^2\tag{2} $$

Taking the sum over all style layers and weighing the layer style cost by the previously defined weights in `STYLE_LAYERS`

$$J_{style}(S,G) = \sum_{l} \lambda^{[l]} J^{[l]}_{style}(S,G)\tag{3}$$

Finally, the content and style costs are added and weighted by the parameters $\alpha$ and $\beta$


$$J(G) = \alpha J_{content}(C,G) + \beta J_{style}(S,G)\tag{4}$$

Most of these functions are implemented in `utils.py`. What remains to be done is to define the training step in `tensorflow`

In [None]:
optimizer = tf.keras.optimizers.Adam(0.01)

alpha = 5
beta = 100

@tf.function()  # Use decorator to make train_step a tensorflow Function
def train_step(generated_image):
    with tf.GradientTape() as tape:
        a_G = vgg_model_outputs(generated_image)
        J_style = compute_style_cost(a_S, a_G, STYLE_LAYERS)
        J_content = compute_content_cost(a_C, a_G)
        J = total_cost(J_content, J_style, alpha, beta)

    # compute gradient of tape
    grad = tape.gradient(J, generated_image)
    
    # update generated_image using computed gradient
    optimizer.apply_gradients([(grad, generated_image)])
    
    # assign new value to tf.Variable()
    generated_image.assign(clip_0_1(generated_image)) 


<a name='4'></a>
## 4. Train the model

All that is left to do is run the model to train the generated image. Training can take quite long and with a learning rate of 0.001 around 10000 epochs need to be run. For quick results, reduce the learning rate and run for fewer epochs.

In [None]:
generated_image = tf.Variable(generated_image)

# Show the generated image at some epochs
epochs = 1001
for i in range(epochs):
    train_step(generated_image)
    if i % 20 == 0:
        print(f"Epoch {i} ")
    if i % 20 == 0:
        image = tensor_to_image(generated_image) 
        imshow(image)
        image.save(f"output/image_{i}.jpg")
        plt.show() 