# Neural Style Transfer 🎨

![](https://images.unsplash.com/photo-1461344577544-4e5dc9487184?ixlib=rb-1.2.1&ixid=eyJhcHBfaWQiOjEyMDd9&auto=format&fit=crop&w=1050&q=80)

Photo by [Alice Achterhof](https://unsplash.com/photos/FwF_fKj5tBo)


---

In this exercise, you'll get a chance to play around with **Neural Style Transfer algorithms**. This is more of a guided tutorial than a real exercise, in that you are given all the code you need to run, and you just have tu run it yourself.

> Fair warning : even on colab, the image generation **will take close to forever** to run (about 3 minutes per iteration, and we need around 100 iterations...).

# I. What is Neural Style Transfer ?

## I.1. Objective

These algorithms are specialized for style transfer between:
- a base image, called content image, from which we would like to keep the content
- a style image from which we would like to take the style and apply it to the content

<img src='../images/nst.png' width="600px" />

## I.2. Intuition

The principle of neural style transfer is to **define two distance functions** :
- one that describes **how different the content of two images are**, `Lcontent`, 
- and one that describes the **difference between the two images in terms of their style**, `Lstyle`. 

Then, given three images :
- a **desired style image** (S)
- a **desired content image** (C)
- and the **generated image** (G) (initialized with the content image)

we try to transform the input image to **minimize the content distance with the content image and its style distance with the style image**.

# II. Loss computation

## II.1. Overall loss

The **overall loss** (or total loss) is given by :

<img src='../images/loss.png' width="600px" />

> 🔦 **Hint**: The coefficients associated to each type of loss are hyper-parameters.

During each iteration, all the three images i.e. **content image**, **style image** and **generated image** are passed through the VGG19 model.

The value of the hidden unit’s activation which encode feature representation of the given image at certain layers are taken as input to these loss functions.

<img src='../images/loss_inp.png' width="600px" />

## II.2. Content loss

The **content loss** is simply the L2-loss of the activation layer of the content image vs. the generated image. 

<img src='../images/loss_c.png' width="600px" />

> 🔦**Hint**: We note each activation layer of content image as `a(L)(C)` and activation layer of generated image as `a(L)(G)`.

## II.3. Style loss

The **style loss** is more complex and requires to compute the Gram matrix, the loss associated to the Gram matrix between the Style and the Generated image, and the weighted style loss accross layers of the Style and Generated images.

---

# III. Implementation in Keras 

In [None]:
!pip install tensorflow==2.0

In [2]:
import tensorflow as tf
from tensorflow.keras.preprocessing.image import load_img, save_img, img_to_array
import numpy as np
from scipy.optimize import fmin_l_bfgs_b
import time
import argparse

from tensorflow.keras.applications import vgg19
from tensorflow.keras import backend as K
tf.compat.v1.disable_eager_execution()

Using TensorFlow backend.


### Basic Settings

In [None]:
# Choose the content image, the style image and the result folder
base_image_path = './trump.jpg'   # You can choose any image of your preference
style_reference_image_path = './lisa.jpg'  # You can choose any image of your preference
result_prefix = 'drive/My Drive/Results_Iterations/'

# Pick the number of iterations
iterations=100

# Weights of the different loss components
content_weight=0.025
style_weight=1.0
total_variation_weight=1.0

# Dimensions of the generated picture
width, height = load_img(base_image_path).size
img_nrows = 400
img_ncols = int(width * img_nrows / height)

### Image pre-processing

In [None]:
# Open, resize and format pictures into appropriate tensors
def preprocess_image(image_path):
    # Load the image
    img = load_img(image_path, target_size=(img_nrows, img_ncols))
    # Convert to array
    img = img_to_array(img)
    # Expand dimensions
    img = np.expand_dims(img, axis=0)
    # Use the VGG19 input pre-processing 
    img = vgg19.preprocess_input(img)
    
    return img

# Convert a tensor into a valid image
def deprocess_image(x):
    if K.image_data_format() == 'channels_first':
        x = x.reshape((3, img_nrows, img_ncols))
        x = x.transpose((1, 2, 0))
    else:
        x = x.reshape((img_nrows, img_ncols, 3))
    # Remove zero-center by mean pixel
    x[:, :, 0] += 103.939
    x[:, :, 1] += 116.779
    x[:, :, 2] += 123.68
    # 'BGR'->'RGB'
    x = x[:, :, ::-1]
    x = np.clip(x, 0, 255).astype('uint8')
    return x

In [None]:
# Get tensor representations of our images
base_image = K.variable(preprocess_image(base_image_path))
style_reference_image = K.variable(preprocess_image(style_reference_image_path))

# This will contain our generated image
if K.image_data_format() == 'channels_first':
    combination_image = K.placeholder((1, 3, img_nrows, img_ncols))
else:
    combination_image = K.placeholder((1, img_nrows, img_ncols, 3))

# Combine the 3 images into a single Keras tensor
input_tensor = K.concatenate([base_image, style_reference_image, combination_image], axis=0)

### Build the model

In [None]:
# Build the VGG19 network with our 3 images as input
# The model will be loaded with pre-trained ImageNet weights
model = vgg19.VGG19(input_tensor=input_tensor, weights='imagenet', include_top=False)
print('Model loaded.')

# get the symbolic outputs of each "key" layer (we gave them unique names).
outputs_dict = dict([(layer.name, layer.output) for layer in model.layers])

### Loss functions

In [None]:
# Compute the neural style loss
# First we need to define 4 util functions


# The gram matrix of an image tensor (feature-wise outer product)
def gram_matrix(x):
    assert K.ndim(x) == 3
    if K.image_data_format() == 'channels_first':
        features = K.batch_flatten(x)
    else:
        features = K.batch_flatten(K.permute_dimensions(x, (2, 0, 1)))
    gram = K.dot(features, K.transpose(features))
    return gram

  
# The "style loss" is designed to maintain the style of the reference image in the generated image.
# It is based on the gram matrices (which capture style) of feature maps from the style reference image
# and from the generated image

def style_loss(style, combination):
    assert K.ndim(style) == 3
    assert K.ndim(combination) == 3
    S = gram_matrix(style)
    C = gram_matrix(combination)
    channels = 3
    size = img_nrows * img_ncols
    return K.sum(K.square(S - C)) / (4.0 * (channels ** 2) * (size ** 2))

  
# An auxiliary loss function designed to maintain the "content" of the
# base image in the generated image

def content_loss(base, combination):
    return K.sum(K.square(combination - base))

  
# The 3rd loss function, total variation loss, designed to keep the generated image locally coherent

def total_variation_loss(x):
    assert K.ndim(x) == 4
    if K.image_data_format() == 'channels_first':
        a = K.square(x[:, :, :img_nrows - 1, :img_ncols - 1] - x[:, :, 1:, :img_ncols - 1])
        b = K.square(x[:, :, :img_nrows - 1, :img_ncols - 1] - x[:, :, :img_nrows - 1, 1:])
    else:
        a = K.square(x[:, :img_nrows - 1, :img_ncols - 1, :] - x[:, 1:, :img_ncols - 1, :])
        b = K.square(x[:, :img_nrows - 1, :img_ncols - 1, :] - x[:, :img_nrows - 1, 1:, :])
    return K.sum(K.pow(a + b, 1.25))

In [None]:
# combine these loss functions into a single scalar
loss = K.variable(0.0)
layer_features = outputs_dict['block5_conv2']
base_image_features = layer_features[0, :, :, :]
combination_features = layer_features[2, :, :, :]
loss = loss + (content_weight * content_loss(base_image_features, combination_features))

feature_layers = ['block1_conv1', 'block2_conv1',
                  'block3_conv1', 'block4_conv1',
                  'block5_conv1']

for layer_name in feature_layers:
    layer_features = outputs_dict[layer_name]
    style_reference_features = layer_features[1, :, :, :]
    combination_features = layer_features[2, :, :, :]
    sl = style_loss(style_reference_features, combination_features)
    loss = loss + ((style_weight / len(feature_layers)) * sl)

loss = loss + (total_variation_weight * total_variation_loss(combination_image))

In [None]:
# Get the gradients of the generated image with respect to the loss
grads = K.gradients(loss, combination_image)

outputs = [loss]
if isinstance(grads, (list, tuple)):
    outputs += grads
else:
    outputs.append(grads)

f_outputs = K.function([combination_image], outputs)


# Evaluate loss and gradients
def eval_loss_and_grads(x):
    if K.image_data_format() == 'channels_first':
        x = x.reshape((1, 3, img_nrows, img_ncols))
    else:
        x = x.reshape((1, img_nrows, img_ncols, 3))
    outs = f_outputs([x])
    loss_value = outs[0]
    if len(outs[1:]) == 1:
        grad_values = outs[1].flatten().astype('float64')
    else:
        grad_values = np.array(outs[1:]).flatten().astype('float64')
    return loss_value, grad_values

In [None]:
# This Evaluator class makes it possible to compute loss and gradients in one pass
# while retrieving them via two separate functions, "loss" and "grads". 
# This is done because scipy.optimize requires separate functions for loss and gradients,
# but computing them separately would be inefficient.

class Evaluator(object):

    def __init__(self):
        self.loss_value = None
        self.grads_values = None

    def loss(self, x):
        assert self.loss_value is None
        loss_value, grad_values = eval_loss_and_grads(x)
        self.loss_value = loss_value
        self.grad_values = grad_values
        return self.loss_value

    def grads(self, x):
        assert self.loss_value is not None
        grad_values = np.copy(self.grad_values)
        self.loss_value = None
        self.grad_values = None
        return grad_values

evaluator = Evaluator()

### Run the model

In [None]:
# Run scipy-based optimization (L-BFGS) over the pixels of the generated image
# so as to minimize the neural style loss
x = preprocess_image(base_image_path)

for i in range(iterations):
    start_time = time.time()
    x, min_val, info = fmin_l_bfgs_b(evaluator.loss, x.flatten(),
                                     fprime=evaluator.grads, maxfun=20)
    # save current generated image
    img = deprocess_image(x.copy())
    if i % 10 == 0 :
        print('Start of iteration', i)
        plt.imshow(img)
        plt.show()
        fname = result_prefix + '_at_iteration_%d.png' % i
        save_img(fname, img)
        end_time = time.time()
        print('Current loss value:', min_val)
        print('Image saved as', fname)
        print('Iteration %d completed in %ds' % (i, end_time - start_time))

---

# Exercise 🎓

This is now your turn !

**Q1**. Use the 2 images given in the image folder of this exercise (`lisa.jpg` and `trump.jpg`) to apply style transfer from Lisa to Trump. Use the code given above. Or you can just directly start with your own images.

> 🔦**Hint**: It is strongly recommended to do this exercise in Colab using GPUs.

**Q2**. Once this is done, use any content and style image you'd like, and try your own style transfer !