# Neural Style Transfer

This is an implementation of Neural Style Transfer first described in 2015 by Gatys et. al <sup>[[1]](#References)</sup>. The mathematical descriptions are informed by Andrew Ng on YouTube<sup>[[2, 3]](#References)</sup> and the implementation details have borrowed heavily from the offical Keras<sup>[[4]](#References)</sup> NST example. I also get other ideas from Tensorflow <sup>[[5, 6]](#References)</sup> regarding total variation loss and an alternative optimisation method. The extention of this is to perform NST in real-time as described in 2016 by Justin Johnson<sup>[[7]](#References)</sup>.


<hr>

In [1]:
from __future__ import print_function
from keras.preprocessing.image import load_img, save_img, img_to_array
import numpy as np
from scipy.optimize import fmin_l_bfgs_b
import time

from keras.applications import vgg16
from keras import backend as K

import tensorflow as tf

Using TensorFlow backend.


In [2]:
content_image_path = "./content_image_1.jpg"
style_image_path = "./style_image_1.jpg"
iterations = 20

# these are the weights of the different loss components
total_variation_weight = 1.0 # 1.0
style_weight = 1.0 # 1.0
content_weight = 0.025 # 0.025

# dimensions of the generated picture.
img_nrows = 224
img_ncols = 224

In [3]:
# util function to open, resize and format pictures into appropriate tensors
def preprocess_image(image_path):
    img = load_img(image_path, target_size=(img_nrows, img_ncols))
    img = img_to_array(img)
    img = np.expand_dims(img, axis=0)
    img = vgg16.preprocess_input(img)
    return img

# util function to convert a tensor into a valid image
def deprocess_image(x):
    x = x.reshape((img_nrows, img_ncols, 3))
    # Remove zero-center by mean pixel
    x[:, :, 0] += 103.939
    x[:, :, 1] += 116.779
    x[:, :, 2] += 123.68
    # 'BGR'->'RGB'
    x = x[:, :, ::-1]
    x = np.clip(x, 0, 255).astype('uint8')
    return x

To improve computational efficiency we can store the generated, content and style image as tensors, and allocate them as input to the network.

In [4]:
# get tensor representations of our images
content_image = K.variable(preprocess_image(content_image_path), name="Content")
style_image = K.variable(preprocess_image(style_image_path), name="Style")

# this will contain our generated image
generated_image = K.placeholder((1, img_nrows, img_ncols, 3), name="Generated")

# combine the 3 images into a single Keras tensor
input_tensor = K.concatenate([content_image,
                              style_image,
                              generated_image], axis=0)

# Create index dict
index = {"C" : 0, "S" : 1, "G" : 2}

Instructions for updating:
Colocations handled automatically by placer.


In [5]:
# build the VGG16 network with our 3 images as input
# the model will be loaded with pre-trained ImageNet weights
model = vgg16.VGG16(input_tensor=input_tensor,
                    weights='imagenet', include_top=False)
print('Model loaded.')

Model loaded.


In [6]:
# get the symbolic outputs of each "key" layer (we gave them unique names).
outputs_dict = dict([(layer.name, layer.output) for layer in model.layers])

In [7]:
outputs_dict

{'input_1': <tf.Tensor 'concat:0' shape=(3, 224, 224, 3) dtype=float32>,
 'block1_conv1': <tf.Tensor 'block1_conv1/Relu:0' shape=(3, 224, 224, 64) dtype=float32>,
 'block1_conv2': <tf.Tensor 'block1_conv2/Relu:0' shape=(3, 224, 224, 64) dtype=float32>,
 'block1_pool': <tf.Tensor 'block1_pool/MaxPool:0' shape=(3, 112, 112, 64) dtype=float32>,
 'block2_conv1': <tf.Tensor 'block2_conv1/Relu:0' shape=(3, 112, 112, 128) dtype=float32>,
 'block2_conv2': <tf.Tensor 'block2_conv2/Relu:0' shape=(3, 112, 112, 128) dtype=float32>,
 'block2_pool': <tf.Tensor 'block2_pool/MaxPool:0' shape=(3, 56, 56, 128) dtype=float32>,
 'block3_conv1': <tf.Tensor 'block3_conv1/Relu:0' shape=(3, 56, 56, 256) dtype=float32>,
 'block3_conv2': <tf.Tensor 'block3_conv2/Relu:0' shape=(3, 56, 56, 256) dtype=float32>,
 'block3_conv3': <tf.Tensor 'block3_conv3/Relu:0' shape=(3, 56, 56, 256) dtype=float32>,
 'block3_pool': <tf.Tensor 'block3_pool/MaxPool:0' shape=(3, 28, 28, 256) dtype=float32>,
 'block4_conv1': <tf.Tensor

### Compute the Content Loss

The content loss <sup>[[2](#References)]</sup> is designed to maintain the "content" of the **Content Image** $(C)$ in the **Generated Image** $(G)$. Using the pre-trained network, we can get the activations of both images from difference layers i.e. 

$$
a^{[l](C)} \text{ and } a^{[l](G)}
$$

If these activations are similar, both images have similar content. To compare them we simply take the sum of the squared element-wise difference between the activations of our layers of interest, and optionally use a normalisation constant.

$$
J_{content}(C, G) = \dfrac{1}{2} {|| a^{[l](C)} - a^{[l](G)} ||}^2
$$

Because the content of an image tends to be large abstract structures, layers tend to be only the last layer of the network, rather than all layers. However, you could explore this to find out the effects.

In [8]:
def content_loss(content, generated):
    return K.sum(K.square(generated - content))

### Compute the Style Loss

To compute the style loss<sup>[[3](#References)]</sup>, we must first develop a way to describe style. This is done with the gram matrix. The gram matrix of an image tensor (feature-wise outer product) is a way of measuring the correlation between filers in a layer. We are asking, if filter 1 is firing, is filter 2 firing? We want to know the correlation between all filters, and so the gram matrix is a (feature $\times$ feature) matrix. We do this my summing the product of the activations in each channel, and multiplying them together. We ant to calculate the *style* of the **Style Image** $S$ and the **Geneterated Image** $(G)$.


$$
\text{Let } a_{i, j k}^{[l]} = \text{ activation at } (i, j, k). G^{[l](S)} \text{ is } n_c^{[l]} \times n_c^{[l]}, \text{ where } i = h, j = w, k = c
$$

$$
G_{kk'}^{[l](S)} = \sum_{i=1}^{n_h^{[l]}} \sum_{j=1}^{n_w^{[l]}} a_{i, j, k}^{[l]} \cdot a_{i, j, k'}^{[l]}
$$

$$
G_{kk'}^{[l](G)} = \sum_{i=1}^{n_h^{[l]}} \sum_{j=1}^{n_w^{[l]}} a_{i, j, k}^{[l]} \cdot a_{i, j, k'}^{[l]}
$$


$$
G_{kk'}^{[l]}, \text{ where } k = 1, ..., n_{c}^{[l]}
$$

From this we can see that the gram matrix can be computed using the dot product of the channels.

In [13]:
def gram_matrix(x):
    assert K.ndim(x) == 3
    features = K.batch_flatten(K.permute_dimensions(x, (2, 0, 1)))
    print("Gram matrix features:", features.shape)
    gram = K.dot(features, K.transpose(features))
    return gram

We want to create a loss that captures the difference in style, and of course want want to minimise it. Therefore, the **style loss** is designed to maintain the style of the reference image in the generated image. It is based on the gram matrices (which capture style) of feature maps from the style reference image and from the generated image. We are wanting to minimise the (scaled) sum of squared errors (differences).

$$
\begin{align}
J_{style}^{[l]}(S, G) &= \lambda {|| G^{[l](S)} - G^{[l](G)} ||}^2 \\
                      &= \lambda \sum_{k}\sum_{k'} (G_{kk'}^{[l](S)} - G_{kk'}^{[l](G)})^2
\end{align}
$$

$$
\lambda = \dfrac{1}{(2 \cdot n_h^{[l]} n_w^{[l]}n_c^{[l]})^2}
$$

Notice the normalisation constant term at the beginning. Andrew Ng says it doesn't matter that much because we are scaling the loss with a hyperparamter $\beta$ later, but we will include it anyway.

The overall style loss is computed as follows:

$$
J_{style}(S, G) = \sum_l \lambda^{l} J_{style}^{[l]}(S, G)
$$

This allows us to weight the correlations between different layers differently to compute the overall style loss.

In [14]:
def style_loss(style, generated):
    assert K.ndim(style) == 3
    assert K.ndim(generated) == 3
    S = gram_matrix(style)
    C = gram_matrix(generated)
    channels = 3
    return K.sum(K.square(S - C)) / (4.0 * (channels ** 2) * (img_nrows**2) * (img_ncols ** 2))

### Compute the Total Variation Loss

The **Total Variation Loss** is an additional loss term included in [4](#References) and detailed in [5](#References). It is designed to keep the generated image locally coherent. In other words, it can be used to suppress noise in images. According to Tensorflow, the total varriation is the sum of the **absolute differences** for neighboring pixel-values in the input images. It measures how much noise is in the images and is defined as:

$$
J_{\text{total variation}}(G)
$$

In the case of the Keras code, I have no idea what it is doing...

In [15]:
def total_variation_loss(x):
    a = K.square(
            x[:, :img_nrows - 1, :img_ncols - 1, :] - x[:, 1:, :img_ncols - 1, :])
    b = K.square(
            x[:, :img_nrows - 1, :img_ncols - 1, :] - x[:, :img_nrows - 1, 1:, :])
    return K.sum(K.pow(a + b, 1.25))

### Compute the Overall Cost

Finally, we can then combine the three losses to get:

$$
J(G) = \alpha \cdot J_{\text{content}}(C, G) + \beta \cdot J_{\text{style}}(S, G) + \gamma \cdot J_{\text{total variation}}(G)
$$

In [16]:
loss = K.variable(0.0)

# Select layer features for Content Loss
layer_features = outputs_dict['block5_conv2']
# As batch process, we can grab the specific features in batch
content_features = layer_features[index["C"], :, :, :]
generated_features = layer_features[index["G"], :, :, :]

# 1. Add the content loss
loss += content_weight * content_loss(content_features, generated_features)
# Select layers for Style Loss
feature_layers = ['block1_conv1', 'block2_conv1',
                  'block3_conv1', 'block4_conv1',
                  'block5_conv1']

# 2. Add each layer to the total loss
for layer_name in feature_layers:
    layer_features = outputs_dict[layer_name]
    style_features = layer_features[index["S"], :, :, :]
    generated_features = layer_features[index["G"], :, :, :]
    # Get style loss for layer and weight it
    sl = style_loss(style_features, generated_features)
    loss += (style_weight / len(feature_layers)) * sl

# 3. Add the total variation loss from the generated image    
loss += total_variation_weight * total_variation_loss(generated_image)

# IMPORTANT! Note the deprication warning for "+=". If you use variable.assign_add the K.gradients functions fails.

Gram matrix features: (64, 50176)
Gram matrix features: (64, 50176)
Gram matrix features: (128, 12544)
Gram matrix features: (128, 12544)
Gram matrix features: (256, 3136)
Gram matrix features: (256, 3136)
Gram matrix features: (512, 784)
Gram matrix features: (512, 784)
Gram matrix features: (512, 196)
Gram matrix features: (512, 196)


To optimize the generated image $(G)$ we will change the values via gradient descent as follows:

$$
G := G - \dfrac{\partial}{\partial G} J(G)
$$



In [17]:
# get the gradients of the generated image wrt the loss
grads = K.gradients(loss, generated_image)

# Create a function to retrieve both the loss (first) and gradients (second)
outputs = [loss]
if isinstance(grads, (list, tuple)):
    outputs += grads
else:
    outputs.append(grads)

# Function
f_outputs = K.function([generated_image], outputs)

To make things easier to deal with we will create a wrapper function and wrapper class for the optimisation process

In [18]:
def eval_loss_and_grads(x):
    x = x.reshape((1, img_nrows, img_ncols, 3))
    outs = f_outputs([x])
    loss_value = outs[0]
    if len(outs[1:]) == 1:
        grad_values = outs[1].flatten().astype('float64')
    else:
        grad_values = np.array(outs[1:]).flatten().astype('float64')
    return loss_value, grad_values

class Evaluator(object):
    """
    This Evaluator class makes it possible to compute loss and gradients in one pass
    while retrieving them via two separate functions, "loss" and "grads". This is done
    because scipy.optimize requires separate functions for loss and gradients,
    but computing them separately would be inefficient.
    """
    def __init__(self):
        self.loss_value = None
        self.grads_values = None

    def loss(self, x):
        assert self.loss_value is None
        loss_value, grad_values = eval_loss_and_grads(x)
        self.loss_value = loss_value
        self.grad_values = grad_values
        return self.loss_value

    def grads(self, x):
        assert self.loss_value is not None
        grad_values = np.copy(self.grad_values)
        self.loss_value = None
        self.grad_values = None
        return grad_values

We are now ready to test it out. Following the directions of [[4]](#References), the following example will use the scipy-based optimisation method - [L-BFGS](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.fmin_l_bfgs_b.html). We will later do the same thing using tensorflow and the Adam optimiser.

In [19]:
# Create the evaluator
evaluator = Evaluator()

# Create an input tensor for the content - setting it as the content image will hasten the process
x = preprocess_image(content_image_path)

for i in range(iterations):
    print('Start of iteration', i)
    start_time = time.time()
    
    # Optimize the input
    x, min_val, info = fmin_l_bfgs_b(evaluator.loss, x.flatten(), fprime=evaluator.grads, maxfun=20)
    print('Current loss value:', min_val)
    
    # save current generated image
    img = deprocess_image(x.copy())
    
    fname = "./generated/{:03d}_generated.png".format(i)
    save_img(fname, img)
    end_time = time.time()
    print('Image saved as', fname)
    print('Iteration %d completed in %ds' % (i, end_time - start_time))

Start of iteration 0
Current loss value: 2826180400.0
Image saved as ./generated/000_generated.png
Iteration 0 completed in 6s
Start of iteration 1
Current loss value: 1480848900.0
Image saved as ./generated/001_generated.png
Iteration 1 completed in 2s
Start of iteration 2
Current loss value: 1174970500.0
Image saved as ./generated/002_generated.png
Iteration 2 completed in 2s
Start of iteration 3
Current loss value: 1048482300.0
Image saved as ./generated/003_generated.png
Iteration 3 completed in 3s
Start of iteration 4
Current loss value: 975706100.0
Image saved as ./generated/004_generated.png
Iteration 4 completed in 3s
Start of iteration 5
Current loss value: 933364500.0
Image saved as ./generated/005_generated.png
Iteration 5 completed in 3s
Start of iteration 6
Current loss value: 907924740.0
Image saved as ./generated/006_generated.png
Iteration 6 completed in 3s
Start of iteration 7
Current loss value: 890081900.0
Image saved as ./generated/007_generated.png
Iteration 7 comp

Here is the final result:


<br>
<div style="text-align: center; font-size: 17px; line-height:20px;">
    <div style="display:inline-block; text-align: center;">
        <p style="background: yellow; margin:0">Content Image</p>
        <img style="margin: 0; padding:0;" src="./content_image_1.jpg" width=224>
    </div>
    <div style="display:inline-block; text-align: center;">
        <p style="background: yellow; margin:0">Style Image</p>
        <img style="margin: 0; padding:0;" src="./style_image_1.jpg" width=224>
    </div>
    <div style="display:inline-block; text-align: center;">
        <p style="background: yellow; margin:0">Generated Image</p>
        <img style="margin: 0; padding:0;" src="./generated/019_generated.png" width=224>
    </div>
</div>


A `gif` of the optimisation process can be created using [ImageMagik](https://www.tecmint.com/install-imagemagick-on-debian-ubuntu/) with help available [here](https://askubuntu.com/questions/648244/how-do-i-create-an-animated-gif-from-still-images-preferably-with-the-command-l).

```bash
convert -loop 0 -delay 20 ./generated/*.png generated.gif
```
<div style="text-align: center; font-size: 40px;">
    <img src="./generated_1.gif" width=224 style="display:inline-block">
</div>

Insted of using an external library to optimise the image, we can use our own adam optimiser to do it.

In [64]:
class Adam(object):
    
    """Adam optimizer.
    Default parameters follow those provided in the original paper.
    # Arguments
        lr: float >= 0. Learning rate.
        beta_1: float, 0 < beta < 1. Generally close to 1.
        beta_2: float, 0 < beta < 1. Generally close to 1.
        epsilon: float >= 0. Fuzz factor.
        decay: float >= 0. Learning rate decay over each update.
    # References
        - [Adam - A Method for Stochastic Optimization](http://arxiv.org/abs/1412.6980v8)
    """

    def __init__(self, lr=0.001, beta_1=0.9, beta_2=0.999,
                 epsilon=1e-8, decay=0., **kwargs):
        
        allowed_kwargs = {'clipnorm', 'clipvalue'}
        for k in kwargs:
            if k not in allowed_kwargs:
                raise TypeError('Unexpected keyword argument '
                                'passed to optimizer: ' + str(k))
        self.__dict__.update(kwargs)
        self.iterations = 0
        self.lr = lr
        self.beta_1 = beta_1
        self.beta_2 = beta_2
        self.decay = decay
        self.epsilon = epsilon
        self.initial_decay = decay

    def get_update(self, params, grads):
        """ params and grads are list of numpy arrays
        """
        original_shapes = [x.shape for x in params]
        params = [x.flatten() for x in params]
        grads = [x.flatten() for x in grads]
        
        lr = self.lr
        if self.initial_decay > 0:
            lr *= (1. / (1. + self.decay * self.iterations))

        t = self.iterations + 1
        lr_t = lr * (np.sqrt(1. - np.power(self.beta_2, t)) /
                     (1. - np.power(self.beta_1, t)))

        if not hasattr(self, 'ms'):
            self.ms = [np.zeros(p.shape) for p in params]
            self.vs = [np.zeros(p.shape) for p in params]
    
        ret = [None] * len(params)
        for i, p, g, m, v in zip(range(len(params)), params, grads, self.ms, self.vs):
            m_t = (self.beta_1 * m) + (1. - self.beta_1) * g
            v_t = (self.beta_2 * v) + (1. - self.beta_2) * np.square(g)
            p_t = p - lr_t * m_t / (np.sqrt(v_t) + self.epsilon)
            self.ms[i] = m_t
            self.vs[i] = v_t
            ret[i] = p_t
        
        self.iterations += 1
        
        for i in range(len(ret)):
            ret[i] = ret[i].reshape(original_shapes[i])
        
        return ret

In this case our code is much simplier, however, the optimision process has a different feel, but none the less produces a pleasing result.

In [68]:
# Get the input image
x =  preprocess_image("./jensen.png")

# Create optimiser - Note: very sensitive to learning rate
optimiser = Adam(lr=20)

# Optimise the input
for i in range(30):
    print('Start of iteration', i)
    start_time = time.time()

    # Compute loss and gradients
    loss_value, grads = f_outputs([x])
    print('Current loss value:', loss_value)
    
    # Update the Image
    x = optimiser.get_update(params=[x],
                             grads=[grads.reshape(1, 224, 224, 3)])[0]
                   
    # Save current generated image
    img = deprocess_image(x.copy())

    fname = "./generated_adam/{:03d}_generated.png".format(i)
    save_img(fname, img)
    end_time = time.time()
    print('Image saved as', fname)
    print('Iteration %d completed in %ds' % (i, end_time - start_time))                              
                               

Start of iteration 0
Current loss value: 98689384000.0
Image saved as ./generated_tf/000_generated.png
Iteration 0 completed in 0s
Start of iteration 1
Current loss value: 39688020000.0
Image saved as ./generated_tf/001_generated.png
Iteration 1 completed in 0s
Start of iteration 2
Current loss value: 49559130000.0
Image saved as ./generated_tf/002_generated.png
Iteration 2 completed in 0s
Start of iteration 3
Current loss value: 16009423000.0
Image saved as ./generated_tf/003_generated.png
Iteration 3 completed in 0s
Start of iteration 4
Current loss value: 13658618000.0
Image saved as ./generated_tf/004_generated.png
Iteration 4 completed in 0s
Start of iteration 5
Current loss value: 9087184000.0
Image saved as ./generated_tf/005_generated.png
Iteration 5 completed in 0s
Start of iteration 6
Current loss value: 13640884000.0
Image saved as ./generated_tf/006_generated.png
Iteration 6 completed in 0s
Start of iteration 7
Current loss value: 9029980000.0
Image saved as ./generated_tf/

Here is the final result:

<br>
<div style="text-align: center; font-size: 17px; line-height:20px;">
    <div style="display:inline-block; text-align: center;">
        <p style="background: yellow; margin:0">Content Image</p>
        <img style="margin: 0; padding:0;" src="./content_image_2.png" width=224>
    </div>
    <div style="display:inline-block; text-align: center;">
        <p style="background: yellow; margin:0">Style Image</p>
        <img style="margin: 0; padding:0;" src="./style_image_1.jpg" width=224>
    </div>
    <div style="display:inline-block; text-align: center;">
        <p style="background: yellow; margin:0">Generated Image</p>
        <img style="margin: 0; padding:0;" src="./generated_adam/029_generated.png" width=224>
    </div>
</div>

<br>

<div style="text-align: center; font-size: 40px;">
    <img src="./generated_2.gif" width=224 style="display:inline-block">
</div>

If we want to do the whole thing with tensorflow we need to change a couple of the steps. The main thing is setting the generated image as a `variable` rather than a `placeholder`.

In [239]:
K.clear_session()

In [240]:
content_image_path = "./content_image_3.jpg"
style_image_path = "./style_image_2.jpg"

# get tensor representations of our images
content_image = K.variable(preprocess_image(content_image_path), name="Content")
style_image = K.variable(preprocess_image(style_image_path), name="Style")

# this will contain our generated image - start with a random image
generated_image = K.variable(preprocess_image(content_image_path), name="Generated")

# combine the 3 images into a single Keras tensor
input_tensor = K.concatenate([content_image,
                              style_image,
                              generated_image], axis=0)

# Create index dict
index = {"C" : 0, "S" : 1, "G" : 2}

model = vgg16.VGG16(input_tensor=input_tensor,
                    weights='imagenet', include_top=False)
print('Model loaded.')


# get the symbolic outputs of each "key" layer (we gave them unique names).
outputs_dict = dict([(layer.name, layer.output) for layer in model.layers])

Model loaded.


Calculate the loss again!

In [241]:
loss = K.variable(0.0)

# Select layer features for Content Loss
layer_features = outputs_dict['block5_conv2']
# As batch process, we can grab the specific features in batch
content_features = layer_features[index["C"], :, :, :]
generated_features = layer_features[index["G"], :, :, :]

# 1. Add the content loss
loss += content_weight * content_loss(content_features, generated_features)
# Select layers for Style Loss
feature_layers = ['block1_conv1', 'block2_conv1',
                  'block3_conv1', 'block4_conv1',
                  'block5_conv1']

# 2. Add each layer to the total loss
for layer_name in feature_layers:
    layer_features = outputs_dict[layer_name]
    style_features = layer_features[index["S"], :, :, :]
    generated_features = layer_features[index["G"], :, :, :]
    # Get style loss for layer and weight it
    sl = style_loss(style_features, generated_features)
    loss += (style_weight / len(feature_layers)) * sl

# 3. Add the total variation loss from the generated image    
loss += total_variation_weight * total_variation_loss(generated_image)

Gram matrix features: (64, 50176)
Gram matrix features: (64, 50176)
Gram matrix features: (128, 12544)
Gram matrix features: (128, 12544)
Gram matrix features: (256, 3136)
Gram matrix features: (256, 3136)
Gram matrix features: (512, 784)
Gram matrix features: (512, 784)
Gram matrix features: (512, 196)
Gram matrix features: (512, 196)


We are now ready to optimise!

In [242]:
# Create optimizer
optimiser = tf.train.AdamOptimizer(learning_rate=10).minimize(loss, var_list=[generated_image])

# Using the current Keras session
sess = K.get_session()

# Optimise the input
for i in range(iterations):
    print('Start of iteration', i)
    start_time = time.time()

    # Update compute loss, get gradients & update image
    loss_value, _, raw_img = sess.run([loss, optimiser, generated_image])
    print('Current loss value:', loss_value)

    # Save current generated image
    img = deprocess_image(raw_img)
    
    fname = "./generated_tf/{:03d}_generated.png".format(i)
    save_img(fname, img)
    end_time = time.time()
    print('Image saved as', fname)
    print('Iteration %d completed in %ds' % (i, end_time - start_time))

Start of iteration 0
Current loss value: 277206760000.0
Image saved as ./generated_tf/000_generated.png
Iteration 0 completed in 0s
Start of iteration 1
Current loss value: 203848430000.0
Image saved as ./generated_tf/001_generated.png
Iteration 1 completed in 0s
Start of iteration 2
Current loss value: 111929210000.0
Image saved as ./generated_tf/002_generated.png
Iteration 2 completed in 0s
Start of iteration 3
Current loss value: 49100190000.0
Image saved as ./generated_tf/003_generated.png
Iteration 3 completed in 0s
Start of iteration 4
Current loss value: 61783548000.0
Image saved as ./generated_tf/004_generated.png
Iteration 4 completed in 0s
Start of iteration 5
Current loss value: 75000420000.0
Image saved as ./generated_tf/005_generated.png
Iteration 5 completed in 0s
Start of iteration 6
Current loss value: 57795236000.0
Image saved as ./generated_tf/006_generated.png
Iteration 6 completed in 0s
Start of iteration 7
Current loss value: 42482230000.0
Image saved as ./generate

Here is the final result:


<br>
<div style="text-align: center; font-size: 17px; line-height:20px;">
    <div style="display:inline-block; text-align: center;">
        <p style="background: yellow; margin:0">Content Image</p>
        <img style="margin: 0; padding:0;" src="./content_image_3.jpg" width=224>
    </div>
    <div style="display:inline-block; text-align: center;">
        <p style="background: yellow; margin:0">Style Image</p>
        <img style="margin: 0; padding:0;" src="./style_image_2.jpg" width=224>
    </div>
    <div style="display:inline-block; text-align: center;">
        <p style="background: yellow; margin:0">Generated Image</p>
        <img style="margin: 0; padding:0;" src="./generated_tf/019_generated.png" width=224>
    </div>
</div>

<br>

<div style="text-align: center; font-size: 40px;">
    <img src="./generated_3.gif" width=224 style="display:inline-block">
</div>

## References

1. Gatys, L. A., Ecker, A. S., & Bethge, M. (2015). A neural algorithm of artistic style. arXiv preprint [arXiv:1508.06576](https://arxiv.org/abs/1508.06576).
2. Content Cost Function - Andrew Ng on [YouTube](https://www.youtube.com/watch?v=b1I5X3UfEYI&list=PLkDaE6sCZn6Gl29AoE31iwdVwSG-KnDzF&index=40)
3. Style Cost Function - Andrew Ng on [YouTube](https://www.youtube.com/watch?v=QgkLfjfGul8&list=PLkDaE6sCZn6Gl29AoE31iwdVwSG-KnDzF&index=41)
4. Neural Style Transfer with [Keras.](https://keras.io/examples/neural_style_transfer/)
5. Total Variation Loss with [Tensorflow.](https://www.tensorflow.org/api_docs/python/tf/image/total_variation)
6. Neural Style Transfer with [Tensorflow.](https://www.tensorflow.org/beta/tutorials/generative/style_transfer)
7. Johnson, J., Alahi, A., & Fei-Fei, L. (2016, October). Perceptual losses for real-time style transfer and super-resolution. [In European conference on computer vision (pp. 694-711). Springer, Cham.](https://cs.stanford.edu/people/jcjohns/papers/eccv16/JohnsonECCV16.pdf)