<a href="https://colab.research.google.com/github/stydg/test/blob/master/Style_Transfer.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Style Transfer Tutorial
Style transfer is, hands down, one of my favorite things. I think the first time it was explained to me was early on when I was starting to learn about machine learning and deep learning. I was hooked. Today, we are going to walk through the style transfer script that I adapted and talk about some of what is going on with it. By the end of it, we should be able to get something like this:
![Example city generation](styletransfer.gif "City generated by this script")


## Getting Started
Okay cool! Now that we know what we are aiming for, let's dive in. To get started, we are going to need the following import statements to work, so go ahead and download the packages if you don't already have them.

In [0]:
import keras.backend as K
from keras.applications import VGG16

from PIL import Image # This is actually installed using `pip install Pillow` 

import numpy as np
import time

from scipy.optimize import fmin_l_bfgs_b
from scipy.misc import imsave

import argparse

Awesome! So let's talk a bit about what's going on here before we dive in. At a very high level, the way that style transfer works is a little something like this:

- Select an image that you want to preserve the content of (a person, a city, a landscape, etc.)
- Select an image that you want to transfer the style of (a painting or something like that)
- Take a pre-trained convolutional NN (we'll be using VGG16) and get the outputs at certain layers to gain feature representations of these images at specific layers
- Generate a random noise image of the same size
- Create loss functions that represent the difference of the noise and the content, and the noise and the style
- Minimize the loss between the two for optimal agreement
- Output your image!

Okay, so that doesn't seem too bad. 

For this tutorial, we'll be using this image as our content image:
![Philly scene](images/city.png)

And this for our style image:
![](images/style1.png)


## Content Loss
Let's start off by talking about content loss and how we define it. It's fairly simple actually!

If you think about a CNN, the base layers start with low-level representations. Maybe lines, then parts, then entire entitities. It builds from the bottom up. If you have a CNN that recognizes faces, maybe it starts with borders first, and learns higher representations like noses, eyes, and mouths in subsequent layers. The final CNN layers are what actually recognize faces.

We'll use that same logic to define our content loss. We'll take one of the final CNN layers of the VGG16 model and we'll grab that for our content loss. How does our loss function actually look? Here is how the equation is defined:
![Content loss equation](http://ankitmathur.me/classes/final_files/image02.jpg)

Looks menacing! But worry not. It reads like this:
- We have a function that we call content loss 
- That function takes in 3 parameters
    - p -> the content target
    - x -> the generated image
    - l -> the layer in question 
- We construct 2 feature representations with this information
    - F -> the feature representation of the generated image
        - In English, this is what the layer outputs to us when we run our image through the network
    - P -> the feature representation of the content target 
- We take the element-wise difference (subtract each index value by the matching one between the two matrices)
- We take the element-wise square of this (square each index value)
- We sum all those values together 
- We divide by 2 (or multiply by 0.5)

And that's our loss function! We'll actually define our code to take in two feature representations and just do the operations because its simpler that way. Here's how we'll write the function: 

In [0]:
def content_loss(content_features, generated_features):
    """
    Computes the content loss
    :param content_features: The features of the content image
    :param generated_features: The features of the generated image
    :return: The content loss
    """
    return 0.5 * K.sum(K.square(generated_features - content_features))

## Style Loss
So the style loss is a little more involved than the content loss. This because while content is pretty easy to define (just what is the subject of the image), style is not quite so obvious. What we'll do is use something called the gram matrix. It's a sort of correlation between aspect of the image fed through variou layers that allows us to numerically represent some features of what we naturally call style. 

The computation for that is defined as this:
![Gram matrix calculation](http://ankitmathur.me/classes/final_files/image04.jpg)
This is simply the dot product between the feature matrix and its transpose. Not too bad.

Next the original paper gives us these two equations to define the loss:

![](http://ankitmathur.me/classes/final_files/image05.jpg)
![](http://ankitmathur.me/classes/final_files/image06.jpg)

We'll come back to the the weighting aspect later on, but the first equation is essentially taking an L2 norm (a Frobenius norm).
- Subtract element-wise the gram matrices
- Square element-wise
- Sum them together
- Multiply by our factor 1/(4 * (color channels)^2 * (total pixel values)^2)

And then we multiply by weighting based on layers, but we'll return to that in a bit. We'll actually use that first equation to define our style loss.

** Note that the img_channels and img_size are global parameters that we'll actually define later

In [0]:
def gram_matrix(features):
    """
    Calculates the gram matrix of the feature representation matrix
    :param features: The feature matrix that is used to calculate the gram matrix
    :return: The gram matrix
    """
    return K.dot(features, K.transpose(features))


def style_loss(style_matrix, generated_matrix):
    """
    Computes the style loss of the transfer
    :param style_matrix: The style representation from the target style image
    :param generated_matrix: The style representation from the generated image
    :return: The loss from the style content
    """
    # Permute the matrix to calculate proper covariance
    style_features = K.batch_flatten(K.permute_dimensions(style_matrix, (2, 0, 1)))
    generated_features = K.batch_flatten(K.permute_dimensions(generated_matrix, (2, 0, 1)))

    # Get the gram matrices
    style_mat = gram_matrix(style_features)
    generated_mat = gram_matrix(generated_features)

    return K.sum(K.square(style_mat - generated_mat)) / (4.0 * (img_channels ** 2) * (img_size ** 2))

## Variation Loss
Now we'll talk about a term called variation loss. Simply speaking, it's a normalization term. It encourages smoothness and discourages noise. This leads to cleaner images that aren't as pixelated. No fancy math here, just shifting some pixel values so we'll drop the code here:

In [0]:
def variation_loss(generated_matrix):
    """
    Computes the variation loss metric (used for normalization)
    :param generated_matrix: The generated matrix
    :return: The variation loss term for normalization
    """
    a = K.square(generated_matrix[:, :img_height-1, :img_width-1, :] - generated_matrix[:, 1:, :img_width-1, :])
    b = K.square(generated_matrix[:, :img_height-1, :img_width-1, :] - generated_matrix[:, :img_height-1, 1:, :])

    return K.sum(K.pow(a + b, 1.25))

## Total Loss
So now it's time to merge all the loss functions together. We'll assign each of them a weight that determines how much each will influence the overall cost. Assigning heavier loss to the style will result in more style, whereas heavier content loss will result in the output being more true to the original content.

Here is the code:

In [0]:
def total_loss(c_layer, s_layers, generated):
    """
    Computes the total loss of a given iteration
    :param c_layer: The layer used to compute the content loss
    :param s_layers: The layer(s) used to compute the style loss
    :param generated: The generated image
    :return: The total loss
    """

    content_weight = args.content_weight
    style_weight = args.style_weight
    variation_weight = args.var_weight

    # Content loss
    content_features = c_layer[CONTENT_IMAGE_POS, :, :, :]
    generated_features = c_layer[GENERATED_IMAGE_POS, :, :, :]
    c_loss = content_loss(content_features, generated_features)

    # Style loss
    s_loss = None
    for layer in s_layers:
        style_features = layer[STYLE_IMAGE_POS, :, :, :]
        generated_features = layer[GENERATED_IMAGE_POS, :, :, :]
        if s_loss is None:
            s_loss = style_loss(style_features, generated_features) * (style_weight / len(s_layers))
        else:
            s_loss += style_loss(style_features, generated_features) * (style_weight / len(s_layers))

    # Variation loss (for regularization)
    v_loss = variation_loss(generated)

    return content_weight * c_loss + s_loss + variation_weight * v_loss

What we have here is a function that is passed 3 variables. 
- c_layer -> The layer that determines content loss
- s_layers -> The list of layers that factor into the style loss 
- generated -> The current generated image 

The arg parameters will be something we talk about towards the end, but these are basically calls to set the weights we wish to see given to these various loss factors. 

Content loss just gets the content for the two features and calculates the content loss. Style loss actually iterates through the layers, and multiplies by the weighting per layer here. Finally we just get the variation loss and multiply the 3 loss factors together using proper weighting. 

That's the value we return.

## Extracting Layers from the Model
So we already mentioned that we are using the VGG16 model. You can read about the model [here](https://gist.github.com/baraldilorenzo/07d7802847aaad0a35d3) but the TL;DR of it is that it's a 16 layer model that was used in ILSVRC comptetitons. We download a pre-trained version so that we can get straight into style transfer.

We'll write a quick function to extract what we want from this model:

In [0]:
def get_layers(content_matrix, style_matrix, generated_matrix):
    """
    Returns the content and style layers we need for the transfer
    :param content_matrix: The feature matrix of the content image
    :param style_matrix:  The feature matrix of the style image
    :param generated_matrix:  The feature matrix of the generated image
    :return: A tuple of content layers and style layers
    """
    # Prep the model for our new input sizes
    input_tensor = K.concatenate([content_matrix, style_matrix, generated_matrix], axis=0)
    model = VGG16(input_tensor=input_tensor, weights='imagenet', include_top=False)

    # Convert layers to dictionary
    layers = dict([(layer.name, layer.output) for layer in model.layers])

    # Pull the specific layers we want
    c_layers = layers['block2_conv2']
    s_layers = ['block1_conv2', 'block2_conv2', 'block3_conv3', 'block4_conv3', 'block5_conv3']
    s_layers = [layers[layer] for layer in s_layers]

    return c_layers, s_layers

By passing in our 3 matrix representations we can get the model we need. We concatenate our matrices to make 1 input tensor so that we can quickly get our values later. You'll see above how we access those specific matrices through our global variables like 'STYLE_IMAGE_POS.' 

After building our input tensor, we grab our model. We give it our input tensor, preparing it for our specific input size. We ask for the imagenet weights. We also scrap the top layers. The top layers are the layers that involve flattening the network so that we can create a couple dense layers that output classification. We don't need these here, we only need layers involving convolutions, so we toss those top layers away.

We can do a quick dict conversion to get our layers in a nice access fashion. Then we simply just pull the layers we want, and return a tuple of the layers back to the calling function.

## Let's Talk Some Args and Globals

So just be warned, if you're viewing this in the python notebook form, the parser and the arguments are not going to work... It'd be better to set these to globals and just modify the code where you see some of those pop up.

But anways, most of this is self-explanatory. We are setting the arguments needed to do our style transfer. We make note of the position of our matrices in our input tensor for later usage as well. 

In [0]:
parser = argparse.ArgumentParser(description='Image neural style transfer implemented with Keras')
parser.add_argument('content_img', metavar='content', type=str, help='Path to target content image')
parser.add_argument('style_img', metavar='style', type=str, help='Path to target style image')
parser.add_argument('result_img_prefix', metavar='res_prefix', type=str, help='Name of generated image')
parser.add_argument('--iter', type=int, default=10, required=False, help='Number of iterations to run')
parser.add_argument('--content_weight', type=float, default=0.025, required=False, help='Content weight')
parser.add_argument('--style_weight', type=float, default=1.0, required=False, help='Style weight')
parser.add_argument('--var_weight', type=float, default=1.0, required=False, help='Total Variation weight')
parser.add_argument('--height', type=int, default=512, required=False, help='Height of the images')
parser.add_argument('--width', type=int, default=512, required=False, help='Width of the images')

args = parser.parse_args()

# Params #

img_height = args.height
img_width = args.width
img_size = img_height * img_width
img_channels = 3

content_path = args.content_img
style_path = args.style_img
target_path = args.result_img_prefix
target_extension = '.png'

CONTENT_IMAGE_POS = 0
STYLE_IMAGE_POS = 1
GENERATED_IMAGE_POS = 2

# Params #

## Processing Our Images
So the natural format of our images aren't going to work. We need to modify them so that they play nice with our network. Here's the function we'll use to do so:

In [0]:
def process_img(path):
    """
    Function for processing images to the format we need
    :param path: The path to the image
    :return: The image as a data array, scaled and reflected
    """
    # Open image and resize it
    img = Image.open(path)
    img = img.resize((img_width, img_height))

    # Convert image to data array
    data = np.asarray(img, dtype='float32')
    data = np.expand_dims(data, axis=0)
    data = data[:, :, :, :3]

    # Apply pre-process to match VGG16 we are using
    data[:, :, :, 0] -= 103.939
    data[:, :, :, 1] -= 116.779
    data[:, :, :, 2] -= 123.68

    # Flip from RGB to BGR
    data = data[:, :, :, ::-1]

    return data

- We open the image using PIL and resize it
- Then we conert it to a numpy array and expand it so that it follows the format we need
    - The cutting off of the alpha channel is specifically done here, if you're using PNGs
- We then pre-process the data so that it matches the means that VGG16 is expecting in terms of RGB values that it's been trained on 
- VGG16 is also expecting the data to be in the format of BGR instead of RGB so we do that flip here as well

All this prepares our data so that we can build out our 3 images:

In [0]:
# Prepare the generated image
generated_img = np.random.uniform(0, 255, (1, img_height, img_width, 3)) - 128.

# Load the respective content and style images
content = process_img(content_path)
style = process_img(style_path)

## Defining our Variables
Like mentioned in the beginning, this is just an optimization problem. In order to make this work, we'll define some variables using our above images. 

We'll also go ahead and grab our layers and define our loss and gradients. This will allow us to run our optimizations when we go through our iterations.

In [0]:
# Prepare the variables for the flow graph
content_image = K.variable(content)
style_image = K.variable(style)
generated_image = K.placeholder((1, img_height, img_width, 3))
loss = K.variable(0.)

# Grab the layers needed to prepare the loss metric
content_layer, style_layers = get_layers(content_image, style_image, generated_image)

# Define loss and gradient
loss = total_loss(content_layer, style_layers, generated_image)
grads = K.gradients(loss, generated_image)

# Define the output
outputs = [loss]
outputs += grads
f_outputs = K.function([generated_image], outputs)

Looking pretty good! We almost got everything in order now!

## Building an Evaluator
To optimize, we'll be using an algorithm called L-BFGS which is a little bit better for this task than just simple gradient descent. The problem is that we'll be using this from scipy, and it expects the data in a format that isn't how Keras naturally handles the gradients and loss. To rectify this, we construct an Evaluator class that stores this for us:

In [0]:
class Evaluator(object):
    """
    Evaluator class used to track gradients and loss values together
    """

    def __init__(self):
        self.loss_value = None
        self.grad_values = None

    def loss(self, x):
        assert self.loss_value is None
        loss_value, grad_values = eval_loss_and_grads(x)
        self.loss_value = loss_value
        self.grad_values = grad_values
        return self.loss_value

    def grads(self, x):
        assert self.loss_value is not None
        grad_values = np.copy(self.grad_values)
        self.loss_value = None
        self.grad_values = None
        return grad_values

def eval_loss_and_grads(generated):
    """
    Computes the loss and gradients
    :param generated: The generated image
    :return: The loss and the gradients
    """
    generated = generated.reshape((1, img_height, img_width, 3))
    outs = f_outputs([generated])
    loss_value = outs[0]
    grad_values = outs[1].flatten().astype('float64')
    return loss_value, grad_values

You'll also notice that we wrote a function that returns the values as a tuple like is expected. This makes it so our evaluator class works like intended.

## Saving Images
For actually saving output, here's a quick function that let's us save our images that we generate:

In [0]:
def save_image(filename, generated):
    """
    Saves the generated image
    :param filename: The filename that the image is saved to
    :param generated: The image that we want saved
    :return: Nothing
    """
    # Reshape image and flip from BGR to RGB
    generated = generated.reshape((img_height, img_width, 3))
    generated = generated[:, :, ::-1]

    # Re-apply the mean shift
    generated[:, :, 0] += 103.939
    generated[:, :, 1] += 116.779
    generated[:, :, 2] += 123.68

    # Clip values to 0-255
    generated = np.clip(generated, 0, 255).astype('uint8')

    imsave(filename, Image.fromarray(generated))

## Our Optimization Loop
Time to put it all together and finish up our code. Here's our last little snippet:

In [0]:
evaluator = Evaluator()
iterations = args.iter

name = '{}-{}{}'.format(target_path, 0, target_extension)
save_image(name, generated_img)

for i in range(iterations):
    print('Iteration:', i)
    start_time = time.time()
    generated_img, min_val, info = fmin_l_bfgs_b(evaluator.loss, generated_img.flatten(),
                                                 fprime=evaluator.grads, maxfun=20)
    print('Loss:', min_val)
    end_time = time.time()
    print('Iteration {} took {} seconds'.format(i, end_time - start_time))
    name = '{}-{}{}'.format(target_path, i+1, target_extension)
    save_image(name, generated_img)
    print('Saved image to: {}'.format(name))

- We construct our evaluator and grab our iteration count
- We save the random noise as a baseline for what our generated image starts as
- We loop through our iterations, timing as we go
    - First we use our optimization algorithm to otpimize 
    - Then we save our image

By default, this runs about 10 times and seems to work pretty well. And that's all there is to it! By running this, you should have a pretty fantastic script that you can play with, modify, and create your own style transfers!

Here is the entire script put together:

In [0]:
#!/usr/bin/env python3
"""
style-transfer.py - An implementation of the style transfer algorithm. It's a synthesis of the original paper, combined
                    with the adaption to the loss function that adds in the variation loss factor for normalization.
                    Components have been synthesized together.

For reference:
    - https://arxiv.org/pdf/1508.06576.pdf (original style loss paper)
    - https://arxiv.org/pdf/1412.0035.pdf (explains the ideas behind variation loss)
    - https://github.com/keras-team/keras/blob/master/examples/neural_style_transfer.py
      (style transfer as given by the keras team)
    - https://harishnarayanan.org/writing/artistic-style-transfer/ (longer tutorial that walks through convolutions)

"""

import keras.backend as K
from keras.applications import VGG16

from PIL import Image

import numpy as np
import time

from scipy.optimize import fmin_l_bfgs_b
from scipy.misc import imsave

import argparse


parser = argparse.ArgumentParser(description='Image neural style transfer implemented with Keras')
parser.add_argument('content_img', metavar='content', type=str, help='Path to target content image')
parser.add_argument('style_img', metavar='style', type=str, help='Path to target style image')
parser.add_argument('result_img_prefix', metavar='res_prefix', type=str, help='Name of generated image')
parser.add_argument('--iter', type=int, default=10, required=False, help='Number of iterations to run')
parser.add_argument('--content_weight', type=float, default=0.025, required=False, help='Content weight')
parser.add_argument('--style_weight', type=float, default=1.0, required=False, help='Style weight')
parser.add_argument('--var_weight', type=float, default=1.0, required=False, help='Total Variation weight')
parser.add_argument('--height', type=int, default=512, required=False, help='Height of the images')
parser.add_argument('--width', type=int, default=512, required=False, help='Width of the images')

args = parser.parse_args()

# Params #

img_height = args.height
img_width = args.width
img_size = img_height * img_width
img_channels = 3

content_path = args.content_img
style_path = args.style_img
target_path = args.result_img_prefix
target_extension = '.png'

CONTENT_IMAGE_POS = 0
STYLE_IMAGE_POS = 1
GENERATED_IMAGE_POS = 2

# Params #


def process_img(path):
    """
    Function for processing images to the format we need
    :param path: The path to the image
    :return: The image as a data array, scaled and reflected
    """
    # Open image and resize it
    img = Image.open(path)
    img = img.resize((img_width, img_height))

    # Convert image to data array
    data = np.asarray(img, dtype='float32')
    data = np.expand_dims(data, axis=0)
    data = data[:, :, :, :3]

    # Apply pre-process to match VGG16 we are using
    data[:, :, :, 0] -= 103.939
    data[:, :, :, 1] -= 116.779
    data[:, :, :, 2] -= 123.68

    # Flip from RGB to BGR
    data = data[:, :, :, ::-1]

    return data


def get_layers(content_matrix, style_matrix, generated_matrix):
    """
    Returns the content and style layers we need for the transfer
    :param content_matrix: The feature matrix of the content image
    :param style_matrix:  The feature matrix of the style image
    :param generated_matrix:  The feature matrix of the generated image
    :return: A tuple of content layers and style layers
    """
    # Prep the model for our new input sizes
    input_tensor = K.concatenate([content_matrix, style_matrix, generated_matrix], axis=0)
    model = VGG16(input_tensor=input_tensor, weights='imagenet', include_top=False)

    # Convert layers to dictionary
    layers = dict([(layer.name, layer.output) for layer in model.layers])

    # Pull the specific layers we want
    c_layers = layers['block2_conv2']
    s_layers = ['block1_conv2', 'block2_conv2', 'block3_conv3', 'block4_conv3', 'block5_conv3']
    s_layers = [layers[layer] for layer in s_layers]

    return c_layers, s_layers


def content_loss(content_features, generated_features):
    """
    Computes the content loss
    :param content_features: The features of the content image
    :param generated_features: The features of the generated image
    :return: The content loss
    """
    return 0.5 * K.sum(K.square(generated_features - content_features))


def gram_matrix(features):
    """
    Calculates the gram matrix of the feature representation matrix
    :param features: The feature matrix that is used to calculate the gram matrix
    :return: The gram matrix
    """
    return K.dot(features, K.transpose(features))


def style_loss(style_matrix, generated_matrix):
    """
    Computes the style loss of the transfer
    :param style_matrix: The style representation from the target style image
    :param generated_matrix: The style representation from the generated image
    :return: The loss from the style content
    """
    # Permute the matrix to calculate proper covariance
    style_features = K.batch_flatten(K.permute_dimensions(style_matrix, (2, 0, 1)))
    generated_features = K.batch_flatten(K.permute_dimensions(generated_matrix, (2, 0, 1)))

    # Get the gram matrices
    style_mat = gram_matrix(style_features)
    generated_mat = gram_matrix(generated_features)

    return K.sum(K.square(style_mat - generated_mat)) / (4.0 * (img_channels ** 2) * (img_size ** 2))


def variation_loss(generated_matrix):
    """
    Computes the variation loss metric (used for normalization)
    :param generated_matrix: The generated matrix
    :return: The variation loss term for normalization
    """
    a = K.square(generated_matrix[:, :img_height-1, :img_width-1, :] - generated_matrix[:, 1:, :img_width-1, :])
    b = K.square(generated_matrix[:, :img_height-1, :img_width-1, :] - generated_matrix[:, :img_height-1, 1:, :])

    return K.sum(K.pow(a + b, 1.25))


def total_loss(c_layer, s_layers, generated):
    """
    Computes the total loss of a given iteration
    :param c_layer: The layer used to compute the content loss
    :param s_layers: The layer(s) used to compute the style loss
    :param generated: The generated image
    :return: The total loss
    """

    content_weight = args.content_weight
    style_weight = args.style_weight
    variation_weight = args.var_weight

    # Content loss
    content_features = c_layer[CONTENT_IMAGE_POS, :, :, :]
    generated_features = c_layer[GENERATED_IMAGE_POS, :, :, :]
    c_loss = content_loss(content_features, generated_features)

    # Style loss
    s_loss = None
    for layer in s_layers:
        style_features = layer[STYLE_IMAGE_POS, :, :, :]
        generated_features = layer[GENERATED_IMAGE_POS, :, :, :]
        if s_loss is None:
            s_loss = style_loss(style_features, generated_features) * (style_weight / len(s_layers))
        else:
            s_loss += style_loss(style_features, generated_features) * (style_weight / len(s_layers))

    # Variation loss (for regularization)
    v_loss = variation_loss(generated)

    return content_weight * c_loss + s_loss + variation_weight * v_loss


def eval_loss_and_grads(generated):
    """
    Computes the loss and gradients
    :param generated: The generated image
    :return: The loss and the gradients
    """
    generated = generated.reshape((1, img_height, img_width, 3))
    outs = f_outputs([generated])
    loss_value = outs[0]
    grad_values = outs[1].flatten().astype('float64')
    return loss_value, grad_values


def save_image(filename, generated):
    """
    Saves the generated image
    :param filename: The filename that the image is saved to
    :param generated: The image that we want saved
    :return: Nothing
    """
    # Reshape image and flip from BGR to RGB
    generated = generated.reshape((img_height, img_width, 3))
    generated = generated[:, :, ::-1]

    # Re-apply the mean shift
    generated[:, :, 0] += 103.939
    generated[:, :, 1] += 116.779
    generated[:, :, 2] += 123.68

    # Clip values to 0-255
    generated = np.clip(generated, 0, 255).astype('uint8')

    imsave(filename, Image.fromarray(generated))


class Evaluator(object):
    """
    Evaluator class used to track gradients and loss values together
    """

    def __init__(self):
        self.loss_value = None
        self.grad_values = None

    def loss(self, x):
        assert self.loss_value is None
        loss_value, grad_values = eval_loss_and_grads(x)
        self.loss_value = loss_value
        self.grad_values = grad_values
        return self.loss_value

    def grads(self, x):
        assert self.loss_value is not None
        grad_values = np.copy(self.grad_values)
        self.loss_value = None
        self.grad_values = None
        return grad_values


if __name__ == '__main__':
    # Prepare the generated image
    generated_img = np.random.uniform(0, 255, (1, img_height, img_width, 3)) - 128.

    # Load the respective content and style images
    content = process_img(content_path)
    style = process_img(style_path)

    # Prepare the variables for the flow graph
    content_image = K.variable(content)
    style_image = K.variable(style)
    generated_image = K.placeholder((1, img_height, img_width, 3))
    loss = K.variable(0.)

    # Grab the layers needed to prepare the loss metric
    content_layer, style_layers = get_layers(content_image, style_image, generated_image)

    # Define loss and gradient
    loss = total_loss(content_layer, style_layers, generated_image)
    grads = K.gradients(loss, generated_image)

    # Define the output
    outputs = [loss]
    outputs += grads
    f_outputs = K.function([generated_image], outputs)

    evaluator = Evaluator()
    iterations = args.iter

    name = '{}-{}{}'.format(target_path, 0, target_extension)
    save_image(name, generated_img)

    for i in range(iterations):
        print('Iteration:', i)
        start_time = time.time()
        generated_img, min_val, info = fmin_l_bfgs_b(evaluator.loss, generated_img.flatten(),
                                                     fprime=evaluator.grads, maxfun=20)
        print('Loss:', min_val)
        end_time = time.time()
        print('Iteration {} took {} seconds'.format(i, end_time - start_time))
        name = '{}-{}{}'.format(target_path, i+1, target_extension)
        save_image(name, generated_img)
        print('Saved image to: {}'.format(name))