# STYLE TRANSFER

##### By: (The one and only) James Bartlett, Edited by Ashley Chien

In this project we are going to be using Convolutional Neural Networks to implement Neural Style Transfer, a technique for creating a new image with the content of one input image and the style of another input. The idea behind style transfer is as follows: take three input images, one our style image, one our content image, and one output image which starts as random noise and we iteratively update it until it looks like the content of the content image in the style of the style image. To do this we run all three images through a pretrained VGG16 model trained to classify images. Then for a selected convolutional or pooling layer of the VGG16 model we compare the activations (values of the neurons) at that layer for the three different images. Specifically, we have what is called a feature reconstruction loss that compares the activations of the current output image and the content image, and what is called a style loss that compares the activations of the current output image and the desired style image. Then we use the gradient of these loss functions to update our current output image. Hopefully, that gives you an overview of what we will be doing, and you should gain a more in depth understanding as we go along.  

### The paper we will be implementing is found here: https://arxiv.org/pdf/1508.06576.pdf

# Part 1: Build Model and Define Losses

#### First we want to initialize a VGG16 model we can use for style transfer.  

In [1]:
from keras.applications import vgg16
from keras.layers import Input, Concatenate
from keras.models import Model, Sequential
from keras import backend as K
import tensorflow as tf

Using TensorFlow backend.


In [None]:
K.clear_session()
content_input = Input(batch_shape=(1, 224, 224, 3))
style_input = Input(batch_shape=(1, 224, 224, 3))
output_tensor = tf.get_variable("output_tensor", [1, 224, 224, 3])
output_input = Input(tensor=output_tensor)
## TODO: use a concatenate layer to concatenate the three inputs on the first axis.
input_tensor = ???

If you get an error for the cell below about SSL PROTOCOL VERSIONS or something similar you can try downloading this file
https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5
and putting it in the folder `~/.keras/models` on your computer. Then the cell below should work after that

In [None]:
# We now create a pretrained VGG 16 model, which is really easy to do in Keras
# include_top=False ensures we don't use the fully connected layers.
vgg_model = vgg16.VGG16(input_tensor=input_tensor, weights='imagenet', include_top=False)
# We can now look at the structure of this model
vgg_model.summary()

In [None]:
print([layer.name for layer in vgg_model.layers])

In [None]:
# Now select one of the above listed layers to be the layer to use for content information
# and select some number of layers (maybe 2 or 3 layers) from the above layers to be the
# style information. If you choose layers closer to the input, this will use 
# more simplistic features, and choosing layers closer to the end will use more complicated
# abstracted features.
content_layer = ???
style_layers = [
    ???,
    ???
]
# You can also play with the content and style loss weights if you want to. This will affect 
# how stylized vs similar to the content image the output will look.
content_loss_weight = 5.0
style_loss_weight = 500.0

In [None]:
layers_dict = dict([(layer.name, layer.output) for layer in vgg_model.layers])

### Loss Functions
We want to define our style transfer losses now. First, we are going to define a feature reconstruction loss based on our content features and our output features. Using tensorflow functions, implement the following loss function: $$\frac{1}{2} \sum_{i,j, k} (F_{ijk} - P_{ijk})^2$$ where $F$ is the 3D tensor of content features and $P$ is the 3D tensor of our output image features. HINT: `tf.reduce_sum` and `tf.square` will be helpful here.

In [None]:
def feature_reconstruction_loss(content_img_features, output_img_features):
    """Takes a tensor representing a layer of VGG features from the content image
    and a tensor representing a layer of VGG features from the current output image and returns a loss value.
    """
    # TODO: YOUR CODE HERE

Now we wish to define our style loss function. First, we have to take our features and represent them as a Gram Matrix, for more information on Gram Matrices and this loss function you can read the paper if you like. Then we wish to implement the loss function:
$$ \frac{1}{4H^2W^2C^2} \sum_{ij} (G_{ij} - A_{ij})^2 $$ where $G$ is the Gram matrix of the output image features and $A$ is the Gram Matrix of the style image features. Note that we have written a Gram matrix function for you so you only need to call it.

In [None]:
def gram_matrix(x):
    # make channels first dimension
    x = tf.transpose(x, (2, 0, 1))
    # flatten everything but channels so x is now (C, H*W)
    x = tf.reshape(x, tf.stack([-1, tf.reduce_prod(tf.shape(x)[1:])]))
    return tf.matmul(x, tf.transpose(x))

In [None]:
def style_loss(style_img_features, output_img_features, img_shape):
    """Takes a tensor representing a layer of VGG features from the style image and a tensor
    representing a layer of VGG features from the current output image and returns 
    the style loss for these features.
    """
    # TODO: YOUR CODE HERE

In [None]:
content_features = layers_dict[content_layer]
content_img_features = content_features[0, :, :, :]
output_content_features = content_features[2, :, :, :]
content_loss = feature_reconstruction_loss(content_img_features, output_content_features)

In [None]:
total_style_loss = tf.zeros(1)
weight = 1.0 / len(style_layers)
for style_layer in style_layers:
    style_features = layers_dict[style_layer]
    style_img_features = style_features[1, :, :, :]
    output_img_features = style_features[2, :, :, :]
    total_style_loss += weight * style_loss(style_img_features, output_img_features, (224, 224, 3))

Now we need to combine our two loss functions using the weightings we defined earlier. 

HINT: Don't overthink this; it should be a very simple operation.

In [None]:
total_loss = ???

In [None]:
optimize = tf.train.AdamOptimizer(learning_rate=10).minimize(total_loss, var_list=[output_tensor])

# Part 2: Feeding in Images

We now want to load and preprocess our images. keras provides a `load_img` function that conveniently loads our image and then cuts it down to our target size. Keras also provides a `vgg16.preprocess_input` that preprocesses images to be in the format vgg16 expects. Use these two functions to write the load_image function below.

In [None]:
from keras.preprocessing.image import load_img, img_to_array
import numpy as np
def load_image(img_path):
    img = #YOUR CODE HERE call load img and set the target size to be (224,224,3)
    img = img_to_array(img)
    img = np.expand_dims(img, axis=0)
    img = #YOUR CODE HERE
    return img

def deprocess_image(x):
    x = x.reshape((224, 224, 3))
    x[:, :, 0] += 103.939
    x[:, :, 1] += 116.779
    x[:, :, 2] += 123.68
    # 'BGR'->'RGB'
    x = x[:, :, ::-1]
    x = np.clip(x, 0, 255).astype('uint8')
    return x

In [None]:
content_img_path = 'images/campanile.jpg'
style_img_path = 'images/monet_style.jpg'

In [None]:
content_img = load_image(content_img_path)
style_img = load_image(style_img_path)

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline

In [None]:
plt.imshow(deprocess_image(content_img))

In [None]:
plt.imshow(deprocess_image(style_img))

# Part 3: Stylize Images

In [None]:
assign_var = tf.assign(output_tensor, content_img)
sess = K.get_session()
var = sess.run(assign_var)

Running the cell below will update the image 10 times. Since the initialization code is in the cell above, if you run the cell below and your output isn't great, you can run it for another 10 iterations simply by rerunning the cell below.

In [None]:
n_iterations = 10
for i in range(n_iterations):
    print("Running iteration: {}".format(i))
    _, output_val, loss = sess.run([optimize, output_tensor, total_loss], feed_dict={content_input: content_img, style_input: style_img})

In [None]:
output_img = deprocess_image(output_val)

In [None]:
plt.imshow(output_img)

# Part 4: Style Transfer Writeup

Now you need to write-up your project. First, write a short paragraph about your understanding of how style transfer works. Feel free to refer to the paper if it helps but your paragraph needs to be in your own words. 

Then attach 3 sets of images to your writeup. For each set show the original content image, the original style image, and the style transfer result. One set should be the images we provided here, include the content and style layers you used as well as the content and style weights you used. Another set should be the images we provided here but with different content layers, style layers and different content and style weights, include your choices for the layers and weights in your writeup. Finally, include a set of images that is based on a new content image and a new style image that you choose yourself. There will be an award for the group with the coolest style transfer result. 