## Neural style transfer

- content image (C), style image (S), generated image (G)
- cost function $J(G) = \alpha J_{content}(C,G) + \beta J_{style}(S,G)$
- initialize $G$ randomly
- use gradient descent to minimize $J(G)$
    - $G = G - \dfrac{\partial}{\partial G}J(G)$
    
## Content cost function

- say you use hidden layer $l$ to compute content cost
- use pre-trained ConvNet (eg. VGG network)
- let $a^{[l](C)}$ and $a^{[l](G)}$ be the activation of layer $l$ on the images
- if $a^{[l](C)}$ and $a^{[l](G)}$ are similar, both images have similar content
    - $J_{content}(C,G) = \dfrac{1}{2}||a^{[l](C)} - a^{[l](G)}||^{2}$
    
## Style cost function

- style matrix
    - let $a_{i,j,k}^{l}$ = activation at $(i,j,k)$ (height, weight, channel)
    - $G^{[l]}$ is $n_{c}^{[l]}$ x $n_{c}^{[l]}$
    - $G_{kk'}^{[l]} = \displaystyle\sum_{i=1}^{n_{H}^{[l]}}\displaystyle\sum_{j=1}^{n_{W}^{[l]}}a_{ijk}^[l]a_{ijk'}^[l]$ (do this for both style and generated) 
    - $J_{style}^{[l]}(S,G) = ||G^{[l](S)} - G^{[l](G)}||^{2}_{F} = \displaystyle\sum_{k}\displaystyle\sum_{k'}(G_{kk'}^{[l](S)} - G_{kk'}^{[l](G)})^{2}$ 
    - $J_{style}(S,G) = \displaystyle\sum_{l}\lambda^{[l]}J_{style}^{[l]}(S,G)$

## Example

- Merges two images (content and style) to create a new image. (generated)

<img src="img/louvre_generated.png" style="width:750px;height:200px;">

Neural style transfer
- Build the content cost function $J_{content}(C,G)$
- Build the style cost function $J_{style}(S,G)$
- Put it together to get $J(G) = \alpha J_{content}(C,G) + \beta J_{style}(S,G)$. 

### Packages

In [None]:
import os
import sys
import scipy.io
import scipy.misc
import matplotlib.pyplot as plt
from matplotlib.pyplot import imshow
from PIL import Image
import numpy as np
import tensorflow as tf
import pprint
%matplotlib inline

In [None]:
class CONFIG:
    IMAGE_WIDTH = 400
    IMAGE_HEIGHT = 300
    COLOR_CHANNELS = 3
    NOISE_RATIO = 0.6
    MEANS = np.array([123.68, 116.779, 103.939]).reshape((1,1,1,3)) 
    VGG_MODEL = 'pretrained-model/imagenet-vgg-verydeep-19.mat' # Pick the VGG 19-layer model by from the paper "Very Deep Convolutional Networks for Large-Scale Image Recognition".
    STYLE_IMAGE = 'images/stone_style.jpg' # Style image to use.
    CONTENT_IMAGE = 'images/content300.jpg' # Content image to use.
    OUTPUT_DIR = 'output/'

In [None]:
# The model is stored in a python dictionary.  
# The python dictionary contains key-value pairs for each layer.  
# The 'key' is the variable name and the 'value' is a tensor for that layer. 
pp = pprint.PrettyPrinter(indent=4)
model = load_vgg_model("data/imagenet-vgg-verydeep-19.mat")
pp.pprint(model)

### Compute content cost

- We want "G" to be similar to "C".
- Choosing middle layer in network gets the best result in pracice.

#### Forward prop "C"

- Set "C" as the input to pretrained VGG, and run forward prop.
- $a^{(C)}$ be the activation in the chosen layer. ($n_H \times n_W \times n_C$ tensor_

#### Forward prop "G"

- Set "G" as the input to pretrained VGG, and run forward prop.
- Let $a^{(G)}$ be the corresponding activation. 

$$J_{content}(C,G) =  \frac{1}{4 \times n_H \times n_W \times n_C}\sum _{ \text{all entries}} (a^{(C)} - a^{(G)})^2$$

<img src="img/NST_LOSS.png" style="width:800px;height:400px;">

In [5]:
def compute_content_cost(a_C, a_G):
    """
    Computes the content cost
    
    Arguments:
    a_C -- tensor of dimension (1, n_H, n_W, n_C), hidden layer activations representing content of the image C 
    a_G -- tensor of dimension (1, n_H, n_W, n_C), hidden layer activations representing content of the image G
    
    Returns: 
    J_content -- scalar that you compute using equation 1 above.
    """
    
    # Retrieve dimensions from a_G (≈1 line)
    m, n_H, n_W, n_C = a_G.get_shape().as_list()
    
    # Reshape a_C and a_G (≈2 lines)
    a_C_unrolled = tf.reshape(a_C, [m, tf.multiply(n_H, n_W), n_C])
    a_G_unrolled = tf.reshape(a_G, [m, tf.multiply(n_H, n_W), n_C])
    
    # compute the cost with tensorflow (≈1 line)
    J_content = tf.reduce_sum(tf.square(tf.subtract(a_C_unrolled, a_G_unrolled))) / (4 * n_H * n_W * n_C)
    
    return J_content

### Computer style cost

