# 1. Introduction

Neural Style Transfer (NST), is the technique in which the style of a piece of artwork is transferred onto a picture. It basically merges two images, namely, a "content" image (C) and a "style" image (S), to create a "generated" image (G). The generated image G combines the "content" of the image C with the "style" of image S.

In the training process, the neural network is not trained  to do anything instead it make use of of backpropagation to minimize two defined loss values the content image loss $J_c$ and the the style loss $J_s$ as defined below. To put it in another way, while training neural networks we update our weights and biases, but in style transfer, we keep the weights and biases constant, and instead, update our image.

The principle is simple: we define two losses, one for the content image ($J_c$) and one for the style ($J_s$). $J_c$ measures how different the content is between two images, while $J_s$ measures how different the style is between two images. Then, we take a third image, the input, (e.g. a with noise), and we transform it in order to both minimize its content-loss with the content-image and its style-distance with the style-image.

## 2. Define content and style loss

### 2.1. Content loss

The content loss is the *euclidean distance* between the intermediate representations of the **pastiche** image $P_{ij}$(generated image) and the content image $C_{ij}$ defined by:


\begin{align}J_c =  \sum_l \sum_{i,j} (\alpha C_{i,j}^l - \alpha P_{i,j}^l)^2\end{align}

 that can be computed using a criterion `nn.MSELoss()`.
 
From the above equation is clear that, you first make a list of layers at which you want to compute content loss and then  pass the content and pastiches images through the network until a particular layer in the list. Aftet this, you take the output of that layer, square the difference between each corresponding value in the output, and sum them all up. 

The process is repeated for every layer in the list, and sum those up. One thing to note, though: we multiply each of the representations by some value alpha (called the content weight) before finding their differences and squaring it, whereas the original equation calls for the value to be multiplied after squaring it. In practice, the former  work much better than the latter, as it produces appealing stylizations much more quickly.


### 2.2 Style cost
For the style loss, we need first to define a module that compute the
style matrix, also called a "Gram matrix" $G$ given the feature maps $P$ of the
neural network fed by $C$, at layer $l$. In linear algebra, the Gram matrix G of a set of vectors $(v_{1},\dots ,v_{n})$ is the matrix of dot products, whose entries are $${\displaystyle G_{ij} = v_{i}^T v_{j} = np.dot(v_{i}, v_{j})  }$$ In other words, $G_{ij}$ compares how similar $v_i$ is to $v_j$: If they are highly similar, you would expect them to have a large dot product.

The Gram matrix contains non-localized information about the image, such as texture, shapes, and weights - style. The pytorch implementation of gram matrix is shown below.

```python
class GramMatrix(nn.Module):
    
    def forward(self, input):
        a, b, c, d = input.size()
        features = input.view(a * b, c * d)
        G = torch.mm(features, features.t())
        
        return G.div(a * b * c * d)
```

Now that we have defined the Gram matrix as having information about style, the style loss is then the euclidean distance between the Gram matrices of the intermediate representations of the pastiche $P$ and style image $S$.

\begin{align}J_s =  \sum_l \sum_{i,j} (\beta G_{i,j}^{s,l} - \beta G_{i,j}^{p,l})^2\end{align}

Similar to the content loss computation, we find the Euclidean distances between each corresponding pair of values in the Gram matrices computed at each layer in a predefined list of layers, multiplied by some value $\beta$ (known as the style weight).

### 2.3 Total Loss

Given the content loss ($J_c$) which contains information on how close the pastiche is in content to the content image and the style loss ($J_s$) — which contains information on how close the pastiche is in style to the style image. We can now add them together to get the total loss ($J$). 

$$
J = J_c + J_s
$$



We then backpropagate through the network to reduce this loss by getting a gradient on the pastiche image and iteratively changing it to make it look more and more like a stylized content image.

Then, the style loss module is implemented exactly the same way like the content loss module, but we have to add the gramMatrix as a parameter:

```python
class StyleLoss(nn.Module):
    
    def forward(self, input, target):
        out = nn.MSELoss()(GramMatrix()(input), target)
        return(out)
```

## 3 Pytorch implementation

### 3.1 Import all important line

First install pytorch 0.4.0 for colaraboratory

In [None]:
from os import path
from wheel.pep425tags import get_abbr_impl, get_impl_ver, get_abi_tag
platform = '{}{}-{}'.format(get_abbr_impl(), get_impl_ver(), get_abi_tag())

accelerator = 'cu80' if path.exists('/opt/bin/nvidia-smi') else 'cpu'

#!pip install -q http://download.pytorch.org/whl/{accelerator}/torch-0.4.0-{platform}-linux_x86_64.whl torchvision
import torch
print(torch.__version__)
print(torch.cuda.is_available())

Then load all  important modules

In [None]:
from __future__ import print_function

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from PIL import Image
import scipy
from scipy.misc import imsave
import matplotlib.pyplot as plt
import torchvision.transforms as transforms
import torchvision.models as models
import copy


In [None]:
class GramMatrix(nn.Module):
    
    def forward(self, input):
        a, b, c, d = input.size()
        features = input.view(a * b, c * d)
        G = torch.mm(features, features.t())
        
        return G.div(a * b * c * d)

## 3.2 Define network

We first create a class which  initialize  important variables such as 
- the style,  content, and pastiches images;
- the layers at which we compute content and style loss, as well as the alpha and beta weights we multiply the representations by; 
- the pretrained network that we use to get the intermediate representations (we use VGG-19); 
- the style loss and the content  loss  and 
- the optimizer we use. 

We also want to make use of a GPU if we have on on our machine. Now for the network’s “training” regime. We pass the images through the network one layer at a time. We check to see if it is a layer at which we do a content or style loss computation. If it is, we compute the appropriate loss at that layer. Finally, we add the content and style losses together and call backward on that loss, and take an update step. 

To use the LBGFS optimizer, it is necessary to pass into the step function a closure function which “reevaluates the model and returns the loss”; we don’t need to do that with any other optimizer…go figure


In [None]:
class StyleCNN(object):
  
    def __init__(self,  style, content, pastiche):
        super(StyleCNN, self).__init__()
        self.device = device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
        self.style = style.to(self.device)
        self.content = content.to(self.device)
        pastiche     = pastiche.to(self.device)
        self.pastiche = nn.Parameter(pastiche.data)
        
        self.content_layers = ['conv_4']
        self.style_layers = ['conv_1', 'conv_2', 'conv_3', 'conv_4', 'conv_5']
        self.content_weight = 1
        self.style_weight = 1000
        
        self.model = models.vgg19(pretrained=True)
        
        self.mse = nn.MSELoss()
        self.gram = GramMatrix()
        self.optimizer = optim.LBFGS([self.pastiche])
        self.model.to(self.device)
        self.mse.to(self.device)
        self.gram.to(self.device)
    
    def train(self):
        
        def closure():
            self.optimizer.zero_grad()
            pastiche = self.pastiche.clone()
            pastiche.data.clamp_(0, 1)
            content = self.content.clone()
            style = self.style.clone()
            
            content_loss = 0
            style_loss = 0
            
            i = 1
            # The in-place version doesn't play very nicely with the ContentLoss
            # and StyleLoss we insert below. So we replace with out-of-place
            not_inplace = lambda layer: nn.ReLU(inplace=False) if isinstance(layer, nn.ReLU) else layer
            
            for layer in list(self.model.features):
                layer = not_inplace(layer)
                
                    
                pastiche, content, style = layer.forward(pastiche), layer.forward(content), layer.forward(style)
                
                if isinstance(layer, nn.Conv2d):
                    name = "conv_" + str(i)
                    
                    if name in self.content_layers:
                        content_loss += self.mse(pastiche * self.content_weight,
                                                         content.detach() * self.content_weight)
                    
                    if name in self.style_layers:
                        pastiche_g, style_g = self.gram.forward(pastiche), self.gram.forward(style)
                        style_loss += self.mse(pastiche_g * self.style_weight, style_g.detach() * self.style_weight)
                
                if isinstance(layer, nn.ReLU):
                    i += 1
            
            total_loss = content_loss + style_loss
            total_loss.backward()
            
            return total_loss
        self.optimizer.step(closure)
        return self.pastiche

## 3.3 Function to load and process image

One more step before we can start transferring some style — we need to write up a couple of convenience functions to load and visualize images

In [None]:
imsize = 128

loader = transforms.Compose([
             transforms.Resize((imsize, imsize)),
             transforms.ToTensor()
         ])

unloader = transforms.ToPILImage()

def image_loader(image_name):
    image = Image.open(image_name)
    image = loader(image).unsqueeze(0)
    return image
  
def save_image(input, path):
    image = input.data.clone().cpu()
    image = image.view(3, imsize, imsize)
    image = unloader(image)
    imsave(path, image)    
    
def imshow(tensor, title=None):
    plt.ion()
    image = tensor.clone()  
    image = image.squeeze(0)      
    image = unloader(image)
    plt.imshow(image)
    if title is not None:
        plt.title(title)
    plt.pause(0.001) 


## 3.4 Load content and style images

Let write function to load image from computer. You only need this for colaboratory

In [None]:
def upload_files():
  from google.colab import files
  uploaded = files.upload()
  for k, v in uploaded.items():
    open(k, 'wb').write(v)
  return list(uploaded.keys())

In [None]:
upload_files()

Load content and style images

In [None]:
style_img = image_loader("../Data/bread.jpg")
content_img = image_loader("../Data/bondia.jpg")

In [None]:
plt.figure()
imshow(style_img, title='Style Image')

In [None]:
plt.figure()
imshow(content_img, title='Content Image')

Define a function to train our models. We will train our model for 20 iteration and save image  as *pastiche_n.png* ater every  $n$  epoch

In [None]:
def main(style, content, pastiche,  num_epochs):
    
    style_cnn = StyleCNN(style, content, pastiche)
    
    for i in range(num_epochs):
        pastiche = style_cnn.train()
        
        if i % 10 == 0:
          
          print("Iteration: %d" % (i))
          pastiche.data.clamp_(0, 1)
          path = "pastiche_{}.png".format(i+1)
          save_image(pastiche, path)
            
        if i == num_epochs - 1:
          print("save image at {}".format(i+1))
          pastiche.data.clamp_(0, 1)
          path = "pastiche_{}.png".format(i+1)
          save_image(pastiche, path)

### Train model
Before we call our model let define initial pastiche image.To simplify the code, we take an image of the same dimensions like content  images. This image can be a white noise, or it can also be a copy of the content-image.

In [None]:
pastiche_img = content_img.clone()

In [None]:
main(style_img, content_img, pastiche_img,100)

In [None]:
img = Image.open("pastiche_26.png")
img