## Introduction 

CNNs pass an input image through filters, which extract certain features from the input image.

This process of extracting features is useful for more than just image classification, but also for creating and changing images! 

Style transfer is the process of applying the style of one image to another content image, such that the content of one image is merged with the style of another image. 

## Style Transfer 

Adding the style of one image to the content of another can be done with help of max pooling layers from a pretrained network. 

* **Max pooling layers** eliminate unuseful information, such that only the content representation is left (ie. edges) 
* **Feature Space**: Contains of texture and color information, used to represent the style of the images. 
* **Feature maps**: The output after applying a filter to an image. 

## Extracting features

Using the `vgg19` network, the content and the style image will be passed through the vgg network. 

**Getting the features** 
1. Feedforward content image until getting to a deep convolutional layer. 
2. The output of the layer is the content representation of the input image 
3. Send the style image into the network 
4. Extract different features from multiple layers which represent the style of the image. 


The next step is to to create a target image, but how is this done? 

1. During forming the new image, compare the content with the original image's content using a content loss 
2. A content loss calculates the difference between the content and target image representation. Content: Cc Target: Tc. The aim is to minize the loss for the images to have the same content.
3. Finally, calculate the mean squared difference between the two losses. 

### Note that this is not the whole picture yet and there is still a style loss to define

## Gram Matrix 

The style representation of an image is based on correlations between features in individual layers. 

Multi-scale style representation captures different sizes of style features. 

A gram matrix is created by flatenning feature maps into 2D matrix of values, and multiplying the matrix by the transpoe(the matrix flipped) 

The gram matrix represents the similarties between the matrix and it's flipped version, which represents how close the style of the original image is to the old image. 

**Quiz**: When the height and the width are flattenned, the new matrix will have height X width dimensions. 

**Quiz 2**: The gram matrix will have a width and height equal to the depth of the network. 

The style loss is calculated by finding the mean squared difference between the gram matrices of the style and the target images. 

Note that the style and content loss are calculated differently they will be different but the target images has to have an about equal amount of each. In order to deal with, constants alpha and beta are applied to the style and content.

The style must be multiplied by a much large constant in order for the image to change it's style. This is because if the style ratio is small, the images would appear like the **content** image 

## Hyperparameters Summary

1. Which content layers to use for content calculation - note that this doesn't make as much difference as choosing style layers 
2. Which style layers to use for style calculation  - in the lesson this was done using the research paper, but different layers can be used! 
3. Content weight (constant, multiply with the content loss)
4. Style weight (constant, multiply with the style loss)

## Coding notes 

When loading the neural network, only the convlutional and pooling layers need to be loaded in, this is done by the following code, not that this will not work for a different type of network. 

In [None]:
vgg = models.vgg19(pretrained=True).features

**Excercise 1**: In order to choose the layers you must create a dictionary that takes the required content and style layers. It is neccessary that the numbers are the same as they are in the vgg network.

**Excercise 2**: The gram matrix function takes the output of a convolutional layer, grabs it's shapes and reshapes it to be 2 dimensional. Finally, it multiplies the matrix with the inverted matrix as seen in the lesson. 

In [None]:
_, d, h, w = tensor.size()
tensor = tensor.view(d, h * w)  # converting to 2-D
gram = torch.mm(tensor, tensor.t()) # tensor.t gives us the inverted matrix