This repository contains my implementation of the Neural Style Transfer paper. The code is based on the explanation given in Andrew Ng's Deep Learning Specialization.
This algorithm allows to transfer the style from one image (style image) to another (content image). The resulting (generated) image is initialized as a random noise sampled from the uniform/Gaussian distribution correlated with the content image. Rather than learning the parameters, the model is trained to update the pixel values of a generated image based on the loss between this image and the content and style images. This implementation uses a pre-trained 19-layer VGG network which is trained for 20000 epochs.
The content cost is computed as a squared difference between activation maps of the content image
The style cost is defined as the unnormalized cross covariance between activation maps across channels.
The Gram matrix used to compute the style of a feature map at layer
for each
This is is equal to:
The style cost is computed as follows:
We can get better results if the style is computed from multiple layers and then combined. Each layer is weighted by some hyperparameter
In this implementation all layers used for computing the style cost are assigned a value of
Total cost combines the content and style costs and weights them by additional hyperparameters
Here are some of the results of training the algorithm over 20,000 epochs.
It can be seen that different values of alpha and beta slightly affect the final image:
We can also see that using different types of noise leads to different results (left - uniform noise, right - gaussian noise):
This animation illustrates how the style transfer is performed starting from the original image+noise to the final artistic version.