Skip to content

This repository contains the implementation of the Neural Style Transfer algorithm using Tensorflow.

Notifications You must be signed in to change notification settings

iamkzntsv/neural-style-transfer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

94 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Neural Style Transfer

This repository contains my implementation of the Neural Style Transfer paper. The code is based on the explanation given in Andrew Ng's Deep Learning Specialization.

What is NST?

This algorithm allows to transfer the style from one image (style image) to another (content image). The resulting (generated) image is initialized as a random noise sampled from the uniform/Gaussian distribution correlated with the content image. Rather than learning the parameters, the model is trained to update the pixel values of a generated image based on the loss between this image and the content and style images. This implementation uses a pre-trained 19-layer VGG network which is trained for 20000 epochs.

Loss Function

Content Cost

The content cost is computed as a squared difference between activation maps of the content image $C$ and the generated image $G$. The feature maps are usually taken from the layer in the middle of the network.

$$J_{content}(C, G) = \frac{1}{4 \times n_{H}^{[l]} \times n_{W}^{[l]} \times n_{C}^{[l]}} \sum(a^{[l](C)} - a^{[l](G)})^2$$

Style Cost

The style cost is defined as the unnormalized cross covariance between activation maps across channels.

The Gram matrix used to compute the style of a feature map at layer $l$ is given by:

$$ G_{(gram)kk'}^{[l](S)} = \sum_{i}^{n_{H}^{[l]}} \sum_{j}^{n_{W}^{[l]}} a_{i,j,k}^{[l](S)} a_{i,j,k'}^{[l](S)} $$ $$ G_{(gram)kk'}^{[l](G)} = \sum_{i}^{n_{H}^{[l]}} \sum_{j}^{n_{W}^{[l]}} a_{i,j,k}^{[l](G)} a_{i,j,k'}^{[l](G)} $$

for each $kk'$, where $a_{i,j,k}^{[l]}$ is the activation value of the feature map $k$ at point $i,j$.

This is is equal to:

$$ G_{(gram)}^{(A)} = AA^{T} $$

The style cost is computed as follows:

$$J_{style}(S, G) = \frac{1}{(4 \times n_{H}^{[l]} \times n_{W}^{[l]} \times n_{C}^{[l]})^2} \sum_{k} \sum_{k'} (G_{(gram)kk'}^{[l](S)} - G_{(gram)kk'}^{[l](G)})^2 $$

We can get better results if the style is computed from multiple layers and then combined. Each layer is weighted by some hyperparameter $\lambda$ which reflects how much the layer will contribute to the overall style:

$$J_{style} (S,G) = \sum_{l} \lambda^{[l]} J_{style}^{[l]} (S,G)$$

In this implementation all layers used for computing the style cost are assigned a value of $0.2$. Note that it is important that the sum of all layers is $1$.

Total Cost

Total cost combines the content and style costs and weights them by additional hyperparameters $\alpha$ and $\beta$,

$$J_{total} (C, S, G) = \alpha J_{content} (C, G) + \beta J_{style} (S, G)$$

Results

Here are some of the results of training the algorithm over 20,000 epochs. result_1 result_2 result_3 result_4

It can be seen that different values of alpha and beta slightly affect the final image:

ab1 ab2 ab3

We can also see that using different types of noise leads to different results (left - uniform noise, right - gaussian noise):

uniform gaussian

This animation illustrates how the style transfer is performed starting from the original image+noise to the final artistic version.

gif

About

This repository contains the implementation of the Neural Style Transfer algorithm using Tensorflow.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages