Neural Style Transfer

This repository contains my implementation of the Neural Style Transfer paper. The code is based on the explanation given in Andrew Ng's Deep Learning Specialization.

What is NST?

This algorithm allows to transfer the style from one image (style image) to another (content image). The resulting (generated) image is initialized as a random noise sampled from the uniform/Gaussian distribution correlated with the content image. Rather than learning the parameters, the model is trained to update the pixel values of a generated image based on the loss between this image and the content and style images. This implementation uses a pre-trained 19-layer VGG network which is trained for 20000 epochs.

Loss Function

Content Cost

The content cost is computed as a squared difference between activation maps of the content image $C$ and the generated image $G$. The feature maps are usually taken from the layer in the middle of the network.

$$J_{content}(C, G) = \frac{1}{4 \times n_{H}^{[l]} \times n_{W}^{[l]} \times n_{C}^{[l]}} \sum(a^{[l](C)} - a^{[l](G)})^2$$

Style Cost

The style cost is defined as the unnormalized cross covariance between activation maps across channels.

The Gram matrix used to compute the style of a feature map at layer $l$ is given by:

$$ G_{(gram)kk'}^{[l](S)} = \sum_{i}^{n_{H}^{[l]}} \sum_{j}^{n_{W}^{[l]}} a_{i,j,k}^{[l](S)} a_{i,j,k'}^{[l](S)} $$ $$ G_{(gram)kk'}^{[l](G)} = \sum_{i}^{n_{H}^{[l]}} \sum_{j}^{n_{W}^{[l]}} a_{i,j,k}^{[l](G)} a_{i,j,k'}^{[l](G)} $$

for each $kk'$, where $a_{i,j,k}^{[l]}$ is the activation value of the feature map $k$ at point $i,j$.

This is is equal to:

$$ G_{(gram)}^{(A)} = AA^{T} $$

The style cost is computed as follows:

$$J_{style}(S, G) = \frac{1}{(4 \times n_{H}^{[l]} \times n_{W}^{[l]} \times n_{C}^{[l]})^2} \sum_{k} \sum_{k'} (G_{(gram)kk'}^{[l](S)} - G_{(gram)kk'}^{[l](G)})^2 $$

We can get better results if the style is computed from multiple layers and then combined. Each layer is weighted by some hyperparameter $\lambda$ which reflects how much the layer will contribute to the overall style:

$$J_{style} (S,G) = \sum_{l} \lambda^{[l]} J_{style}^{[l]} (S,G)$$

In this implementation all layers used for computing the style cost are assigned a value of $0.2$. Note that it is important that the sum of all layers is $1$.

Total Cost

Total cost combines the content and style costs and weights them by additional hyperparameters $\alpha$ and $\beta$,

$$J_{total} (C, S, G) = \alpha J_{content} (C, G) + \beta J_{style} (S, G)$$

Results

Here are some of the results of training the algorithm over 20,000 epochs.

It can be seen that different values of alpha and beta slightly affect the final image:

We can also see that using different types of noise leads to different results (left - uniform noise, right - gaussian noise):

This animation illustrates how the style transfer is performed starting from the original image+noise to the final artistic version.

Name		Name	Last commit message	Last commit date
Latest commit History 94 Commits
model		model
README.md		README.md
data_loader.py		data_loader.py
loss.py		loss.py
neural_style_transfer.py		neural_style_transfer.py
nst.gif		nst.gif
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Neural Style Transfer

What is NST?

Loss Function

Content Cost

Style Cost

Total Cost

Results

About

Releases

Packages

Languages

iamkzntsv/neural-style-transfer

Folders and files

Latest commit

History

Repository files navigation

Neural Style Transfer

What is NST?

Loss Function

Content Cost

Style Cost

Total Cost

Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages