Skip to content

AbhiSaphire/Style-Transfer-DNN

Repository files navigation

Style Transfer Deep Neural Network

Introduction

CNN are some of the most powerfull tools for images classification and Analysis. They process visual image in a feed forward manner passing an input image through a collection of image filters which extract some of the features from the input image.

In this I'll be building the Style Transfer Algorithm. Style Transfer allows us to apply style of one image onto another image of your choice. Trained CNN to extract style of one image and apply on other image.

Reference

Image Style Transfer Using Convolutional Neural Networks

In this paper style transfer uses the features found in the 19 layer VGG Network (VGG19). This network accepts a color image as input and passes it through a series of convolution and pooling layers followed finally by 3 fully connected layers. It uses both the Content and Style representation to form the target image.

Loss Functions

Content Loss

A loss that calculates the difference between the content image and the target image representations. Aim is to minimize the content loss.

Lcontent = 12 ∑ (Tc - Cc)2

Gram Matrix

To make sure that our target image have same content as our content image. We calculate the content loss that compares the content representation of 2 images. Same can be done to the style image, this can be calculated after looking at how similar features of a single layer are.

  • Vectorize values in the last pooling layer (Flattening) 3D -> 2D
  • Multiply resultant 2D matrix to its transpose to get the gram matrix.

Style Loss

To find the Style loss between the target and style image. We find the Mean Squared distance between the style and target image gram matrices.

Lstyle = a ∑i wi (Ts,i - Ss,i)2

Total Loss

Combining Content loss and Style loss will give total loss and then we use typical backpropagation and optimization to reduce this loss by iteratively changing the target image to match our desired content and style.

Ltotal = α [Lcontent] + β [Lstyle] = α [12 ∑ (Tc - Cc)2] + β [a ∑i wi (Ts,i - Ss,i)2]

α and β are the amount or the ratio of content and style you want to add to the target image.

Training Process

Training Content Image with Style

Results

PS - Ran only 2000 iteration of updates on content image.

Content vs Content with Poster style applied Content vs Content with Paint style applied

Cheers