Skip to content

Xiongfeng-Jin/Neural-Style-Transfer

Repository files navigation

Neural-Style-Transfer

Usage:

nst = NeuralStyleTransfer(contentImagePath,styleImagePath,modelPath)
	-- contentImagePath: the path to the image you want to transfer
	-- styleImagePath: the path to the image you want to get the style from
	-- modelPath: imagenet vgg-19 model in this repository
nst.run(num_iterations = 300)

You can download the vgg-19 model in this link:https://mega.nz/#!pIFimCCA!9nFD0KJ_ysx0NWfEs90bPkjvBMUn1Y82pYF-FrWLHw8

after the Neural Style Transfer finishes its task, you will find a folder named output in the same directory, and the result will be there.

What is Neural Style Transfer

Neural style transfer is you take a content image C and combine it with a style image S to get a generated image G such that image G is the content image drawn in the style of image S.

In order to understand the Neural style transfer, we need to look at the features extracted by ConvNet at various layers, both the shallow and the deeper layers of a ConvNet. If we have a ConvNet looks like the one below, and if we pick a unit in layer one and find nine image patches that maximize the unit's activation. We repeat the process for other units in layer one, then we can see that units in layer one have relatively simple features such as edge or a particular shade of color.

If we go deeper to the ConvNet, the deeper layers will see a larger region of the image where each pixel could hypothetically affect the output of these later layers of the neural network. We can see that layer 2 is looking for a bit more complex shapes and patterns than layer 1, and layer 3 is looking for rounded shapes and people. Layer 4 is detecting dogs, water etc, and layer 5 is detecting even more sophisticated things.

So we've gone a long way from detecting relatively simple things such as edges in layer 1 to textures in layer 2, up to detecting very complex objects in the deeper layers.

We now define a Cost function J(G) and use gradient descent to minimize the cost function to get our generated image G.

The cost function has two parts, and the first part is a measure, or the cost, of how similar the content image compare to the generated image. Second part is the measure, or cost, of how similar is the style of image G compare to the style image S. Finally we will weight these with two hyperparameter alpha and beta to specify the weight of each cost. The step to find generated image G is follows:

  1. Initiate G randomly with dimension W x H x 3
  2. Use gradient descent to minimize the cost function J(G)

Let calculate the content cost:

  1. Choose a hidden layer l to compute the content cost. Usually we choose l not too shallow nor too deep in the neural network. If we choose a layer l that is too shallow then the generated image will be look like the content image.
  2. Use a pre-trained ConvNet (E.g. VGG network) to calculate activation at each layer for both content image and generated image
  3. The content cost is the just how similar of the content image activation and generated image activation.

To calculate style cost we need first to define the style of a image as correlation between activations across channels.

Then we will build a style matrix G for both style image G_s and generated image G_g, and the style cost will be:

To get a visually more pleasing result, we can sum over all the layers instead just one layer.

About

Neural Style Transfer

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages