Skip to content
Generate novel artistic images using neural style transfer algorithm
Python
Branch: master
Clone or download
Pull request Compare This branch is even with tejaslodaya:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
images
nb_images
output
.gitignore
README.md
config.py
nst_app_utils.py
nst_main.py
nst_utils.py

README.md

Neural style transfer

Neural Style Transfer is an algorithm that given a content image C and a style image S can generate novel artistic image

A few examples


  • The beautiful ruins of the ancient city of Persepolis (Iran) with the style of Van Gogh (The Starry Night)
  • The tomb of Cyrus the great in Pasargadae with the style of a Ceramic Kashi from Ispahan
  • A scientific study of a turbulent fluid with the style of a abstract blue fluid painting

Transfer Learning


Neural Style Transfer (NST) uses a previously trained convolutional network, and builds on top of that. The idea of using a network trained on a different task and applying it to a new task is called transfer learning.

Following the original NST paper, I have used the VGG network. Specifically, VGG-19, a 19-layer version of the VGG network. This model has already been trained on the very large ImageNet database, and thus has learned to recognize a variety of low level features (at the earlier layers) and high level features (at the deeper layers)

Cost function


Most of the algorithms optimize a cost function to get a set of parameter values. In NST, optimize a cost function to get pixel values!

  1. Building the content cost function

    "Generated" image G should have similar content as the input image C. The most visually pleasing results will be generated if a layer is chosen in the middle of the network--neither too shallow nor too deep. Set the image C as the input to the pretrained VGG network, and run forward propagation. a(C) be the hidden layer activation in the layer you had chosen. Set G as the input, and run forward propagation. a(G) be the corresponding hidden layer activation. The cost function will be:

    When minimizing the content cost later, it helps make sure G has similar content as C

  2. Building the style cost function

    Gram matrix (or style matrix) which serves as the basic building block of style cost function computes the correlation between filters. This matrix is of dimensions (nC,nC) where nC is the number of filters. The value G(i,j) measures how similar the activation of filter i are to the activation of filter j. The Style matrix G measures the style of an image. After generating the Style matrix (Gram matrix), goal is to minimize the distance between the Gram matrix of the "style" image S and that of the "generated" image G

    where G(S) and G(G) are respectively the Gram matrices of the "style" image and the "generated" image, computed using the hidden layer activations for a particular hidden layer in the network.

    Better results are obtained when style costs from several different layers are merged. This is in contrast to the content representation, where usually using just a single hidden layer is sufficient. λ[l] is the weights given to different layers

    Minimizing the style cost will cause the image G to follow the style of the image S

  3. Putting it together to get total cost

    Cost function that minimizes both the style and the content cost

    The total cost is a linear combination of the content cost and the style cost. α and β are hyperparameters that control the relative weighting between content and style

Now, reduce the cost function and "generated" image will be a combination of content of content image and style of style image

Steps


The following steps are to be followed to synthesize new images. Find correlation between the steps mentioned below and nst_main.py

  1. Create an Interactive tensorflow session
  2. Load the content image
  3. Load the style image
  4. Randomly initialize the image to be generated
  5. Load the pretrained VGG16 model
  6. Build the TensorFlow graph:
    • Run the content image through the VGG16 model and compute the content cost
    • Run the style image through the VGG16 model and compute the style cost
    • Compute the total cost
    • Define the optimizer and the learning rate
  7. Initialize the TensorFlow graph and run it for a large number of iterations(200 here), updating the generated image at every step

NOTE


  1. Download pretrained VGG model from here and place it in pretrained_model folder. Change the config.py file to point to VGG19 model path
  2. Run the nst_main.py on different style and content images placed in images folder. Change the config.py accordingly
  3. Content & style images can be found in images directory. Corresponding output images can be found in output directory

References


You can’t perform that action at this time.