## Chapter 5. Project 

## Neural Painter - an application to colourise sharp pencil sketches 

We have now successfully learnt the fundamentals of Neural Style Transfer and also implemented an end to end style transfer pipeline. 

In this unit we will apply what we learnt to build one more cool project.  We are going to apply style transfer concepts to build a **Neural Painter** which takes an uncoloured image and colours it based on the the style image.  Let's dive in to it . 

This project will take in a sharp pencil sktech of any object as content image. I highlight the word sharp because we need the outlines to be well preserved in the outputs too. The style image will be the painted version of a similar object. It need not be the same object though.If you want to be a bit more adventurous try with diffent objects to get colourful outputs. 

We are going to read images from url this time, which means you can provide url of any image of choice from web (preferablely google images search). This also means, you can directly run this on **Google Colab** as well (https://colab.research.google.com/notebooks/welcome.ipynb) 

What will you learn from this project :
- Building the complete style transfer pipeline
- Adapting style transfer to binary or gray scale images. So you can apply it to any kind of images (graycale or RGB)
- Controlling the tuning parameters to control the nature of the output 
- Utilizing the pretrained networks like VGG-19 to extract features 

## Import libs
First as usual we import all the necessary libs

In [2]:
import warnings
warnings.filterwarnings("ignore")
import keras
import  tensorflow as tf
import numpy as np
import matplotlib
from matplotlib import pyplot as plt
%matplotlib inline
import numpy as np
import time
import os
from keras.applications import vgg19
from keras import backend as K
import cv2
import skimage
from skimage import io
print("Tensorflow version >= " , tf.__version__)
print("Keras version >= " , keras.__version__)
print("Numpy version >= " , np.__version__)
print("Matplotlib version >= " , matplotlib.__version__)
print("OpenCV version >= " , cv2.__version__)
print("Scikit-image version >= " , skimage.__version__)

def mkdir(dirpath):
    if not os.path.exists(dirpath):
        os.makedirs(dirpath)

Tensorflow version >=  1.12.0
Keras version >=  2.2.4
Numpy version >=  1.17.2
Matplotlib version >=  2.2.2
OpenCV version >=  4.1.0
Scikit-image version >=  0.15.0


### USER INPUTS
Here comes the user input section. 

Content image : Search for pencil sketch or border image of any object of your choice from google images and provide the url. 

Style image : Search for a painted version of the object and provide the url 

Tune the parameters to play with the results. I have kept the content weight high so as to make sure my content image features are preserved till the end


In [2]:
content_img_url = 'https://i1.wp.com/flowernifty.com/wp-content/uploads/images/drawn-daisy-flower-petal-pencil-and-in-color-drawn-daisy-daisy-drawing-flower.jpg'
style_img_url = 'http://2.bp.blogspot.com/-KETjBj3RQZw/USraH2mQJtI/AAAAAAAAB00/SLn5ny3v6AE/s1600/1329+Pocketful+of+Sunshine.jpg'

#### Height of generated image. 
#### This will be used to calculate the width of generated images based on Aspect Ratio of original image
img_nrows = 300

############## Folders to store content , style and output images 
content_image_path = 'images/content_imgs_1/'
style_image_path = 'images/style_imgs_1/'
out_folder = 'ST_results_1' 
mkdir(out_folder)
mkdir(content_image_path)
mkdir(style_image_path)

## Util functions

### Read image from URL  : 
To make our code user friendly, let's read images from URL this time. Also since our content image will be binary image mostly, we will have a condition to convert it to RGB.

In [3]:
def get_ImgfrmURL(url):
  img = io.imread(url)
  if img.ndim == 2: #If the image gray scale , convert it to BGR 
    img = cv2.cvtColor(img,cv2.COLOR_GRAY2RGB)
  else:
    img = cv2.cvtColor(img,cv2.COLOR_BGR2RGB)
  return img

def imshow(img):
    plt.imshow(cv2.cvtColor(img,cv2.COLOR_BGR2RGB))
    plt.show()

##  <span style="color:RED"> TASK : </span> Preprocessing and Deprocessing of the images to suit VGG-19 requirements
Here we implement the same VGG based preprocessing functions as implemented in the previous chapter. 

Please refer to Chapter 4  for more details on preprocessing and deprocessing . 

Create the following functions taking reference from Chapter 4 and the function hints below

In [4]:
from keras.preprocessing.image import load_img, save_img, img_to_array
from keras.applications import vgg19

def prepare_img_array(img_path , target_size = (224,224)):
    """ 
    TODO :
    Create the function in such a way that it :
    - loads the image and scale it to givcen target size
    - converts it to array
    - expands dimensions along axis 0  
    
    Hint : 
    from keras.preprocessing.image import load_img, save_img, img_to_array
    
    """
    # Load image from img_path and scale to specified target_size
    img_arr = 
    
    # Convert the image to array
    img_arr = 
    
    # Expand dimensions along axis = 0 
    img_arr = 
    return img_arr

def preprocess_image(img):
    """
    TODO :
    Create the function to process the image
    Hint :  vgg19.preprocess_input()
    
    """
    # Prepare a copy of the img to avoid the original array being mutated
    img_copy = np.copy(img)
    
    # Preprocess image 
    pp_img = 
    
    return pp_img

def deprocess_image(img_preprocessed , img_nrows , img_ncols):
    """
    Deprocess the image preprocessed image
    
    The steps to be implemented are : 
    - Remove zero-center by mean pixel based on VGG standard values 
    - Convert to RGB 
    - Clip the image values to be between 0 t0 255
    
    Hint :

    - The VGG standard values are : 
      Red_MEAN = 123.68 
      Green_MEAN = 116.779
      Blue_MEAN = 103.939 
      
    
    """
    # Prepare a copy of the img_preprocessed to avoid the original array being mutated    
    img = np.copy(img_preprocessed)
    
    # Prepare conditions to handle channels_first, channels_last format 
    if K.image_data_format() == 'channels_first':
        img = img.reshape((3, img_nrows, img_ncols))
        img = img.transpose((1, 2, 0))
    else:
        img = img.reshape((img_nrows, img_ncols, 3))
    print(img.shape)
    
    # 'BGR'->'RGB' another simple way. cv2.cvtColor(img,cv2.COLOR_BGR2RGB) can be used as well 
    img = 
    
    # Remove zero-center by mean pixel based on VGG standard values
    img[:, :, 0] += 
    img[:, :, 1] += 
    img[:, :, 2] += 
    
    # Clip the values to be betwene 0 to 255 and convert it to unit8
    img = 
    
    return img

In [None]:
url_content_img = get_ImgfrmURL(content_img_url)
print("CONTENT IMAGE (Pencil sketch/border image) ")
imshow(url_content_img)
print("STYLE IMAGE")
url_style_img = get_ImgfrmURL(style_img_url)
imshow(url_style_img)

### Save the content and style images

In [6]:
content_image_path = content_image_path + '/content1.jpg'
style_image_path = style_image_path + '/style1.jpg'

cv2.imwrite(content_image_path , url_content_img)
cv2.imwrite(style_image_path , url_style_img)

############## Generated (output) Image Dimensions
width, height = load_img(content_image_path).size
img_ncols = int(width * img_nrows / height)#Width of generated image
generated_imsize = (img_nrows , img_ncols)

### Test processed and deprocessed images 
Just a visualisation of processed and deprocessed images. Though the processed images look weird, they are just the mean centred images according to VGG processing

In [None]:
# get tensor representations of our images
content_image = prepare_img_array(content_image_path , generated_imsize)
style_image = prepare_img_array(style_image_path , generated_imsize)

content_image_pp = preprocess_image(content_image)
style_image_pp = preprocess_image(style_image)

content_image_dp = deprocess_image(content_image_pp, img_nrows , img_ncols)
style_image_dp = deprocess_image(style_image_pp, img_nrows , img_ncols)

orig_image = content_image

print("Orig Image")
plt.imshow(orig_image[0].astype('uint8'))
plt.show()
print("Preprocessed Image")
plt.imshow(content_image_pp[0].astype('uint8'))
plt.show()
print("Deprocessed Image")
plt.imshow(content_image_dp.astype('uint8'))
plt.show()
print("Orig Image")
plt.imshow(style_image[0].astype('uint8'))
plt.show()
print("Preprocessed Image")
plt.imshow(style_image_pp[0].astype('uint8'))
plt.show()
print("Deprocessed Image")
plt.imshow(style_image_dp.astype('uint8'))
plt.show()

## Loss Functions

Now let us build the loss functions : Content loss, Gram matrix, Style loss and Total variation loss. 
You can refer Chapters 3 and 4 for more information

###   <span style="color:RED"> TASK : </span> Content Loss (L<sub>content</sub>)
Implement Content Loss 

**Hint :** 

 Use K.sum . K.square for implementation

 The content loss can be formulated as follows : 

 ![](https://bitbucket.org/ga_learning/style_transfer/raw/48db127941e04f2f66cdfd6a3ccaca9ed888d598/markdown_images/Content_loss.png)
 
 where , 

* **L<sub>content</sub>** is the Content Loss 

* **l** is the layer from which the feature maps are obtained

* **i** refers to the index of each feature maps from layer l

* **j** refers to each element in the flattened feature matrix of size h x w

* **F** is the feature map from the Content image (C) obtained at layer l

* **P** is the feature map from the Generated image (G) obtained at layer l 


In [4]:
from keras import backend as K
def content_loss(base, combination):
    content_loss = 
    return content_loss

###   <span style="color:RED"> TASK : </span>  Gram Matrix

Implement Gram matrix calculation function

**Hint :** 

Use K.dot , K.transpose

Gram matrix equation : 

![](https://bitbucket.org/ga_learning/style_transfer/raw/48db127941e04f2f66cdfd6a3ccaca9ed888d598/markdown_images/Gram_matrix.png)

where, 

* **G<sub>ij</sub>** is the Gram matrix 

* **i** refers to the number of feature maps at layer l

* **k** refers to each element in the flattened feature matrix of size h x w

* **F** is the feature map from the Content image (C) obtained at selected layer l


In [11]:
from keras import backend as K

def gram_matrix(x):
    assert K.ndim(x) == 3
    if K.image_data_format() == 'channels_first':
        features = K.batch_flatten(x)
    else:
        features = K.batch_flatten(K.permute_dimensions(x, (2, 0, 1)))
    gram = 
    return gram


###   <span style="color:RED"> TASK : </span>  Style Loss (L<sub>style</sub>)

Implement Style loss calculation function

**Hint:**

Use the previous Gram Matrix function, K.sum, K.square

Style Loss equation : 

![](https://bitbucket.org/ga_learning/style_transfer/raw/48db127941e04f2f66cdfd6a3ccaca9ed888d598/markdown_images/Style_loss.png)

where, 

* **L<sub>style</sub>** is the Style Loss 

* **l** is the layer from which the feature maps are obtained

* **i** refers to the index of each feature maps from layer l

* **j** refers to each element in the flattened feature matrix of size h x w

* **A<sub>ij</sub>** is the Gram matrix for the Input Image features

* **G<sub>ij</sub>** is the Gram matrix for the Generated Image features



In [5]:
from keras import backend as K

def style_loss(style, combination):
    assert K.ndim(style) == 3
    assert K.ndim(combination) == 3
    channels = 3
    size = img_nrows * img_ncols
    
    S = gram_matrix(style)
    C = gram_matrix(combination)

    style_loss = K.sum(K.square(S - C)) / (4.0 * (channels ** 2) * (size ** 2))
    return style_loss

###   <span style="color:RED"> TASK : </span>   Total Variation Loss (L<sub>tv</sub>)

The total variance loss can be represented as **L<sub>tv</sub>**

**L<sub>tv</sub> = Elementwise_Sum( (G<sub>x</sub><sup>2</sup> + G<sub>y</sub><sup>2</sup>) <sup>1.25</sup> )**

where,

* G<sub>x</sub> = Gradient of Image along x-axis
* G<sub>y</sub> = Gradient of Image along y-axis

**Hint:**
Use K.square , K.sum , K.pow 

In [12]:
def total_variation_loss(x):
    assert K.ndim(x) == 4
    if K.image_data_format() == 'channels_first':
        a = K.square(
            x[:, :, :img_nrows - 1, :img_ncols - 1] - x[:, :, 1:, :img_ncols - 1])
        b = K.square(
            x[:, :, :img_nrows - 1, :img_ncols - 1] - x[:, :, :img_nrows - 1, 1:])
    else:
        a = K.square(
            x[:, :img_nrows - 1, :img_ncols - 1, :] - x[:, 1:, :img_ncols - 1, :])
        b = K.square(
            x[:, :img_nrows - 1, :img_ncols - 1, :] - x[:, :img_nrows - 1, 1:, :])
    return K.sum(K.pow(a + b, 1.25))

##  <span style="color:RED"> TASK : </span>  Create a composite tensor to hold Content Image , Style Image and Generated (Combined) Image

The preprocessed images created so far are in array format. 
Create a single tensor to hold the :
- preprocessed content image (content_image_pp)
- preprocessed style image (style_image_pp)
- Generated (output) image (combination_image)

Note that the combination image is still not generated yet and hence we need to define a placeholder for the same. 

Hints : 
- Use K.variable to convert numpy array to Tensor, 
- Use K.placeholder to create placeholder
- Use K.concatenate to combine the tensors 

In [8]:
from keras import backend as K

# Convert images from array to tensor
content_image_pp_tensor = 
style_image_pp_tensor = 

# Preparing the tensor placeholder for holding the generated image
if K.image_data_format() == 'channels_first':  # (1,3,img_nrows, img_ncols)
    combination_image_tensor = 
else: # (1,img_nrows, img_ncols, 3)
    combination_image_tensor = 

# Concatenate content_image_pp_tensor, style_image_pp_tensor, combination_image_tensor 
# into a single Keras tensor along axis = 0 
input_tensor = 

##  <span style="color:RED"> TASK : </span> VGG-19 Model 

Load the pretrained VGG-19 model (inbuilt in Keras) and print the summary

In [None]:
from keras.applications import vgg19

def load_VGG19(input_tensor , include_top=False, weights='imagenet'):
    """
     Load the vgg 19 pretrained model 
     
     include_top: whether to include the 3 fully-connected layers at the top of the network.
     weights: None (random initialization) or 'imagenet' (pre-training on ImageNet)
     input_tensor: optional Keras tensor (i.e. output of layers.Input()) to use as image input for the model
     
     More options at : 
     https://keras.io/applications/#extract-features-from-an-arbitrary-intermediate-layer-with-vgg19
    
    """
    model =  
    return model 

include_top = False
vgg19_model = load_VGG19(input_tensor,include_top,weights='imagenet')
vgg19_model.summary()

###   <span style="color:RED"> TASK : </span>  Initialize parameters for Style Transfer 

Set up the style transfer parameters. Tweak the values and observe the effects on the output

In [8]:
############## Style transfer parameters 
content_weight = 1.0 #alpha
style_weight = 0.5 #beta
total_variation_weight = 0.5  #gamma
iterations = 50 # Number of iteration to optimise the Total-Loss

###   <span style="color:RED"> TASK : </span>  Calculate losses

- Extract features from loaded VGG-19 model and initialise the content loss, style loss, total variation loss

- Make sure you use the above defined content_weight, style_weight and total_variation weights while calculating losses

- Calculate the total loss : loss = sum of content loss, style loss, total variation loss


In [None]:
# Make a dictionary of layers of the loaded VGG-19 model with the layer names as the keys 
outputs_dict = dict([(layer.name, layer.output) for layer in vgg19_model.layers])

# Calculate Content loss
layer_features = outputs_dict['block5_conv2']
base_image_features = layer_features[0, :, :, :]
combination_features = layer_features[2, :, :, :]

loss_content = 

# Calculate Style loss
feature_layers = ['block1_conv1', 'block2_conv1',
                  'block3_conv1', 'block4_conv1',
                  'block5_conv1']
loss_style = 0 #Initialise Style loss
for layer_name in feature_layers:
    layer_features = outputs_dict[layer_name]
    style_reference_features = layer_features[1, :, :, :]
    combination_features = layer_features[2, :, :, :]
    
    loss_style_layerwise = 
    
    loss_style += loss_style_layerwise

# Calculate Total Variation loss
loss_tv =    

# Calculate the total loss (loss)
loss = K.variable(0.0)
loss = 

###   <span style="color:RED"> TASK : </span>   Calculate gradients 

- Calculate the gradients using K.gradients

Hint : 
Use K.gradients 

In [None]:
from keras import backend as K

# Get the gradients of the generated image wrt the loss and the combination_image_tensor 
grads = K.gradients(loss, combination_image_tensor)

### Create a Keras function to output loss and grads during evaluation

In [None]:
loss_n_grads = [loss]
if isinstance(grads, (list, tuple)):
    loss_n_grads += grads
else:
    loss_n_grads.append(grads)

# Initialise a function to output loss and gradient values 
f_outputs = K.function([combination_image_tensor], loss_n_grads)

### Evaluation function to calculate the gradient and losses  
Let us now write our loss evaluation function. 

It is organised in the same way as in the project in Chapter 4 and can be used as standard for these kind of loss optimisation problems

In [14]:
def eval_loss_and_grads(x):
    if K.image_data_format() == 'channels_first':
        x = x.reshape((1, 3, img_nrows, img_ncols))
    else:
        x = x.reshape((1, img_nrows, img_ncols, 3))
    outs = f_outputs([x])
    loss_value = outs[0]
    if len(outs[1:]) == 1:
        grad_values = outs[1].flatten().astype('float64')
    else:
        grad_values = np.array(outs[1:]).flatten().astype('float64')
    return loss_value, grad_values

class Evaluator(object):

    def __init__(self):
        self.loss_value = None
        self.grads_values = None

    def loss(self, x):
        assert self.loss_value is None
        loss_value, grad_values = eval_loss_and_grads(x)
        self.loss_value = loss_value
        self.grad_values = grad_values
        return self.loss_value

    def grads(self, x):
        assert self.loss_value is not None
        grad_values = np.copy(self.grad_values)
        self.loss_value = None
        self.grad_values = None
        return grad_values

evaluator = Evaluator()

## Optimising the the losses and generating the output 

The Evaluator() class defined above provides methods to access loss and gradients.Now we have a computation graph ready . How do we optimize it?
There are many optimisation techniques ranging from simplest (like gradient_descent) to most sophisticated ones (like ADAM) 

In this case, for ease of implementation we can use limited memory BFGS, from scipy.optimize package. It helps us optimise our loss functions easily without breaking our head on the implementation part. Please have a look at https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.fmin_l_bfgs_b.html#scipy-optimize-fmin-l-bfgs-b for more details.

Wonderful !! We now have eveything to optimise our network to colour up our uncoloured content image. Just make sure you have set all the parameters in the "User inputs" cell. 

We are all set to run the optimisation below! The results will be stored with iteration number as postfix in the output folder mentioned at input


###   <span style="color:RED"> TASK : </span>   Creating the style transfer pipeline
- Create the optimisation pipeline using  `fmin_l_bfgs_b()` as optimisation function

In [None]:
from scipy.optimize import fmin_l_bfgs_b
input_image = content_image

# Preprocess content image
out_image = 

generated_images = []
for i in range(iterations):
    print('Start of iteration', i)
    start_time = time.time()
    
    # TODO : Use the fmin_l_bfgs_b for optimization
    out_image, min_val, info = 
    print('Current loss value:', min_val)
    
    # deprocess image to get the styled output
    deprocessed_result = 
    
    
    fname = out_folder + '/out_at_iteration_%d.png' % i
    save_img(fname, deprocessed_result)
    end_time = time.time()
    print('Image saved as', fname)
    print('Iteration %d completed in %ds' % (i, end_time - start_time))
    generated_images.append(deprocessed_result)

### Output Visualisation
Have a look at how the generated image becomes more and more realistic as we move through the iterations

In [None]:
print("Content Image")
plt.imshow(content_image_dp.astype('uint8'))
plt.show()

print("\n\nStyle Image")
plt.imshow(style_image_dp.astype('uint8'))
plt.show()

for i,g in enumerate(generated_images):
    if i%10 == 0 :
        print("\n\n================= Iteration : ",i)
        plt.imshow(g)
        plt.show()

In [None]:
print("Content Image")
plt.imshow(content_image_dp.astype('uint8'))
plt.show()

print("\n\nStyle Image")
plt.imshow(style_image_dp.astype('uint8'))
plt.show()

print("\n\nColoured image")
plt.imshow(generated_images[-1].astype('uint8'))
plt.show()

### Well Done !

Congratulations ! You have now learnt the concepts needed to build your own Neural Style Transfer Pipeline successfully. With the project in Neural style transfer project in Chapter 4 we learnt how to use a pretrained network and a cascade of losses to create beautiful fusion of images. 

With this project we were able to demonstrate a simple yet very useful application of style transfer ie., colourisation of uncoloured images, by just changing our content image to a black and white pencil sketch image and processing it appropriately.  Now you can freely experiment it even further and build cool apps as per your imaginations. 

### Further reading

Below are the list of some interesting papers on Style Transfer and it's applications:

* A Neural Algorithm of Artistic Style
* Neural Style Transfer: A Review
* Deep Photo Style Transfer
* There is a very nice collection of style transfer related papers and their implementations at : https://paperswithcode.com/task/style-transfer . Have a look at them in case you wish to learn more about implementation of advanced style transfer and it's extensions.

