# Colorization of Greyscale Images

In this notebook, we will explore how to train a machine learning model with ``Keras`` and ``Tensorflow`` to colorize black and white images. Colorization has a wide variety of applications from restoring historical context and "modernizing" history to producing vibrant and colorful images which are served for recreational/personal purpose.


![colorization](colorizingex.png)
*This is an example from our pre-trained model integrated in Streamlit app (we got quite lucky with this image as you can see the result is almost perfect).*

Okay! So that's cool and all. But what will we actually do to achieve such an impressive result? In short, for this project, we will:
- Preprocess the image data
- Build and train an convolutional neural network model combined with a classifier
- Evaluate our model on the test set
- Check out how our model performs

**Note**: The algorithm from this notebook is based on and learned from Emil's blog post on colorization. For more detailed and thorough explanation of the process, please take a look at his amazing work on [medium](https://emilwallner.medium.com/colorize-b-w-photos-with-a-100-line-neural-network-53d9b4449f8d).

### Some Necessary Packages/Libraries for Data Processing and Machine Learning

In [1]:
%load_ext autoreload
%autoreload 2

import tensorflow as tf
import numpy as np

import os
import random
import keras

from keras.applications.inception_resnet_v2 import InceptionResNetV2
from keras.applications.inception_resnet_v2 import preprocess_input

from keras.preprocessing import image
from keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img

from keras.models import Sequential, Model
from keras.callbacks import TensorBoard 

from keras.engine import Layer
from keras.layers import Conv2D, UpSampling2D, InputLayer, Conv2DTranspose, Input, Reshape, merge, concatenate, Activation, Dense, Dropout, Flatten
from keras.layers.normalization import BatchNormalization
from keras.layers.core import RepeatVector, Permute

from skimage.color import rgb2lab, lab2rgb, rgb2gray, gray2rgb
from skimage.transform import resize
from skimage.io import imsave

  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])
Using TensorFlow backend.


### CPU v.s. GPU

The difference in computational power between a CPU and a GPU is extremely huge, and it's ideal to train this type of model on a GPU as it requires a very large computing power. From my rough calculation, it would take days to see some observable results if trained on a CPU.

In [3]:
if tf.test.is_gpu_available():
    # GPU -- this takes about 1-3 hours on Tesla K80 if the data set is about 20-50 images
    # You will see some acceptable results for model trained with this paramters
    BATCH_SIZE = 20
    EPOCHS = 115
else:
    # CPU -- not recommended
    BATCH_SIZE = 20
    EPOCHS = 50

### Data Sets
For this project, I will be using a mix of images from sources mentioned in **Data/** folder and divide them into three sets: small (50 images), medium (200 images), big(10000 images).

**Side Note**: When we first proposed the project, we were quite ambitious and adamant that we would train this model on roughly 500k-1000k images. Now looking back, given the available and limited resources we have (ram capacity and computational power <=> USD), such a task is impossible.


After loading images, we basically first want to converting images into tensors and rescaling the pixel values from [0-255] to [0,1].

Since there are lots images there, so I won't be uploading them to github, but instead you can find the sources to those in the **Data/** folder that's on my github.

In [4]:
# Add path to data sets
DS_PATH = "Data/TBA"

# Get images
X = []
for filename in os.listdir(DS_PATH):
    if os.path.isfile(os.path.join(DS_PATH, filename)):
        X.append(img_to_array(load_img(os.path.join(DS_PATH, filename))))
                      
# Normalization => Converting pixel value from [0-255] to [0,1]                      
X = np.array(X, dtype=float)
Xtrain = 1.0/255*X

Let's say we want to train our model on 20 images.

In [5]:
# check the shape
X.shape

(20, 256, 256, 3)

We can interpret the dimension of ``X`` as follows:
- 20 : the number of images that would be used to trained our model
- 256: width of the images
- 256: height of the images
- 3  : the 3 color channels in each of the image (RGB) 

### Data Preprocessing

We’ll use an algorithm to change the color channels, from RGB to Lab. L stands for lightness, and a and b for the color spectrums green–red and blue–yellow.


We have a grayscale layer for input, and our ultimate goal is to predict two appropriate color layers, i.e., the ab in Lab.

![Mapping from B&W to AB](https://miro.medium.com/max/700/1*W23SQ2oEdE_PsK-HmP4cow.png)
*I found this picture from Emil's blog post sums up perfectly what we want to achieve*

In [6]:
from support import create_inception_embedding

# Image transformer
datagen = ImageDataGenerator(
        shear_range=0.1,
        zoom_range=0.1,
        rotation_range=10,
        horizontal_flip=True)

def image_a_b_gen(batch_size):
    """
    FUNCTION
    ---------
    Converts RGB images to B&W, extract the feature using Inception,
    and get the LAB from the original image. 
    
    INPUT
    -----
    batch_size (integer): the number of batch size
    
    OUTPUT
    ------
    Data for training the model later on
    """
    for batch in datagen.flow(Xtrain, batch_size=batch_size):
        # RGB to B&W
        grayscaled_rgb = gray2rgb(rgb2gray(batch))
        # Feature Extraction
        embed = create_inception_embedding(inception, grayscaled_rgb)
        # RGB to LAB
        lab_batch = rgb2lab(batch)
        X_batch = lab_batch[:,:,:,0]
        X_batch = X_batch.reshape(X_batch.shape+(1,))
        # Convert LAB value from [-128, 128] to [-1, 1]
        Y_batch = lab_batch[:,:,:,1:] / 128
        # The new Batch (B&W, Embedding, LAB)
        yield ([X_batch, create_inception_embedding(inception, grayscaled_rgb)], Y_batch)

## Model

In his blog post, Emil used a model from Federico Baldassarre’s [Deep Koalarization: Image Colorization using CNNs and Inception-ResNet-v2](https://arxiv.org/abs/1712.03400) which, in principle, works as follows:

![colornet](https://raw.githubusercontent.com/baldassarreFe/deep-koalarization/master/assets/our_net.png)
*Deep Koalarization: Image Colorization using CNN and Inception-ResNet-v2 -- Image from [the paper](https://arxiv.org/abs/1712.03400)*

In [7]:
#Load weights of InceptionResNet model for embedding extraction 
inception = InceptionResNetV2(weights=None, include_top=True)
inception.load_weights('../Models/inception_resnet_v2_weights_tf_dim_ordering_tf_kernels.h5')
inception.graph = tf.get_default_graph()

In [8]:
# The Model
def conv_stack(data, filters, s):
    # utility to build convolutional layers
    output = Conv2D(filters, (3, 3), strides=s, activation='relu', padding='same')(data)
    return output

embed_input = Input(shape=(1000,))

#Encoder
encoder_input = Input(shape=(256, 256, 1,))
encoder_output = conv_stack(encoder_input, 64, 2)
encoder_output = conv_stack(encoder_output, 128, 1)
encoder_output = conv_stack(encoder_output, 128, 2)
encoder_output = conv_stack(encoder_output, 256, 1)
encoder_output = conv_stack(encoder_output, 256, 2)
encoder_output = conv_stack(encoder_output, 512, 1)
encoder_output = conv_stack(encoder_output, 512, 1)
encoder_output = conv_stack(encoder_output, 256, 1)

#Fusion
fusion_output = RepeatVector(32 * 32)(embed_input) 
fusion_output = Reshape(([32, 32, 1000]))(fusion_output)
fusion_output = concatenate([encoder_output, fusion_output], axis=3) 
fusion_output = Conv2D(256, (1, 1), activation='relu')(fusion_output) 



#Decoder
decoder_output = conv_stack(fusion_output, 128, 1)
decoder_output = UpSampling2D((2, 2))(decoder_output)
decoder_output = conv_stack(decoder_output, 64, 1)
decoder_output = UpSampling2D((2, 2))(decoder_output)
decoder_output = conv_stack(decoder_output, 32, 1)
decoder_output = conv_stack(decoder_output, 16, 1)
decoder_output = Conv2D(2, (2, 2), activation='tanh', padding='same')(decoder_output)
decoder_output = UpSampling2D((2, 2))(decoder_output)

model = Model(inputs=[encoder_input, embed_input], outputs=decoder_output)
model.summary()

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_3 (InputLayer)            (None, 256, 256, 1)  0                                            
__________________________________________________________________________________________________
conv2d_204 (Conv2D)             (None, 128, 128, 64) 640         input_3[0][0]                    
__________________________________________________________________________________________________
conv2d_205 (Conv2D)             (None, 128, 128, 128 73856       conv2d_204[0][0]                 
__________________________________________________________________________________________________
conv2d_206 (Conv2D)             (None, 64, 64, 128)  147584      conv2d_205[0][0]                 
__________________________________________________________________________________________________
conv2d_207

Wow! What a massive number of parameters! Again, this solidifies what we recommend the readers earlier when training with this model. **Always prioritize GPU!**

### Training Time
Before proceeding, it's a good thing to be aware that this step would require a lot of dedication, patient and especially time. So what does I mean by making such a statement? Well, it turns out that you need to fine-tune different parameters here such as the number of epochs, batch size, or how many images to train with, etc, to get the desired result, and this process takes quite a lot of time and perseverance.

In [9]:
#Train model 
tensorboard = TensorBoard(log_dir="/")
model.compile(optimizer='adam', loss='mse', metrics=['accuracy'])
model.fit_generator(image_a_b_gen(BATCH_SIZE), 
                    callbacks=[tensorboard],
                    epochs=EPOCHS, steps_per_epoch=10, verbose=1)

Epoch 1/115
Epoch 2/115
Epoch 3/115
Epoch 4/115
Epoch 5/115
Epoch 6/115
Epoch 7/115
Epoch 8/115
Epoch 9/115
Epoch 10/115
Epoch 11/115
Epoch 12/115
Epoch 13/115
Epoch 14/115
Epoch 15/115
Epoch 16/115
Epoch 17/115
Epoch 18/115
Epoch 19/115
Epoch 20/115
Epoch 21/115
Epoch 22/115
Epoch 23/115
Epoch 24/115
Epoch 25/115
Epoch 26/115
Epoch 27/115
Epoch 28/115
Epoch 29/115
Epoch 30/115
Epoch 31/115
Epoch 32/115
Epoch 33/115
Epoch 34/115
Epoch 35/115
Epoch 36/115
Epoch 37/115
Epoch 38/115
Epoch 39/115
Epoch 40/115
Epoch 41/115
Epoch 42/115
Epoch 43/115
Epoch 44/115
Epoch 45/115
Epoch 46/115
Epoch 47/115
Epoch 48/115
Epoch 49/115
Epoch 50/115
Epoch 51/115
Epoch 52/115
Epoch 53/115
Epoch 54/115
Epoch 55/115
Epoch 56/115
Epoch 57/115
Epoch 58/115
Epoch 59/115
Epoch 60/115
Epoch 61/115
Epoch 62/115
Epoch 63/115
Epoch 64/115
Epoch 65/115
Epoch 66/115
Epoch 67/115
Epoch 68/115
Epoch 69/115
Epoch 70/115
Epoch 71/115
Epoch 72/115
Epoch 73/115
Epoch 74/115
Epoch 75/115
Epoch 76/115
Epoch 77/115
Epoch 78

<keras.callbacks.History at 0x7f40e03bf908>

### Evaluation Time

In [18]:
# Evaluate Colorization
from support import load_pretrained_model
from outputImg import color_result

(model, inception) = load_pretrained_model('../Models/inception_resnet_v2_weights_tf_dim_ordering_tf_kernels.h5',
                                           '../Models/color_tensorflow_ds_small_115.h5')

START = 0
END = 7
PATH = 'Test/'
RESULT = 'Result'

color_result(PATH, START, END, RESULT, model, inception)

Loading pre-trained model...
Model loaded!


In [21]:
# Show results
from ipywidgets import interact
from ipywidgets import widgets
from outputImg import show_img 

def show_sample(sample_n):
    """
    showcase black and white images which got colorized by our model
    """
    image_path = os.path.join(RESULT, "img_"+str(sample_n-1)+".png")
    img = image.load_img(image_path)
    img = image.img_to_array(img)/255
    ax = show_img(img, figsize=(9,9))
    ax.set_title(image_path)
    
interact(show_sample, sample_n=widgets.IntSlider(value=1, min=1, max=END-START-1, description='Results'));

interactive(children=(IntSlider(value=1, description='Results', max=6, min=1), Output()), _dom_classes=('widge…

We can also try the model on images that we get from the web. In addition, we will showcase here how well our model performs compared to the original colored version of an image (if applicable).

Some colored photos for demo:
- https://cdn.pixabay.com/photo/2017/04/07/18/23/landscape-2211587_960_720.jpg
- https://photographycourse.net/wp-content/uploads/2014/11/Landscape-Photography-steps.jpg
- https://briansmith.com/wp-content/uploads/2012/02/Brian-Smith-Samuel-L-Jackson.jpg

In [22]:
# Testing on url images
from ipywidgets import interact_manual
from ipywidgets import widgets
from support import prediction_from_url

def get_prediction(URL):
    prediction_from_url(URL, model, inception)

interact_manual(get_prediction, URL=widgets.Text(placeholder='Insert an image URL'));

interactive(children=(Text(value='', description='URL', placeholder='Insert an image URL'), Button(description…

### Save the Trained Model

In [13]:
model.save_weights("../Models/model_{}_epochs.h5".format(EPOCHS))