#  A short introduction

We are working on a project to color grayscale pictures.
The first few aspect of our project that we would like to introduce in this short introduction, are the decisions we took in our current approach to the problem.

The Dataset:
At first we were planning on using the ImageNet Database, but soon it became evident, that due to some technical difficulties on the site we weren't going to get access, so we decided on our current Database (Open Images V4 Dataset) which granted us the original images in a zip file. The images are not restricted to a single subject, the dataset incorporates images with various themes. We decided to start with 100000 images.
(https://www.figure-eight.com/dataset/open-images-annotated-with-bounding-boxes/)

The Images:
Although the Images in the Database were mostly satisfactory for our goals, there were some modifications we had to make on them. Since the images were too big for us to handle and they did not have the same measurements we decided to crop them into a square shape and scale them down to resolution of 128X128. Also some of the pictures were not appropriate for our task such as grayscale pictures, so we decided, to sort them out.

The LAB color scale:
We decided to convert the images from RGB color space to LAB, which hopefully will make the teaching easier, as the L channel of LAB colorspace is the greyscale representation of the image, so the machine will only have to predict two channels instead of 3.

The Small Parts:
Even with a 128X128 scaling the whole dataset is too big to be loaded to the operating memory as a numpy array, so we decided to divide it into smaller parts, and teach the neural network on each smaller part individually.

# The preparation and preprocessing of the dataset

In [27]:
# Importing the used packages
# Numpy for arrays
import numpy as np
# requests for downloading the dataset
import requests
# PIL.Image for image processing
from PIL import Image
# keras.preprocessing.image for image processing
import keras.preprocessing.image as k_image
# os for file management
import os
# skimage.color for transforming the color model of images
import skimage.color as skcolor
# zipfile for extracting downloaded dataset 
import zipfile
# random for random number generation in grayscale check
import random

In [28]:
# Function for downloading the dataset.
# url:           String, url of the zipped dataset
# target_path:   String, intended filepath of the downloaded dataset
def download_dataset(url, target_path):
    # Downloading the file in chunks to avoid memory overrun.
    r = requests.get(url, stream = True)
    with open(target_path, 'wb') as f:
        for chunk in r.iter_content(chunk_size=1024):
            # Filtering out keep-alive new chunks.
            if chunk: 
                f.write(chunk)

In [29]:
#Function for extracting zipped dataset.
# dataset_zipped_path:   String, filepath of the zipped dataset
# raw_dataset_path:      String, intended folder path of raw dataset
# return value:          String, final folder path of raw dataset
def extract_dataset(dataset_zipped_path, raw_dataset_path):
    # Creating directory for the raw dataset, if it does not exist.
    if not os.path.exists(raw_dataset_path):
        os.makedirs(raw_dataset_path)
    
    # Extracting dataset to the intended folder path.
    zip_ref = zipfile.ZipFile(dataset_zipped_path, 'r')
    zip_ref.extractall(raw_dataset_path)
    zip_ref.close()
    
    # Determining and returning final path of the raw dataset.
    dirlist = os.listdir(raw_dataset_path)    
    return raw_dataset_path + dirlist[0] + '/'

In [30]:
# Function for grayscale check of an image. Returns True if grayscale, False if not.
# im:            PIL.Image object, input image
# return value:  boolean, True if grayscale, False if not grayscale
def is_gray_scale(im):
    w,h = im.size
    # Generating 10 random pixel coordinate.
    rand_pixel_array = np.zeros((10,2))
    for i in range(10):
        rand_pixel_array[i,:] = [random.randint(0,w-1), random.randint(0,h-1)]
    # If all of the 10 pixels have the same values on each channels, the image is regarded grayscale.
    for i in range(10):
        r,g,b = im.getpixel((rand_pixel_array[i,0], rand_pixel_array[i,1]))
        if r != g != b: return False
    return True

In [31]:
# Function for dimension check of an image. Returns True if if image has the proper dimensions (3D, 3 channels), False if not.
# im:            PIL.Image object, input image
# return value:  boolean, True if image has the proper dimensions (3D, 3 channels), False if not
def has_proper_dim(im):
    # Get the image data to numpy array.
    im_array = np.array(im)
    shape = im_array.shape
    # The image shall have 3 dimensions, and 3 channels.
    if((len(shape) != 3) or (shape[2] != 3)):
        return False
    else:
        return True    

In [32]:
# Function for making the images to 1:1 ratio, and resizing them to the target size.
# im:           PIL.Image object, input image
# target_size:  tuple with 2 integer element, (width, height)
# return value: PIL.Image object, transformed image
def crop_resize_Image(im, target_size):
    # Taking out the image data (width,height).
    width,height = im.size
    # Deciding if the image is landscape or portrait.
    if(width > height):
        # Landscape
        top     = 0
        left    = int((width - height)/2)
        bottom  = height
        right    = width - int((width - height)/2)
    else:
        # Portrait.
        top     = int((height - width)/2)
        left    = 0
        bottom  = height-int((height - width)/2)
        right    = width
    # Cropping the image to conform 1:1 ratio, the resizing to target size.
    return im.crop((left,top,right,bottom)).resize(target_size)

In [33]:
# Function for standardizing the input array.
# array:          numpy array, input array
# return value:   numpy array, standardized input array
def standardize(array):
    # Calculating average on all elements of the array.
    ave = np.average(array)
    # Calculating standard deviation on all the elements of the array.
    std = np.std(array)
    # Standardizing the input array.
    new_array = (array-ave)/std
    # Returning standardized array.
    return new_array

In [34]:
# Function for transforming the images of the dataset to 1:1 ratio, and target size.
# initial_path:  String, folder path of the raw dataset
# target_path:   String, intended folder path of the transformed dataset
# image_size:    tuple with 2 integer element, (width, height)
def dataset_transform(initial_path, target_path, image_size):
    # Creating directory for the transformed dataset, if it does not exist.
    if not os.path.exists(target_path):
        os.makedirs(target_path)
    # Iterating over the raw images.
    for filename in os.listdir(initial_path):
        im = Image.open(initial_path + filename)
        # Filtering out the grayscale images, and images with improper dimensions.
        if((is_gray_scale(im) == False) and (has_proper_dim(im) == True)):
            # Making the images to 1:1 ratio, and resizing them to the target size.
            im = crop_resize_Image(im,(128,128))
            # Saving the images to the target directory.
            im.save(target_path + filename)

In [35]:
# Function to preprocess the given number of images from the given offset to form training and validation data.
# dataset_path:  String, folder path of the dataset
# train_spl:     float, proportion of training data to all data
# valid_spl:     float, proportion of validation data to all data
# num_imgs:      int, number of images to preprocess
# offset:        int, image index offset in dataset folder
# return value:  training data input, training data output, validation data input, validation data output
def preprocess_train_valid_data(dataset_path, train_spl, valid_spl, num_imgs, offset):
    
    # Determine validation split when taking to account only training and validation data.
    valid_split = valid_spl / (train_spl + valid_spl)
    
    # Making a list of filenames of the dataset directory.
    filename_list = os.listdir(dataset_path)
    # Creating an empty list for the loaded images.
    data = []
    # Iterating over the given number of images from the given offset in the dataset.
    for i in range(offset, (offset + num_imgs)):
        # Loading the actual image, then converting that to a numpy array.
        im = k_image.img_to_array(k_image.load_img(dataset_path + filename_list[i]))
        # Scaling the rgb pixel values from 0-255 to 0-1, to comply with the input of rgb2lab() function.
        im = im * (1.0/255.0)
        # Converting the image from RGB to LAB color space. The datatype of the result is casted from float64 to float32.
        im = skcolor.rgb2lab(im).astype('float32')
        # Appending image to the list.
        data.append(im)    

    # Creating a numpy array from the list, with float32 datatype.
    data = np.asarray(data, dtype='float32')
    # Selecting the first channel of the images as input, that contains the grayscale representation of the image in lAB color space.
    X = data[:,:,:,0]
    # Selecting the second and third channels of the images as output, they contain green–red and blue–yellow color components respectively.
    Y = data[:,:,:,1:]
    
    # Standardizing input data.
    X = standardize(X)
    # Normalizing output data from range -128...+128 to -1...+1
    Y = Y/128

    # Selecting training and validation data separately.    
    X_train = X[0:int(num_imgs*(1-valid_split)),:,:]
    Y_train = Y[0:int(num_imgs*(1-valid_split)),:,:,:]
    X_valid = X[int(num_imgs*(1-valid_split)):,:,:]
    Y_valid = Y[int(num_imgs*(1-valid_split)):,:,:,:]
    
    return X_train,Y_train,X_valid,Y_valid

In [36]:
# Function to preprocess the given number of images from the given offset to form test data.
# dataset_path:  String, folder path of the dataset
# num_imgs:      int, number of images to preprocess
# offset:        int, image index offset in dataset folder
# return value:  test data input, test data output
def preprocess_test_data(dataset_path, num_imgs, offset):
    
    # Making a list of filenames of the dataset directory.
    filename_list = os.listdir(dataset_path)
    # Creating an empty list for the loaded images.
    data = []
    # Iterating over the given number of images from the given offset in the dataset.
    for i in range(offset, min(offset + num_imgs,len(filename_list))):
        # Loading the actual image, then converting that to a numpy array.
        im = k_image.img_to_array(k_image.load_img(dataset_path + filename_list[i]))
        # Scaling the rgb pixel values from 0-255 to 0-1, to comply with the input of rgb2lab() function.
        im = im * (1.0/255.0)
        # Converting the image from RGB to LAB color space. The datatype of the result is casted from float64 to float32.
        im = skcolor.rgb2lab(im).astype('float32')
        # Appending image to the list.
        data.append(im)

    # Creating a numpy array from the list, with float32 datatype.
    data = np.asarray(data, dtype='float32')
    # Selecting the first channel of the images as input, that contains the grayscale representation of the image in lAB color space.
    X_test = data[:,:,:,0]
    # Selecting the second and third channels of the images as output, they contain green–red and blue–yellow color components respectively.
    Y_test = data[:,:,:,1:]
    
    # Standardizing input data.
    X_test = standardize(X_test)
    # Normalizing output data from range -128...+128 to -1...+1
    Y_test = Y_test/128
    
    return X_test, Y_test

In [37]:
# Function for converting RGB image array to grayscale image array.
def img_grayscale(imageArray):
    
    imgArr= np.empty([1])
    for image in imageArray:
        pil_imgray = image.convert('LA')
        img = np.array(list(pil_imgray.getdata(band=0)), int)
        img.shape = (pil_imgray.size[1], pil_imgray.size[0])
        imgArr=np.append(imgArr,img)
    imgArr = np.delete(imgArr,0)
    imgArr.shape = (len(imageArray),pil_imgray.size[1], pil_imgray.size[0])
    return imgArr;

In [38]:
# Function for making visualization for the images.
# transformed_dataset_path   String, folder path of transformed dataset
# width                      int, number of images on one edge
# size                       int, size of 1:1 ratio image
def image_mosaic(transformed_dataset_path, width, size):
    
    # Making a list of filenames of the dataset directory.
    filename_list = os.listdir(transformed_dataset_path)
    # Creating an empty list for the loaded images.
    imagearray = []
    # Loading the first width*width number of images, and appending them to the list.
    for i in range(width*width):
        im = Image.open(transformed_dataset_path + filename_list[i])
        imagearray.append(im)
        
    # Initializing canvas.
    canvas = np.ones((size*width,size*width*2,3));
    # Resizing images.
    for i in range(len(imagearray)):
        imagearray[i]=imagearray[i].resize((size,size),Image.ANTIALIAS)
    # Making GrayImages.
    grayimage=img_grayscale(imagearray)
    # Writing the RGB images to the canvas right side.
    for i in range(width):
        for j in range(width):
            canvas[i*size:(i+1)*size,j*size:(j+1)*size,0::]=np.array(imagearray[i+width*j])
    # Writing grayscale images to the canvas left side.
    for i in range(width):
        for j in range(width,2*width):
            canvas[i*size:(i+1)*size,j*size:(j+1)*size,0]=np.array(grayimage[i+width*(j-width)])
            canvas[i*size:(i+1)*size,j*size:(j+1)*size,1]=np.array(grayimage[i+width*(j-width)])
            canvas[i*size:(i+1)*size,j*size:(j+1)*size,2]=np.array(grayimage[i+width*(j-width)])
            
    # Displaying the mosaic.
    canvas = canvas.astype(np.uint8);
    mosaic = Image.fromarray(canvas)
    mosaic.show()

In [None]:
# Specifying train, validation and test split values.
train_split = 0.8
valid_split = 0.1
test_split = 0.1
# Specifying input image size.
image_size = (128,128)
# Specifying maximal number of images that can be loaded into memory at one time.
max_loaded_imgs_num = 1000
# Specifying intended filepath for the dataset to be downloaded.
dataset_zipped_path = os.getcwd() + '/zipped_dataset.zip'
# Specifying intended folder path of raw dataset.
raw_dataset_path = os.getcwd() + '/raw_dataset/'
# Specifying intended folder path of transformed dataset.
transformed_dataset_path = os.getcwd() + '/transformed_dataset/'

# Downloading zipped dataset.
download_dataset('https://datasets.figure-eight.com/figure_eight_datasets/open-images/test_challenge.zip', dataset_zipped_path)
print('Zip file downloaded.')

# Extracting zipped dataset to the intended folder path.
raw_dataset_path = extract_dataset(dataset_zipped_path, raw_dataset_path)
print('Images are extracted.')

# Transforming raw dataset to the proper format. 
dataset_transform(raw_dataset_path, transformed_dataset_path, (128,128))
print('Dataset is transformed')

# Visualization dataset by displaying a mosaic.
image_mosaic(transformed_dataset_path, 10, 64)

# Determining the length of the dataset.
len_dataset = len(os.listdir(transformed_dataset_path))

# Since the whole dataset cannot be loaded to the memory at the same time, the training of the network will be
# executed in cycles. In each cycle (except the last) a predetermined number of images (max_loaded_imgs_num) are
# loaded to the memory, where they are preprocessed and split to training and validation datasets. Then epoch
# number of training and validation phase are executed on the neural network with these datasets. The images
# loaded to the memory are different in every cycle.

# Determining the combined length of the training and validation data.
len_train_val_set = (int)(len_dataset*(train_split+valid_split))
# Determinde number of (training + validation) cycles.
num_cycles_train_val = (int)(len_train_val_set/max_loaded_imgs_num) + 1
# Determine the length of the last section of (training + validation) data.
len_last_section_train_val = len_train_val_set - max_loaded_imgs_num * (num_cycles_train_val - 1)

# Since the whole dataset cannot be loaded to the memory at the same time, the network evaluation on test data
# will be executed in cycles. In each cycle (except the last) a predetermined number of images (max_loaded_imgs_num)
# are loaded to the memory, where they are preprocessed, forming the test dataset. Then the test phase is executed
# on the neural network with this dataset. The images loaded to the memory are different in every cycle.

# Determining the length of the test data.
len_test_set = len_dataset - len_train_val_set
# Determinde number of test cycles.
num_cycles_test = (int)(len_test_set/max_loaded_imgs_num) + 1
# Determine the length of the last section of test data.
len_last_section_test = len_test_set - max_loaded_imgs_num * (num_cycles_test - 1)

# Execution of (num_cycles_train_val-1) training cycle.
#for i in range(num_cycles_train_val-1):
#    (X_train,Y_train,X_valid,Y_valid) = preprocess_train_valid_data(transformed_dataset_path, train_split, valid_split, max_loaded_imgs_num, i * max_loaded_imgs_num)
    # training the model

# If there are images left in the last section, execute last training cycle.
#if(len_last_section_train_val != 0):    
#    (X_train,Y_train,X_valid,Y_valid) = preprocess_train_valid_data(transformed_dataset_path, train_split, valid_split, len_last_section_train_val, (num_cycles_train_val - 1) * max_loaded_imgs_num)
    # training the model

# Execution of (num_cycles_test-1) test cycle.
#for i in range(num_cycles_test-1):
#    (X_test, Y_test) = preprocess_test_data(transformed_dataset_path, max_loaded_imgs_num, len_train_val_set + i * max_loaded_imgs_num )
    # elvaluation of test data

# If there are images left in the last section, execute last test cycle.
#if(len_last_section_test != 0):
#    (X_test, Y_test) = preprocess_test_data(transformed_dataset_path, len_last_section_test, len_train_val_set + (num_cycles_test - 1) * max_loaded_imgs_num )
    # evaluation of test data

Zip file downloaded.
Images are extracted.
