# Challenge Deep Learning MDI341
## Author : Mohamed AL ANI

## Challenge rules 
For face recognition, researchers often build 'templates' (a vector of 128 float values) from high resolution images.  These templates are useful in certain applications where only low resolution images can be acquired and/or there is only low processing power (cpu/gpu) available.


The goal of this challenge is to learn a predictive system which predicts templates from **low** resolution images. The predicted templates should be as close as possible to original templates that were built from **high** resolution images.

You can use any method you like and any code you like (scikit-learn, tensorflow etc.). However the data provider suggested to see what a Convolutive Neural Network (CNN) could achieve. Again, while we will appreciate if you make it work with a Convolutive Neural Network, but you can use any method you like. However, **if you decide to use a CNN, then Tensorflow is mandatory**.



The properties of the dataset are as follows:

**Training data:**

$\textbf{X}_{train}$: size 100000 x 2304. Each row of this matrix contains a **low-resolution** image. There are 100000 images where each image is of size 48x48=2304.

$\textbf{Y}_{train}$: size 100000 x 128.  Each row of this matrix contains a template of size 128, which is previously learned from a **high-resolution** image

**Validation data:**

$\textbf{X}_{valid}$: size 10000 x 2304. There are 10000 images in this dataset.

$\textbf{Y}_{valid}$: size 10000 x 128.  

**Test data:**

$\textbf{X}_{test}$: size 10000 x 2304. There are 10000 images in this dataset similar to the validation set.

$\textbf{Y}_{test}$: size 10000 x 128.  (You do not have access to this matrix)



### The goal and the performance criterion

The goal is to build a model, which would produce templates for the test data, given $\textbf{X}_{train}$, $\textbf{Y}_{train}$, $\textbf{X}_{valid}$, $\textbf{Y}_{valid}$, and $\textbf{X}_{test}$. 

Let us call the prediction of the model: $\hat{\textbf{Y}}_{test}$. The performance criterion is given as follows:

$\text{score} = \frac1{N}\sum_{i=1}^{10000} \sum_{j=1}^{128} \Bigl(\textbf{Y}_{test}(i,j) - \hat{\textbf{Y}}_{test}(i,j) \Bigr)^2 $

where $N$ denotes the total number of elements in $\textbf{Y}_{test}$, such that $N=128 \times 10000$.

The lower the score, the better the performance.


## Introduction 
For this challenge, we will be using a CNN with the help of tensorflow.
We tried many different filters and layers, what worked the best for us is : 
- No activation function
- 6 convolution layers (3 3x3 : 2 stride, no padding, and 3 8x8 : 4 strides, no padding)
- We used 30000 iterations for the Adam optimizer that appeared to be the optimizer that gives the best results


## Imports 

In [1]:
#Inspired by this very good tutorial : https://github.com/Hvass-Labs/TensorFlow-Tutorials/blob/master/02_Convolutional_Neural_Network.ipynb

import tensorflow as tf
import numpy as np
import time
from os.path  import join
from datetime import timedelta
import math

This was developed using Python 3.5.2 (Anaconda) and TensorFlow version:

In [2]:
tf.__version__

'0.12.0-rc0'

## Some useful functions 

### Total parameters of the cnn 
Function that calculates the total number of parameters used in the CNN. We make sure not to use more than 50K

In [3]:
def get_total_param():
    ''' Gives the total number of parameters of the CNN (should be < 50k)'''
    total_parameters = 0
    # iterating over all variables
    for variable in tf.trainable_variables():
        local_parameters = 1
        shape = variable.get_shape()  # getting shape of a variable
        for i in shape:
            local_parameters *= i.value  # mutiplying dimension values
        total_parameters += local_parameters

    return total_parameters

### Variable Summaries 
add the variables to Tensors. Can be useful for tensorBoard visualization

In [4]:
def variable_summaries(var):
'''Attach a lot of summaries to a Tensor (for TensorBoard visualization).'''
  with tf.name_scope('summaries'):
    mean = tf.reduce_mean(var)
    tf.summary.scalar('mean', mean)
    with tf.name_scope('stddev'):
        stddev = tf.sqrt(tf.reduce_mean(tf.square(var - mean)))
    tf.summary.scalar('stddev', stddev)
    tf.summary.scalar('max', tf.reduce_max(var))
    tf.summary.scalar('min', tf.reduce_min(var))
    tf.summary.histogram('histogram', var)

### Weights and biases initialization
Functions for creating new TensorFlow variables in the given shape and initializing them with random values.

In [5]:
def new_weights(shape):
    return tf.Variable(tf.truncated_normal(shape, stddev=0.05))

def new_biases(length):
    return tf.Variable(tf.constant(0.05, shape=[length]))

### New convolution layer 

This function creates a new convolutional layer in the computational graph for TensorFlow. 

Input : 4-dim tensor :
- Image number.
- Y-axis of each image.
- X-axis of each image.
- Channels of each image.

Output : 4-dim tensor :
- Image number, same as input.
- Y-axis of each image. 
- X-axis of each image. 
- Channels produced by the convolutional filters.

In [6]:
def new_conv_layer(input,              # The previous layer.
                   num_input_channels, # Num. channels in prev. layer.
                   filter_size,        # Width and height of each filter.
                   num_filters,        # Number of filters.
                   act=tf.nn.relu):

    # Shape of the filter-weights for the convolution.
    shape = [filter_size, filter_size, num_input_channels, num_filters]

    with tf.variable_scope("conv_layer"):

        with tf.name_scope('weights'):
            # Create new weights aka. filters with the given shape.
            weights = new_weights(shape=shape)
            variable_summaries(weights)
        with tf.name_scope('biases'):
            # Create new biases, one for each filter.
            biases = new_biases(length=num_filters)
            variable_summaries(biases)

        with tf.name_scope('Wx_plus_b'):
            # Create the TensorFlow operation for convolution.
            # The padding is set to 'SAME' which means the input image
            # is padded with zeroes so the size of the output is the same.
            preactivate = tf.nn.conv2d(input=input,
                                        filter=weights,
                                        strides=[1, 3, 3, 1],
                                        padding='SAME') + biases
            tf.summary.histogram('pre_activations', preactivate)

        # Rectified Linear Unit (ReLU).        
        activations = act(preactivate, 'activation')
        tf.summary.histogram('activations', activations)

    return activations, weights  # act_dp

### New Fully Connected Layer

This function creates a new fully-connected layer in the computational graph for TensorFlow.

In [7]:
def new_fc_layer(input,          # The previous layer.
                 num_inputs,     # Num. inputs from prev. layer.
                 num_outputs,    # Num. outputs.
                 use_relu=False): # Use Rectified Linear Unit (ReLU)?

    with tf.name_scope("fc_layer"):
        # Create new weights and biases.
        weights = new_weights(shape=[num_inputs, num_outputs])
        biases = new_biases(length=num_outputs)

        # Calculate the layer as the matrix multiplication of
        # the input and weights, and then add the bias-values.
        layer = tf.matmul(input, weights) + biases

        # Use ReLU?
        if use_relu:
            layer = tf.nn.relu(layer)

    return layer

### Flattening Layer Dimension

From dimension 4 to dimension 2

In [8]:
def flatten_layer(layer):

    with tf.name_scope("flatten_layer"):
        # Get the shape of the input layer.
        layer_shape = layer.get_shape()

        # The shape of the input layer is assumed to be:
        # layer_shape == [num_images, img_height, img_width, num_channels]

        # The number of features is: img_height * img_width * num_channels
        # We can use a function from TensorFlow to calculate this.
        num_features = layer_shape[1:4].num_elements()

        # Reshape the layer to [num_images, num_features].
        layer_flat = tf.reshape(layer, [-1, num_features])

        # The shape of the flattened layer is now:
        # [num_images, img_height * img_width * num_channels]

    # Return both the flattened layer and the number of features.
    return layer_flat, num_features

### Next batch
Return a total of 'num' random samples from the array images and targets

In [9]:
def next_batch(num, images, targets):
    idx = np.arange(0, images.shape[0])  # get all possible indexes
    np.random.shuffle(idx)  # shuffle indexes
    idx = idx[0:num]  # use only `num` random indexes
    images_shuffle = [images[i] for i in idx]  # get list of `num` random samples
    images_shuffle = np.asarray(images_shuffle)  # get back numpy array
    targets_shuffle = [targets[i] for i in idx]
    targets_shuffle = np.asarray(targets_shuffle)

    return images_shuffle, targets_shuffle

### Restoring a saved model 
Useful to get all variables

In [10]:
def restore_session(model_path):

    saver = tf.train.Saver()

    with tf.Session() as sess:
        # Restore variables from disk.
        saver.restore(sess, model_path)
        print("Model restored from " + model_path)

    return sess

### Predict result using cnn

In [11]:
def cnn_predict(image_data, template_data=None):

    # Number of images in the test-set.
    num_test = len(valid_template_data)

    # Split the test-set into smaller batches of this size.
    test_batch_size = 256

    # Allocate an array for the predicted templates which
    # will be calculated in batches and filled into this array.
    temp_pred = np.zeros(shape=(num_test, template_dim), dtype=np.float32)

    # Now calculate the predicted templates for the batches.
    # We will just iterate through all the batches.

    # The starting index for the next batch is denoted i.
    i = 0

    while i < num_test:
        # The ending index for the next batch is denoted j.
        j = min(i + test_batch_size, num_test)

        # Get the images from the test-set between index i and j.
        images = image_data[i:j, :]

        # Get the associated labels.
        if not template_data is None:
            labels = template_data[i:j, :]

        # Calculate the prediction using TensorFlow.
        temp_pred[i:j, :] = session.run(layer_fc1, feed_dict={x: images})

        # Set the start-index for the next batch to the
        # end-index of the current batch.
        i = j

    # Regression score (MSE): Numpy check
    sc = np.mean((valid_template_data - temp_pred) ** 2)

    if not template_data is None:
        # Print the score on the validation set.
        msg = "Score on Test-Set: {0:.4%}"
        print(msg.format(sc))

    return temp_pred


### Initializing layers 
We use just 2 types of convolutional layers : 

- 16 filters of 8 x 8 pixels
- 20 filters of 3 x 3 pixels

And the number of neurons that we need in the fully connected layer

In [12]:
# Convolutional Layer 1.
filter_size1 = 8    # Convolution filters are 8 x 8 pixels.
num_filters1 = 16   # There are 16 of these filters.

# Convolutional Layer 2.
filter_size2 = 3    # Convolution filters are 3 x 3 pixels.
num_filters2 = 20   # There are 20 of these filters.

# Fully-connected layer
fc_size = 128             # Number of neurons in fully-connected layer.

### Loading the images

In [13]:
images_train_fname    = "data_train.bin"
templates_train_fname = "fv_train.bin"

images_valid_fname    = "data_valid.bin"
templates_valid_fname = "fv_valid.bin"

images_test_fname     = "data_test.bin"

# number of images
num_train_images = 100000
num_valid_images = 10000
num_test_images  = 10000

# We know that images are 48 pixels in each dimension.
img_size = 48

# size of the images 48*48 pixels in gray levels
img_size_flat = img_size * img_size
img_shape = (img_size, img_size)

# Number of colour channels for the images: 1 channel for gray-scale.
num_channels = 1

# dimension of the templates
template_dim = 128

# read the training files
with open(templates_train_fname, 'rb') as f:
    train_template_data = np.fromfile(f, dtype=np.float32, count=num_train_images * template_dim)
    train_template_data = train_template_data.reshape(num_train_images, template_dim)

with open(images_train_fname, 'rb') as f:
    train_image_data = np.fromfile(f, dtype=np.uint8, count=num_train_images * img_size_flat).astype(np.float32)
    train_image_data = train_image_data.reshape(num_train_images, img_size_flat)

# read the validation files
with open(templates_valid_fname, 'rb') as f:
    valid_template_data = np.fromfile(f, dtype=np.float32, count=num_valid_images * template_dim)
    valid_template_data = valid_template_data.reshape(num_valid_images, template_dim)

with open(images_valid_fname, 'rb') as f:
    valid_image_data = np.fromfile(f, dtype=np.uint8, count=num_valid_images * img_size_flat).astype(np.float32)
    valid_image_data = valid_image_data.reshape(num_valid_images, img_size_flat)

# read the test file
with open(images_test_fname, 'rb') as f:
    test_image_data = np.fromfile(f, dtype=np.uint8, count=num_test_images * img_size_flat).astype(np.float32)
    test_image_data = test_image_data.reshape(num_test_images, img_size_flat)

print("Size of:")
print("- Training-set:\n\timages={}\tlabels={}".format(train_image_data.shape,
                                                        train_template_data.shape))
print("- Validation-set:\n\timages={}\tlabels={}".format(valid_image_data.shape,
                                                   valid_template_data.shape))
print("- Test-set:\n\timages={}".format(test_image_data.shape))



Size of:
- Training-set:
	images=(100000, 2304)	labels=(100000, 128)
- Validation-set:
	images=(10000, 2304)	labels=(10000, 128)
- Test-set:
	images=(10000, 2304)


### Placeholders variables

In [14]:
x = tf.placeholder(tf.float32, shape=[None, img_size_flat], name='x')
x_image = tf.reshape(x, [-1, img_size, img_size, num_channels])
y_true = tf.placeholder(tf.float32, shape=[None, template_dim], name='y_true')

### The convolutional Network
We will use 3 convolutional layers (type 1), one pooling layer, 3 other convolutional layers (type 2), one other pooling layer and finally a flatten layer and a fully connected.

In [None]:
summaries_dir = ""

# CONVOLUTIONAL LAYER 1

layer_conv1, weights_conv1 = \
    new_conv_layer(input=x_image,
                   num_input_channels=num_channels,
                   filter_size=filter_size1,
                   num_filters=num_filters1)
                   # use_pooling=True)


# CONVOLUTIONAL LAYER 2

layer_conv2, weights_conv2 = \
    new_conv_layer(input=layer_conv1,
                   num_input_channels=num_filters1,
                   filter_size=filter_size1,
                   num_filters=num_filters1)

# CONVOLUTIONAL LAYER 3

layer_conv3, weights_conv3 = \
    new_conv_layer(input=layer_conv2,
                   num_input_channels=num_filters1,
                   filter_size=filter_size1,
                   num_filters=num_filters1)

# POOLING LAYER

# Use pooling to down-sample the image resolution
# This is 4x4 max-pooling
layer_pool1 = tf.nn.max_pool(value=layer_conv3, ksize=[1, 2, 2, 1], \
                             strides=[1, 4, 4, 1], padding='SAME')

# CONVOLUTIONAL LAYER 4

layer_conv4, weights_conv4 = \
    new_conv_layer(input=layer_pool1,
                   num_input_channels=num_filters1,
                   filter_size=filter_size2,
                   num_filters=num_filters2)


# CONVOLUTIONAL LAYER 5

layer_conv5, weights_conv5 = \
    new_conv_layer(input=layer_conv4,
                   num_input_channels=num_filters2,
                   filter_size=filter_size2,
                   num_filters=num_filters2)

# CONVOLUTIONAL LAYER 6

layer_conv6, weights_conv6 = \
    new_conv_layer(input=layer_conv5,
                   num_input_channels=num_filters2,
                   filter_size=filter_size2,
                   num_filters=num_filters2)

# POOLING LAYER

# Use pooling to down-sample the image resolution
layer_pool2 = tf.nn.max_pool(value=layer_conv6, ksize=[1, 2, 2, 1], \
                             strides=[1, 2, 2, 1], padding='SAME')

drop = tf.nn.dropout(layer_pool2, 0.99) 

# FLATTEN LAYER

layer_flat, num_features = flatten_layer(layer_pool2)

# FULLY CONNECTED LAYER 1

layer_fc1 = new_fc_layer(input=layer_flat,
                         num_inputs=num_features,
                         num_outputs=fc_size,
                         use_relu=False)

### Cost function 

We use the eucliean loss function

In [None]:
with tf.name_scope('euclidean_loss'):
    squared_err = tf.squared_difference(layer_fc1, y_true)
    with tf.name_scope('total'):
        cost = tf.reduce_mean(squared_err)
tf.summary.scalar('train_euclidean_loss', cost)

### Optimization method

We use an exponential decay for the learning rate and an Adam Optimizer that happened to give the best results

In [None]:
with tf.name_scope('learning_rate_definition'):
    global_step = tf.Variable(0.1, trainable=True)  # trainable=False
    decay_epoch = 300
    learning_rt = tf.train.exponential_decay(0.001, global_step,
                                                np.floor(decay_epoch * 10),
                                                0.95, staircase=True)
tf.summary.scalar('learning_rate', learning_rt)

with tf.name_scope('train'):
    train_step = tf.train.AdamOptimizer(learning_rate=learning_rt).minimize(cost)

### Performance

On validation set

In [None]:
with tf.name_scope('validation_score'):
    score = tf.reduce_mean(tf.squared_difference(layer_fc1, y_true))
tf.summary.scalar('test_score', score)

session = tf.Session()
# Merge all the summaries
merged = tf.summary.merge_all()
# SUMMARIES
train_writer = tf.summary.FileWriter(join(summaries_dir, 'train'), session.graph)
val_writer = tf.summary.FileWriter(join(summaries_dir, 'validation'))

### Tensorflow initialization 

In [None]:
session.run(tf.global_variables_initializer())

### Prediction

In [3]:
# Add ops to save and restore all the variables.
saver = tf.train.Saver()

train_batch_size = 64
# Counter for total number of iterations performed so far.
total_iterations = 0

def feed_dict(train):
    """Make a TensorFlow feed_dict: maps data onto Tensor placeholders."""
    if train:
        xs, ys = next_batch(train_batch_size,
                            train_image_data,
                            train_template_data)
        # k = 0.8
    else:
        xs, ys = valid_image_data, valid_template_data
        # k = 1.0
    return {x: xs, y_true: ys}  # , keep_prob: k}


def optimize(num_iterations):
    # Ensure we update the global variable rather than a local copy.
    global total_iterations

    # Start-time used for printing time-usage below.
    start_time = time.time()

    for i in range(total_iterations,
                   total_iterations + num_iterations):

        # Every 1000 iterations:
        #   Record summaries and test-set score
        #   Print Test Score
        if i % 100 == 0:
            # Calculate the accuracy on the validation set.
            summary, sc = session.run([merged, score], feed_dict=feed_dict(False))
            val_writer.add_summary(summary, i)
            # Message for printing.
            msg = "Optimization Iteration: {0:>6}, Test Score: {1:.4%}"
            # Print it.
            print(msg.format(i + 1, sc))

        else:
            # Run the optimizer using this batch of training data.
            # TensorFlow assigns the variables with feed_dict function
            # to the placeholder variables and then runs the optimizer.
            summary, _ = session.run([merged, train_step], feed_dict=feed_dict(True))
            train_writer.add_summary(summary, i)

    # Update the total number of iterations performed.
    total_iterations += num_iterations

    # Ending time.
    end_time = time.time()

    # Difference between start and end-times.
    time_dif = end_time - start_time

    # Print the time-usage.
    print("Time usage: " + str(timedelta(seconds=int(round(time_dif)))))


print("Number of learnable parameters: ", get_total_param())
try:
    optimize(num_iterations=30000)
    val_pred = cnn_predict(valid_image_data, valid_template_data)
    test_pred = cnn_predict(test_image_data)

    with open('template_pred.bin', 'wb') as f:
        for i in range(num_test_images):
            f.write(test_pred[i, :])

except KeyboardInterrupt:
    val_writer.close()
    session.close()
    train_writer.close()

f.close()
train_writer.close()
val_writer.close()


Size of:
- Training-set:
	images=(100000, 2304)	labels=(100000, 128)
- Validation-set:
	images=(10000, 2304)	labels=(10000, 128)
- Test-set:
	images=(10000, 2304)
Number of learnable parameters:  46669
Optimization Iteration:      1, Test Score: 1.0802%
Optimization Iteration:    101, Test Score: 0.7755%
Optimization Iteration:    201, Test Score: 0.7751%
Optimization Iteration:    301, Test Score: 0.7756%
Optimization Iteration:    401, Test Score: 0.7737%
Optimization Iteration:    501, Test Score: 0.7734%
Optimization Iteration:    601, Test Score: 0.7730%
Optimization Iteration:    701, Test Score: 0.7720%
Optimization Iteration:    801, Test Score: 0.7717%
Optimization Iteration:    901, Test Score: 0.7722%
Optimization Iteration:   1001, Test Score: 0.7706%
Optimization Iteration:   1101, Test Score: 0.7708%
Optimization Iteration:   1201, Test Score: 0.7713%
Optimization Iteration:   1301, Test Score: 0.7697%
Optimization Iteration:   1401, Test Score: 0.7710%
Optimization Itera

In [4]:
# Save the variables to disk.
save_path = saver.save(session, join(summaries_dir, "tf_model", "cnn_ckpt"))
print("Model saved in file: %s" % save_path)
session.close()

Model saved in file: tf_model\cnn_ckpt
