#**Face Recognition Using Convolutional Neural Network**

- --------------

##Work FLow
- -----
###I. Import Libiaries 
You can run this notebook on databricks, after create libraries: *keras*, *opencv-python*.
Or, you can run this notebook locally. Anaconda is a good choice.

In [3]:
import os
import sys
import numpy as np
import cv2
import random
import numpy as np
import keras
from sklearn.cross_validation import train_test_split
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.optimizers import SGD
from keras.utils import np_utils
from keras.models import load_model
from keras import optimizers
from keras import backend as K
import h5py 
from keras.models import model_from_json 
import matplotlib.pyplot as plt
import pandas as pd

###II. Image Processing
The input shape of a CNN has rule, so before build our model, we need to process image dataset. We want to read graphs one by one, normalization, and add different labels to each of them according to the person shows on it, like ‘0’ for A and ‘1’ for B. 

We use OpenCV, which has an image processing module that includes linear and non-linear image filtering, geometrical image transformations (resize, affine and perspective warping, generic table-based remapping), color space conversion, histograms, and so on. to adjust our images.

![image processing](https://raw.githubusercontent.com/JunxiFan/Big-Data-Systems-and-Intelligence-Analytics/master/image_portfolio/image_processing.jpg)

When we read an image, suppose we have x pixel wide and y height image. OpenCV can read an image file and coverts it to a list (width × height × 3, 3 means three channels of colors as red, green, blue). To normalize the input data shape of our neural network model, for one images, we compare the width and height, and add borders to the shorter sides. At last, we resized the picture to 32 × 32 pixels format.

In [6]:
IMAGE_SIZE = 64

def resize_image(image, height_output = IMAGE_SIZE, width_output = IMAGE_SIZE):
     # initialize border's value. default value of four borders is 0
    top, bottom, left, right = (0, 0, 0, 0)
    
    #get image's size
    width, height, _ = image.shape
    
    #find the longer side of the image
    longer_side = max(width, height)    
    
    #calculate how many we should add to the shorter side
    if width < longer_side:
        dh = longer_side - width
        top = dh // 2
        bottom = dh - top
    elif height < longer_side:
        dw = longer_side - height
        left = dw // 2
        right = dw - left
    else:
        pass 
    
    #RGB
    BLACK = [0, 0, 0]

    # add border to make two sides the same. "cv2.BORDER_CONSTANT " is the color of border, configured by "value"
    constant = cv2.copyMakeBorder(image, top , bottom, left, right, cv2.BORDER_CONSTANT, value = BLACK)
    
    return cv2.resize(constant, (height_output, width_output))

How to read hundreds of images with labels into our project is the next problem. Now we have many graphs of two people, for example, Johnny Depp and Natalie Portman, separated in two folders.  After doing this classification, we can get a list of images and a list of related labels.

In [8]:
#read data into ram
images = []
labels = []
def read_path(path_url):    
    for dir_item in os.listdir(path_url):
        full_path = os.path.abspath(os.path.join(path_url, dir_item))
        
        if os.path.isdir(full_path):    #if it is a folder, continue. (here a recursion is used)
            read_path(full_path)
        else: 
            if dir_item.endswith('.jpg'):
                image = cv2.imread(full_path)                
                image = resize_image(image, IMAGE_SIZE, IMAGE_SIZE)
                images.append(image)                
                labels.append(path_url)                                
                    
    return images,labels
    
#main function, read data
def load_dataset(path_url):
    images,labels = read_path(path_url)
  
    #change images to 4-dimensions array
    images = np.array(images)
#     print(images.shape,"total ")
    
    labels = np.array([0 if label.endswith('johnny_depp') else 1 for label in labels])    
    return images, labels

We will do cross validation after training, so we split our dataset into three parts: Training, validation, and test sets. 

Also, as it is a project about recognition, it’s necessary to do one-hot encoding to the labels. Label sets' shape will be a 2 dimention list, depends on the value of *nb_classes*. 

Next, change data type to float32 and normalize the value of RGB between 0-1, to improve network convergence speed, reduce training time, and reduce the value of training error.

For now, the preparation function before building CNN is complete.

In [10]:
class Dataset:
    def __init__(self, path_url):
        #train data
        self.train_images = None
        self.train_labels = None
        
        #validate data
        self.valid_images = None
        self.valid_labels = None
        
        #test data
        self.test_images  = None            
        self.test_labels  = None
        
        #path of dataset
        self.path_url = path_url
        
        #tensorflow (channels,rows,cols) or theano (rows,cols,channels)
        self.input_shape = None
        
    # load dataset, then seperate dataset according to cross-validation
    def load(self, nb_classes = 2):
        #load dataset to ram
        images, labels = load_dataset(self.path_url)        
        
        # split into three sets randomly
        train_images, valid_images, train_labels, valid_labels = train_test_split(images, labels, test_size = 0.3, random_state = random.randint(0, 100))

        _, test_images, _, test_labels = train_test_split(images, labels, test_size = 0.5, random_state = random.randint(0, 100))

        self.input_shape = (IMAGE_SIZE, IMAGE_SIZE, 3)            
            
        #the amount of three data parts
        print(train_images.shape[0], 'train samples')
        print(valid_images.shape[0], 'valid samples')
        print(test_images.shape[0], 'test samples')
        
        # label ===(one-hot encoding)===> 2 dimensions data
        train_labels = np_utils.to_categorical(train_labels, nb_classes)                        
        valid_labels = np_utils.to_categorical(valid_labels, nb_classes)            
        test_labels = np_utils.to_categorical(test_labels, nb_classes)
        
        #image to float
        train_images = train_images.astype('float32')            
        valid_images = valid_images.astype('float32')
        test_images = test_images.astype('float32')
            
        # normalize the value of RGB between 0-1
        train_images /= 255
        valid_images /= 255
        test_images /= 255            
        
        self.train_images = train_images
        self.valid_images = valid_images
        self.test_images  = test_images
        self.train_labels = train_labels
        self.valid_labels = valid_labels
        self.test_labels  = test_labels

Next is load data from storage system. We have uploaded data source on DBFS (1062 jpg of Johnny Depp, 840 jpg of natalie portman), and read them here.

After creating a new dataset object, including train_images, valid_images, test_images, train_labels, valid_labels, test_labels.

In [12]:
dataset = Dataset('/dbfs/FileStore/tables/data/')    
dataset.load()

##III. Build CNN model
Keras provides plenty of API for building neural network model. We can build a sequential convolutional neural network easily. For more details you can read [Keras official document layers part](https://keras.io/layers/about-keras-layers/).

In [14]:
def build_model(dataset, nb_classes = 2):
    model = Sequential() 

    model.add(Conv2D(32, (3, 3), border_mode='same', 
                                 input_shape = dataset.input_shape))

    model.add(Activation('relu'))
    model.add(Conv2D(32, (3, 3)))
    model.add(Activation('relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))        
    # droupout: present from Overfitting
    model.add(Dropout(0.25))
    model.add(Flatten())
    model.add(Dense(256))
    model.add(Activation('relu'))
    model.add(Dropout(0.5))
    model.add(Dense(nb_classes))
    model.add(Activation('softmax'))
    model.summary()
    
    return model

here is an introduce of each kinds of layers in the CNN.

###Convolution layer:
The emphasis here is on the function: Conv2D(). According to the Keras official document, 2D represents a 2D convolution, whose function is to perform sliding window convolution calculation on 2D input. Our facial image size is 64 x 64 pixels, which only contain length and width, so we are using a 2D convolution function to calculate the convolution. The sliding window calculation represents uses the convolution kernel to calculate pixels one by one in order.

![conv2d img](https://raw.githubusercontent.com/JunxiFan/Big-Data-Systems-and-Intelligence-Analytics/master/image_portfolio/conv_layer.jpg)
 
First, we will focus the convolution kernel on the first pixel of the image, here the pixel with a pixel value of 237. The area covered by the convolution kernel, and all pixels below it are averaged and then added together: 

_C(1) = 0 * 0.5 + 0 * 0.5 + 0 * 0.5 + 0 * 0.5 + 237 * 0.5 + 203 * 0.5 + 0 * 0.5 + 123 * 0.5 + 112 * 0.5_

Then replace the first pixel in the image, then calculate the second, third…until we get a same size but convoluted image. As to the edge pixels, we fill over edge part with 0. We can use this setting in Conv2D() by code:

_model.add(Conv2D(64, (3, 3), border_mode='same', input_shape = dataset.input_shape))_

We also need to tell Keras the data we input, which is 64*64 in RGB color, code: input_shape(64, 64, 3)

###Activate function layer:
Relu (Rectified Linear Units) function, the input is less than 0, the output is all 0, greater than 0 is equal to the input and output. The advantage of this function is its fast convergence. The keras library also supports several other activation functions: Softplus, Softsign, Tanh, Sigmoid, Hard_sigmoid, Linear. We tried all of this activation functions and decide to use Relu.

###Max Pooling Layer:
The purpose of the pooling layer is to reduce the input feature map, simplify network computational complexity, and simultaneously compress features, highlighting key features.

 ![conv2d img](https://raw.githubusercontent.com/JunxiFan/Big-Data-Systems-and-Intelligence-Analytics/master/image_portfolio/MaxPooling_layer.jpg)
 
We establish the pooling layer by calling the MaxPooling2D() function. This function uses the maximum pooling method. This method selects the maximum value of the coverage area as the main feature of the area to compose a new reduced feature map. Thus, we will get a 32*32-pixel image after pooling.

###Dropout Layer:
The Dropout layer randomly disconnects a certain percentage of input neuron links and consciously reduces the model parameters, making the model simple to prevent overfitting. Dropout() function use float parameter from 0-1 to define drop out percentage.

###Flatten layer:
After many times of convolution, pooling, and Dropout, here you can enter the full connection layer for final processing. The fully connected layer requires that the input data must be one-dimensional, so we must “squash” the input data into one dimension.

###Dense layer:
The role here is for classification or regression, which is classification. We define dense layer through the Dense() function. One of the required parameters of this function is the number of neurons, which it is to specify how many outputs the layer has. In our code, the first dense layer specifies 512 neurons, that is, retains 512 features output to the next layer.

###Classification layer: 
The goal of the dense layer is to complete our classification requirements: 0 or 1.
self.model.add(Dense(nb_classes))
self.model.add(Activation('softmax'))
In the first row of code, we define classification requirement number of neurons, which is 2 for us. And in next layer, we use Softmax() to finish final classification. From the perspective of classification, the greater the output value of the neuron, the greater the likelihood that its corresponding category is a real category. Therefore, after the Softmax(), the upper N inputs are mapped to N probability distributions, and the sum of the probabilities is 1. The highest probability is the model predicted by the model.

Our structure of CNN model:

In [17]:
MODEL = build_model(dataset)

##IV.	Trainning
Next is let the dataset we have prepared to train this model and do cross validations. We set the cost function as ‘categorical_crossentropy’, a typical function dealing with categorical project. The optimizer is SGD (Stochastic gradient descent optimizer), at the same time, Keras includes support for momentum, learning rate decay, and Nesterov momentum, which means we can take multiple advantages from them. Using SGD only, the decent direction depends on batch data completely, so add a momentum can keep original decent direction in some degree, make the optimizer works faster, but more stable. We set the epoch as 100, and at last we get a graph using spark DataFrame.

In [19]:
def train(dataset, model, batch_size = 40, nb_epoch = 100):
    # SGD: a  compiler; lr: learning rate
    sgd = SGD(lr = 0.001, decay = 1e-6, momentum = 0.9, nesterov = True) 
    
    model.compile(loss='categorical_crossentropy',
                  optimizer=sgd,
                  metrics=['accuracy'])   

    history = model.fit(dataset.train_images, dataset.train_labels,
                   batch_size = batch_size,
                   nb_epoch = nb_epoch,
                   validation_data = (dataset.valid_images, dataset.valid_labels),
                   shuffle = True)

    res = history.history
    return res

In [20]:
res = train(dataset, MODEL)

In [21]:
i=1
temp_epoch = []

for each in range(len(res['acc'])):
#   print(i)
  temp_epoch.append(i)
  i+=1
res['epoch'] = temp_epoch

df = pd.DataFrame(res)
spark_df = spark.createDataFrame(df)
display(spark_df)

In [22]:
display(spark_df)

##Test
- ---
Before evaluate our model, we need to realize the standards, or, conditions of face recognition research area. As we know there are many out-standing solution of face recognition, like Fisher Vector Faces, DeepFace, Fusion, FaceNet, etc. 

![conv2d img](https://raw.githubusercontent.com/JunxiFan/Big-Data-Systems-and-Intelligence-Analytics/master/image_portfolio/faceRec_method_acc1.jpg)

Using test dataset we generate before to evaluate our training result. Keras provides an evaluate function, use trained model and test set as parameters, it will return the value of loss and accuracy.

In [25]:
def evaluate(model, dataset):
    score = MODEL.evaluate(dataset.test_images, dataset.test_labels, verbose = 1)
    print('Test loss:', score[0])
    print('Test accuracy:', score[1])
#         print(type(score))
    return score


In [26]:
evaluate(MODEL,dataset)

The result accuracy of test set is 0.91. Considering the amount of samples and the time we cost for training, this result is not bad..

##References

1. [http://cis.csuohio.edu/~sschung/CIS660/DeepFaceRecognition_parkhi15.pdf]

2. [https://www.microsoft.com/en-us/research/project/msra-cfw-data-set-of-celebrity-faces-on-the-web/?from=http%3A%2F%2Fresearch.microsoft.com%2Fen-us%2Fprojects%2Fmsra-cfw%2F]

3.  [https://www.wikiwand.com/en/Databricks]

4.  [https://docs.opencv.org/3.4.1/d1/dfb/intro.html]

##License

####The text in the document by &lt;JUNXI FAN, KAIXIN GAO&gt; is licensed under CC BY 3.0 [https://creativecommons.org/licenses/by/3.0/us/]

THE WORK (AS DEFINED BELOW) IS PROVIDED UNDER THE TERMS OF THIS CREATIVE COMMONS PUBLIC LICENSE ("CCPL" OR "LICENSE"). THE WORK IS PROTECTED BY COPYRIGHT AND/OR OTHER APPLICABLE LAW. ANY USE OF THE WORK OTHER THAN AS AUTHORIZED UNDER THIS LICENSE OR COPYRIGHT LAW IS PROHIBITED.

BY EXERCISING ANY RIGHTS TO THE WORK PROVIDED HERE, YOU ACCEPT AND AGREE TO BE BOUND BY THE TERMS OF THIS LICENSE. TO THE EXTENT THIS LICENSE MAY BE CONSIDERED TO BE A CONTRACT, THE LICENSOR GRANTS YOU THE RIGHTS CONTAINED HERE IN CONSIDERATION OF YOUR ACCEPTANCE OF SUCH TERMS AND CONDITIONS.

####The code in the document by &lt;JUNXI FAN, KAIXIN GAO&gt; is licensed under the MIT License [https://opensource.org/licenses/MIT]

Copyright &lt;2018&gt; &lt;JUNXI FAN, KAIXIN GAO&gt;

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.