# Face Recognition using Artificial Neural Network with PyBrain Library

This tutorial will introduce you the basic concepts of Artificial Neural Network and use the Multilayer Feedforward Network to work on face recognition. 
The goal is to train a Multilayer Feedforward Network on a faces dataset to recognize "pose" of the person in the image. The network takes the grayscale of each pixel in the image as inputs, and outputs which direction the person is looking at. 

In [1]:
import numpy as np
import pandas as pd
import glob
import random

## 1. Artificial Neural Network
Aritificial Neural Network(ANN) is a computational model inspired from how biological neural network works. It is used to estimate values for functions after learning from training sets. Back Propagation algorithm is often used for network parameter adjustment to best fit the input and output training pairs. ANN can deal with complicated logical and non-linear computations with high robustness. It is widely used in many areas such as machine vision and speech recognition.  

### 1.1. Multilayer Feedforward Network
Multilayer Feedforward Network is a typical network as ANN, which will be used for face recognition in this tutorial. This network consists of several layers. The i'th layer only accept the output of i-1'th layer as its input. There is no feedback among any neural cells in the network. There are usually the following three kinds of layers in a Multilayer Feedforward Network: 
- **Input Layer**, where neurons accept large amount of non-linear input vectors.
- **Output Layer**, where output vectors are formed. 
- **Hidden Layer**, which lays between the input and output layer.The more hidden layers there are, the more robustness the network will achieve. 

<img src="Multilayer Feedforward Network Layers.jpg">

## 2. Faces Data Set

### 2.1. Face Images
The dataset contains images of 20 different people with different poses, face expressions, wearing/not wearing sunglasses and different image resolutions. There are 20 folders in the faces directory, each contains several images for one person. The name of the .pgm image files has the following pattern, which provides all the information we need: 

&lt;userid&gt;\_&lt;pose&gt;\_&lt;expression&gt;\_&lt;eyes&gt;[\_&lt;scale&gt;].pgm

| **Attribute** | **Explanation**             | 
|----------|-------------|
| userid      | The unique id of the person in the image; has 20 values since there are images of 20 persons in this data set |
| pose | Head position; 4 values: straight, left, right, up |
| expression | Facial expression; 4 values: neutral, happy, sad, angry |
| eyes | Wearing sunglasses or not; 2 values: open, sunglasses |
| scale | Scale of the image file; If not specified, the file has a resolution of 128x120. A value 2 means the resolution is 64x60, 4 means 32x30. |

( In this tutorial, we will focus on the 32x30 .pgm files and the "pose" attribute. )

### 2.2. Helper Functions

We will have several helper functions to handle with the image files, such as extracting pixels(features) from an image, or generating "poses"(targets) attribute from the filename.

- First we need to have a function to extract grey scale of each pixel in the image, as the network inputs. OpenCV library will help us deal with this problem. 

In [2]:
import cv2

def extractGrayScale(filepath):
    """
    Extract gray scale of a given image.
    
    Args:
    filepath(str): path of the image file.
    
    Return:
    (list): list of floats that each float, between 0 and 1, indicates a gray scale of a pixel in the image
    """
    img = cv2.imread(filepath, cv2.IMREAD_GRAYSCALE)
    grayscale = [itm for row in img for itm in row]
    normalized = [float(x)/255 for x in grayscale]
    return normalized

- Second we need to generate input features, that is to generate all the gray scales for a given image file list.

In [3]:
def genFeature(files):
    """
    Generating input features from a given image dataframe
    
    Args:
    files(list): list of filenames(str)
    
    Return:
    (list): 2D list, each sub-list length is 30x32=960, storing the normalized gray scales. 
    """
    table = []
    for f in files:
        table.append(extractGrayScale(f))
    return table

- Third, we will generate targets. 

In [4]:
def genTarget(files):
    """
    Generating targets from a given image list
    
    Args:
    files(list): list of filenames(str)
    
    Return:
    (list): list of strings indicates the "pose"
    """
    target = []
    for f in files:
        features = f.split('/')[-1].split('_')
        target.append(features[1])
    
    return target

## 3. Train and test with PyBrain library

Since we are recognizing "pose" from an 30x32 pixel image, so that the network should have 32x30=960 input units and 4 output unit. The Feedforward Network and Classification Dataset in Pybrain library in PyBrain will help with this problem. 

### 3.1. Introduction to PyBrain

PyBrain is a Machine Learning Library for Python. It's short for Python-Based Reinforcement Learning, Artificial Intelligence and Neural Network Library. Just as the name suggests, it contains most of the common algorithms for neural networks, for reinforcement learning (and the combination of the two), for unsupervised learning, and for evolution. It has multiple network modules and also support users to define custom networks. It also has different kinds of datasets, like supervised, sequential, classification and importantce dataset. 

### 3.2. Classification Dataset
The classification dataset aims to facilitate dealing with classification problems, which will be used in this tutorial. 

- First import the dataset module

In [5]:
from pybrain.datasets import ClassificationDataSet

- Define some variables

In [6]:
files = glob.glob("./faces/**/*_4.pgm")

n_pixels = 960
n_input = n_pixels
n_output = 4
n_files = len(files)

category = ["straight", "left", "right", "up"]

proportion = .75

- Generate two lists: one for feature and one for target

In [7]:
featurelist = genFeature(files)
targetlist = genTarget(files)

- Create a Classification DataSet that has 960 inputs and 4 classes, since there are 4 "poses". 
- Add samples to the dataset. 

In [8]:
ds = ClassificationDataSet(n_input, nb_classes=n_output)
for i in xrange(n_files):
    ds.addSample(featurelist[i], [category.index(targetlist[i])])

- Split the dataset to two parts, the major part as the training set, the minor part as the testint set.

In [41]:
trainDS, testDS = ds.splitWithProportion(proportion)

- The following code aims only to fix a bug pyBrain has. Their splitWithProportion() function is not working properly! The returning datasets' type should be consistent with the argument, but it is always returning a SupervisedDataSet, no matter what the argument type is! So the code below tries to convert the returning dataset back to ClassificationDataSet.

In [42]:
# refer to - http://stackoverflow.com/questions/27887936/attributeerror-using-pybrain-splitwithportion-object-type-changed
trainDS_temp = trainDS
trainDS = ClassificationDataSet(n_input, nb_classes=n_output)
for n in xrange(0, trainDS_temp.getLength()):
    trainDS.addSample( trainDS_temp.getSample(n)[0], trainDS_temp.getSample(n)[1] )

testDS_temp = testDS
testDS = ClassificationDataSet(n_input, nb_classes=n_output)
for n in xrange(0, testDS_temp.getLength()):
    testDS.addSample( testDS_temp.getSample(n)[0], testDS_temp.getSample(n)[1] )

- Converts the target classes to a 1-of-k representation, in our case k is 4, retaining the old targets as a field class. This step is needed for creating the network. 

In [11]:
trainDS._convertToOneOfMany( )
testDS._convertToOneOfMany( )

- Check the input, target and class fields of the training and testing set.

In [12]:
print "training set", trainDS
print "testing set", testDS

### 3.3. Create a Network

- Import all the modules needed. 

In [13]:
from pybrain.utilities import percentError
from pybrain.tools.shortcuts import buildNetwork
from pybrain.structure.modules import SoftmaxLayer

- Create a network with 960 input units, 4 output units, and a hiden layer with 6 units. 

There are multiple ways to do it, if we want a standard network, we can do:

In [14]:
n_hidden = 6
net = buildNetwork( n_input, n_hidden, n_output, outclass=SoftmaxLayer )

Alternatively, we can define a custom network with modules and connections:

In [15]:
from pybrain.structure import FeedForwardNetwork
from pybrain.structure import LinearLayer, SigmoidLayer, SoftmaxLayer
from pybrain.structure import FullConnection

net = FeedForwardNetwork()

inLayer = LinearLayer(n_input)
hiddenLayer = SigmoidLayer(n_hidden)
outLayer = SoftmaxLayer(n_output)

net.addInputModule(inLayer)
net.addModule(hiddenLayer)
net.addOutputModule(outLayer)

in_to_hidden = FullConnection(inLayer, hiddenLayer)
hidden_to_out = FullConnection(hiddenLayer, outLayer)

net.addConnection(in_to_hidden)
net.addConnection(hidden_to_out)

net.sortModules()

- We can check the structure of the network by:

In [16]:
print net

FeedForwardNetwork-15
   Modules:
    [<LinearLayer 'LinearLayer-12'>, <SigmoidLayer 'SigmoidLayer-16'>, <SoftmaxLayer 'SoftmaxLayer-17'>]
   Connections:
    [<FullConnection 'FullConnection-13': 'LinearLayer-12' -> 'SigmoidLayer-16'>, <FullConnection 'FullConnection-14': 'SigmoidLayer-16' -> 'SoftmaxLayer-17'>]



### 3.4. Training and Testing

- Import modules

In [17]:
from pybrain.supervised.trainers import BackpropTrainer

- Create a trainer with the training dataset, with 0.003 momentum, 0.003 learning rate, and 0.01 weight decay. 

In [18]:
trainer = BackpropTrainer( net, dataset=trainDS, momentum=0.003, learningrate=0.003 , verbose=False, weightdecay=0.01)

- Train it! and store the training error and the percent error on the test dataset for every epoch

In [19]:
n_epoch = 300
errs_train = []
errs_test = []
for i in xrange(n_epoch):
    trainer.train()
    err_train = percentError(trainer.testOnClassData(dataset=trainDS), trainDS['class'])
    err_test = percentError(trainer.testOnClassData(dataset=testDS), testDS['class'])
    
    errs_train.append(err_train)
    errs_test.append(err_test)

- Print out the training error and the testing error for every 20 epoch.

In [None]:
print "epoch    training err  testing err"
for i in range(0, n_epoch, 20):
    print '{0:4} {1:13.2f}% {2:12.2f}%'.format(i, errs_train[i], errs_test[i])

It should print the following table:
<img src="table1.png">

## 4. Visualization

Use matplotlib library to visulize the error percentages to get a better understanding. 

In [None]:
%matplotlib inline
import matplotlib
matplotlib.use("svg")
from matplotlib import pyplot as plt
# plt.styles.use('ggplot')
matplotlib.rcParams['figure.figsize'] = (10.0, 5.0)

In [None]:
plt.plot(range(n_epoch), errs_train, label="Training Errors")
plt.plot(range(n_epoch), errs_test, label="Testing Errors")
plt.xlabel("Epochs")
plt.ylabel("Error percentage(%)")
plt.legend()
plt.gca().yaxis.grid(True)
plt.show()

<img src="graph1.png">
As we can see from the above graph, at epoch 0, the error percentages are 75% since there are 4 kinds of "poses" so a random network can achieve 25% correctness. 

Then, both of the traning and testing errors are decreasing rapidly at the very begining. After about 120 epochs, the testing error reaches its limitation, that is roughly 10% error rate. 

In [None]:
for lr in np.arange(0.001, 0.01, 0.002):
    net = buildNetwork( n_input, n_hidden, n_output, outclass=SoftmaxLayer )
    trainer = BackpropTrainer( net, dataset=trainDS, momentum=0.003, learningrate=lr , verbose=False, weightdecay=0.01)
    
    errs_test = []
    for i in xrange(n_epoch):
        trainer.train()
        err_test = percentError(trainer.testOnClassData(dataset=testDS), testDS['class'])
        errs_test.append(err_test)
        
    plt.plot(range(n_epoch), errs_test, label="Learning Rate = " + str(lr))

plt.xlabel("Epochs")
plt.ylabel("Test Error percentage(%)")
plt.legend()
plt.gca().yaxis.grid(True)
plt.show()

<img src="graph2.png">
From the result, we can conclude that when the learning rate increases, the epochs needed to reach the 10% error percentage decreases. However, when the learning rate is 0.009, the situation is not getting better anymore, the converge is getting slower again because 0.009 is a too high learning rate. 

For this dataset, a learning rate between 0.003 and 0.007 seems to be good. 

You can also try different number of hidden layer neurons, or change the layer type, or any other arguments to compare which module can achieve a better output accuracy. 