#Face Recognition
Face Recogniton is a form of biometrics, the science which deals with the automated recognition of individuals based on biological and behavioral characteristics. It is included in physiological or static biometrics, based on data derived from the measurement of a part of a person's anatomy.
The main purpose is the authentication of the subject, which follows identification and is based on proving the previously declared identity.
We will later discuss about the advantages and disadvantages of this mechanism, analysing the requirements for an ideal biometric identifier:
1. Universality
2. Uniqueness
3. Performance
4. Collectability
5. Acceptability

Face Recognition or Face Identification is: given the picture of the face of an unknown person, identify the name of the person by referring to a gallery of previously seen pictures of identified persons.


###About the dataset
Labeled Faces in the Wild (LFW) is an image dataset containing face photographs, collected especially for studying the problem of unconstrained face recognition. It includes over 13,000 images of faces collected from across the web. Here are key aspects of these images:

Each face in this data set was labeled with the person’s name in the image.
1680 of the photographed persons distinctly appear in two or more photos in the data set.
The faces in these images were detected by the Viola-Jones face detector (Paul Viola and Michael Jones, 2001).

LFW includes four different sets of images, including the original and three types of aligned images that can be used to test algorithms under different conditions. For alignment, the dataset uses funneled images (ICCV 2007), LFW-a, and deep funneled images (NIPS 2012). Deep funneled and LFW-a images produce superior results for most face verification algorithms over the funneled images and the original images.

Face Recognition is a task typically performed on the output of a model trained to perform Face Detection. The most popular model for Face Detection is called Viola-Jones and is implemented in the OpenCV library. The LFW faces were extracted by this face detector from various online websites.

###Purpose
Our first purpose is to train an image classifier to work on a dataset of well-known faces of famous people. Then we will make the neural network recognize further faces extending the dataset with new measurements linked to proper identity labels. The final objective is to make an app which recognize human faces after collecting data.

#Schedule
1. Exploratory Data Analysis 
2. Define the relevant metrics to be used
3. Train a first baseline algorithm as a reference
4. Prepare data where needed
5. Design experiments and define hyperparameters 
6. Repeat until performance on the test set is acceptable

###Exploratory Data Analysis 
* Regression or Classification?
We are obiouvsly dealing with a problem of classification of faces regarding their labels.

* What is the target variable?
The target variable of a dataset is the feature of a dataset about which you want to gain a deeper understanding. A supervised machine learning algorithm uses historical data to learn patterns and uncover relationships between other features of your dataset and the target.
Targets are labeled in the dataset and consists in the identities associated with famous faces.

* Is the data unbalanced?
* What are the features? Correlation, ranges, variances, NaN, errors...
* Plot to make findings clearer


###Define the relevant metrics to be used

###Train a first baseline algorithm as a reference
For classification, train basic models as the one-class-classifier and the basic logistic regression with all the features.

###Prepare data where needed
* Cleaning
* Normalization 
* Shuffling and train, test and validation set construction (check the statistical properties of the splits)

###Design experiments and define hyperparameters

###Repeat until performance on the test set is acceptable 
* Train model and cross validate hyperparameters until acceptable performance on training set is achieved
* Test best hyperparameter model and check if there is overfitting or underfitting



# Training an image classifier
We will do the following steps in order:
1. Load and normalize the training test datasets using ``torchvision``
2. Define a Convolutional Neural Network
3. Define a loss function and an optimizer
4. Train the network on the training data 
5. Test the network on the test data 

In [None]:
%matplotlib inline

In [None]:
import torch 
import torchvision
import torchvision.transforms as transforms

###1. Load and normalize training test datasets
(The output of torchvision datasets are PILImage of range [0,1]. We transform them to Tensors of normalized range [-1,1].)

In [22]:
#Training hyperparameters
INIT_LR = 0.001
BATCH_SIZE = 64
EPOCHS = 10

#Train and val split
TRAIN_SPLIT = 0.75
VAL_SPLIT = 1 - TRAIN_SPLIT

#Setting the device to train the model
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

#Loading the dataset
print("[INFO] Loading the LFW dataset...")
#transform = transforms.Compose(
#    [transforms.ToTensor(),
#     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

#trainset = 
#trainloader =

#testset = 
#testloader =




[INFO] Loading the LFW dataset...


###2. Define a Convolutional Neural Network 
The CNN is created from scratch to take 3-channel images.

In [None]:
import torch.nn as nn
import torch.nn.functional as F

class ConvNet(nn.Module): 
    def _init_(self):
      super(ConvNet, self)._init_()
      self.conv1 = nn.Conv2d(...)
      self.pool = nn.MaxPool2d(....)
      self.conv2 = nn.Conv2d(...)
      self.fc1 = nn.Linear(...)
      self.fc2 = nn.Linear(...)

    def forward(self, x):
      x = self.pool(F.relu(self.conv1(x)))
      x = self.pool(F.relu(self.conv2(x)))
      x = x.view(....)
      x = F.relu(self.fc1(x))
      x = F.relu(self.fc2(x))
      x = self.fc3(x)
      return x

net = ConvNet()

SyntaxError: ignored

###3. Define a loss function and an optimizer
We use a Classification Cross-Entropy loss and SGD with momentum.

In [None]:
import torch.optim as optim 

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.0001, momentum=0.9)

NameError: ignored

###4. Train the network on the training data
We have to loop over our data iterator and feed the inputs to the network and optimize.

In [None]:
for epoch in range(2):

    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
      inputs, labels = data

      optimizer.zero_grad()

      outputs = net(inputs)
      loss = criterion(outputs, labels)
      loss.backward()
      optimizer.step()

      running_loss += loss.item()
      if i % 2000 == 1999:
        print('[%d, %5d] loss: %.3f' %
              (epoch + 1, i + 1, running_loss / 2000))
        running_loss = 0.0
print('Finished Training')

###5. Test the network on the test data 
Check if the network has learnt anything at all.

In [None]:
import matplotlib.pyplot as plt
import numpy as np

def imshow(img):
  img = img/2 + 0.5
  npimg = img.numpy()
  plt.imshow(np.transpose(npimg, (1,2,0)))

dataiter = iter(trainloader)
images, labels = dataiter.next()

imshow(torchvision.utils.make_grid(images))
print(' '.join('%5s' % classes[labels[j]] for j in range(4)))

In [None]:
dataiter = iter(testloader)
images, labels = dataiter.next() 

imshow(torchvision.utils.make_grid(images))
print('GroundTruth'.join('%5s' % classes[labels[j]] for j in range(4)))

Now let's see what the neural network thinks these examples above are:

In [None]:
outputs = net(images)

In [None]:
_, predicted = torch.max(outputs, 1)

print('Predicted: ', ' '.join('%5s' % classes[predicted[j]] for j in range(4)))

Let's now check how the network performs on the whole testset.

In [None]:
correct = 0
total = 0
with torch.no_grad():
  for data in testloader:
    images, labels = data
    outputs = net(images)
    _, predicted = torch.max(outputs.data, 1)
    total += labels.size(0)
    correct += (predicted == labels).sum.item()

print('Accuracy of the network on the ----- test images: %d %%' % (correct/total*100))

Check what are the classes that performed well, and the ones that did not performed well.

In [None]:
class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))
with torch.no_grad():
  for data in testloader:
    images, labels = data 
    outputs = net(images)
    _, predicted = torch.max(outputs, 1)
    c = (predicted == labels).squeeze()
    for i in range(4):
      label = labels[i]
      class_correct[label] += c[i].item()
      class_total[label] += 1 

for i in range(10):
  print('Accuracy of %5s : %2d %%' % (
      classes[i], 100 * class_correct[i] / class_total[i]
  ))