# Face Recognition (Part 1: Data/Image Pre-Processing)

In this tutorial, we are going to do some data preprocessing and go over some standard steps such that the data is prepared to be used as input to the model/network for face recognition. We will use face images of a random subset of 10 celebrities, out of the 2,622.

Let's go through these step-by-step.

In [None]:
import cv2
import matplotlib.pyplot as plt
import numpy as np
import os

DATA_ROOT = "/tmp/data/lab3"

# 0. GETTING THE DATA

The dataset that we shall be using is a subset of the [VGG-Face](http://www.robots.ox.ac.uk/~vgg/data/vgg_face/) dataset. It consists of 20 images each of 10 Indian celebrities. We have already pre-processed it and it is availabale in a directory called vgg_face_indian_dataset.

**There is NO need to download any files from the [VGG-Face] link**

In [None]:
celebs = ["A.R._Rahman", "Aamir_Khan", "Amitabh_Bachchan", "A_P_J_Abdul_Kalam", "Kamal_Hassan", "Madhuri_Dixit", "Mahendra_Singh_Dhoni", "Preity_Zinta", "Vidya_Balan", "Virat_Kohli"]

# 1. Understanding the dataset

Let us see a raw image in the dataset:

In [None]:
# Directory names
dataset_dir = DATA_ROOT+'/vgg_face_indian_dataset'
raw_images_dir = DATA_ROOT+'/vgg_face_indian_dataset/raw'
face_images_dir = DATA_ROOT+'/vgg_face_indian_dataset/faces'

In [None]:
# Read an image in the "raw" directory
example_raw_image = cv2.imread(os.path.join(raw_images_dir, "A.R._Rahman_01.jpg"))
# Show it
plt.imshow(cv2.cvtColor(example_raw_image, cv2.COLOR_BGR2RGB)) # Ignore "cv2.COLOR_BGR2RGB"
plt.axis("off")
plt.show()

In the dataset, 20 images each of 10 celebrities.

The bounding box of the faces in these image have been found, and the images have been rotated to make the faces straight, and then converted to grayscale. These grayscale images have been saved in the **vgg_face_indian_dataset/faces** directory.

Let us read all the images in the faces dataset:

In [None]:
# Read all images in the "faces" directory
images = []
for celeb in celebs:
    images.append([])
    # This code inside the loop constructs the filename as Name_xx.jpg for each name
    # xx runs from 01 - 20
    for i in range(1, 21):
        filenum = "_{0:02d}".format(i)
        filename = face_images_dir + "/" + celeb + filenum + ".jpg"
        images[-1].append(cv2.imread(filename, 0))

$images$ is a list containing 10 lists. Each of those 10 lists contains images of a particular celebrity.

Each of the 10 lists contains 20 images, i.e. 20 images per celebrity

In [None]:
print(len(images), type(images))
# Finding the length of each list in "images"
print(len(images[0]), type(images[0][0]))

Each of these 20 images per celebrity is a "numpy array". These images are of variable size. For example, the first two images of A.R. Rahman are of sizes: 

In [None]:
ar_rahman_index = 0
print(images[ar_rahman_index][0].shape, images[ar_rahman_index][1].shape)

Thus, all the images are of different sizes. Let us visualize one image per class (celebrity) in the dataset, and note the sizes of the images:

In [None]:
## function to plot 1 image per class in the given dataset
def plot_one_per_class(images):
    # Number of images ot be plotted
    N = 10    # Plot one image of each class (#classes=10)
    #each plot size
    plt.figure(figsize=(15, 5))

    # For each class
    for i, celeb in enumerate(celebs):
        # Make subplot
        plt.subplot(2, 5, i+1)    # plt.subplot(n_rows, n_columns, image_position_in_plot)

        # Plot the 0th image for each class (in grayscale)
        example_face_image = images[i][0]
        plt.imshow(example_face_image, cmap="gray")

        # Turn off axis lines
        plt.axis("off")
        # (Optional) Write the size of the image as its title
        plt.title("size="+str(example_face_image.shape), size=18)

    plt.show()

plot_one_per_class(images)

# 2. Splitting the dataset into "train", "val" and "test"

## Ensuring class balance

It is important to maintain equal/similar number of images per celebrity in the train and val datasets. This ensures that the model and the calculated accuracies are not biased towards any class. We know that we have 20 images per celebrity and we divide the dataset such that we have 60% of it as training and 20% as validation.

In [None]:
n_train_per_class = 12
n_val_per_class = 4
n_test_per_class = 4

Using these values, let us make the train, val and test sets:

In [None]:
# Make the train, val, and test sets
images_train = []
images_val = []
images_test = []
# For each celebrity
for i, celeb in enumerate(celebs):
    
    # Add a new empty list item
    images_train.append([])
    images_val.append([])
    images_test.append([])
    
    # Add the specified number of images to the train set
    for train_iter in range(0, n_train_per_class):
        images_train[-1].append(images[i][train_iter])
    
    for val_iter in range(n_train_per_class, n_train_per_class + n_val_per_class):
        images_val[-1].append(images[i][val_iter])
    
    for test_iter in range(n_train_per_class + n_val_per_class, n_train_per_class + n_val_per_class + n_test_per_class):
        images_test[-1].append(images[i][test_iter])
    

# IMAGE MANIPULATIONS

In order to train a model, the images have to be manipulated so that they have similar properties. We shall see these manipulation tasks below.

We will transform the training set first, step-by-step and then use it as a function and apply on other sets as well.

## 1. Image Resizing

As the images can be of different sizes, we need to resize all the images to a constant size (224 x 224).

In [None]:
# Resize images
resized_images_train = np.zeros((10, int(n_train_per_class), 224, 224)) # 10 classes, 12 images per class, each image of size (224, 224)
for i in range(len(celebs)):
    for j in range(int(n_train_per_class)):
        image = images_train[i][j]
        resized_image = cv2.resize(image, (224, 224))
        resized_images_train[i][j] = resized_image

Plot one **resized_image** of each class to check if resizing worked

In [None]:
plot_one_per_class(resized_images_train)

## 2. Image Normalization

We know that image pixel values range between 0 and 255. As an example, let us see this range in the 4th image of the class= "Madhuri Dixit". 

In [None]:
madhuri_dixit_index = 5
example_image = resized_images_train[madhuri_dixit_index][4]
# in plt.imshow - we specify that the min and max possible values in the image as 0 and 255
# as otherwise plt.imshow() normalizes it automatically
plt.imshow(example_image, cmap='gray', vmin=0, vmax=255)
plt.axis("off")
plt.show()
# Printing the minimum and maximum values of the image
print(resized_images_train[madhuri_dixit_index][4].min(), resized_images_train[madhuri_dixit_index][4].max())

As we can see, the maximum pixel value in this image is 171 which also explains why this image appears dark. This thus makes it important to normalize the images first before building a model over them. As each image depending on the capture setting can have a different range of values, we need to mormalize the image such that its pixel values are stretched out in the range of 0-255.

Let us scale the pixels values in **each** image in the dataset such that the minimum pixel value within the image becomes 0, and the maximum becomes 255. This way, we are ensuring that the full range of values are being covered in the image. This should result in the above image becoming _brighter_. This is called MinMax Scaling/Normalization.

In [None]:
# MinMax Scaling of images
minmax_scaled_images_train = np.zeros((10, int(n_train_per_class), 224, 224)) # 10 celebrities, some images per celebrity, each image of size (224, 224)
for i in range(len(celebs)):
    for j in range(int(n_train_per_class)):
        resized_image = resized_images_train[i][j]
        # normalize image using minmax scaling
        minmax_scaled_image = (resized_image - np.min(resized_image))/(np.max(resized_image) - np.min(resized_image))*255
        minmax_scaled_images_train[i][j] = minmax_scaled_image

Let us check if this worked.

In [None]:
# Showing image before and after min-max scaling
plt.subplot(121)
plt.imshow(resized_images_train[madhuri_dixit_index][4], cmap='gray', vmin=0, vmax=255)
plt.axis("off")
plt.subplot(122)
plt.imshow(minmax_scaled_images_train[madhuri_dixit_index][4], cmap='gray', vmin=0, vmax=255)
plt.axis("off")
plt.show()
# Printing the minimum and maximum values of the image
print(minmax_scaled_images_train[madhuri_dixit_index][4].min(), minmax_scaled_images_train[madhuri_dixit_index][4].max())

## 3.Changing the input dimensions (depends on model) 

Till now, we maintained the 1st dimension as iterating through celebrities, and the 2nd dimension as iterating throught the images of each celebrity. This was done only for illustrative purposes.

$$shape(minmax\_scaled\_images\_train) = num\_class\times n\_train\_per\_class\times224\times224$$.

But, for EigenFaces the input dimensions is as follows:

(1) the first dimension is iterating through all the samples.

(2) the second dimension is the features (or data values).

Therefore, let us combine the first & second dimensions as they are the data samples. Similarly, we combine the last 2 dimensions as they represent the features (the data values) resulting in the new shape $$(num\_class * n\_train\_per\_class)\times(224 * 224)$$

In [None]:
reshaped_images = np.reshape(minmax_scaled_images_train, (10*int(n_train_per_class), 224* 224))
print(reshaped_images.shape)

## 4. Subtracting $mean\_image$

Since we are interested in the difference between the faces, let’s subtract the characteristics which are common between them. The common characteristic of each pixel value is its mean among all the training images. Thus, let us find the **mean_image**, and subtract it from all the images.

In [None]:
# Calculating mean image
mean_image = np.mean(reshaped_images, axis=0)
print(mean_image.shape) # should be 51076, i.e., 224*224
final_images_train = reshaped_images - mean_image

### Visualizing mean image

In [None]:
plt.imshow(np.reshape(mean_image, (224, 224)), cmap='gray')
plt.axis("off")
plt.show()

As can be seen, the mean image consists of a pair of eyes, a nose, a mouth, etc., at the right places. This is possible since all the face images in the dataset were oriented and aligned to each other.

Had the images not been aligned, the mean image would have looked like this:

<img src=DATA_ROOT+"/vgg_face_indian_dataset/hazy_mean_image.png">

# VALIDATION AND TEST DATA

## - Image manipulations for validation and test data

The same image manipulations must be carried out on the validation and test images. As a part of exercise, complete a function that performs all the operations step-by-step.

**Note that the mean image is calculated only on the training set.**

In [None]:
def transform_data(images, n_images_per_class, mean_image):
    
    #### Your Code Here
    
    return final_images

Now use this function to find the final images of val and test data:

In [None]:
# Finding final images for val and test
final_images_val = transform_data(images_val, n_val_per_class, mean_image)
final_images_test = transform_data(images_test, n_test_per_class, mean_image)

# LABELS

Let us make the labels for train, val and test. Each of the 10 celebrities shall be associated with a number among {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}: 

In [None]:
labels_train = np.array([[i]*n_train_per_class for i in range(10)]).flatten()
labels_val = np.array([[i]*n_val_per_class for i in range(10)]).flatten()
labels_test = np.array([[i]*n_test_per_class for i in range(10)]).flatten()
print(labels_val)

# SAVE DATA

Let us save the data as "data.npz" format, so that we can load it in the next notebook.

In [None]:
# Save the train, val, and test data and labels, and the mean image
np.savez("data",
         data_train=final_images_train, labels_train=labels_train,
         data_val=final_images_val, labels_val=labels_val,
         data_test=final_images_test, labels_test=labels_test,
         mean_image=mean_image)

## Now that we have prepared the data, lets move to Part 2 of this exercise

Note that we have already saved a data.npz file in the data folder in case you want to directly jump to the next experiment without completing the function

## Exercises

1. Write code for the function **transform_data()** which performs all the image manipulations discussed in a single function that can be used to perform the same on test and validation sets.