## Face Recognition ##

To do Face Recognition I can't use a preexisting model as the ones use in `dlib` or `opencv` and I'll have to do something a little more manual. But before doing that, let me get a source of photos. The best one, right now, is Instagram and here's how I'll be getting photos from public accounts.

### 01. Lambdas & Iterators ###

First, I'll need to install [instaloader](https://instaloader.github.io/) which is a Python module that allows me to download recent pots:

In [None]:
import sys
!{sys.executable} -m pip install instaloader

Next, let me show you how to use instaloader. I'll take advantage of this to explain a little bit about iterators and lambda functions.

Let me start with the simple thing: lambda functions. So, a lambda function is actually a function without a name, that always returns something. The syntax is this one:

`lambda x : some_operation_with(x)` - remember, in this example I used `x` as a random name; you can have one or more arguments and you can really be creative with their names.

In [None]:
# this:
f = lambda a : a + 2
print('f(2):', f(2), 'f is', type(f))
# is actually equivalent to this:
def f(a):
    return a + 2
print('f(2):', f(2), 'f is', type(f))

Then why use it? After all this is just a restricted way to define a function. Well, because it's nice to be lazy, like in this example where I'll compute a list of squares of the first 10 numbers:

In [None]:
# this:
l = list(map(lambda x : x**2, range(10)))
print(l)
# vs this:
def square(x): return x**2
l = []
for n in range(10): l.append(square(n))
print(l)

Also, there's a speed advantage for the first method when compared with the second for very large lists. But I'm pretty sure you noticed something: that I've wrapped the `map` function in a `list` cast (*cast* - the transformation of one data type to another). This is because `map` returns an `iterator`. So what is an `iterator`? That's pretty simple:

An iterator is returning the current value with the promise than next time it will be called, I'll get the next value.

In [None]:
it = map(lambda x : x**2, range(10))
print('type of it:', type(it))
print('1st call:', next(it))
print('2nd call:', next(it))
print('3rd call:', next(it))

So why use it? Well, to save memory! This way, you don't need to know all the elements of a set, but just the rule that builds that set. So here's how you can create an iterator (actually, this is really called a *generator*, a particular *iterator* case, that is really simple to construct):

In [None]:
def squares(n):
    """
    This function will return an iterator for all the squared integers less than n^2.
    @param n (int) : will return all the squared integers until n^2
    @return int : an iterator for all squared integers less than n^2.
    """
    i = 0 # start with a value
    while i < n:
        yield i**2
        i += 1

list(squares(10))

And I also can write something like this:

In [None]:
for sq in squares(10):
    print(sq)

The good thing about the above approach is that at any point only one value will be stored in the memory. Iterators are mostly used in list comprehension (creating list from other lists, like in a for loop, like this):

In [None]:
[ x + 2 for x in squares(10) ] # computes the list of numbers x^2 + 2

One last thing about this: if I'd like to extract only parts of the iterator (for example squares from 3^2 until 10^2), I can use `dropwhile` and `takewhile` from the standard library module `itertools`.
- `dropwhile(lambda, element)` will discard all the values from the iterator until the first moment the lambda applied to the element will become False; so, if the lambda is False from the beginning, nothing will be discaded; also, if the lambda will return True for some elements after it previously returned False, those values are not discarded;
- `takewhile(lambda, element)` works exactly like `dropwhile` but will keep the element instead of discarding it; same rules apply for the lambda;

And yes, of course, instead of lambda you can use any other kind of function, normal or method ones.

In [None]:
from itertools import dropwhile, takewhile
# Get the squares after 9:
print([ x for x in dropwhile(lambda sq : sq <= 9 , squares(10)) ])
# Get the squares before 16:
print([ x for x in takewhile(lambda sq : sq < 16, squares(10))])

### 02. Getting Images from Instagram ###

Putting things together now, here's how I'll be using instaloader:
- first, I'll instantiate the `Instaloader` object from `instaloader` module; this will allow me to interact with Instagram;
- I'm going to get an iterator to all the post by first building a `Profile` profile from the Instagram ID and will call the `get_posts()` method which creates an iterator;
- I'll use `dropwhile` and `takewhile` to filter the post for a given interval;
- and put the results in a list;

In [None]:
from datetime import datetime
from itertools import dropwhile, takewhile

import instaloader

profile_id = 'valerie.lungu' # please, add here the instagram profile

# this is me, instantiating the Instaloader object
L = instaloader.Instaloader()
# building the profile
profile = instaloader.Profile.from_username(L.context, profile_id)
# and retrieve an iterator to the posts
posts = profile.get_posts()

# these are the limits for the posts
since = datetime(2021, 3, 1)
until = datetime(2021, 4, 6)

# construct an empty list for the urls
image_urls = []

# do the dropwhile and takewhile trick:
for post in dropwhile(lambda post: post.date > until, takewhile(lambda post: post.date > since, posts)):
    # and if this post is an image
    if not post.is_video:
        # add the image url to the list
        image_urls.append(post.url)

The results from running the cell above produces a list of image URLs that I can verify in the browser:

In [None]:
image_urls

So if I have the URL, I need to manually download the picture to use it? No, of course not. I can use the builtin `urllib` which allows me to open webpages using the `request.urlopen(url)` method. I convert the response (which is a `byte` string) to array and decode it so I'll obtain a normal matrix. I've put everything in the following short function:

In [None]:
import cv2
import urllib
import numpy as np

def image_from_url(url):
    """
    Read an image from an URL and returns the image matrix.
    @param url (string) : the URL for the image;
    @return (numpy.array) : a matrix containing the image information.
    """
    response = urllib.request.urlopen(url)
    image_enc = np.asarray(bytearray(response.read()), dtype=np.uint8)
    image = cv2.imdecode(image_enc, cv2.IMREAD_COLOR)
    return image

Now I'll test the image download function:

In [None]:
import matplotlib.pyplot as plt

image = image_from_url(image_urls[1])
plt.rcParams['figure.figsize'] = (10, 10)
plt.imshow(image[:,:,::-1]) # this trick switches the channels

### 03. Siameze Networks ###

I don't know if you realized what is actually a Machine Learning algorithm? Well, from the mathematical point of view is just a matrix function that takes a real-valued matrix as input and produces a real-value matrix as output.

$$Y = f_{ML}(X)$$

The sizes of both matrices need not match. Actually, in practice, this never happens. The learning part happens like this $f_{ML}$ has actually one internal state that changes while training and which when predicting is used in computing the output value.

Now to my problem: to be able to tell if two faces are the same, I have two options:
- either create this kind of *function* that takes as input two images and produces as output a single number (which is also a matrix, but with only one row and one column);
- or create a transformation *function* from an image to another structure, so that when this transformation is applied on both images, you can measure the distance between their outputs;

While the first method is possible, it's applications are few as it might be more complex to make it general. While the second one, just allows you to learn a very general transformation that from a face will extract *features*. This second method is called a **Siameze Network** as both images that will be compared are passed through the same **network**. This is a deep learning method as usually there's a large number of layers involved.

So, how to make a siameze network? Actually, this being a very known and useful problem, there are plenty of siameze networks already done and better, pretrained - so no other work is need for it. The one I'll be using is called [FaceNet](https://github.com/davidsandberg/facenet). To download the pretrained FaceNet model execute the following cell:

In [None]:
!gdown https://drive.google.com/uc?id=1PZ_6Zsy1Vb0s0JmjEmVd8FS99zoMCiN1

Next, to load this model I've just downloaded, I need [tensorflow](https://www.tensorflow.org/) which is Google's helper library for building and playing with Machine Learning methods.

After importing it, I'll disable all the INFO and WARNING messages, to keep things cleaner:

In [None]:
import tensorflow as tf
tf.get_logger().setLevel('ERROR')

But tensorflow is a low level library. It requires you to define all the dependencies and operations inside and between the layers. So Google came to resque and created keras, which is a high level library wrapper for tensorflow - meaning the nasty things are hidden inside and you can call very few things to make it work. Keras comes with a load_model function that I'll use and which can import a saved model like the one I've just downloaded. The returned object exposes a method named `predict()` that can be applied on a matrix and generate the output of $f_ML$.

To know exactly what the model requires as input and what it will produce as output, I can use the `inputs` and `outputs` properties:

In [None]:
!{sys.executable} -m pip install keras

In [None]:
from keras.models import load_model
# load the model
model = load_model('facenet_keras.h5')
# summarize input and output shape
print('inputs:', model.inputs)
print('outputs:', model.outputs)

So cool! So this method requires a matrix of 160x160x3 (so square image, of 160x160 with 3 channels and they are in RGB format - this is something I know reading the documentation, not from seeing the input) and produces as output a vector with 128 components.

But is it taking long to run? I can use `time.time()` from the builtin `time` module to count the number of microseconds while the prediction runs on a matrix full of zeros. You'll also notice that the matrix I used for input is of size 1x160x160x3 - this is because you can apply the model on a lot of images at once:

In [None]:
import numpy as np
import time\

input_shape = tuple(model.inputs[0].shape[1:])

prediction_start = time.time()
prediction = model.predict(np.zeros((1,) + input_shape))
prediction_time = time.time() - prediction_start

prediction_time

### 04. Prepare Faces ###

But FaceNet is just for faces - not for generic images containing faces. So I need a method to extract and prepare the faces from a given image. From the previous notebook, I'll use the `dlib.get_frontal_face_detector` object that is specifically designed for this.

In [None]:
import dlib
detector = dlib.get_frontal_face_detector()

def detect_faces(frame):
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    faces = detector(gray)
    return faces

The bad part is that I can't use this as input directly, as while this is a square face, it doesn't have the exact size required. So I'll need a little preparation before. I'll put the preparation in a function that will produce the matrix needed for the model input:

In [None]:
def prepare_faces(frame):
    """
    This function will extract the faces from a frame and will create a matrix from them,
    that can be fed through the FaceNet model. The function uses dlib to extract the
    rectangles, after that it will cut the faces, resize them and concatenate them in a
    matrix. To not lose the bounding boxes (rectangles surrounding the faces), I'll add
    those also to a second matrix.
    @param frame (numpy.array) : the image frame where I'll search for faces;
    @return tuple(numpy.array, numpy.array) : the first element will be a matrix that
    can be fed through the FaceNet model, while the second one is a matrix that has on
    each row the top, left, bottom, right coordinates of the bounding rectangle.
    """
    # this will be the output matrices
    faces = None # the model input matrix
    boxes = None # the bounding boxes matrix
    
    # go through all the detected faces
    for box in detect_faces(frame):
        # extract the face from the image
        face = frame[box.top():box.bottom(), box.left():box.right(), :]
        # resize the face to match the required size
        face = cv2.resize(face, input_shape[:2])
        # add the first axis, to be able to concatenate the outputs
        face = face.reshape(1, *input_shape)
        # if this is the first face, just put it in the output
        if faces is None:
            faces = face
            boxes = np.array([[
                box.top(),
                box.left(),
                box.bottom(),
                box.right()
            ]])
        # if this is not the first face, i'll concatenate it
        else:
            faces = np.concatenate((faces, face), axis = 0)
            boxes = np.concatenate((boxes, np.array([[
                box.top(),
                box.left(),
                box.bottom(),
                box.right()
            ]])), axis = 0)
    
    return faces, boxes

Running the face extraction and preparation on the first image from the list, gives:

In [None]:
faces, boxes = prepare_faces(image)
for face in faces:
    plt.imshow(face[:,:,::-1])
faces.shape, boxes.shape

### 05. Face Signatures ###

Now that I can extract faces from a random input image and prepare them for the correct machine learning algorithm input, let's see how the image gets converted: 

In [None]:
fig, ax = plt.subplots(1, 2, figsize = (10, 2), gridspec_kw={'width_ratios': [1, 4]}) # create a 1-row, 2-column plot, with 1:4 column ratio

signature = model.predict(faces)
ax[0].imshow(faces[0][:,:,::-1])
ax[1].plot(signature[0])
signature.shape

As this works for one of the faces, the next step is to generalize for all the selected posts from the profile I chose. The next cell will go through all the images, get the faces and their signatures and storedthem in `face_signatues` and `face_images` lists.

In [None]:
# define two lists to hold the data
face_signatures = [] # one for signatures
face_images = [] # one for face images

# go through all the urls I've got from Instagram
for image_url in image_urls:
    # download and extract the image matrix
    image = image_from_url(image_url)
    # convert it to RGB
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    # extract the prepared faces
    faces, boxes = prepare_faces(image)
    if faces is None:
        continue
    # and get the signatures
    signatures = model.predict(faces)
    # put the faces in their list
    for face_index, face in enumerate(faces):
        face_images.append(face)
        face_signatures.append(signatures[face_index])

The next cell will show how the signature graphs compare to eachother. You can notice that the signatures are pretty similar, given the faces are from the same person. This is how face recognition works.

In [None]:
fig, ax = plt.subplots(len(face_images), 2, gridspec_kw={'width_ratios': [1, 4]})
for face_index, face in enumerate(face_images):
    ax[face_index, 0].imshow(face)
    ax[face_index, 1].plot(face_signatures[face_index])

### 06. Cosine Distance ###

Looks good! Now, there are a lot of distances that I can use between two vectors - cause this is the output of the model, a 128-component vector. For similarity, the best that I can use is the cosine-distance which will tell me the cosine of the angle between those two vectors. Unlike anyother norms, this is a distance that encodes the differences between the vector components, iregardless of each component amplitude. This is based on this formula:
$$ \langle h_1, h_2 \rangle = |h_1||h_2|cos(\sphericalangle(h_1, h_2)) $$

Which leads to this formula:

$$d_{cos}(h_1, h_2) = \frac{\langle h_1, h_2 \rangle}{|h_1||h_2|}$$

For which I've implemented this function:

In [None]:
def cos_distance(signature_A, signature_B):
    """
    Returns the cosine distance between vectors signature_A and signature_B.
    @param signature_A (np.array) : the first vector;
    @param signature_B (np.array) : the second vector;
    @return (float) : a number, representing the cosine distance.
    """
    return np.matmul(signature_A, signature_B) / (np.linalg.norm(signature_A) * np.linalg.norm(signature_B))

Now, I'll compute the cosine distance between every face. Actually, as this distance is commutative and it will be 1.00 for the same image, all I have to do is to loop through the faces and compare each faces with the ones with strictly higher index.

I'll put the results into a matrix - but using this method I described, the matrix will be strictly upper-triangular so I'll need to add the identity matrix to it (one's on the diagonal) and it's transpose (a strictly lower-triangular matrix), to get the full picture.

In [None]:
faces_no = len(face_signatures)
# prepare the resulting matrix
results = np.zeros((faces_no, faces_no))
# for each face
for i in range(faces_no):
    # get each face with a higher index
    for j in range(i + 1, faces_no):
        # and compute the distance
        results[i,j] = cos_distance(face_signatures[i], face_signatures[j])
# build a matrix that is symmetric and has 1.00 on the main diagonal
results = results + np.eye(faces_no) + results.transpose()
results

Or better, I can use `matshow` from `matplotlib.pyplot` to draw a nice image of the matrix: the lighter colors represent a bigger match, while the dark colors represent a lower match between the compared faces:

In [None]:
plt.matshow(results)

In order to see the percentages better, I can loop for each cell coordinates and put the text containing the percentage in that cell, using `matplotlib.pyplot.text`, like this:

In [None]:
plt.matshow(results)

for i in range(faces_no):
    for j in range(faces_no):
        plt.text(i, j, '%d%%' % int(100 * results[i,j]), va='center', ha='center')

Or the more advanced, where I draw each faces on top of the axis, to see which faces look more to the others:

In [None]:
# I need to use subplots for this, this means than in a single image I'll have a matrix of (1, 1) subplots.
# The rest is actually the code to display the matrix
fig, ax = plt.subplots(1, 1)
ax.matshow(results)
# Together with getting the plot position on the canvas
pos = ax.get_position()

# Put also the percentages on the canvas, as I did before
for i in range(faces_no):
    for j in range(faces_no):
        ax.text(i, j, '%d%%' % int(100 * results[i,j]), va='center', ha='center')

# Compute the width and height of the matrix plot (the information, not the axis)
w = pos.x1 - pos.x0 # this is the width of the plot
h = pos.y1 - pos.y0 # this is the height of the plot
# Compute the width and height of a single cell:
img_w = w / faces_no # this is the width of one cell
img_h = h / faces_no # this is the height of one cell

# I'm using enumerate which provides not only the object but also an index for it
for num, image in enumerate(face_images):
    # I'm adding a new subplot, on the position starting from x0 (left) and on y1 (top) of the plot
    ax_w = fig.add_axes([pos.x0 + num * img_w, pos.y1, img_w, img_h])
    # I'll not display the number axis
    ax_w.axison = False
    # but will show the image
    ax_w.imshow(image)
    # Also, add a new subplot, on the position starting one cell left of x0 and going down from y1 (top)
    ax_h = fig.add_axes([pos.x0 - img_w, pos.y1 - (num + 1) * img_h, img_w, img_h])
    # same tricks, no number axis
    ax_h.axison = False
    # and display the image
    ax_h.imshow(image)

So if I want to consider a few pictures to have a reference on how the instagram model looks like, I can do the following trick:
- take the average of the cosine distance for each image when compared to the others;
- take the standard deviation of the cosine distance for on each image when compare to the others;

By computing the $ \mu - 5\sigma $ we get the minimum estimated distance threshold for a face when compared to others:

In [None]:
fig, ax = plt.subplots(1, 1)
ax.bar(range(results.shape[0]), np.mean(results, axis = 1) - 5 * np.std(results, axis = 1))
pos = ax.get_position()

margins = ax.margins()
# Compute the width and height of the matrix plot (the information, not the axis)
w = pos.x1 - pos.x0 - margins[0] # this is the width of the plot
h = pos.y1 - pos.y0 - margins[1]# this is the height of the plot
# Compute the width and height of a single cell:
img_w = w / faces_no # this is the width of one cell
img_h = h / faces_no # this is the height of one cell

# I'm using enumerate which provides not only the object but also an index for it
for num, image in enumerate(face_images):
    # I'm adding a new subplot, on the position starting from x0 (left) and on y1 (top) of the plot
    ax_w = fig.add_axes([pos.x0 + margins[0] * 0.5 + num * img_w, img_h + margins[1] * 0.5, img_w, img_h])
    # I'll not display the number axis
    ax_w.axison = False
    # but will show the image
    ax_w.imshow(image)