# Project 7: Facial Detection with Artificial Neural Networks
- Name: Carson Stevens
- Date: November 28, 2018

## Background
Facial detection is a specific case of object-class detection, which attempts to find the location and sizes of objects in an image that belong to a given class. We're going to train a neural network to do this by giving it a database of positive examples so it can learn what to look for and negative examples so it can learn what not to look for.

There are multiple ways to do this, which you can read more about [here](https://towardsdatascience.com/face-detection-for-beginners-e58e8f21aad9), but here we're going to take a feature-based approach: the structural features of each face will be extracted, then the model is trained to spot them and acts as a classifier.

Facial detection is very useful for camera-based apps, which need to find faces quickly. One "quick and dirty" method is to train a model with HaaR features - the sums of pixel intensities in adjacent regions of a photo. This is fast and pretty successful, but trades some accuracy for time. For an example, look [here](https://docs.opencv.org/3.4.3/d7/d8b/tutorial_py_face_detection.html)

We're going to be using a technique known as Histogram of Oriented Gradients, which is described more fully below. It's an older method, but remains the state of the art of high accuracy techniques. It's a lot slower than HaaR (and definitely too slow for cameras), but what we lose in time, we gain in predictive power.

## Setup

In [None]:
import cv2
import time
import numpy as np
import skimage.data
import matplotlib.pyplot as plt
import seaborn as sns; sns.set()
from skimage import color, feature
from matplotlib import pyplot as plt
from sklearn.datasets import fetch_lfw_people
from sklearn.preprocessing import StandardScaler
from sklearn.neural_network import MLPClassifier
from skimage import data, color, feature, transform
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.image import PatchExtractor
from sklearn.metrics import classification_report,confusion_matrix

# ignore warnings - they're just of future deprecations
def warn(*args, **kwargs):
    pass
import warnings
warnings.warn = warn

## Problem 1: load the dataset (25 pts)
For our faces, we're going to the [Labeled Faces in the Wild(LFW)](http://vis-www.cs.umass.edu/lfw/) database. It has tens of thousands of both positive and negative samples and is usually used for facial *recognition*, which involves teaching our model names as well as the facial features that go along with that name, but we're just going to detect here.

- Load the faces with `faces = fetch_lfw_people()` then save the images in a variable from  `faces.images`. These 'patches' will be our positive training samples.

- Next, negative patches are loaded for you.

- Combine positive and negative patches into a single set of inputs. (Hint: consider `np.concatenate()`)

- Finally, create a set of targets with a `1` for every positive patch and a `0` for every negative. (Hint: consider `np.ones()` and `np.zeros()`) You can get the number of both sets of patches with `shape[0]`

In [None]:
# training data
## positive
faces = fetch_lfw_people()
positive_patches = faces.images

In [None]:
## negative patches
imgs_to_use = ['camera', 'text', 'coins', 'moon',
'page', 'clock', 'immunohistochemistry',
'chelsea', 'coffee', 'hubble_deep_field']

negative_images = [color.rgb2gray(getattr(data, name)()) for name in imgs_to_use]

def extract_patches(img, N, scale=1.0, patch_size=positive_patches[0].shape):
    extracted_patch_size = tuple((scale * np.array(patch_size)).astype(int))
    extractor = PatchExtractor(patch_size=extracted_patch_size,
                               max_patches=N, random_state=0)
    patches = extractor.transform(img[np.newaxis])
    if scale != 1:
        patches = np.array([transform.resize(patch, patch_size)
                            for patch in patches])
    return patches

negative_patches = np.vstack([extract_patches(im, 500, scale) for im in negative_images for scale in [0.5, 1.0, 2.0]])
negative_patches.shape

In [None]:
# combine positive and negative
# inputs
X = np.concatenate((positive_patches, negative_patches), axis=0)
# targets
y = np.concatenate((np.ones(len(positive_patches)), np.zeros(len(negative_patches))))
#print(X.shape, y.shape, positive_patches.shape, negative_patches.shape)

## Problem 2: HOG features (10 pts)
*Histogram of Oriented Gradients* is a feature descriptor, which is a representation of an image that simplifies it by extracting useful information and removes extraneous information. This feature descriptor counts occurrences of "gradient orientation" in localized portions of an image. Basically, an image is divided into regions. In each region, the directions and extremities of light intensity or edges are calculated. All these different gradients are then compiled into a histogram. Then histograms from multiple regions can be normalized by measuring intensity across multiple regions together. The information extracted here can be used for face detection by feeding it to a learning model to teach it how light interacts with faces and the edges and shapes parts of the face has.

Here are some more resources to learn about HOG
- [wiki](https://en.wikipedia.org/wiki/Histogram_of_oriented_gradients)
- [OpenCV blog](https://www.learnopencv.com/histogram-of-oriented-gradients/)
- [Stanford](http://vision.stanford.edu/teaching/cs231b_spring1213/slides/HOG_2011_Stanford.pdf)
- [Intel](https://software.intel.com/en-us/ipp-dev-reference-histogram-of-oriented-gradients-hog-descriptor)

**Procedure**<br>
We want to call `skimage.feature.hog()` on each image, either in a loop or a comprehension, store each in a list, then create an `np.array` from that list.

Then, split that array with `train_test_split()`

In [None]:
# hog features
hog_X = [skimage.feature.hog(face) for face in X]
hog_X = np.array(hog_X)

In [None]:
# splitting
X_train, X_test, y_train, y_test = train_test_split(hog_X, y)
#print(X_train.shape, y_train.shape)

In [None]:
# image functions - these functions are given and will be used below ... leave as is please :)
def show_img(img, interpolation = 'bicubic'):
    '''
    basic image plotter

    Arguments:
        img - the image(which is an ndarray)

    Returns:
        void
    '''
    plt.imshow(img, cmap='gray', interpolation = interpolation)
    plt.axis('off') if not axis else None
    plt.show()
    
def convert_img_to_greyscale(img):
    '''
    convert an image to greyscale

    Arguments:
    img - the image/ndarray

    Returns: 
    '''
    return skimage.color.rgb2gray(img)

def rescale_img(img, r_scale=0.5, r_mode='reflect'):
    '''
    rescale inage
    
    Arguments:
        img - the image/ndarray
        r_scale - 
        r_mode - 

    Returns:
        rescaled image - skimage
    '''
    return skimage.transform.rescale(img, r_scale, mode=r_mode)

def show_hog_features(image, is_greyscale=False, is_rescaled=False):
    '''
    plot image and hog features side by side
    
    Arguments:
       image - the image
       is_greyscale - 
       is_rescaled - 
    '''
    if not is_greyscale:
        image = convert_img_to_greyscale(image)
    if not is_rescaled:  
        image = rescale_img(image)
    hog_vec, hog_vis = feature.hog(image, visualise=True)
    fig, ax = plt.subplots(1, 2, figsize=(12, 6), subplot_kw=dict(xticks=[], yticks=[]))
    ax[0].imshow(image, cmap='gray')
    ax[0].set_title('input image')

    ax[1].imshow(hog_vis, cmap='gray')
    ax[1].set_title('visualization of HOG features');
    plt.show()
    
def show_patches(img, indices, label):
    '''
    superimpose the patches predicted to contain a face on the original image
    
    Arguments:
        img - the image
        indices - 
        label - 
        
    Returns:
        void
    '''
    fig, ax = plt.subplots()
    ax.imshow(img, cmap='gray')
    ax.axis('off')

    Ni, Nj = positive_patches[0].shape
    indices_arr = np.array(indices)

    for i, j in indices_arr[label == 1]:
         ax.add_patch(plt.Rectangle((j, i), Nj, Ni, edgecolor='red', alpha=0.3, lw=2, facecolor='none'))
    
    plt.title("Faces Found: " + str(len(indices_arr[label == 1])))
    plt.show()

### Example visualization of HOG features

In [None]:
# These code is also given to load the images we will be working with and the HOG result
astronaut = skimage.data.astronaut()
show_hog_features(astronaut)
cropped_astronaut = rescale_img(convert_img_to_greyscale(astronaut))[0:100, 70:160]
show_hog_features(cropped_astronaut)

## Problem 3: Multilayer Perceptron (MLP) (10 pts)
A perceptron is a linear classifier - it classifies input by separating two categories with the straight line: `y = w * X + b`,  the input feature vector `X` is multipled by weights `w` with an added bias `b`. This "neuron" works by calculating a weighted sum of its input, adds the bias, and decides whether it should "fire" or not. By itself, a perceptron is just a building block, only useful when combined and expanded into larger functions, such as a multilayer perceptron.

An MLP is a deep artificial neural network, differing from single-hidden-layer neural networks in their *depth* - how many layers of nodes data passes through in the process of pattern recognition. Generally, `3` or more hidden layers qualifies a model as "deep". The hidden layers collectively form a *feature hierarchy* - advancing further in the network corresponds to increasing complexity and abstraction of the features it can recognize. This is great for large, high-dimensional data, e.g. raw media like photos. 

An MLP can be created with [MLPClassifer](https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html), for which you can specify hidden layer size, activation function, weight optimization solver, batch size, verbosity, and whether training should stop early to avoid overfitting, and others.

For this problem, we want to create an MLP classifier with at least `3` hidden layers. We also want to choose the activation function, which does the work of deciding whether a neuron should fire (or "activate") or not. At its simplest, this is a step function that fires the neuron if `y > threshold` and does not fire otherwise. For this problem, let's choose `ReLU`(Rectified Linear Units), which returns 0 if it receives negative input and for positive input, simply returns the value back. There's a lot more that can be said, so look [here](https://www.kaggle.com/dansbecker/rectified-linear-units-relu-in-deep-learning) if you're interested. 

Next is the `solver`, which decides how the weights are mathematically optimized. There are a few options which the sklearn [documentation](https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html) describes, but for simplicity, just stick with the default `adam` for now. 

Next is `batch_size` which designates the size of minibatches for stochastic optimizers. This just means training data is split into groups so that gradient descent can reduce the variance of the gradient by averaging the gradients of the groups.

For our own sake, let's tell the model to print out how it's doing each epoch/iteration. This is just a flag we want to set with `verbose=True`. Finally, let's avoid overfitting by telling the model to stop if the validation score does not improve enough over consecutive epochs.

**Parameters**
- hidden_layer_sizes: a tuple of neurons per layer, default (100,)
- activation : {‘identity’, ‘logistic’, ‘tanh’, ‘relu’}, default ‘relu’
- solver : {‘lbfgs’, ‘sgd’, ‘adam’}, default ‘adam’
- batch_size: int, default 'auto'
- verbose: boolean(True/False) for whether or not to print out progress at each stage of training.
- early_stopping: boolean, prevents overfitting(like we did on the gradient descent assignment)


**Procedure**
- create the model object with at least 3 hidden layers and the appropriate parameters as detailed above.
- fit the model on `x_train` and `y_train`

In [None]:
# multilayer perceptron model
model = MLPClassifier(hidden_layer_sizes=(100,100,100), activation='relu', solver='adam',verbose=True, early_stopping=True)
model.fit(X_train, y_train)

In [None]:
# now instead of getting hog features of entire image, break image into
# patches and get hog features of each
def sliding_window(img, patch_size=positive_patches[0].shape,
                   istep=2, jstep=2, scale=1.0):
    '''
    '''
    Ni, Nj = (int(scale * s) for s in patch_size)
    for i in range(0, img.shape[0] - Ni, istep):
        for j in range(0, img.shape[1] - Ni, jstep):
            patch = img[i:i + Ni, j:j + Nj]
            if scale != 1:
                patch = transform.resize(patch, patch_size)
            yield (i, j), patch

In [None]:
def get_hog_features(images: list, scale = 1.0):
    image_hogs = []
    image_indices = []
    for image in images:
        indices, patches = zip(*sliding_window(image, scale=scale))
        image_indices.append(indices)
        patches_hog = np.array([feature.hog(patch) for patch in patches])
        image_hogs.append(patches_hog)
        print(patches_hog.shape)
    return image_hogs, image_indices

## Problem 4: Image uploading (15 pts)
You will be uploading at least `5` images in this portion - at least 1 must have a "clearly visible" face in it.
<br>For this we will be using the Open Source Computer Vision(OpenCV) library. Specifically, we will be reading images in, manipulating and displaying them, then saving the changes.
<br>The relevant functions here are `cv2.imread()`, `cv2.imshow()`, and `cv2.imwrite()`, and matplotlib's `plt.imshow()`
<br>Before uploading, be sure each photo is a `jpg` and is relatively small (use your own or the provided images to begin with).

**Procedure**
- Make a list of filenames of your images
- Create a list of targets with 1's corresponding to positive images and 0's for negatives (e.g., [1.0, 1.0, 1.0, 0.0, 0.0] would be positie for images 1, 2, and 3 and negative for the last two).
- Loop over the filename list, call `cv2.imread()` on the filename with the `cv2.IMREAD_GRAYSCALE` option as the second parameter, and store the result in another list (e.g., images). 
- If you want to display the images, create another loop to go through the images and call `plt.imshow()`

In [None]:
# image uploading
filenames = ["brain-1.jpg", "face-1.jpg", "face-2.jpg", "not-face-1.jpg", "not-face-2.jpg"]
# inputs
images = []
# targets
images_targets = [0.0, 1.0, 1.0, 0.0, 0.0]

# read
for filename in filenames:
    image = cv2.imread(filename, cv2.IMREAD_GRAYSCALE)
    images.append(image)

# display
for image in images:
    plt.imshow(image, cmap = 'gray', interpolation = 'bicubic')
    plt.xticks([]), plt.yticks([])
    plt.show()

## Problem 5: Rescale and show HOG features (15 pts)
Loop over the list of images and using the helper functions in Problem 2 above, rescale each image and then display the hog features of each, again using the functions above (See the "Example visualization of HOG features" above for examples).

In [None]:
for image in images:
    # rescale
    cropped_image = rescale_img(convert_img_to_greyscale(image))[0:100, 70:160]
    # hog features of inputs
    show_hog_features(cropped_image)

## Problem 6: Predict detections (20 pts)
Use the MLP model to predict faces by feeding HOG patches to `predict()`.

**Prodcedure**
- call `get_hog_features()`, which returns two things: the HOG features of each patch and the indices of those patches.
- loop over the HOG features variable, calling `predict()` on each. This result is the predicted labels - store in a `labels` varaible, then append each of those `labels` to a list (e.g., image_labels).
- print how many patches have faces in them by calling `sum()` on `labels`

In [None]:
# feed patches to model to predict positive patches
image_labels = []
image_hogs, image_indices = get_hog_features(images, 1.0)
for hog in image_hogs:
    labels = model.predict(hog)
    image_labels.append(labels)
    print(sum(labels))

## Problem 7: plotting predictions  (10 pts)
Finally, we want to visualize how the model did.
- Loop over your images. On the `ith` iteration, call `show_patches()` with `images[i]`, `indices[i]`(this is the second value returned from `get_hog_features()`), and `labels[i]`(the list created in the previous problem)

In [None]:
print(len(images), len(image_indices), len(image_labels))
for i in range(len(images)):
    show_patches(images[i], image_indices[i], image_labels[i])

## Problem 8: Improvements (10 pts)
Weird results, right? How can we improve the model's performance? There are a lot of possibilities, but consider starting with sliding box scale. You can modify this by calling `pipeline()` below, which just does all the steps above.

There isn't one right answer here, and your results don't necessarily have to improve, so explore and experiment with what affects performance.

Some possibilities:
- sliding box scale: make the sliding box/window bigger or smaller by passing different values to `pipeline()` below
- normalzing data with StandardScaler, like in lecture 23
- randomizing input data before splitting into training and testing sets. For this, make sure the targets list is shuffled in the same way so as to still coresspond to the inputs.

Some possible extensions:
- instead of plotting every box, averge their locations and plot only a single "best guess" for where the face(s) is(are).

In [None]:
def plot_predictions(images: list, indices: list, labels: list):
    for i in range(len(images)):
        show_patches(images[i], indices[i], labels[i])

# full workflow
def pipeline(model, images: list, scale = 1.0):
    image_labels = []
    image_hogs, image_indices = get_hog_features(images, scale)

    for hog in image_hogs:
        labels = model.predict(hog)
        image_labels.append(labels)

    plot_predictions(images, image_indices, image_labels)
    


In [None]:
# try with scale 0.5, 0.75, 1.0, 1.25, 1.5
pipeline(model, images, 0.5)
pipeline(model, images, 0.75)
pipeline(model, images, 1.0)
pipeline(model, images, 1.25)
pipeline(model, images, 1.5)

#Hand to use above because kernel kept crashing with loop
# for i in range(0.5, 1.5, 0.25):
#     pipeline(model, images, i)

In [None]:
#Comments
"""
Time Spent: 2.5 hours
Enjoyed:    This was a super interesting project with a satisfying result.
Disliked:   Was hard to run on servers without kernel dying. I could get a
            result without it crashing half the time. Much of the time spent
            was waiting.
"""