# Active Appearance Models with Menpo for Infrared Images

Requirements: menpo 0.7.7 menpofit 0.4.1 menpodetect 0.4.0 (never versions still provide the same functionality, however may provide a different interface, thereby requiring chnges to this code)

Download using anaconda or check menpo.org for details

### Note that Windows support has been suspended by the menpo developers.
We can provide you with a copy of our compatile local windows environment, wich you can use at your own risk. It is in the same cloud folder as the database and this code. Extract it into the ``envs`` subfolder of your conda installation and hope for the best.

## Train Your AAM

### Settings AAM-Training

To train an Active Appearance Model (AAM) a database of images with a consistent number of labeled points per image is required. The path to this database should be specified below in ``IMAGE_PATH``. The labeled points should either be in a ``.pts`` or a ``.ljson`` file and the filetype has to be specified in ``LANDMARK_GROUP``. 
Afterwards a feature descriptor has to be selected:

 * ``no_op`` uses the raw image data
 * ``d_sift`` and ``fast_dsift`` are computing [dense sift features](https://pdfs.semanticscholar.org/ac08/4587faa2227e8e09a0d2b7803f60f23be1c1.pdf)
 * ``hog`` will compute the [histogram of oriented gradients](http://web.eecs.umich.edu/~silvio/teaching/EECS598_2010/slides/09_28_Grace.pdf)

At the end a scale and an image diagonal has to be choosed (recommended to use default parameters at the beginning)

In [6]:
from menpo.feature import hog, ndfeature # you can import no_op and dsift as well (hog is best for imfrared images though)
IMAGE_PATH = "d:/downloads/FaceDB_PNG_2935" # "/home/temp/schock/Infrared/Databases/FaceDB_Snapshot"
LANDMARK_GROUP = "LJSON" #or "PTS"
# use hog or dsift if you want precision, no_op to save memory and speed (but no_op performs _really_ bad on thermal data)
features = hog

# you will need atleast 8, preferably 16 GB RAM to train the model with these settings
scales = 1 # use 2 if you have enough RAM (>32 GB)

# use 120 or 150 if you have enough RAM - at higher values and with dsift or hog features, this eats RAM like popcorn
# larger diagonal = higher precision. Results in the paper were with 150 iirc.
diagonal = 50 

# convert feature from 64 to 32 bit; has no impact on fitting precision but saves 50% memory
# thanks to the menpo team for the hint
# you should define the same for your other features if you use other features than hog in your code
@ndfeature
def float32_hog(x):
    return hog(x).astype(np.float32)

### Loading Images

With ``mio.import images`` we firstly import all images their corresponding landmarks in ``IMAGE_PATH``.
Afterwards we crop every image to its respective landmarks to ensure that the resulting image contains only the face.
The last step of loading the images is to convert them to greyscale if they are not already.
> #### Note: The landmark-files must have the same name as the corresponding image-files but with different file-exstensions 

In [2]:
from menpo import io as mio
from tqdm import tqdm

print("Importing images")
train_images = []

for i in tqdm(mio.import_images(IMAGE_PATH)):
    
    # Crop images to Landmarks --> only Face on resulting image
    i = i.crop_to_landmarks_proportion(0.1)
    
    # Convert multichannel images to greyscale
    if i.n_channels > 2:
        i = i.as_greyscale()
        
    train_images.append(i)

    
print("Succesfully imported %d Images" % len(train_images))


  0%|                                                                                 | 3/2935 [00:00<01:43, 28.30it/s]

Importing images


100%|██████████████████████████████████████████████████████████████████████████████| 2935/2935 [01:18<00:00, 37.19it/s]

Succesfully imported 2935 Images





### Training the AAM
The trainstage of an Active Appearance Model is almost only a PCA for each of the model parts (Shape Model and Appearance Model) and storing their results. Therefore the code for training an AAM is quite simple using the previous defined settings:

In [7]:
from menpofit.aam import HolisticAAM as AAM

print("Training AAM")
aam = AAM(
    train_images,
    group=LANDMARK_GROUP,
    verbose=True,
    holistic_features=features,
    scales=scales,
    diagonal = diagonal
)

Training AAM
- Computing reference shape                                                     Computing batch 0
  - Warping images: [          ] 0% (23/2935) - 00:00:18 remaining              



  - Doneding appearance model                                                   
                                                                       

##  Fitting from a trained AAM
### Creating an AAM-Fitter : LucasKanadeFitter
To fit images using the trained AAM a fitter is necessary. The fitter is a class cappable of the whole optimization procedure. 
As compositional gradient descent algorithms either the [Wiberg Inverse Compositional Gauss-Newton algorithm (WIC)](http://menpofit.readthedocs.io/en/stable/api/menpofit/aam/WibergInverseCompositional.html) or the [Simultaneous Inverse Compositional Gauss-Newton algorithm (SIC)](http://menpofit.readthedocs.io/en/stable/api/menpofit/aam/SimultaneousInverseCompositional.html) should be used [1](https://link.springer.com/article/10.1007%2Fs11263-016-0916-3).
With the parameters ``n_shape`` and ``n_appearance`` we can specify how many of the PCA-components should be used for the fitting process. Setting them to a float less between zero and one it defines the fraction of accuracy we want to achieve at the certain model. Setting it to an int greater than one it defines the number of components to be used. 

In [8]:
from menpofit.aam import LucasKanadeAAMFitter as Fitter
from menpofit.aam import WibergInverseCompositional as WIC
from menpofit.aam import SimultaneousInverseCompositional as SIC

print("Creating Fitter")
fitter_alg = WIC # or SIC  --> Algorithm to be used by fitter
n_shape = 0.95 #  --> fraction of shape accuracy to remain (dimensionality reduction through PCA)
n_appearance = 0.95 #  --> fraction of appearance accuracy to remain (dimensionality reduction through PCA)

fitter = Fitter(aam=aam, 
                lk_algorithm_cls=fitter_alg,
                n_shape=n_shape, 
                n_appearance=n_appearance)

Creating Fitter


## Face Detection
To recieve a good result from our LucasKanadeAAMFitter we need to give it a bounding box of the face as initialization. This bounding box shows defines the position and scale of the AAM's initial shape.

To do so, we need to define a function which gets a list of faces and returns the bounding box of the first shape as [PointDirectedGraph](http://docs.menpo.org/en/stable/api/shape/PointDirectedGraph.html):

In [9]:
import numpy as np
import menpo

def face_2_pointcloud(faces):
    if len(faces):
            face = np.array([faces[0].as_vector()[1], faces[0].as_vector()[0],
                             faces[0].range()[1], faces[0].range()[0]]).astype(np.uint16)
            print("Face detected. > ", face)
    else:
        face = np.array([0, 0, 0, 0]).astype(np.uint16)
        print("NO Face detected.")
    
    points = np.array([[face[1], face[0]],
                         [face[1]+face[3], face[0]],
                         [face[1]+face[3], face[0]+face[2]],
                         [face[1], face[0]+face[2]]])

    adjacency_matrix = np.array([[0, 1, 0, 1], [1, 0, 1, 0], [0, 1, 0, 1], [1, 0, 1, 0]])

    return menpo.shape.PointDirectedGraph(points, adjacency_matrix)

Afterwards we load a pretrained [hog-facedetector](http://blog.dlib.net/2014/02/dlib-186-released-make-your-own-object.html) to detect the faces and return their bounding boxes:

> #### Note: The face detector has been trained on infrared images. To guarantee good performance on other image types it should be retrained 

In [10]:
import menpodetect
import dlib
HOG_PATH = "./hog_detector.svm"

print("Loading Face Detector")
face_detector = menpodetect.DlibDetector(dlib.simple_object_detector(HOG_PATH))

Loading Face Detector


Afterwards we load the test-image and detect it's face using the loaded face-detector and and the previously written function ``face_2_pointcloud`` will return us a bounding box as PointDirectedGraph:

In [11]:
TEST_IMG_PATH = "D:/downloads/FaceDB_PNG_2935/irface_sub051_seq07_frm01053.jpg_lfb.png"#"PATH/TO/TEST_IMG" # "/home/temp/schock/Infrared/Databases/IR_HPE_Colette/001/image_00000.png" 

print("Loading Test Image")
test_img = mio.import_image(TEST_IMG_PATH)

print("Detecting Face")
test_face_bb = face_2_pointcloud(face_detector(test_img))


Loading Test Image
Detecting Face
Face detected. >  [279 229 445 446]




### Fitting
Using the detected face's bounding box, we fit from the bounding box by invoking the fitters function ``fit_from_bb`` and passing it the loaded image and the bounding box. The argument ``max_iters=25`` defines the maximum number of compositional gradient descent iterations to be 25, which is usually a good tradeoff between performance and accuracy.

The achieved ``fitting_result`` can be visualized with ``fitting_result.view()``. To show the created figure the pyplot ``show`` function is necessary afterwards.

In [12]:
from matplotlib import pyplot as plt

print("Fitting AAM")
fitting_result = fitter.fit_from_bb(test_img, test_face_bb, max_iters=25)

fitting_result.view()
plt.show()

Fitting AAM


AttributeError: module 'matplotlib.colors' has no attribute 'to_rgba'

## Emotion Detection
### Loading the pretrained classifier
To classify the face's emotion we provide a pretrained classifier. To load this classifier the following function tries to load it using ``joblib`` and switches back to loading it with ``pickle`` for backward compatibility if neccessary :

In [None]:
import joblib
import pickle
def load_model(file_path: str):
 
    try:
        model = joblib.load(file_path)
    except:
        with open(file_path, "rb") as f:
            model = pickle.load(f, encoding='latin1')

    return model

The classifier will now be loaded with the above defined function. Therefore the filepath to the classifier file has to be specified in ``EMOTION_CLF_PATH``. For the plotting step we also need to specify the emotions, the classifier is able to predict in the correct order as ``class_labels``:

> #### Note: The given emotion classifier has been trained on HoG-features of inrared images. They might be different from HoG-features of other image types. Therefore a retraining of the classifier is necessary to guarantee good performance on other image types.

In [None]:
EMOTION_CLF_PATH = "./classifier_29092016_neutral_freude_trauer_ueberraschung.pkl" # "/PATH/TO/CLF"
class_labels = ["neutral", "joy", "sorrow", "surprise"]
print("Loading Emotion Classifier")
emotion_clf = load_model(EMOTION_CLF_PATH)

### Extracting the relevant features
#### Specify relevant image part
To detect the emotion we use the prediction we obtianed from the AAM or more precisely its boundaries. We extract the points on the upper left and the bottom right of the bounding box and clip their values to the image range.

Afterwards we extract the part inside the bounding box and resize it to 144 x 144 pixels because our classifier was trained on HoG-Features of images which were sized like this.

The last two lines plot the extracted image part to verify that the right part was extracted.

In [None]:
import cv2

# get points of bounding_box. left top and bottom right
p_left_top = fitting_result.final_shape.bounds()[0]
p_right_bottom = fitting_result.final_shape.bounds()[1]

image_width, image_height = test_img.width, test_img.height


# clip value to image range
p_left_top[0] = np.clip([p_left_top[0]], 0, image_height)[0]
p_left_top[1] = np.clip([p_left_top[1]], 0, image_width)[0]
p_right_bottom[0] = np.clip([p_right_bottom[0]], 0, image_height)[0]
p_right_bottom[1] = np.clip([p_right_bottom[1]], 0, image_width)[0]

#extract relevant image part
img_tmp = test_img.pixels.squeeze()[p_left_top[0]:p_right_bottom[0], p_left_top[1]:p_right_bottom[1]]*255

img_tmp = cv2.resize(img_tmp, (144, 144))

plt.imshow(img_tmp, cmap='gray')
plt.show()

#### HoG-Features
Now, we need to extract the Hog features of our image:

In [None]:
from skimage.feature import hog

hog_features, hog_image = hog(img_tmp, orientations=9, pixels_per_cell=(8, 8),
                               cells_per_block=(2, 2), visualise=True)

plt.imshow(hog_image, cmap='gray')
plt.show()

#### Classification
Afterwards we feed our extracted features to our classifier to get probabilites for each emotion and plot then result:

In [None]:
emotion_probabilities = np.array(emotion_clf.predict_proba([hog_features])[0])

classes = np.arange(len(emotion_probabilities))
plt.bar(left=classes, height=emotion_probabilities, align='center', tick_label=class_labels)
plt.show()