# Machine Learning Engineer Nanodegree

## Capstone Project - Smile Detector


---
In this project, CNNs are used to build models to detect if the person in the image is smiling or not.
CelibA dataset is used for this purpose - http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html
The data for this project is downloaded from Kaggle - https://www.kaggle.com/jessicali9530/celeba-dataset

In this notebook, an attempt is successfully made to improve the proposed benchmark model. This is achieved here by using the method of Data Augmentation.

In [2]:
IMG_H=218
IMG_W=178
IMG_D=3

In the code cell below, we populate a few variables through the use of the load_files function from the scikit-learn library:

train_files, valid_files, test_files - numpy arrays containing file paths to images train_targets, valid_targets, test_targets - numpy arrays containing onehot-encoded classification labels smile_names - list of string-valued smile categories for translating labels.

In [3]:
from sklearn.datasets import load_files       
from keras.utils import np_utils
import numpy as np
from glob import glob

# define function to load train, test, and validation datasets
def load_dataset(path):
    data = load_files(path, load_content = False)
    smile_files = np.array(data['filenames'])
    smile_targets = np_utils.to_categorical(np.array(data['target']), 2)
    return smile_files, smile_targets

# load train, test, and validation datasets
train_files, train_targets = load_dataset('input/dataset/train')
valid_files, valid_targets = load_dataset('input/dataset/validate')
test_files, test_targets = load_dataset('input/dataset/test')

smile_names = [item[:-1] for item in sorted(glob("input/dataset/train/*/"))]

# print statistics about the dataset
print('There are %d total smile categories.' % len(smile_names))
print('There are %s total smile images.\n' % len(np.hstack([train_files, valid_files, test_files])))
print('There are %d training smile images.' % len(train_files))
print('There are %d validation smile images.' % len(valid_files))
print('There are %d test smile images.'% len(test_files))

Using TensorFlow backend.


There are 2 total smile categories.
There are 15000 total smile images.

There are 10000 training smile images.
There are 2500 validation smile images.
There are 2500 test smile images.


When using TensorFlow as backend, Keras CNNs require a 4D array as input. Below is the function for the same. Also, preprocess input is used from imagenet_utils.

In [16]:
from tqdm import tqdm
from keras.preprocessing import image
from keras.applications import imagenet_utils

def path_to_tensor(img_path):
    # loads RGB image as PIL.Image.Image type
    img = image.load_img(img_path, target_size=(IMG_H, IMG_W))
    # convert PIL.Image.Image type to 3D tensor with shape (IMG_H, IMG_W, IMG_D)
    x = image.img_to_array(img)
    # convert 3D tensor to 4D tensor with shape (1, IMG_H, IMG_W, IMG_D) and return 4D tensor
    x = imagenet_utils.preprocess_input(x)
    return np.expand_dims(x, axis=0)
    
def paths_to_tensor(img_paths):
    list_of_tensors = [path_to_tensor(img_path) for img_path in tqdm(img_paths)]
    return np.vstack(list_of_tensors)

In [28]:
from PIL import ImageFile                            
ImageFile.LOAD_TRUNCATED_IMAGES = True                 

# pre-process the data for Keras
#train_tensors = paths_to_tensor(train_files).astype('float32')/255
#valid_tensors = paths_to_tensor(valid_files).astype('float32')/255
test_tensors = paths_to_tensor(test_files)


  0%|          | 0/2500 [00:00<?, ?it/s][A
  2%|▏         | 56/2500 [00:00<00:04, 558.72it/s][A
  5%|▍         | 117/2500 [00:00<00:04, 573.17it/s][A
  7%|▋         | 168/2500 [00:00<00:04, 552.61it/s][A
  9%|▉         | 227/2500 [00:00<00:04, 558.48it/s][A
 11%|█▏        | 287/2500 [00:00<00:03, 568.70it/s][A
 14%|█▎        | 343/2500 [00:00<00:03, 564.43it/s][A
 16%|█▌        | 397/2500 [00:00<00:03, 556.77it/s][A
 19%|█▊        | 463/2500 [00:00<00:03, 581.18it/s][A
 21%|██        | 521/2500 [00:00<00:03, 580.79it/s][A
 23%|██▎       | 584/2500 [00:01<00:03, 593.08it/s][A
 26%|██▌       | 642/2500 [00:01<00:03, 580.17it/s][A
 28%|██▊       | 700/2500 [00:01<00:03, 556.13it/s][A
 30%|███       | 762/2500 [00:01<00:03, 573.48it/s][A
 33%|███▎      | 820/2500 [00:01<00:02, 568.18it/s][A
 35%|███▌      | 885/2500 [00:01<00:02, 590.04it/s][A
 38%|███▊      | 945/2500 [00:01<00:02, 575.36it/s][A
 40%|████      | 1009/2500 [00:01<00:02, 592.89it/s][A
 43%|████▎     | 106

### BenchModel Architecture

Below is our Benchmark model - a simple CNN model created with few Conv2D, MaxPooling and Drop out layers. 

In [6]:
from keras.layers import Conv2D, MaxPooling2D, GlobalAveragePooling2D
from keras.layers import Dropout, Flatten, Dense
from keras.models import Sequential

model = Sequential()

model.add(Conv2D(filters=16,kernel_size=2,padding='same',activation='relu',input_shape=(IMG_H,IMG_W,IMG_D)))
model.add(MaxPooling2D(pool_size=2))
model.add(Conv2D(filters=32,kernel_size=2,padding='same',activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Conv2D(filters=64,kernel_size=2,padding='same',activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(0.25))
model.add(Conv2D(filters=128,kernel_size=2,padding='same',activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(0.25))
model.add(GlobalAveragePooling2D())
model.add(Dense(2, activation='softmax'))

model.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_1 (Conv2D)            (None, 218, 178, 16)      208       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 109, 89, 16)       0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 109, 89, 32)       2080      
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 54, 44, 32)        0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 54, 44, 64)        8256      
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 27, 22, 64)        0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 27, 22, 64)       

### Compile the Model

In [10]:
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])

## Data Augmentation

Ref: https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html
We will now augment the data via a number of random transformations so that our model would never see the exact same picture. This helps prevent overfitting and helps the model generalize better. Robustness of the model increases with this augmentation technique. The image generators of augmented image batches (and their lables) are instantiated either via .flow(data, labels) or .flow_from_directory(directory). In this notebook, .flow_from_directory(directory) is used.

In [19]:
from keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(
    preprocessing_function = imagenet_utils.preprocess_input,
    width_shift_range = 0.2,
    height_shift_range = 0.2,
    zoom_range = 0.2,
    horizontal_flip = True)

train_generator = train_datagen.flow_from_directory(
        'input/dataset/train',
        target_size=(IMG_H,IMG_W),
        batch_size=32,
        class_mode='categorical')

val_datagen = ImageDataGenerator(preprocessing_function = imagenet_utils.preprocess_input)

val_generator = val_datagen.flow_from_directory(
        'input/dataset/validate',
        target_size=(IMG_H,IMG_W),
        batch_size=32,
        class_mode='categorical')

Found 10000 images belonging to 2 classes.
Found 2500 images belonging to 2 classes.


In [20]:
from keras.callbacks import ModelCheckpoint, CSVLogger
model_weights_filepath='input/saved_models/rmsprop_benchmark_aug.weights.best.hdf5'
checkpointer = ModelCheckpoint(model_weights_filepath,verbose=1,save_best_only=True)
csvLogger = CSVLogger('logs/rmsprop_benchmark_aug',append = True)
model.fit_generator(
        train_generator,
        steps_per_epoch=10000//32, # No.Of Training Samples/Batch_size
        epochs=20,
        validation_data=val_generator,
        callbacks=[checkpointer])

Epoch 1/20

Epoch 00001: val_loss improved from inf to 0.69769, saving model to input/saved_models/rmsprop_benchmark_aug.weights.best.hdf5
Epoch 2/20

Epoch 00002: val_loss improved from 0.69769 to 0.47019, saving model to input/saved_models/rmsprop_benchmark_aug.weights.best.hdf5
Epoch 3/20

Epoch 00003: val_loss improved from 0.47019 to 0.43744, saving model to input/saved_models/rmsprop_benchmark_aug.weights.best.hdf5
Epoch 4/20

Epoch 00004: val_loss did not improve from 0.43744
Epoch 5/20

Epoch 00005: val_loss did not improve from 0.43744
Epoch 6/20

Epoch 00006: val_loss improved from 0.43744 to 0.40361, saving model to input/saved_models/rmsprop_benchmark_aug.weights.best.hdf5
Epoch 7/20

Epoch 00007: val_loss improved from 0.40361 to 0.22242, saving model to input/saved_models/rmsprop_benchmark_aug.weights.best.hdf5
Epoch 8/20

Epoch 00008: val_loss improved from 0.22242 to 0.08769, saving model to input/saved_models/rmsprop_benchmark_aug.weights.best.hdf5
Epoch 9/20

Epoch 00

<keras.callbacks.callbacks.History at 0x286c3adda08>

In [21]:
# Load the model with the best validation accuracy
model.load_weights(model_weights_filepath)

In [30]:
# evaluate and print the test accuracy
# get index of predicted smile detection for each image in test set
smile_prediction = [np.argmax(model.predict(np.expand_dims(test_data, axis=0))) for test_data in test_tensors]

# report test accuracy
test_accuracy = 100*np.sum(np.array(smile_prediction)==np.argmax(test_targets, axis=1))/len(smile_prediction)
print('Test accuracy: %.4f%%' % test_accuracy)

Test accuracy: 82.1200%



Here, we saw that the benchmark model when considered with augmentation of data gave a test accuracy of 82.10%. This is slightly higher than the accuracy of the benchmark model considered.