# Machine Learning Engineer Nanodegree

## Capstone Project - Smile Detector

In this project, CNNs are used to build models to detect if the person in the image is smiling or not.
CelibA dataset is used for this purpose - http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html
The data for this project is downloaded from Kaggle - https://www.kaggle.com/jessicali9530/celeba-dataset

In this notebook, a simple CNN model is developed from scratch. This is considered as Benchmark model based on which improved models will be worked upon.

In [1]:
# all imports
from sklearn.datasets import load_files       
from keras.utils import np_utils
import numpy as np
from glob import glob
from keras.preprocessing import image                  
from tqdm import tqdm

Using TensorFlow backend.


In [5]:
IMG_H=218
IMG_W=178
IMG_D=3

In [6]:
# define function to load train, test, and validation datasets
def load_dataset(path):
    data = load_files(path)
    smile_files = np.array(data['filenames'])
    smile_targets = np_utils.to_categorical(np.array(data['target']), 2)
    return smile_files, smile_targets

# load train, test, and validation datasets
train_files, train_targets = load_dataset('input/dataset/train')
valid_files, valid_targets = load_dataset('input/dataset/validate')
test_files, test_targets = load_dataset('input/dataset/test')

smile_names = [item[:-1] for item in sorted(glob("input/dataset/train/*/"))]

# print statistics about the dataset
print('There are %d total smile categories.' % len(smile_names))
print('There are %s total smile images.\n' % len(np.hstack([train_files, valid_files, test_files])))
print('There are %d training smile images.' % len(train_files))
print('There are %d validation smile images.' % len(valid_files))
print('There are %d test smile images.'% len(test_files))

Using TensorFlow backend.


There are 2 total smile categories.
There are 15000 total smile images.

There are 10000 training smile images.
There are 2500 validation smile images.
There are 2500 test smile images.


In [8]:
def path_to_tensor(img_path):
    # loads RGB image as PIL.Image.Image type
    img = image.load_img(img_path, target_size=(IMG_H, IMG_W))
    # convert PIL.Image.Image type to 3D tensor with shape (IMG_H, IMG_W, IMG_D)
    x = image.img_to_array(img)
    # convert 3D tensor to 4D tensor with shape (1, IMG_H, IMG_W, 3) and return 4D tensor
    return np.expand_dims(x, axis=0)

def paths_to_tensor(img_paths):
    list_of_tensors = [path_to_tensor(img_path) for img_path in tqdm(img_paths)]
    return np.vstack(list_of_tensors)

---

## Creating a CNN to detect smile (from Scratch)

### Pre-process the Data

We rescale the images by dividing every pixel in every image by 255.

In [10]:
from PIL import ImageFile                            
ImageFile.LOAD_TRUNCATED_IMAGES = True                 

# pre-process the data for Keras
train_tensors = paths_to_tensor(train_files).astype('float32')/255
valid_tensors = paths_to_tensor(valid_files).astype('float32')/255
test_tensors = paths_to_tensor(test_files).astype('float32')/255

100%|██████████| 10000/10000 [00:45<00:00, 218.41it/s]
100%|██████████| 2500/2500 [00:51<00:00, 48.43it/s] 
100%|██████████| 2500/2500 [00:34<00:00, 72.18it/s] 


### BenchModel Architecture
A simple CNN model is developed here considering alternating layers of Conv2D and MaxPooling. Dropout layers are also added to avoid overfitting. All Conv2D layers have relu activation. The final classification dense layer has softmax activation.


In [11]:
from keras.layers import Conv2D, MaxPooling2D, GlobalAveragePooling2D
from keras.layers import Dropout, Flatten, Dense
from keras.models import Sequential

model = Sequential()

model.add(Conv2D(filters=16,kernel_size=2,padding='same',activation='relu',input_shape=(IMG_H,IMG_W,IMG_D)))
model.add(MaxPooling2D(pool_size=2))
model.add(Conv2D(filters=32,kernel_size=2,padding='same',activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Conv2D(filters=64,kernel_size=2,padding='same',activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(0.25))
model.add(Conv2D(filters=128,kernel_size=2,padding='same',activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(0.25))
model.add(GlobalAveragePooling2D())
model.add(Dense(2, activation='softmax'))

model.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_1 (Conv2D)            (None, 218, 178, 16)      208       
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 109, 89, 16)       0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 109, 89, 32)       2080      
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 54, 44, 32)        0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 54, 44, 64)        8256      
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 27, 22, 64)        0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 27, 22, 64)       

### Compile the Model

In [12]:
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])

### Train the Model

Training the model in the code cell below.  Using model checkpointing to save the model that attains the least validation loss.

In [13]:
from keras.callbacks import ModelCheckpoint  

EPOCHS = 20

checkpointer = ModelCheckpoint(filepath='input/saved_models/weights.best.from_scratch.hdf5', 
                               verbose=1, save_best_only=True)

model.fit(train_tensors, train_targets, 
          validation_data=(valid_tensors, valid_targets),
          epochs=EPOCHS, batch_size=20, callbacks=[checkpointer], verbose=1)

Train on 10000 samples, validate on 2500 samples
Epoch 1/20



Epoch 00001: val_loss improved from inf to 0.68873, saving model to input/saved_models/weights.best.from_scratch.hdf5
Epoch 2/20





Epoch 00002: val_loss improved from 0.68873 to 0.67900, saving model to input/saved_models/weights.best.from_scratch.hdf5
Epoch 3/20



Epoch 00003: val_loss improved from 0.67900 to 0.67473, saving model to input/saved_models/weights.best.from_scratch.hdf5
Epoch 4/20





Epoch 00004: val_loss improved from 0.67473 to 0.66247, saving model to input/saved_models/weights.best.from_scratch.hdf5
Epoch 5/20



Epoch 00005: val_loss improved from 0.66247 to 0.66089, saving model to input/saved_models/weights.best.from_scratch.hdf5
Epoch 6/20





Epoch 00006: val_loss improved from 0.66089 to 0.65191, saving model to input/saved_models/weights.best.from_scratch.hdf5
Epoch 7/20



Epoch 00007: val_loss improved from 0.65191 to 0.64142, saving model to input/saved_models/weights.best.from_scratch.hdf5
Epoch 8/20





Epoch 00008: val_loss did not improve from 0.64142
Epoch 9/20



Epoch 00009: val_loss did not improve from 0.64142
Epoch 10/20





Epoch 00010: val_loss improved from 0.64142 to 0.63319, saving model to input/saved_models/weights.best.from_scratch.hdf5
Epoch 11/20



Epoch 00011: val_loss improved from 0.63319 to 0.61822, saving model to input/saved_models/weights.best.from_scratch.hdf5
Epoch 12/20





Epoch 00012: val_loss improved from 0.61822 to 0.58823, saving model to input/saved_models/weights.best.from_scratch.hdf5
Epoch 13/20



Epoch 00013: val_loss did not improve from 0.58823
Epoch 14/20





Epoch 00014: val_loss did not improve from 0.58823
Epoch 15/20



Epoch 00015: val_loss improved from 0.58823 to 0.55004, saving model to input/saved_models/weights.best.from_scratch.hdf5
Epoch 16/20





Epoch 00016: val_loss did not improve from 0.55004
Epoch 17/20



Epoch 00017: val_loss improved from 0.55004 to 0.46805, saving model to input/saved_models/weights.best.from_scratch.hdf5
Epoch 18/20





Epoch 00018: val_loss improved from 0.46805 to 0.46000, saving model to input/saved_models/weights.best.from_scratch.hdf5
Epoch 19/20



Epoch 00019: val_loss improved from 0.46000 to 0.42449, saving model to input/saved_models/weights.best.from_scratch.hdf5
Epoch 20/20





Epoch 00020: val_loss improved from 0.42449 to 0.42438, saving model to input/saved_models/weights.best.from_scratch.hdf5


<keras.callbacks.callbacks.History at 0x21d27e09be0>

### Load the Model with the Best Validation Loss

In [14]:
model.load_weights('input/saved_models/weights.best.from_scratch.hdf5')

### Test the Model

Trying out the model on the test dataset images.

In [15]:
# get index of predicted smile detection for each image in test set
smile_predictions = [np.argmax(model.predict(np.expand_dims(tensor, axis=0))) for tensor in test_tensors]

# report test accuracy
test_accuracy = 100*np.sum(np.array(smile_predictions)==np.argmax(test_targets, axis=1))/len(smile_predictions)
print('Test accuracy: %.4f%%' % test_accuracy)

Test accuracy: 81.0400%



Here, we see that a simple CNN model is developed and that its accuracy is 81.04%. This is considered as a Benchmark and we try to develop a model with a better accuracy considering augmentation, transfer learning, bottlenexk features, pretuning, etc in the other notebooks.