# CHAPTER 2.1

### Creating a binary classifier to detect smiles

we'll implement a binary classifier that tells us whether a person in a photo is smiling. We'll use the SMILEs dataset, located here: https://github.com/hromi/SMILEsmileD.

In [1]:
import os
import pathlib

import glob
import numpy as np
from sklearn.model_selection import train_test_split
from tensorflow.keras import Model
from tensorflow.keras.layers import *
from tensorflow.keras.preprocessing.image import *

Firstly, we need to define a function to load the images and labels from a list of file paths

In [2]:
def load_images_and_labels(image_paths):
    images = []
    labels = []

    for image_path in image_paths:
        image = load_img(image_path, target_size=(32, 32),
                         color_mode='grayscale')
        image = img_to_array(image)

        label = image_path.split(os.path.sep)[-2]
        label = 'positive' in label
        label = float(label)

        images.append(image)
        labels.append(label)

    return np.array(images), np.array(labels)

To remember some commands, let's check on one image and path about what we will do

In [7]:
img_pth=r'C:\Users\Zeki\.keras\datasets\SMILEsmileD-master\SMILEs\positives\positives7\260.jpg'
img = load_img(img_pth)

In [8]:
print(f'Image type: {type(img)}')
print(f'Image format: {img.format}')
print(f'Image mode: {img.mode}')
print(f'Image size: {img.size}')

Image type: <class 'PIL.Image.Image'>
Image format: None
Image mode: RGB
Image size: (64, 64)


In [10]:
img = load_img(img_pth,color_mode='grayscale')
print(f'Image type: {type(img)}')
print(f'Image format: {img.format}')
print(f'Image mode: {img.mode}')
print(f'Image size: {img.size}')

Image type: <class 'PIL.JpegImagePlugin.JpegImageFile'>
Image format: JPEG
Image mode: L
Image size: (64, 64)


In [11]:
img = load_img(img_pth, target_size=(32, 32),color_mode='grayscale')
print(f'Image type: {type(img)}')
print(f'Image format: {img.format}')
print(f'Image mode: {img.mode}')
print(f'Image size: {img.size}')

Image type: <class 'PIL.Image.Image'>
Image format: None
Image mode: L
Image size: (32, 32)


In [12]:
img_np = img_to_array(img)

In [14]:
print(f'Image type: {type(img_np)}')
print(f'Image size: {img_np.size}')

Image type: <class 'numpy.ndarray'>
Image size: 1024


In [15]:
label = img_pth.split(os.path.sep)[-2]

In [16]:
label

'positives7'

In [17]:
label = 'positive' in label
label = float(label)

In [18]:
label

1.0

 SUMMARY : we take image in grayscale and reshape 32x32. Then convert it to numpy array. According to its file name, we label it 1(for positive) or 0 (for negative)

Now we will define a function to build the neural network. This model's structure is based on LeNet

In [19]:
def build_network():
    input_layer = Input(shape=(32, 32, 1))
    x = Conv2D(filters=20,
               kernel_size=(5, 5),
               padding='same',
               strides=(1, 1))(input_layer)
    x = ELU()(x)
    x = BatchNormalization()(x)
    x = MaxPooling2D(pool_size=(2, 2),
                     strides=(2, 2))(x)
    x = Dropout(0.4)(x)

    x = Conv2D(filters=50,
               kernel_size=(5, 5),
               padding='same',
               strides=(1, 1))(x)
    x = ELU()(x)
    x = BatchNormalization()(x)
    x = MaxPooling2D(pool_size=(2, 2),
                     strides=(2, 2))(x)
    x = Dropout(0.4)(x)

    x = Flatten()(x)
    x = Dense(units=500)(x)
    x = ELU()(x)
    x = Dropout(0.4)(x)

    output = Dense(1, activation='sigmoid')(x)

    model = Model(inputs=input_layer, outputs=output)
    return model

In [20]:
pathlib.Path.home()

WindowsPath('C:/Users/Zeki')

Load the image paths into a list

In [21]:
files_pattern = (pathlib.Path.home() / '.keras' / 'datasets' /
                 'SMILEsmileD-master' / 'SMILEs' / '*' / '*' /
                 '*.jpg')

In [22]:
files_pattern

WindowsPath('C:/Users/Zeki/.keras/datasets/SMILEsmileD-master/SMILEs/*/*/*.jpg')

In [23]:
files_pattern = str(files_pattern)
dataset_paths = [*glob.glob(files_pattern)]

In [25]:
files_pattern

'C:\\Users\\Zeki\\.keras\\datasets\\SMILEsmileD-master\\SMILEs\\*\\*\\*.jpg'

In [24]:
dataset_paths

['C:\\Users\\Zeki\\.keras\\datasets\\SMILEsmileD-master\\SMILEs\\negatives\\negatives7\\10.jpg',
 'C:\\Users\\Zeki\\.keras\\datasets\\SMILEsmileD-master\\SMILEs\\negatives\\negatives7\\10000.jpg',
 'C:\\Users\\Zeki\\.keras\\datasets\\SMILEsmileD-master\\SMILEs\\negatives\\negatives7\\10001.jpg',
 'C:\\Users\\Zeki\\.keras\\datasets\\SMILEsmileD-master\\SMILEs\\negatives\\negatives7\\10002.jpg',
 'C:\\Users\\Zeki\\.keras\\datasets\\SMILEsmileD-master\\SMILEs\\negatives\\negatives7\\10003.jpg',
 'C:\\Users\\Zeki\\.keras\\datasets\\SMILEsmileD-master\\SMILEs\\negatives\\negatives7\\10004.jpg',
 'C:\\Users\\Zeki\\.keras\\datasets\\SMILEsmileD-master\\SMILEs\\negatives\\negatives7\\10005.jpg',
 'C:\\Users\\Zeki\\.keras\\datasets\\SMILEsmileD-master\\SMILEs\\negatives\\negatives7\\10006.jpg',
 'C:\\Users\\Zeki\\.keras\\datasets\\SMILEsmileD-master\\SMILEs\\negatives\\negatives7\\10008.jpg',
 'C:\\Users\\Zeki\\.keras\\datasets\\SMILEsmileD-master\\SMILEs\\negatives\\negatives7\\1001.jpg',
 'C:

Now we are ready to load dataset, split it for training, building our model and fit it.

In [26]:
X, y = load_images_and_labels(dataset_paths)

In [28]:
type(X)

numpy.ndarray

In [29]:
X.shape

(13165, 32, 32, 1)

In [30]:
y.shape

(13165,)

We should normalize X and find the number of positive and negative labels. For this purpose: 

In [33]:
X /= 255.0

In [34]:
total = len(y)
print(total)

13165


In [35]:
total_positive = np.sum(y)
print(total_positive)
total_negative = total - total_positive
print(total_negative)

3690.0
9475.0


In [36]:
print(f'Total images: {total}')
print(f'Smile images: {total_positive}')
print(f'Non-smile images: {total_negative}')

Total images: 13165
Smile images: 3690.0
Non-smile images: 9475.0


Split datasets for train, validation and test

In [37]:
(X_train, X_test, y_train, y_test) = train_test_split(X, y,
                                     test_size=0.2,
                                     stratify=y,
                                     random_state=999)
(X_train, X_val, y_train, y_val) = train_test_split(X_train, y_train,
                                    test_size=0.2,
                                    stratify=y_train,
                                    random_state=999)

In [38]:
model = build_network()
model.compile(loss='binary_crossentropy',optimizer='rmsprop',metrics=['accuracy'])

NOTE : Because the dataset is unbalanced, we are assigning weights to each class proportional to the number of positive and negative images in the dataset!!!!

In [39]:
BATCH_SIZE = 32
EPOCHS = 20
model.fit(X_train, y_train,
          validation_data=(X_val, y_val),
          epochs=EPOCHS,
          batch_size=BATCH_SIZE,
          class_weight={
              1.0: total / total_positive,
              0.0: total / total_negative
          })

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.History at 0x1ae491eb490>

In [40]:
test_loss, test_accuracy = model.evaluate(X_test, y_test)
print(f'Loss: {test_loss}, accuracy: {test_accuracy}')

Loss: 0.27361956238746643, accuracy: 0.9137865304946899
