# Real time Face mask recognition :

 Because of covid-19, we need to wear a mask to protect ourselves and the others. In this project, I will try to make a python programme that recognize if someone is wearing or not a mask in real time.

## Plan

### EDA :
 **Analysis of the shape :**

* Identification of the target : 

* Number of rows and columns  : 

* Variables type : 

* Identification of missing values : 
 

 **Substantive analysis :**

* Target visualization : 

* Understanding the different variables :

* Visualization of relations : features/Target :

* Identification of outliers :

### Preprocessing :

* Creation of the Train Set / Test Set
* Removal of NaN : dropna(), imputation, "empty" column
* Encoding
* Removal of outliers harmful to the model 
* Feature selection
* Feature engineering
* Feature scaling

### Modeling :

* Define an evaluation function
* Training of different model : Here i choose to 3 models : 
    1. The VGG architecture                                                     
    2. The ResNet architecture 
    3. the mobileNet architecture
* Optimization
* Error analysis and return to Preprocessing / EDA
* Learning curve and Decision Making

### Objectif :
The objectif is to get high accuracy. I will compare between these 3 architectures (VGG, ResNet, MobileNet), and I will focus on (accuracy, training time). The final model will be the one who has best accuracy and less training time.


## Import the libraries

In [62]:
import datetime
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import os
from tensorflow.keras.applications import MobileNetV2, VGG16, ResNet50V2
from tensorflow.keras.layers import AveragePooling2D, MaxPool2D, Dropout, Flatten, Dense, Input, Activation
from tensorflow.keras.models import Model, Sequential
from tensorflow.keras.optimizers import Adam, RMSprop
from tensorflow.keras.applications.mobilenet_v2 import preprocess_input
from tensorflow.keras.preprocessing.image import img_to_array
from tensorflow.keras.preprocessing.image import load_img 
from tensorflow.keras.utils import to_categorical
from sklearn.preprocessing import LabelBinarizer
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix
from imutils import paths

-- Here, I will initialize a very important hyperparameter :
* the learning rate : for the gradient descent


In [17]:
lr = 1e-4

In [18]:
DIRECTORY = "data/"
CATEGORIES = ["with_mask", "without_mask"]

I will now initialize two lists :
* The first one will contain the data (the images)
* The second one will contain the labels (with mask or not)

In [19]:
data = []
labels = []

In [20]:
for cat in CATEGORIES:
    path = os.path.join(DIRECTORY, cat)
    for img in os.listdir(path):
        img_path = os.path.join(path, img)
        image = load_img(img_path, target_size=(224, 224))
        image = img_to_array(image)
        image = preprocess_input(image)
        data.append(image)
        labels.append(cat)



Because the labels are strings, we should convert them into integers, so I will use the `labelBinarizer` with the function `to_categorical`

In [21]:
lb = LabelBinarizer()
labels = lb.fit_transform(labels)
labels

array([[0],
       [0],
       [0],
       ...,
       [1],
       [1],
       [1]])

Here we see, labels now is an 1D-array with 0 and 1.

In [22]:
labels = to_categorical(labels)
labels

array([[1., 0.],
       [1., 0.],
       [1., 0.],
       ...,
       [0., 1.],
       [0., 1.],
       [0., 1.]], dtype=float32)

Now, it's a 2D-array (because we have two classes), It's exactly like the one hot encoding.

---
The list data and labels we should transform them to an array so we can use it.

In [23]:
data = np.array(data, dtype="float32")
labels = np.array(labels)

In [24]:
print(data.shape)
print(labels.shape)

(3834, 224, 224, 3)
(3834, 2)


   Now, i will split the data into a train and a test set using the train test split.
   Here , I got a problem because i had less labels, so to solve that i just copied another image so data and labels will have the same size.

In [25]:
X_train, X_test, y_train, y_test = train_test_split(data, labels, test_size=0.25, random_state=11)

Now, i will construct an image generator, it's useful for data augmentation

In [26]:
data_aug = ImageDataGenerator(
        rotation_range=20,
        zoom_range=0.15,
        width_shift_range=0.2,
        height_shift_range=0.2,
        shear_range=0.15,
        horizontal_flip=True,
        fill_mode="nearest")

#### Construction the 3 models :
* The VGG architecture
* The ResNet architecture
* The mobileNet architecture

## The VGG model : VGG16

In [27]:
baseModel_1 = VGG16(weights="imagenet", include_top=False, input_tensor=Input(shape=(224, 224, 3)))

In [28]:
headModel_1 = baseModel_1.output
headModel_1 = AveragePooling2D((7, 7))(headModel_1)
headModel_1 = Flatten(name="flatten")(headModel_1)
headModel_1 = Dense(128, activation="relu")(headModel_1)
headModel_1 = Dropout(0.5)(headModel_1)
headModel_1 = Dense(2, activation="sigmoid")(headModel_1)

In [29]:
model_1 = Model(inputs=baseModel_1.input, outputs=headModel_1)
for l in baseModel_1.layers:
    l.trainable = False
model_1.summary()

Model: "model_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_4 (InputLayer)         [(None, 224, 224, 3)]     0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0   

## The second model : The ResNet architecture 

In [30]:
baseModel_2 = ResNet50V2(weights="imagenet", include_top=False, input_tensor=Input(shape=(224, 224, 3)))

In [31]:
headModel_2 = baseModel_2.output
headModel_2 = AveragePooling2D((7, 7))(headModel_2)
headModel_2 = Flatten(name="flatten")(headModel_2)
headModel_2 = Dense(128, activation="relu")(headModel_2)
headModel_2 = Dropout(0.5)(headModel_2)
headModel_2 = Dense(2, activation="sigmoid")(headModel_2)

In [32]:
model_2 = Model(inputs=baseModel_2.input, outputs=headModel_2)

for layer in baseModel_2.layers:
    layer.trainable = False
model_2.summary()

Model: "model_3"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_5 (InputLayer)            [(None, 224, 224, 3) 0                                            
__________________________________________________________________________________________________
conv1_pad (ZeroPadding2D)       (None, 230, 230, 3)  0           input_5[0][0]                    
__________________________________________________________________________________________________
conv1_conv (Conv2D)             (None, 112, 112, 64) 9472        conv1_pad[0][0]                  
__________________________________________________________________________________________________
pool1_pad (ZeroPadding2D)       (None, 114, 114, 64) 0           conv1_conv[0][0]                 
____________________________________________________________________________________________

## The third model : MobileNet

In [33]:
baseModel_3 = MobileNetV2(weights="imagenet", include_top=False, input_tensor=Input(shape=(224, 224, 3)))



In [34]:
headModel_3 = baseModel_3.output
headModel_3 = AveragePooling2D((7, 7))(headModel_3)
headModel_3 = Flatten(name="flatten")(headModel_3)
headModel_3 = Dense(128, activation="relu")(headModel_3)
headModel_3 = Dropout(0.5)(headModel_3)
headModel_3 = Dense(2, activation="sigmoid")(headModel_3)

In [35]:
model_3 = Model(inputs=baseModel_3.input, outputs=headModel_3)

In [36]:
for layer in baseModel_3.layers:
    layer.trainable = False
model_3.summary()

Model: "model_4"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_6 (InputLayer)            [(None, 224, 224, 3) 0                                            
__________________________________________________________________________________________________
Conv1 (Conv2D)                  (None, 112, 112, 32) 864         input_6[0][0]                    
__________________________________________________________________________________________________
bn_Conv1 (BatchNormalization)   (None, 112, 112, 32) 128         Conv1[0][0]                      
__________________________________________________________________________________________________
Conv1_relu (ReLU)               (None, 112, 112, 32) 0           bn_Conv1[0][0]                   
____________________________________________________________________________________________

- To simplify things, I will write a function that include all the training steps
- As input, function takes a model, an optimizer, a model name, the number of epochs and the batch size

#### Optimizers that I will use :
* Adam
* RMSprop

In [37]:
rmsProp = RMSprop(lr=0.0005, decay=1e-6)
adam = Adam(lr=1e-4, decay=1e-4/20)



In [38]:
now = datetime.datetime.now

def train_model(model, optimizer, modelName, epochs=20, batch_size=20):
    #you can decomment this two if you use non normalized images
    #X_train /= 255
    #X_test /= 255
    print("Model summary :\n")
    model.summary()
    print("Compiling model...")
    model.compile(loss='binary_crossentropy',
                  optimizer=optimizer,
                  metrics=['accuracy'])

    t = now()
    
    model.fit(data_aug.flow(X_train, y_train, batch_size=batch_size),
              batch_size=batch_size,
              epochs=epochs,
              verbose=1,
              validation_data=(X_test, y_test))
    print('Training time: %s' % (now() - t))

    print("[INFO] evaluating network...")
    score = model.evaluate(X_test, y_test, verbose=0)
    print('Test score:', score[0])
    print('Test accuracy:', score[1])
    
    #Saving the model :
    print("Saving the model...")
    model.save("mask_detector_"+modelName+".model", save_format="h5")

## Training the first model : The VGG architecture :

In [40]:
train_model(model_1, rmsProp, "VGG")

Model summary :

Model: "model_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_4 (InputLayer)         [(None, 224, 224, 3)]     0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 56, 56

## Training the second model : The ResNet

In [61]:
train_model(model_2, adam, "ResNet")

Model summary :

Model: "model_3"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_5 (InputLayer)            [(None, 224, 224, 3) 0                                            
__________________________________________________________________________________________________
conv1_pad (ZeroPadding2D)       (None, 230, 230, 3)  0           input_5[0][0]                    
__________________________________________________________________________________________________
conv1_conv (Conv2D)             (None, 112, 112, 64) 9472        conv1_pad[0][0]                  
__________________________________________________________________________________________________
pool1_pad (ZeroPadding2D)       (None, 114, 114, 64) 0           conv1_conv[0][0]                 
___________________________________________________________________________

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Training time: 1:16:04.277986
[INFO] evaluating network...
Test score: 0.02996882051229477
Test accuracy: 0.9937434792518616
Saving the model...




## Train the third model : The mobileNet architecture

In [39]:
train_model(model_3, adam, "mobileNet")

Model summary :

Model: "model_4"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_6 (InputLayer)            [(None, 224, 224, 3) 0                                            
__________________________________________________________________________________________________
Conv1 (Conv2D)                  (None, 112, 112, 32) 864         input_6[0][0]                    
__________________________________________________________________________________________________
bn_Conv1 (BatchNormalization)   (None, 112, 112, 32) 128         Conv1[0][0]                      
__________________________________________________________________________________________________
Conv1_relu (ReLU)               (None, 112, 112, 32) 0           bn_Conv1[0][0]                   
___________________________________________________________________________

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Training time: 0:22:15.962667
[INFO] evaluating network...
Test score: 0.03148581460118294
Test accuracy: 0.9885297417640686
Saving the model...




# Comparaison between the 3 models :

## Training time :
* VGG : ~ 4 hours 37 minutes
* ResNet : ~ 1 hour 16 minutes
* MobileNet : ~22 minutes

## Accuracy :
* VGG : ~0.988
* ResNet : ~0.993
* MobileNet : ~0.988

## Final model and next steps :

As we see here, MobileNet did less time while training, and ResNet has the best accuracy. VGG is a bad model for this kind of problems because It took a lot of time. So the only models i can choose between are ResNet and MobileNet. And because of we only trained these models for 20 epochs, My final model would be the MobileNet with more epochs to get the higher possible accuracy. I also created another python file in which i used the opencv to do the real time predicition and It works really well