# Detecting Face Masks in Images

#### Basic steps:
1. Given images, detect facial regions and features (find faces)
2. Create train/test data with facial regions as inputs
3. Using supervised model with training images, predict whether there is a mask in the facial region

#### Environment Setup

1. Use/create a new [Anaconda](https://www.anaconda.com/products/individual) environment with Python 3.6 (not 3.7 or 3.8), keras, and tensorflow:
``` sh
$ conda create -n <your_env_name> python=3.6 anaconda tensorflow keras
```
2. Install dependencies:
``` sh
$ conda install -c anaconda xlrd xlwt scikit-learn seaborn
$ conda install mtcnn ipyparallel
$ pip install python-opencv
```

In [1]:
import numpy as np
import pandas as pd
import os
import urllib
import matplotlib.pyplot as plt
import cv2
import matplotlib.patches as patches
import tensorflow as tf
import keras
from keras.utils import to_categorical
from keras.layers import Flatten, Dense, Conv2D, MaxPooling2D, Dropout
from keras.models import Sequential        # keras is only compatible with python 3.6 or lower
import sklearn
from sklearn.preprocessing import LabelEncoder
from mtcnn.mtcnn import MTCNN

Using TensorFlow backend.


## Grab/Process Image Data

1. For our purposes as of the time of writing this, there should be a prepared dataframe for images not in our training set located in the data directory: `data/img_data/unclassified.csv`
  + The prepared data frame is made up of rows containing the image filename, the coordinates of one face's bounding box, and an empty column 'classname' where the classification results would otherwise be.
2. Currently, we're using the training set from Kaggle's medical masks dataset.
  + This data is similarly prepared already, but it has it's classname column filled (mask/no mask/colorful mask)
  + Ideally, we could get some additional images for the training set containing masks that are not strictly 'medical' looking (anything that effectively covers the nose and mouth: bandanas, etc)


In [2]:
# define directory paths for easier navigation
NOTEBOOK_DIR = !pwd
NOTEBOOK_DIR = NOTEBOOK_DIR[0]
ROOT_DIR = os.path.abspath(os.path.join(NOTEBOOK_DIR, os.pardir))
DATA_DIR = os.path.join(ROOT_DIR, 'data')
IMAGES_DIR = os.path.join(DATA_DIR, 'images')
ANNOTATIONS_DIR = os.path.join(DATA_DIR, 'annotations')
SUB_IMAGES_DIR = os.path.join(IMAGES_DIR, 'sub')

# load training data
df_train = pd.read_csv(os.path.join(DATA_DIR, 'train.csv'))
df_test = pd.read_csv(
    os.path.join(DATA_DIR, 'img_data/not_classified.csv'),
    index_col=0
)

In [3]:
df_train.rename(
    columns={'x1': 'x', 'x2': 'y', 'y1': 'w', 'y2': 'h'},
    inplace=True
)
df_train.sort_values('name', axis = 0, inplace = True)
df_train

Unnamed: 0,name,x,y,w,h,classname
13381,1801.jpg,451,186,895,697,face_no_mask
3464,1802.jpg,160,151,268,265,mask_surgical
3463,1802.jpg,110,71,273,272,face_with_mask
14836,1803.jpg,147,200,288,320,mask_surgical
14835,1803.jpg,126,75,303,333,face_with_mask
...,...,...,...,...,...,...
13555,6433.png,669,205,774,282,mask_surgical
9508,6434.jpg,315,82,775,783,face_with_mask
9507,6434.jpg,343,448,756,774,mask_colorful
9434,6435.jpg,198,86,292,149,mask_surgical


In [4]:
df_test.rename(
    columns={'x1': 'x', 'x2': 'y', 'y1': 'w', 'y2': 'h'},
    inplace=True
)
df_test.sort_values('name', axis = 0, inplace = True)
df_test

Unnamed: 0,name,x,y,w,h,classname
0,0001.jpg,441,108,341,416,
1,0003.jpg,1292,218,865,1088,
2,0004.jpg,630,176,212,266,
3,0006.jpg,441,668,57,70,
4,0007.jpg,922,565,102,121,
...,...,...,...,...,...,...
3161,1796.jpg,469,223,186,229,
3159,1796.jpg,933,207,232,298,
3162,1797.jpg,738,235,483,638,
3163,1799.jpg,757,296,75,86,


## Up Next:

1. We'll need to be able to look at each facial region within an image
2. We'll also need to keep the data somewhere
3. We also need to define and train a keras model before we can make any predictions 

## We know where our faces are... let's get our training data ready for the model

In [None]:
train_images = df_train['name'].unique()
data = list()
for image in train_images:
    name, x, y, w, h, classname = list(df_train[df_train['name'] == image].values[0])
    # print(name, x, y, w, h)
    og_img = cv2.imread(os.path.join(IMAGES_DIR, name), cv2.IMREAD_GRAYSCALE)
    cropped = og_img[y:h, x:w]
    new_img = cv2.resize(cropped, (50, 50)) # resize img to 50x50
    data.append([new_img, classname])
    
list

## A Few More Steps:
1. Split training data into features and labels
2. Convert labels to categorical values and reshape features
3. Define/train a model on the training set

In [6]:
# split into labels/features
x=[]
y=[]
for features, labels in data:
    x.append(features)
    y.append(labels)

# convert labels to categorical values
labeler = LabelEncoder()
y = labeler.fit_transform(y)

In [7]:
# reshape features
x = np.array(x).reshape(-1, 50, 50, 1)
x = tf.keras.utils.normalize(x, axis=1)
y = to_categorical(y)

y.shape

(4326, 20)

## Model Creation/Fitting

1. Keras models can be created, compiled, trained, and then *saved* to a file and/or loaded from one.
2. If there isn't a model file available, we have to create it.
  + This takes awhile, so if there is an already trained/fit model, we're gonna use it

In [83]:
model_path = "../models/image_model"
if os.path.exists(model_path):
    model = keras.models.load_model(model_path)

else:
    batch_size = 5
    epochs = 30
    model = Sequential()
    model.add(Conv2D(100, (3, 3), input_shape=x.shape[1:], activation='relu', strides=2))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Conv2D(64, (3, 3), activation='relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Flatten())
    model.add(Dense(50, activation='relu'))
    model.add(Dropout(0.2))
    model.add(Dense(2, activation='softmax'))

    opt = keras.optimizers.Adam(learning_rate=1e-3, epsilon=1e-5)
    model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])
    model.fit(x, y, epochs=epochs, batch_size=batch_size)
    model.save(model_path)

ValueError: Error when checking target: expected dense_6 to have shape (2,) but got array with shape (20,)