<a href="https://colab.research.google.com/github/sabrina-beck/masked-faces-deep-learning/blob/snap-2020-12-12-15-18/2020s2_mo434_projeto_final.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Projeto Final

Universidade Estadual de Campinas (UNICAMP), Instituto de Computação (IC)

Profa. Sandra Avila, 2020s2

In [1]:
print('173334: ' + 'Luiz Henrique Simioni Machado')
print('121192: ' + 'Mariane Previde')
print('157240: ' + 'Sabrina Beck Angelini')

173334: Luiz Henrique Simioni Machado
121192: Mariane Previde
157240: Sabrina Beck Angelini


# Tema: Detecção de máscaras faciais
Com a pandemia do Covid-19 surgiu a necessidade das pessoas utilizarem máscaras para diminuir o contágio por Sars-CoV-2. A máscara vem cada vez mais se provando muito eficaz na diminuição do número de contágios ou da carga viral na transmissão, ajudando a diminuir casos de contaminação e internações de modo a evitar a sobrecarga dos hospitais e falecimentos. 

A ideia é desenvolver uma rede neural que consiga detectar em espaços públicos pessoas que estejam sem máscara ~~**ou usando a máscara incorretamente**~~. Com isso, agentes públicos poderiam abordar essas pessoas e fornecer uma máscara ~~**ou explicar o jeito correto de utilizá-la**~~.

**~~Nós pretendemos mesclar alguns datasets de modo a obter um conjunto maior para o treinamento~~. Nosso dataset final deverá ter ~~3~~ 2 classes:**
* Pessoas com máscaras
* Pessoas sem máscaras 
* ~~**Pessoas utilizando a máscara de forma errada**~

# Base de Dados

A base de dados utilizada é a [Real World Masked Face Dataset](https://github.com/X-zhangyang/Real-World-Masked-Face-Dataset), optamos pela primeira opção de [download](https://github.com/X-zhangyang/Real-World-Masked-Face-Dataset) em que as amostras de imagens foram limpas e rotuladas contendo:


*   5000 faces com máscara de 525 pessoas
*   90000 faces sem máscara

Subimos uma cópia da base de dados feita no dia 23-11-2020 no Google Drive [aqui](https://drive.google.com/file/d/1UD8nf8CfuEycJwt2mBjfT9ElB2QoOMlx/view?usp=sharing).

In [2]:
!pip install PyDrive &> /dev/null

In [3]:
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials
import io
import zipfile

# Authenticate and create the PyDrive client.
# This only needs to be done once per notebook.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

# Download the zipped dataset based on its file ID on Drive.
file_id = '1UD8nf8CfuEycJwt2mBjfT9ElB2QoOMlx' #-- Updated File ID for my zip
downloaded = drive.CreateFile({'id': file_id})
downloaded.GetContentFile('RMFD.zip')
!unzip -q RMFD.zip

## Organização da Base de Dados

A base está organizada com os seguintes subdiretórios:
* `AFDB_face_dataset/`: imagens de pessoas sem máscara
* `AFDB_masked_face_dataset/`: imagens de pessoas com máscara

Cada subdiretório é composto por outros subdiretórios, cada um identificando as fotos de cada uma das 525 pessoas.

Devido à essa organização tivemos que fazer uma reorganização da base para deixar todas as imagens diretamente no subdiretório de sua classe, uma vez que estamos interessados somente na classificação `face` x `masked face`.

In [None]:
from pathlib import Path

basePath = Path('self-built-masked-face-recognition-dataset')
maskPath = basePath/'AFDB_masked_face_dataset'
nonMaskPath = basePath/'AFDB_face_dataset'

In [None]:
from pathlib import Path
from tqdm import tqdm
from os import listdir, rmdir, path
from shutil import move
import ntpath

def flattenClassDir(classDir, classDesc):
  classDirList = list(classDir.iterdir())
  count = 0
  for subDirectory in tqdm(classDirList, desc=classDesc):
    for imgPath in subDirectory.iterdir():
      count += 1
      filename = ntpath.basename(imgPath)
      move(imgPath, path.join(classDir, '%s.jpg' % count))
    rmdir(subDirectory)

flattenClassDir(nonMaskPath, 'non mask photos')
flattenClassDir(maskPath, 'mask photos')

non mask photos: 100%|██████████| 460/460 [00:03<00:00, 138.85it/s]
mask photos: 100%|██████████| 525/525 [00:00<00:00, 5434.31it/s]


## A Base está Balanceada?

In [None]:
mask_image_count = len(list(maskPath.glob('*.jpg')))
non_mask_image_count = len(list(nonMaskPath.glob('*.jpg')))
image_count = mask_image_count + non_mask_image_count

print('Masked photos: ', mask_image_count, ' (', '%.2f' % (mask_image_count / image_count * 100), '%)')
print('Non masked photos: ', non_mask_image_count, ' (', '%.2f' % (non_mask_image_count / image_count * 100), '%)')

Masked photos:  2203  ( 2.38 %)
Non masked photos:  90468  ( 97.62 %)


In [None]:
# Scaling by total/2 helps keep the loss to a similar magnitude.
# The sum of the weights of all examples stays the same.
weight_for_0 = (1 / non_mask_image_count)*(image_count)/2.0 
weight_for_1 = (1 / mask_image_count)*(image_count)/2.0

train_class_weights = {0: weight_for_0, 1: weight_for_1}

print('Weight for class 0 (non mask): {:.2f}'.format(weight_for_0))
print('Weight for class 1 (mask): {:.2f}'.format(weight_for_1))

Weight for class 0 (non mask): 0.51
Weight for class 1 (mask): 21.03


## Carregando a Base de dados

In [None]:
batch_size = 512
img_width = 299
img_height = 299
input_shape = (img_width, img_height, 3)

In [None]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

def load_ds(preprocess_function):
  datagen = ImageDataGenerator(rescale=1/255,
            #  rotation_range=40,
            #  width_shift_range= 0.1,
            #  height_shift_range= 0.1,
            #  shear_range =  2,
            #  zoom_range =  0.2,
            horizontal_flip=  True,
            vertical_flip =  True,
            #  vertical_flip =  False,
            #  fill_mode = "nearest"
            validation_split=0.2,
            preprocessing_function=preprocess_function
            )

  train_ds = datagen.flow_from_directory(
            basePath,
            target_size=(img_width, img_height),
            batch_size=batch_size,
            class_mode='binary',
            subset='training')
    
  val_ds = datagen.flow_from_directory(
            basePath,
            target_size=(img_width, img_height),
            batch_size=batch_size,
            class_mode='binary',
            subset='validation')
  
  return (train_ds, val_ds)

In [None]:
# TODO split test_ds

# Modelos

In [None]:
import numpy as np
import tensorflow as tf
from tensorflow import keras

In [None]:
epochs=10
optimizer=keras.optimizers.Adam(1e-3)
callbacks=[tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=2)]

## Baseline Resnet 50

In [None]:
from tensorflow.keras.applications import resnet50

In [None]:
(resnet50_train_ds, resnet50_val_ds) = load_ds(resnet50.preprocess_input)

Found 74138 images belonging to 2 classes.
Found 18533 images belonging to 2 classes.


In [None]:
class_names = list(train_ds.class_indices.keys())
class_names

['AFDB_face_dataset', 'AFDB_masked_face_dataset']

In [None]:
# Modelo pré-treinado SEM as camadas densas (include_top = False)
resnet50_model = tf.keras.applications.ResNet50(
    weights='imagenet', 
    include_top=False, 
    input_shape=(img_height, img_width) + (3,))

# Congela camadas pré-treinadas
for layer in resnet50_model.layers:
    layer.trainable = False

resnet50_model.summary()

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5
Model: "resnet50"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_1 (InputLayer)            [(None, 299, 299, 3) 0                                            
__________________________________________________________________________________________________
conv1_pad (ZeroPadding2D)       (None, 305, 305, 3)  0           input_1[0][0]                    
__________________________________________________________________________________________________
conv1_conv (Conv2D)             (None, 150, 150, 64) 9472        conv1_pad[0][0]                  
__________________________________________________________________________________________________
conv1_bn (BatchNormalization)   (None, 150, 150

In [None]:
# Inserindo novas camadas de saída com o número de classes adequadas para este problema
resnet50_full_model = tf.keras.Sequential([
  resnet50_model,
  tf.keras.layers.GlobalAveragePooling2D(),
  # Only 1 output neuron. It will contain a value from 0-1 where 0 for 1 class ('') and 1 for the other ('')
  tf.keras.layers.Dense(1, activation='sigmoid')
])

resnet50_full_model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
resnet50 (Functional)        (None, 10, 10, 2048)      23587712  
_________________________________________________________________
global_average_pooling2d (Gl (None, 2048)              0         
_________________________________________________________________
dense (Dense)                (None, 1)                 2049      
Total params: 23,589,761
Trainable params: 2,049
Non-trainable params: 23,587,712
_________________________________________________________________


In [None]:
# Since there are two classes, use a binary cross-entropy loss with from_logits=True since the model provides a linear output.
resnet50_full_model.compile(
    # loss='categorical_crossentropy', 
    #loss='binary_crossentropy',
    loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
    metrics=['acc'],
    optimizer=optimizer
    )

In [None]:
# Treina o modelo
history_restnet50 = resnet50_full_model.fit(resnet50_train_ds, 
      batch_size=batch_size, 
      epochs=epochs, 
      validation_data=resnet50_val_ds,
      callbacks=callbacks,
      class_weight=train_class_weights
      )

Epoch 1/10
Epoch 2/10

KeyboardInterrupt: ignored

In [None]:
# Avalia o modelo na validação
score_full_restnet50 = resnet50_full_model.evaluate(processed_val_ds,verbose=1, 
                       #batch_size=batch_size
                       )

print('Valid loss:', score_full_restnet50[0])
print('Valid acc:', score_full_restnet50[1])

### Fine tuning

In [None]:
# Descongela camadas pré-treinadas
for layer in resnet50_model.layers:
    layer.trainable = True

# Since there are two classes, use a binary cross-entropy loss with from_logits=True since the model provides a linear output.
resnet50_full_model.compile(
    # loss='categorical_crossentropy', 
    #loss='binary_crossentropy',
    loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
    metrics=['acc'],
    optimizer=optimizer
    )
    
resnet50_full_model.summary()

In [None]:
resnet50_unfreeze_history = resnet50_full_model.fit(
    resnet50_train_ds, 
    batch_size=batch_size, 
    epochs=epochs, 
    validation_data=resnet50_val_ds,
    callbacks=callbacks)

In [None]:
# Avalia modelo na validação
resnet50_unfreeze_score = model_full_resnet50.evaluate(resnet50_val_ds,verbose=1)
print('Validation loss:', resnet50_unfreeze_score[0])
print('Validation acc:', resnet50_unfreeze_score[1])