# MARRtino face mask classifier
The purpose of my project is creating an image classifier that can recognize wether a person is wearing or not a face mask.
This notebook contains the code I wrote in order to load the face mask dataset and train my classifier explained step-by-step. You can find more in my github repository [here](https://github.com/ludocomito/marrtino-face-detection).

### Import and setup Weights & Biases

Weights & Biases is a platform commonly used for logging useful information about the training process of a model. Here it will be used to log the data about loss and accuracy during the training and validation phases of the model. You can find the results of my experiments [here](https://github.com/ludocomito/marrtino-face-detection/tree/main/training%20reports).
The following lines of code are responsible for installing wandb library and setting up the login and the connection to the Weights & Biases project.

In [None]:
# WandB – Install the W&B library
!pip install wandb -q

In [None]:
import wandb

# API Key will be requested
wandb.login()

In [None]:
# Starts the communication with the platform
wandb.init(project="MARRtino-face-mask-recognition", entity="ludocomito")

## Libraries used
The libraries involved in this project are:
* Torch and Torchvision (framework used to create and train the model)
* Numpy - used for array manipulation and math transforms
* Matplotlib - used for displaying images and plots

In [None]:
from __future__ import print_function, division

import torch
import torch.nn as nn
import torch.optim as optim
from torch.optim import lr_scheduler
import torch.backends.cudnn as cudnn
from torch.utils.data import DataLoader, random_split
import numpy as np
import torchvision
from torchvision import datasets, models, transforms
import matplotlib.pyplot as plt
import time
import os
import copy

cudnn.benchmark = True
plt.ion()   # interactive mode

## Importing the dataset
In order to correctly import and process the dataset we have to choose the proper transforms. Here we define a pipeline of transform that will apply the following manipulations for each image:
* Transforming the image into a tensor (turns it into an object that can be processed by a neural net)
* Normalize the image (which means changing the range of pixel intensity values)
The dataset folder will be imported using the datasets.ImageFolder function, which handles for us the correct import of the photos and their labels.

⚠️ *The ImageFolder function expects to have as an input a folder containing a sub-folder for every class. Each sub-folder should then contain the elements belonging to that class. Luckily our dataset already has this structure and does not require any change.*

In [None]:
# Defining the pipeline of transforms that will be applied to the images when creating the dataset.
data_transforms = transforms.Compose([
        transforms.ToTensor(),
        transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
    ])

data_dir = '../input/face-mask-detection/Dataset'

dataset = torchvision.datasets.ImageFolder(data_dir,transform=data_transforms)
print(dataset)

Here we split the dataset into train and test sets. The proportion is 80% for training and 20% for validation.

In [None]:
train_set_size = int(len(dataset)*0.8)
val_set_size = len(dataset) - train_set_size
train, val = torch.utils.data.random_split(dataset, [train_set_size,val_set_size])

Defining the data loader, which is combines a dataset and a sampler, and provides an iterable over the given dataset. The iterable will be used to parse the various batches during training and validation.



In [None]:
# Defining the data loaders for both sets
train_loader = DataLoader(train, batch_size=64, shuffle=True)
val_loader = DataLoader(val, batch_size=64, shuffle=True)

# Creating the dataloaders object, which contains both the loaders
dataloaders = {"train":train_loader,"val":val_loader}
dataset_sizes = {"train": len(train), "val":len(val)}

# Check that everything is correct
dataloaders["train"]

In [None]:
# Checking that images and labels are represented correctly
dataiter = iter(train_loader)
images, labels = dataiter.next()
print(type(images))
print(images.shape)
print(labels.shape)

## Defining an helper function that shows the batch of images
After the transformation, images are represented as tensors. The imshow function is helpful in order when we want to turn them back to readable images and show them in a plot.

In [None]:
def imshow(inp, title=None):
    """Imshow for Tensor."""
    inp = inp.numpy().transpose((1, 2, 0)) # permuting the axes in the correct order
    mean = np.array([0.5, 0.5, 0.5])
    std = np.array([0.5, 0.5, 0.5])
    inp = std * inp + mean
    inp = np.clip(inp, 0, 1)
    plt.figure(figsize=(10, 10))
    plt.imshow(inp)
    plt.pause(0.001)  # pause a bit so that plots are updated


# Get a batch of training data
inputs, classes = next(iter(train_loader))

# Make a grid from batch
out = torchvision.utils.make_grid(inputs)

imshow(out)

## Training the model
The following is the function responsible for training the model. It runs for _num_epochs_ times (default is 25) and for each step runs a training and validation phase, computing the loss and performing backpropagation in order to update the model's weights.
At the end of the execution it will store the model with the best accuracy performing a deep copy into the final output model.

In [None]:
def train_model(model, criterion, optimizer, scheduler, num_epochs=25):
    
    # Measuring the time it takes to train
    since = time.time()
    
    # Starting W&B logging
    wandb.watch(model, criterion, log="all", log_freq=10)
    
    best_model_wts = copy.deepcopy(model.state_dict())
    best_acc = 0.0

    for epoch in range(num_epochs):
        print(f'Epoch {epoch}/{num_epochs - 1}')
        print('-' * 10)

        # Each epoch has a training and validation phase
        for phase in ['train', 'val']:
            if phase == 'train':
                model.train()  # Set model to training mode
            else:
                model.eval()   # Set model to evaluate mode

            running_loss = 0.0
            running_corrects = 0

            # Iterate over data.
            for inputs, labels in dataloaders[phase]:
                inputs = inputs.to(device)
                labels = labels.to(device)

                # zero the parameter gradients
                optimizer.zero_grad()

                # forward
                # track history if only in train
                with torch.set_grad_enabled(phase == 'train'):
                    outputs = model(inputs)
                    _, preds = torch.max(outputs, 1)
                    loss = criterion(outputs, labels)
                    
                    
                    # backward + optimize only if in training phase
                    if phase == 'train':
                        loss.backward()
                        optimizer.step()

                # statistics
                running_loss += loss.item() * inputs.size(0)
                running_corrects += torch.sum(preds == labels.data)
            if phase == 'train':
                scheduler.step()

            epoch_loss = running_loss / dataset_sizes[phase]
            epoch_acc = running_corrects.double() / dataset_sizes[phase]

            #send data to W&B
            train_log(epoch_loss, epoch_acc,epoch,phase)
            
            print(f'{phase} Loss: {epoch_loss:.4f} Acc: {epoch_acc:.4f}')

            # deep copy the model
            if phase == 'val' and epoch_acc > best_acc:
                best_acc = epoch_acc
                best_model_wts = copy.deepcopy(model.state_dict())

        print()

    time_elapsed = time.time() - since
    print(f'Training complete in {time_elapsed // 60:.0f}m {time_elapsed % 60:.0f}s')
    print(f'Best val Acc: {best_acc:4f}')

    # load best model weights
    model.load_state_dict(best_model_wts)
    return model


In [None]:
# This function simply sends logging data to W&B classified by phase.

def train_log(loss, acc, epoch, phase):
    loss = float(loss)
    acc = float(acc)
    
    if phase == 'train':
        wandb.log({"Epoch":epoch, "Training loss":loss, "Training accuracy": acc})
    else:
        wandb.log({"Epoch":epoch, "Validation loss":loss, "Validation accuracy": acc})

## Visualizing the results
The visualize_model function runs a test on a batch of six images in order to see the model's predictions.

In [None]:
def visualize_model(model, num_images=6):
    was_training = model.training
    
    # Setting the model in eval mode in order to compute predictions
    model.eval()
    images_so_far = 0
    fig = plt.figure()
    
    with torch.no_grad():
        for i, (inputs, labels) in enumerate(dataloaders['val']):
            inputs = inputs.to(device)
            labels = labels.to(device)

            outputs = model(inputs)
            _, preds = torch.max(outputs, 1)

            for j in range(inputs.size()[0]):
                images_so_far += 1
                ax = plt.subplot(num_images//2, 2, images_so_far)
                ax.axis('off')
                ax.set_title(f'predicted: {class_names[preds[j]]}')
                imshow(inputs.cpu().data[j])

                if images_so_far == num_images:
                    model.train(mode=was_training)
                    return
        model.train(mode=was_training)

## Importing the model and running the training
Here we execute the functions previously defined. Firs we import the ResNet18 model (which has been tested both pre-trained and not pre-trained). We now need to choose a criterion and an optimizer.
The criterion corresponds to the function that we use in order to measure the loss of our model. In this case we use CrossEntropyLoss. Here is a brief definition. 
> Cross-entropy loss, or log loss, measures the performance of a classification model whose output is a probability value between 0 and 1. Cross-entropy loss increases as the predicted probability diverges from the actual label.

The optimizer defines the function that we will use in order to updated the weights in order to reduce the loss. Here we use SGD (Stochastic Gradient Descent).

The last thing we define is the learning rate. As you will notice, I did not define a fixed learning rate. Instead I followed very popular technique which is defining a Learning Rate Scheduler. Practically it starts with a certain learning rate, and decreases it each *step_size* epochs in order to fine tune the more and more with the increasing of epochs.



In [None]:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model_ft = models.resnet18(pretrained=True)
num_ftrs = model_ft.fc.in_features

# The fc property of a model defines the size of the final layer, which should be equal to the number of classes
class_names = dataset.classes
model_ft.fc = nn.Linear(num_ftrs, len(class_names))

model_ft = model_ft.to(device)

criterion = nn.CrossEntropyLoss()

# Observe that all parameters are being optimized
optimizer_ft = optim.SGD(model_ft.parameters(), lr=0.001, momentum=0.9)

# Decay LR by a factor of 0.1 every 7 epochs
exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1)

In [None]:
# Start training
model_ft = train_model(model_ft, criterion, optimizer_ft, exp_lr_scheduler,num_epochs=25)

In [None]:
# See a sample of results
visualize_model(model_ft)

In [None]:
# Save the model
torch.save(model_ft.state_dict(), 'mask_recognition_model.pth')