<a href="https://www.kaggle.com/code/edaaydinea/facial-expression-recognition-with-pytorch?scriptVersionId=123511190" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# Facial Expression Recognition with PyTorch

*Author: Eda AYDIN*

# Business Understanding

Facial Expression Recognition is a critical technology that is gaining traction in various industries, including healthcare, entertainment, and security. The goal of this project is to develop an accurate and efficient Facial Expression Recognition model using the PyTorch framework. The model will be trained on a large dataset of facial expressions to recognize different emotions, including happiness, sadness, fear, anger, and surprise. The ultimate objective of this project is to provide businesses with a tool that can improve customer experience, increase security, and enhance overall operational efficiency. For instance, the technology can be used in healthcare to detect early signs of depression or anxiety in patients, in entertainment to enhance gaming experiences, and in security to monitor public spaces for suspicious behavior. The Facial Expression Recognition with PyTorch project has the potential to revolutionize various industries by providing an automated, accurate, and reliable tool for recognizing emotions.

# Data Understanding

The Face Expression Recognition dataset available on Kaggle contains 28,709 images of human faces, labeled with seven different facial expressions, including angry, disgust, fear, happy, sad, surprise, and neutral. 

The dataset is split into two subsets, a training set of 24,706 images and a test set of 4,003 images. The images are grayscale, 48×48 pixels in size, and the data is stored in CSV format. Each row of the CSV file corresponds to an image and contains the pixel values of the image, the emotion label, and other attributes, such as the image usage and intensity. 

The dataset is well-balanced, with each emotion class containing approximately the same number of images. It is important to note that the images were extracted from the FER2013 dataset and preprocessed to contain only faces with frontal pose and appropriate brightness, resulting in some loss of information. 

Additionally, the dataset contains some images with low resolution or artifacts, which may affect the performance of the model. 

Overall, the Face Expression Recognition dataset provides a diverse and labeled set of images for training and testing facial expression recognition models.

# Install libraries, packages and dataset

In [None]:
!pip install -U git+https://github.com/albumentations-team/albumentations
!pip install timm
!pip install --upgrade opencv-contrib-python

# Imports

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import torch

# Configurations

In [None]:
TRAIN_IMG_FOLDER_PATH = "/kaggle/input/face-expression-recognition-dataset/images/train"
VALID_IMG_FOLDER_PATH = "/kaggle/input/face-expression-recognition-dataset/images/validation"

LR = 0.001
BATCH_SIZE = 32
EPOCHS = 15

DEVICE = 'cuda'
MODEL_NAME = 'efficientnet_b0'

# Load Dataset

In [None]:
from torchvision.datasets import ImageFolder
from torchvision import transforms as T
import random

In [None]:
train_augs = T.Compose([
    T.RandomHorizontalFlip(p = 0.5),
    T.RandomRotation(degrees=(-20, + 20)),
    T.ToTensor() # Convert a PIL image or numpy.ndarray to tensor (h, w, c) --> (c, h, w)
])

valid_augs = T.Compose([
    T.ToTensor()
])

In [None]:
trainset = ImageFolder(TRAIN_IMG_FOLDER_PATH, transform = train_augs)
validset = ImageFolder(VALID_IMG_FOLDER_PATH, transform = valid_augs)

In [None]:
print(f"Total no. of examples in trainset : {len(trainset)}")
print(f"Total no. of examples in validset : {len(validset)}")

In [None]:
print(trainset.class_to_idx)

In [None]:
class_names = trainset.classes

index = random.randint(0, len(trainset)-1)
image, label = trainset[index]

plt.imshow(image.permute(1, 2, 0)) # (h, w, c)
plt.title(class_names[label])

In [None]:
class_names = validset.classes

index = random.randint(0, len(validset)-1)
image, label = validset[index]

plt.imshow(image.permute(1, 2, 0)) # (h, w, c)
plt.title(class_names[label])

# Load Dataset into Batches

In [None]:
from torch.utils.data import DataLoader

In [None]:
trainloader = DataLoader(trainset, batch_size = BATCH_SIZE, shuffle = True)
validloader = DataLoader(validset, batch_size = BATCH_SIZE)

In [None]:
print(f"Total no. of batches in trainloader : {len(trainloader)}")
print(f"Total no. of batches in validloader : {len(validloader)}")

In [None]:
for images, labels in trainloader:
    break;

print(f"One image batch shape : {images.shape}")
print(f"One label batch shape : {labels.shape}")

# Create Model 

In [None]:
import timm
from torch import nn

In [None]:
class FaceModel(nn.Module):
    
    def __init__(self):
        super(FaceModel, self).__init__()
        
        self.eff_net = timm.create_model('efficientnet_b0',
                                        pretrained = True,
                                        num_classes = 7)
        
    def forward(self, images, labels = None):
        logits = self.eff_net(images)
        
        if labels != None:
            loss = nn.CrossEntropyLoss()(logits, labels)
            return logits, loss
        
        return logits

In [None]:
model = FaceModel()
model.to(DEVICE)

# Create Train and Eval Function

In [None]:
from tqdm import tqdm

In [None]:
def multiclass_accuracy(y_pred, y_true):
    top_p, top_class = y_pred.topk(1, dim = 1)
    equals = top_class = y_true.view(*top_class.shape)
    return torch.mean(equals.type(torch.FloatTensor))

In [None]:
def train_fn(model, dataloader, optimizer, current_epo):
    model.train()
    total_loss = 0.0
    total_acc = 0.0
    tk = tqdm(dataloader, desc = "EPOCH" + "[TRAIN]" + str(current_epo + 1) + "/" + str(EPOCHS))
    
    for t, data in enumerate(tk):
        images, labels = data
        images, labels = images.to(DEVICE), labels.to(DEVICE)
        
        optimizer.zero_grad()
        logits, loss = model(images, labels)
        loss.backward()
        optimizer.step()
        
        total_loss += loss.item()
        total_acc += multiclass_accuracy(logits, labels)
        tk.set_postfix({'loss' : '%6f' %float(total_loss / (t+1)), 'acc' : '%6f' %float(total_acc / (t+1)),})
        
    return total_loss / len(dataloader), total_acc / len(dataloader)
        

In [None]:
def eval_fn(model, dataloader, current_epo):
    model.eval()
    total_loss = 0.0
    total_acc = 0.0
    tk = tqdm(dataloader, desc = "EPOCH" + "[TRAIN]" + str(current_epo + 1) + "/" + str(EPOCHS))
    
    for t, data in enumerate(tk):
        images, labels = data
        images, labels = images.to(DEVICE), labels.to(DEVICE)
        
        logits, loss = model(images, labels)
        
        total_loss += loss.item()
        total_acc += multiclass_accuracy(logits, labels)
        tk.set_postfix({'loss' : '%6f' %float(total_loss / (t+1)), 'acc' : '%6f' %float(total_acc / (t+1)),})
        
    return total_loss / len(dataloader), total_acc / len(dataloader)
        

# Create Training Loop

In [None]:
optimizer = torch.optim.Adam(params = model.parameters(),
                             lr = LR)

In [None]:
best_valid_loss = np.Inf

for i in range(EPOCHS):
    train_loss, train_acc = train_fn(model, trainloader, optimizer, i)
    valid_loss, valid_acc = eval_fn(model, validloader, i)
    
    if valid_loss < best_valid_loss:
        torch.save(model.state_dict(), 'best-weights.pt')
        print("SAVED-BEST-WEIGHTS")
        best_valid_loss = valid_loss

# Inference

In [None]:
def view_classify(img, ps):
  
    classes = ['angry', 'disgust', 'fear', 'happy', 'neutral', 'sad', 'suprise']
    
    ps = ps.data.cpu().numpy().squeeze()
    img = img.numpy().transpose(1,2,0)
    
    fig, (ax1, ax2) = plt.subplots(figsize = (5,9), ncols = 2)
    ax1.imshow(img)
    ax1.axis('off')
    ax2.barh(classes, ps)
    ax2.set_aspect(0.1)
    ax2.set_yticks(classes)
    ax2.set_yticklabels(classes)
    ax2.set_title('Class Probability')
    ax2.set_xlim(0, 1.1)
    
    plt.tight_layout()
    
    return None

In [None]:
image, label = validset[97]
image = image.unsqueeze(0)

logits = model(image.to(DEVICE))
probs = nn.Softmax(dim=1)(logits)

view_classify(image.squeeze(), probs)