# DL project - classifying emotions from facial images
*Group 17 - Dorothy Modrall Sperling, Manuel Schönberger, Lea Roncero*

## 0. Dataset description

For our deep learning project, we use the FER+ dataset that contains ~28.700 pictures of facial expressions and was proposed in the assignment specification.

###  Features of the Dataset

**Emotion**
- Represents the class label for the facial expression depicted in the image.
- It is an integer value ranging from `0` to `6`, corresponding to seven different emotions:
  - `0`: Angry  
  - `1`: Disgust  
  - `2`: Fear  
  - `3`: Happy  
  - `4`: Sad  
  - `5`: Surprise  
  - `6`: Neutral  

**Pixels**
- A string of pixel values representing a `48x48` grayscale image.
- Contains `2304` comma-separated values (`48 x 48 = 2304`), where each value is an intensity level in the range `0-255`.

In [37]:
# Import Libraries

import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader
import matplotlib.pyplot as plt
import numpy as np

## 0. Loading and preprocessing data

First, we load the data into a pandas dataframe.

In [38]:
import pandas as pd

train_unprocessed= pd.read_csv("data/train.csv")
print(train_unprocessed.head())

   emotion                                             pixels
0        0  70 80 82 72 58 58 60 63 54 58 60 48 89 115 121...
1        0  151 150 147 155 148 133 111 140 170 174 182 15...
2        2  231 212 156 164 174 138 161 173 182 200 106 38...
3        4  24 32 36 30 32 23 19 20 30 41 21 22 32 34 21 1...
4        6  4 0 0 0 0 0 0 0 0 0 0 0 3 15 23 28 48 50 58 84...


The cell below shows that there are no missing values. It is therefore not necessary to impute anything.

In [39]:
train_unprocessed.isnull().values.any()

False

However, we apply the following preprocessing steps to convert the data into the desired format we can further work with:
1. **Reshaping**:
   - Convert the pixel string into a `48x48` matrix for visualization and processing.
2. **Scaling**:
   - Scale pixel values to the range `[0, 1]` by dividing the pixel values by 255.
3. **Normalization**:
   - Normalize pixel values to the range `[-1, 1]` by subtracting the mean and diving them by the standard deviation.

In [40]:
class ReshapeAndNormalize:
    def __init__(self, n_rows, n_cols):
        self.n_rows = n_rows
        self.n_cols = n_cols

    def __call__(self, pixel_string):
        # Reshape the flat pixel string into a 2D array
        pixel_vals = np.array([float(val) for val in pixel_string.split()], dtype=np.float32)
        image = pixel_vals.reshape(self.n_rows, self.n_cols)
        
        # Scale the image to [0, 1]
        image = image / 255.0
        
        # Convert to a PyTorch tensor
        return torch.tensor(image, dtype=torch.float32).unsqueeze(0) # add a channel dimension

# Define the pipeline
transform = transforms.Compose([
    ReshapeAndNormalize(n_rows=48, n_cols=48),
    transforms.Normalize(mean=[0.5], std=[0.5])  # Normalize to [-1, 1]
])

## 1. Define a custom convolutional neural network

In [41]:
from torch.utils.data import Dataset

class Fer2013Dataset(Dataset):
    def __init__(self, data_frame, labels, transform=None):
        self.data_frame = data_frame
        self.labels = labels
        self.transform = transform

    def __getitem__(self, idx):
        pixel_string = self.data_frame.iloc[idx]["pixels"]
        image = self.transform(pixel_string) if self.transform else pixel_string
        
        label = torch.tensor(self.labels.iloc[idx], dtype=torch.long)
        return image, label

    def __len__(self):
        return len(self.data_frame)

In [42]:
from torch.utils.data import random_split
from torch.utils.data import DataLoader

batch_size = 32

train_labels = train_unprocessed["emotion"]
full_dataset = Fer2013Dataset(train_unprocessed, train_labels, transform=transform)

train_dataset, val_dataset = random_split(full_dataset, [0.8, 0.2])

train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False)

for batch_images, batch_labels in train_loader:
    print("Train Batch Shape:", batch_images.shape, "Train Labels Shape:", batch_labels.shape)

for batch_images, batch_labels in val_loader:
    print("Validation Batch Shape:", batch_images.shape, "Validation Labels Shape:", batch_labels.shape)


Train Batch Shape: torch.Size([32, 1, 48, 48]) Train Labels Shape: torch.Size([32])
Train Batch Shape: torch.Size([32, 1, 48, 48]) Train Labels Shape: torch.Size([32])
Train Batch Shape: torch.Size([32, 1, 48, 48]) Train Labels Shape: torch.Size([32])
Train Batch Shape: torch.Size([32, 1, 48, 48]) Train Labels Shape: torch.Size([32])
Train Batch Shape: torch.Size([32, 1, 48, 48]) Train Labels Shape: torch.Size([32])
Train Batch Shape: torch.Size([32, 1, 48, 48]) Train Labels Shape: torch.Size([32])
Train Batch Shape: torch.Size([32, 1, 48, 48]) Train Labels Shape: torch.Size([32])
Train Batch Shape: torch.Size([32, 1, 48, 48]) Train Labels Shape: torch.Size([32])
Train Batch Shape: torch.Size([32, 1, 48, 48]) Train Labels Shape: torch.Size([32])
Train Batch Shape: torch.Size([32, 1, 48, 48]) Train Labels Shape: torch.Size([32])
Train Batch Shape: torch.Size([32, 1, 48, 48]) Train Labels Shape: torch.Size([32])
Train Batch Shape: torch.Size([32, 1, 48, 48]) Train Labels Shape: torch.Siz

## 2. Define the CNN Model

In [43]:
class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(in_channels=1, out_channels=32, kernel_size=3, stride=1, padding=1)
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2, padding=0)
        self.conv2 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, stride=1, padding=1)
        self.fc1 = nn.Linear(in_features=64 * 12 * 12,
                             out_features=128)  # CIFAR-10 (32x32 image -> downsampled by pooling)
        self.fc2 = nn.Linear(in_features=128, out_features=7)  # 7 classes

    def forward(self, x):
        x = self.pool(torch.relu(self.conv1(x)))  # call activation function
        x = self.pool(torch.relu(self.conv2(x)))  # call activation function
        x = x.view(-1, 64 * 12 * 12)  # Flatten the tensor
        x = torch.relu(self.fc1(x))  # call activation function
        x = self.fc2(x)
        return x


model = CNN()

## 4. Define Loss Function and Optimizer

In [44]:
criterion = nn.CrossEntropyLoss()  # Loss function for classification
optimizer = optim.Adam(model.parameters(), lr=0.001)  # call optimizer

## 5. Train the Model

In [45]:
num_epochs = 10  # define number of epochs
for epoch in range(num_epochs):
    model.train()
    running_loss = 0.0
    for i, (inputs, labels) in enumerate(train_loader):
        # Zero the parameter gradients
        optimizer.zero_grad()
      
        # Forward pass
        outputs = model(inputs)
   
        loss = criterion(outputs, labels)

        # Backward pass
        loss.backward()
        optimizer.step()

        running_loss += loss.item()
        if (i + 1) % 100 == 0:  # Print every 100 batches
            print(
                f"Epoch [{epoch + 1}/{num_epochs}], Step [{i + 1}/{len(train_loader)}], Loss: {running_loss / 100:.4f}")
            running_loss = 0.0

Epoch [1/10], Step [100/718], Loss: 1.7526
Epoch [1/10], Step [200/718], Loss: 1.6265
Epoch [1/10], Step [300/718], Loss: 1.5832
Epoch [1/10], Step [400/718], Loss: 1.5268
Epoch [1/10], Step [500/718], Loss: 1.4879
Epoch [1/10], Step [600/718], Loss: 1.4699
Epoch [1/10], Step [700/718], Loss: 1.4360
Epoch [2/10], Step [100/718], Loss: 1.3679
Epoch [2/10], Step [200/718], Loss: 1.3188
Epoch [2/10], Step [300/718], Loss: 1.3485
Epoch [2/10], Step [400/718], Loss: 1.3212
Epoch [2/10], Step [500/718], Loss: 1.2832
Epoch [2/10], Step [600/718], Loss: 1.2904
Epoch [2/10], Step [700/718], Loss: 1.2850
Epoch [3/10], Step [100/718], Loss: 1.1646
Epoch [3/10], Step [200/718], Loss: 1.1843
Epoch [3/10], Step [300/718], Loss: 1.1957
Epoch [3/10], Step [400/718], Loss: 1.1901
Epoch [3/10], Step [500/718], Loss: 1.1656
Epoch [3/10], Step [600/718], Loss: 1.1856
Epoch [3/10], Step [700/718], Loss: 1.1732
Epoch [4/10], Step [100/718], Loss: 1.0422
Epoch [4/10], Step [200/718], Loss: 1.0316
Epoch [4/10

## 6. Evaluate the Model

In [46]:
model.eval()
correct = 0
total = 0
with torch.no_grad():
    for inputs, labels in val_loader:
        outputs = model(inputs)
        _, predicted = torch.max(outputs, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print(f"Test Accuracy: {correct / total:.4f}")

Test Accuracy: 0.5327
