# Introdution to Machine Learning - Course Project Report

Group members:
   - Grzegorz Prasek
   - Jakub Kindracki
   - Mykhailo Shamrai
   - Mateusz Mikiciuk
   - Ernest Mołczan

In this report we will describe our implementation of CNN supposed to classify users allowed to the system and users not allowed (binary classification).

## Table of contents:
1. Dataset
2. Exploratory Data Analysis
3. Preparing audio files for generating spectrograms
3. Generating spectrograms
4. Classifying spectrograms for train, test and validation datasets
5. Model
6. Training loop
7. [EXTRA] **interpretability** - visualizing the behavior and function of individual cnn layers and using if for data exploration
8. [EXTRA] **uncertainty** - using monte carlo dropout to estimate classification confidence. Comparing dropout to an ensemble of CNN networks.
9. [EXTRA] **parameter space** examining how much individual layers of the network change during training. Investigating their re-initialization robustness.

In [None]:
print("Hello World!")


### 6. Model

In this chapter, we will describe the Convolutional Neural Network (CNN) model used for our project. The model is designed to classify spectrogram images into two classes. Below, we provide an overview of the model architecture and the code implementation.

#### Model Architecture

The CNN model consists of the following layers:
1. **Convolutional Layers**: Three convolutional layers with ReLU activation and max pooling.
2. **Fully Connected Layers**: Two fully connected layers with dropout for regularization.
3. **Output Layer**: A final fully connected layer for binary classification.

The input to the model is a grayscale image with a size of 224x224 pixels.

#### Code Implementation

Here is the implementation of the `SpectrogramCNN` model in PyTorch:

In [None]:
import torch
import torch.nn as nn
import torch.nn.functional as F


class SpectrogramCNN(nn.Module):
    def __init__(self, num_classes=2):
        super(SpectrogramCNN, self).__init__()

        # Input is grayscale (1 channel), so input channels = 1
        self.conv1 = nn.Conv2d(in_channels=1, out_channels=32, kernel_size=3, stride=1, padding=1)
        self.conv2 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, stride=1, padding=1)
        self.conv3 = nn.Conv2d(in_channels=64, out_channels=128, kernel_size=3, stride=1, padding=1)

        # Max Pooling
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2, padding=0)

        # Fully connected layers
        self.fc1 = nn.Linear(128 * 28 * 28, 512)  # Based on 224x224 input size after 3 pooling layers
        self.fc2 = nn.Linear(512, num_classes)  # Output layer (for binary classification, num_classes=2)

        # Dropout (optional, helps prevent overfitting)
        self.dropout = nn.Dropout(0.5)

    def forward(self, x):
        # Convolutional layers with ReLU activation and Max Pooling
        x = self.pool(F.relu(self.conv1(x)))  # Output: (32, 112, 112)
        x = self.pool(F.relu(self.conv2(x)))  # Output: (64, 56, 56)
        x = self.pool(F.relu(self.conv3(x)))  # Output: (128, 28, 28)

        # Flatten the tensor for fully connected layers
        x = x.view(-1, 128 * 28 * 28)  # Flattening the output of conv layers

        # Fully connected layers with dropout
        x = F.relu(self.fc1(x))
        x = self.dropout(x)
        x = self.fc2(x)  # Output: (batch_size, num_classes)

        return x

This model is designed to process grayscale spectrogram images and classify them into one of two classes. The use of convolutional layers helps in extracting spatial features from the images, while the fully connected layers perform the final classification. Dropout is used to prevent overfitting during training.

### 7. Training Loop

In this chapter, we will explain how the training and evaluation of our CNN model are performed. The training loop is responsible for optimizing the model's parameters, while the evaluation loop assesses the model's performance on the validation set.

#### Training Loop

The training loop involves the following steps:
1. **Model Initialization**: The model, loss function, and optimizer are initialized.
2. **Epoch Loop**: The training process runs for a specified number of epochs.
3. **Batch Loop**: For each epoch, the model processes the training data in batches.
4. **Forward Pass**: The model makes predictions on the input data.
5. **Loss Calculation**: The loss between the predictions and the true labels is computed.
6. **Backward Pass**: Gradients are calculated, and the model's parameters are updated.
7. **Validation**: After each epoch, the model is evaluated on the validation set to monitor its performance.

Here is the code implementation of the training loop:


In [None]:
import time
import torch.optim as optim
import torch.nn as nn
import torch
import numpy as np
from sklearn.metrics import f1_score


MODEL_PATH = "./model.pth"
LAST_MODEL_PATH = "./last_model.pth"


def train_model(model, train_loader, val_loader, num_epochs=25, learning_rate=0.001, model_path=MODEL_PATH):
    # Define the loss function and optimizer
    criterion = nn.CrossEntropyLoss()  # For classification tasks
    optimizer = optim.Adam(model.parameters(), lr=learning_rate)

    # Move model to GPU if available
    # For Mac
    if torch.backends.mps.is_available():
        device = torch.device("mps")
        print("Using MPS device")
    else:
        device = torch.device("cpu")
        print("MPS device not found, using CPU")

    # For Windows
    # device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

    model.to(device)

    # Training loop
    best_val_acc = 0.0
    for epoch in range(num_epochs):
        print(f"Starting epoch {epoch + 1}/{num_epochs} at {time.strftime('%Y-%m-%d %H:%M:%S', time.localtime())}")
        model.train()  # Set the model to training mode
        running_loss = 0.0
        correct = 0
        total = 0

        for inputs, labels in train_loader:
            inputs, labels = inputs.to(device), labels.to(device)  # Move data to GPU if available

            # Zero the parameter gradients
            optimizer.zero_grad()

            # Forward pass
            outputs = model(inputs)
            loss = criterion(outputs, labels)
            loss.backward()  # Backpropagation
            optimizer.step()  # Update the weights

            # Track loss and accuracy
            running_loss += loss.item() * inputs.size(0)
            _, predicted = torch.max(outputs, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()

        epoch_loss = running_loss / len(train_loader.dataset)
        epoch_acc = correct / total

        # Validation phase
        val_loss, val_acc = evaluate_model(model, val_loader, criterion, device)

        print(f"Ending epoch {epoch + 1}/{num_epochs} at {time.strftime('%H:%M:%S', time.localtime())}, "
              f"Train Loss: {epoch_loss:.4f}, Train Acc: {epoch_acc:.4f}, "
              f"Val Loss: {val_loss:.4f}, Val Acc: {val_acc:.4f}")

        # Save the best model based on validation accuracy
        if val_acc > best_val_acc:
            best_val_acc = val_acc
            torch.save(model.state_dict(), model_path)  # Save the best model

    torch.save(model.state_dict(), LAST_MODEL_PATH)  # Save the last model
    print("Training complete. Best validation accuracy: {:.4f}".format(best_val_acc))

#### Evaluation Loop

The evaluation loop involves the following steps:
1. **Model Evaluation Mode**: The model is set to evaluation mode to disable dropout and batch normalization.
2. **Batch Loop**: The model processes the validation data in batches.
3. **Forward Pass**: The model makes predictions on the input data.
4. **Loss Calculation**: The loss between the predictions and the true labels is computed.
5. **Accuracy Calculation**: The accuracy of the model's predictions is calculated.
6. **F1 Score Calculation**: The F1 score is computed to evaluate the model's performance.

Here is the code implementation of the evaluation loop:

In [None]:
def evaluate_model(model, data_loader, criterion, device):
    model.eval()  # Set the model to evaluation mode
    running_loss = 0.0
    correct = 0
    total = 0
    all_labels = []
    all_predictions = []

    with torch.no_grad():  # Disable gradient computation during evaluation
        for inputs, labels in data_loader:
            inputs, labels = inputs.to(device), labels.to(device)

            outputs = model(inputs)
            loss = criterion(outputs, labels)

            running_loss += loss.item() * inputs.size(0)
            _, predicted = torch.max(outputs, 1)

            total += labels.size(0)
            correct += (predicted == labels).sum().item()

            # Collect predictions and labels for F1 score calculation
            all_labels.extend(labels.cpu().numpy())
            all_predictions.extend(predicted.cpu().numpy())

    epoch_loss = running_loss / len(data_loader.dataset)
    epoch_acc = correct / total

    # Calculate F1 score
    f1 = f1_score(all_labels, all_predictions, average="weighted")  # Change "weighted" if you need macro or micro F1 score
    print(f"F1 Score: {f1:.4f}")

    return epoch_loss, epoch_acc

These functions together form the core of the training and evaluation process for our CNN model.
