# Question A1 (15 marks)

#### Design a feedforward deep neural network (DNN) which consists of **three** hidden layers of 128 neurons each with ReLU activation function, and an output layer with sigmoid activation function. Apply dropout of probability **0.2** to each of the hidden layers.

In [1]:
import tqdm
import time
import random
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

import torch
from torch import nn
from torch.utils.data import Dataset
from torch.utils.data import DataLoader

from scipy.io import wavfile as wav

from sklearn import preprocessing
from sklearn.model_selection import KFold
from sklearn.model_selection import train_test_split
from sklearn.metrics import f1_score, precision_score, recall_score, confusion_matrix

from common_utils import set_seed

# setting seed
set_seed()

1.Define the model class.

In [2]:
class MLP(nn.Module):

    def __init__(self, no_features, no_hidden, no_labels):
        super().__init__()
        self.mlp_stack = nn.Sequential(
            nn.Linear(no_features, no_hidden),
            nn.ReLU(),
            nn.Dropout(p=0.2),
            nn.Linear(no_hidden, no_hidden),
            nn.ReLU(),
            nn.Dropout(p=0.2),
            nn.Linear(no_hidden, no_hidden),
            nn.ReLU(),
            nn.Dropout(p=0.2),
            nn.Linear(no_hidden, no_labels),
            nn.Sigmoid()
        )

    def forward(self, x):
        return self.mlp_stack(x)

#### Divide the dataset into a 80:20 ratio for training and testing. Use **appropriate** scaling of input features. We solely assume that there are only two datasets here: training & test.

2.Split the dataset and do preprocessing.

In [3]:
from common_utils import split_dataset, preprocess_dataset


def preprocess(df, test_size=0.2, random_state=42):
    df_train, y_train, df_test, y_test = split_dataset(df, ["filename"], test_size, random_state)
    X_train_scaled, X_test_scaled = preprocess_dataset(df_train, df_test)
    return X_train_scaled, y_train, X_test_scaled, y_test

df = pd.read_csv('simplified.csv')
df['label'] = df['filename'].str.split('_').str[-2]

df['label'].value_counts()

X_train_scaled, y_train, X_test_scaled, y_test = preprocess(df)

#### Use the training dataset to train the model for 100 epochs. Use a mini-batch gradient descent with **‘Adam’** optimizer with learning rate of **0.001**, and **batch size = 128**. Implement early stopping with patience of **3**.

3.Define a Pytorch Dataset and Dataloaders.  

In [4]:
class CustomDataset(Dataset):
    def __init__(self, X, y):
        self.X = torch.tensor(X, dtype=torch.float32)
        self.y = torch.tensor(y, dtype=torch.long)

    def __len__(self):
        return len(self.X)

    def __getitem__(self, idx):
        return self.X[idx], self.y[idx]


def intialise_loaders(X_train_scaled, y_train, X_test_scaled, y_test, batch_size=128):
    train_dataset = CustomDataset(X_train_scaled, y_train)
    test_dataset = CustomDataset(X_test_scaled, y_test)

    train_dataloader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
    test_dataloader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)
    return train_dataloader, test_dataloader

train_dataloader, test_dataloader = intialise_loaders(X_train_scaled, y_train, X_test_scaled, y_test)

4.Next, define the model, optimizer and loss function.

In [5]:
# YOUR CODE HERE
from torch.optim import Adam
from torch.nn import CrossEntropyLoss

model = MLP(X_train_scaled.shape[1], 128, len(df['label'].unique()))
optimizer = Adam(model.parameters(), lr=0.001)
loss_fn = CrossEntropyLoss()

5.Train model for 100 epochs. Record down train and test accuracies. Implement early stopping.

In [6]:
from torch.utils.tensorboard import SummaryWriter

# Create a tensorboard writer for visualization
writer = SummaryWriter('runs/your_run')

# Training loop
best_val_acc = 0
patience = 3
epochs_since_improvement = 0

for epoch in range(100):
    model.train()
    for batch_idx, (data, target) in enumerate(train_dataloader):
        optimizer.zero_grad()
        output = model(data)
        loss = loss_fn(output, target)
        loss.backward()
        optimizer.step()


        # Log training loss to tensorboard
        writer.add_scalar('Loss/train', loss.item(), epoch * len(train_dataloader) + batch_idx)

    # Evaluation on validation set
    model.eval()
    test_loss = 0  # Initialize test loss
    correct = 0
    total = 0
    with torch.no_grad():
        for data, target in test_dataloader:
            output = model(data)
            test_loss += loss_fn(output, target).item()  # Accumulate test loss
            _, predicted = torch.max(output.data, 1)
            total += target.size(0)
            correct += (predicted == target).sum().item()

    # Calculate average test loss
    test_loss /= len(test_dataloader)
    val_acc = 100 * correct / total
    print(f'Epoch {epoch+1}: Validation Acc: {val_acc:.2f}%, Test Loss: {test_loss:.4f}')

    # Log test loss to tensorboard
    writer.add_scalar('Loss/test', test_loss, epoch)
    
    # Early stopping
    if val_acc > best_val_acc:
        best_val_acc = val_acc
        epochs_since_improvement = 0
        torch.save(model.state_dict(), f'models/best_model_{epoch}_a1.pth')
    else:
        epochs_since_improvement += 1
        if epochs_since_improvement >= patience:
            print('Early stopping')
            break

# Close tensorboard writer
writer.close()

Epoch 1: Validation Acc: 99.96%, Test Loss: 0.3139
Epoch 2: Validation Acc: 99.92%, Test Loss: 0.3138
Epoch 3: Validation Acc: 100.00%, Test Loss: 0.3133
Epoch 4: Validation Acc: 100.00%, Test Loss: 0.3133
Epoch 5: Validation Acc: 100.00%, Test Loss: 0.3133
Epoch 6: Validation Acc: 99.96%, Test Loss: 0.3135
Early stopping


#### Plot train and test accuracies and losses on training and test data against training epochs and comment on the line plots.


In [7]:
# YOUR CODE HERE

!tensorboard --logdir runs/your_run

TensorFlow installation not found - running with reduced feature set.
I1009 20:23:57.578763 6425702400 plugin.py:429] Monitor runs begin
Serving TensorBoard on localhost; to expose to the network, use a proxy or pass --bind_all
TensorBoard 2.18.0 at http://localhost:6006/ (Press CTRL+C to quit)
^C


![img](loss_train_test.png)

6.Comment on line plots.

In [8]:
# YOUR CODE HERE
answer = """
As the loss of the model on the train set decreases the loss on the
test set also decreases. However, cross a certain point the loss on the test set
starts to increase which can be seen from epoch 4. By implementing early stopping
with a patient of 3, the model stops training at epoch 5.
"""