# M6 - W1 Assignment: Deep Learning I

The MNIST dataset is a very famous dataset used to test and benchmark new deep learning architectures and models. It contains images of handwritten digits (from 0 to 9). It consists of a training and test sets of features and labels.

- Your first task is to use PyTorch and develop a model that correctly detects a handwritten digit. Though you can technically solve that task with any type of a supervised model, please use Feed-Forward neural networks. You are free to experiment with the architecture of the model. Extra points will be awarded if you define your own model class rather than a Sequential model.
- Please evaluate your model properly and interpret its performance.

In [3]:
# Import required libraries
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import transforms
from torch.utils.data import TensorDataset, DataLoader
import pandas as pd
import os
from sklearn.metrics import precision_recall_fscore_support

# Set random seed for reproducibility
torch.manual_seed(42)

# Check if GPU is available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Define transformations to apply to the data
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))
])

# Load the MNIST dataset using pandas
os.chdir("C:\\Users\\ManosIeronymakisProb\\OneDrive - Probability\\Bureaublad\\ELU\\M6 - W1 Assignment Deep Learning I")

train_filepath = "mnist_train.csv"
test_filepath = "mnist_test.csv"

train_data = pd.read_csv(train_filepath)
test_data = pd.read_csv(test_filepath)

# Separate features and labels
train_labels = train_data['label']
train_features = train_data.drop('label', axis=1)

test_labels = test_data['label']
test_features = test_data.drop('label', axis=1)

# Convert the data to tensors
train_features_tensor = torch.tensor(train_features.values, dtype=torch.float32)
train_labels_tensor = torch.tensor(train_labels.values, dtype=torch.long)
test_features_tensor = torch.tensor(test_features.values, dtype=torch.float32)
test_labels_tensor = torch.tensor(test_labels.values, dtype=torch.long)

# Create TensorDatasets
train_dataset = TensorDataset(train_features_tensor, train_labels_tensor)
test_dataset = TensorDataset(test_features_tensor, test_labels_tensor)

# Define data loaders
batch_size = 128
num_workers = 2 if device.type == "cuda" else 0  # Adjust the number of workers based on GPU availability
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True, num_workers=num_workers)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False, num_workers=num_workers)

# Define the model class
class NeuralNet(nn.Module):
    def __init__(self):
        super(NeuralNet, self).__init__()
        self.fc1 = nn.Linear(784, 512)
        self.relu1 = nn.ReLU()
        self.fc2 = nn.Linear(512, 256)
        self.relu2 = nn.ReLU()
        self.fc3 = nn.Linear(256, 10)

    def forward(self, x):
        x = x.view(x.size(0), -1)
        x = self.fc1(x)
        x = self.relu1(x)
        x = self.fc2(x)
        x = self.relu2(x)
        x = self.fc3(x)
        return x

# Instantiate the model
model = NeuralNet().to(device)

# Define the loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Training loop
num_epochs = 10

for epoch in range(num_epochs):
    # Set model to training mode
    model.train()
    
    for images, labels in train_loader:
        # Move data to GPU if available
        images, labels = images.to(device), labels.to(device)
        
        # Forward pass
        outputs = model(images)
        loss = criterion(outputs, labels)

        # Backward pass and optimization
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

    print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')

# Evaluation
model.eval()
true_labels = []
predicted_labels = []

with torch.no_grad():
    for images, labels in test_loader:
        # Move data to GPU if available
        images, labels = images.to(device), labels.to(device)

        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)
        
        true_labels.extend(labels.cpu().numpy())
        predicted_labels.extend(predicted.cpu().numpy())

# Calculate precision, recall, and F1-score
precision, recall, f1_score, _ = precision_recall_fscore_support(true_labels, predicted_labels, average='weighted')

print(f'Precision: {precision:.4f}')
print(f'Recall: {recall:.4f}')
print(f'F1-score: {f1_score:.4f}')
print(f'Loss: {loss.item():.4f}')

# Print true label and predicted label for 20 examples
print("Example predictions:")
for i in range(20):
    print(f'True: {true_labels[i]}, Predicted: {predicted_labels[i]}')


Epoch [1/10], Loss: 0.1749
Epoch [2/10], Loss: 0.0844
Epoch [3/10], Loss: 0.1225
Epoch [4/10], Loss: 0.0965
Epoch [5/10], Loss: 0.0171
Epoch [6/10], Loss: 0.0509
Epoch [7/10], Loss: 0.1134
Epoch [8/10], Loss: 0.0082
Epoch [9/10], Loss: 0.0078
Epoch [10/10], Loss: 0.1302
Precision: 0.9703
Recall: 0.9701
F1-score: 0.9701
Loss: 0.1302
Example predictions:
True: 7, Predicted: 7
True: 2, Predicted: 2
True: 1, Predicted: 1
True: 0, Predicted: 0
True: 4, Predicted: 4
True: 1, Predicted: 1
True: 4, Predicted: 4
True: 9, Predicted: 9
True: 5, Predicted: 5
True: 9, Predicted: 9
True: 0, Predicted: 0
True: 6, Predicted: 6
True: 9, Predicted: 9
True: 0, Predicted: 0
True: 1, Predicted: 1
True: 5, Predicted: 5
True: 9, Predicted: 9
True: 7, Predicted: 7
True: 3, Predicted: 8
True: 4, Predicted: 4


# Results

Epoch [1/10], Loss: 0.1749: This means that during the first epoch of training, the average loss on the training set was 0.1749.

Epoch [2/10], Loss: 0.0844: In the second epoch, the average loss on the training set decreased to 0.0844, indicating that the model is improving in its ability to minimize the difference between predicted and actual labels.

Epoch [3/10], Loss: 0.1225: The loss slightly increased in the third epoch to 0.1225, which could indicate a slight overfitting or noise in the training data.

Epoch [4/10], Loss: 0.0965: The loss decreased again in the fourth epoch to 0.0965, suggesting that the model is learning to generalize better.

Epoch [5/10], Loss: 0.0171: The loss significantly decreased to 0.0171 in the fifth epoch, indicating a significant improvement in the model's performance.

Epoch [6/10], Loss: 0.0509: The loss slightly increased in the sixth epoch to 0.0509, but it still remains relatively low.

Epoch [7/10], Loss: 0.1134: The loss increased again in the seventh epoch to 0.1134, which might indicate some fluctuations in the training process.

Epoch [8/10], Loss: 0.0082: The loss decreased sharply to 0.0082 in the eighth epoch, suggesting that the model is converging well.

Epoch [9/10], Loss: 0.0078: The loss remained low in the ninth epoch at 0.0078.

Epoch [10/10], Loss: 0.1302: The loss increased in the tenth and final epoch to 0.1302, but it is still relatively low.

In the context of handwritten digit recognition, it is important to evaluate the performance of the model using appropriate metrics. In this project, we considered precision, recall, F1-score, and loss as the key metrics. Loss serves as a fundamental metric during training, guiding the model towards minimizing the discrepancy between predicted and true labels. Precision measures the model's ability to correctly identify a specific digit, focusing on minimizing false positives. Recall, on the other hand, evaluates the model's ability to capture all instances of a particular digit, aiming to minimize false negatives. The F1-score provides a balanced measure, taking into account both precision and recall, and is particularly useful when dealing with imbalanced datasets or uneven class distributions. By considering these metrics, we gain valuable insights into the model's performance and can make informed decisions based on the specific requirements of the problem at hand

The model achieved an accuracy of approximately 97% on the test set. Precision, Recall, and F1-score were calculated and found to be 0.9703, 0.9701, and 0.9701 respectively. These metrics indicate that the model performs well in classifying the handwritten digits. The loss value at the end of training was 0.1302. The printed examples show the true label and predicted label for 20 random samples from the test set.