**HW 4**: *The goal in this assignment is to train Fully Connected Deep Neural Networks (FCNs / DNNs) to classify
images in FashionMNIST data set. As in the MNIST dataset, each sample is a grayscale 28 × 28 image and the training set consists of 60K images and the testing set consists of 10K images. Images depict articles of clothing that belong to a class from one of
10 classes.*

**Task 1:** Design your FCN model to have an adjustable number of hidden layers and neurons in each layer with
relu activation. The first (input) layer should be of dimension 784 and the last (output) layer of 10
corresponding to 10 classes. Use Cross Entropy loss to perform the classification and SGD optimizer
with learning rate as an adjustable parameter. Set the number of epochs as an adjustable parameter
and train the network. Inspect the training loss curve and include validation of accuracy, on the
validation set, at each epoch. Perform testing at the end of training.


**Task 2:** Find a configuration of the FCN (by varying the adjustable parameters) that trains in reasonable time
(several minutes is reasonable) and results in reasonable training loss curve and testing accuracy (at
least above 85% testing accuracy). These parameters will be your baseline configuration/parameters. 

The model below is the basline, with defined parameters.

In [4]:
import torch
import torchvision
import torch.nn as nn
import torch.optim as optim
import torchvision.transforms as transforms
from torch.utils.data import DataLoader, Subset
from torchvision.datasets import FashionMNIST
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt

In [5]:
# Use the following code to load and normalize the dataset for training and testing
# It will downlad the dataset into data subfolder (change to your data folder name)
train_dataset = torchvision.datasets.FashionMNIST('C:\\Users\\Sarayu G\\582\\', train=True, download=True,
                             transform=torchvision.transforms.Compose([
                               torchvision.transforms.ToTensor(),
                               torchvision.transforms.Normalize(
                                 (0.1307,), (0.3081,))
                             ]))

test_dataset = torchvision.datasets.FashionMNIST('C:\\Users\\Sarayu G\\582\\', train=False, download=True,
                             transform=torchvision.transforms.Compose([
                               torchvision.transforms.ToTensor(),
                               torchvision.transforms.Normalize(
                                 (0.1307,), (0.3081,))
                             ]))


# Use the following code to create a validation set of 10%
train_indices, val_indices, _, _ = train_test_split(
    range(len(train_dataset)),
    train_dataset.targets,
    stratify=train_dataset.targets,
    test_size=0.1,
)

# Generate training and validation subsets based on indices
train_split = Subset(train_dataset, train_indices)
val_split = Subset(train_dataset, val_indices)


# set batches sizes
train_batch_size = 900 #Define train batch size
test_batch_size  = 1000 #Define test batch size (can be larger than train batch size)


# Define dataloader objects that help to iterate over batches and samples for
# training, validation and testing
train_batches = DataLoader(train_split, batch_size=train_batch_size, shuffle=True)
val_batches = DataLoader(val_split, batch_size=train_batch_size, shuffle=True)
test_batches = DataLoader(test_dataset, batch_size=test_batch_size, shuffle=True)
                                           
num_train_batches=len(train_batches)
num_val_batches=len(val_batches)
num_test_batches=len(test_batches)


#print(num_train_batches)
#print(num_val_batches)
#print(num_test_batches)

In [6]:
class FCN(nn.Module):
    def __init__(self, input_dim, output_dim, hidden_dims):
        super(FCN, self).__init__()
        self.hidden_layers = nn.ModuleList([nn.Linear(input_dim, hidden_dims[0])])
        for i in range(len(hidden_dims) - 1):
            self.hidden_layers.append(nn.Linear(hidden_dims[i], hidden_dims[i + 1]))
        self.output_layer = nn.Linear(hidden_dims[-1], output_dim)
        self.relu = nn.ReLU()

    def forward(self, x):
        for layer in self.hidden_layers:
            x = self.relu(layer(x))
        x = self.output_layer(x)
        return x

In [7]:
import numpy as np
from tqdm import tqdm

input_dim = 784
output_dim = 10
hidden_dims = [400, 400]  # hidden layer configuration

model = FCN(input_dim=input_dim, output_dim=output_dim, hidden_dims=hidden_dims)

# Define the learning rate and epochs number
learning_rate = 0.05
epochs = 15

train_loss_list = np.zeros((epochs,))
validation_accuracy_list = np.zeros((epochs,))

# Define loss function and optimizer
loss_func = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

# Iterate over epochs, batches with progress bar and train+validate the FCN
# Track the loss and validation accuracy
for epoch in tqdm(range(epochs)):
    model.train()
    epoch_loss = 0.0  # Initialize epoch loss
    num_batches = 0   # Initialize number of batches processed in this epoch
    for train_features, train_labels in train_batches:
        optimizer.zero_grad()
        train_features = train_features.reshape(-1, input_dim)
        outputs = model(train_features)
        loss = loss_func(outputs, train_labels)
        loss.backward()
        optimizer.step()

        epoch_loss += loss.item()  # Accumulate loss for each batch
        num_batches += 1

    # Calculate average loss for the epoch
    average_epoch_loss = epoch_loss / num_batches
    train_loss_list[epoch] = average_epoch_loss 
    
    #print(f'Epoch [{epoch+1}/{epochs}], Training Loss: {average_epoch_loss:.4f}')

    # FCN Validation
    val_acc = 0
    total_samples = 0
    for val_features, val_labels in val_batches:

        # Telling PyTorch we aren't passing inputs to network for training purpose
        with torch.no_grad():
            model.eval()

            # Reshape validation images into a vector
            val_features = val_features.reshape(-1, 28*28)

            # Compute validation outputs (targets)
            val_outputs = model(val_features)

            # Compute accuracy
            _, predicted = torch.max(val_outputs.data, 1)
            val_acc += (predicted == val_labels).sum().item()
            total_samples += val_labels.size(0)  # Counting total samples
    average_val_acc = val_acc / total_samples * 100
    # Record average validation accuracy for the epoch
    validation_accuracy_list[epoch] = average_val_acc
    #print("Epoch:", epoch, "; Validation Accuracy:", average_val_acc, '%')
    

100%|██████████████████████████████████████████████████████████████████████████████████| 15/15 [03:57<00:00, 15.86s/it]


In [8]:
# Initialize variables for computing accuracy
total_correct = 0
total_samples = 0

# Telling PyTorch we aren't passing inputs to network for training purpose
with torch.no_grad():
    for test_features, test_labels in test_batches:
        model.eval()
        
        # Reshape test images into a vector
        test_features = test_features.reshape(-1, 28*28)
        
        # Compute test outputs (targets)
        test_outputs = model(test_features)
        
        # Compute predicted labels
        _, predicted = torch.max(test_outputs, 1)
        
        # Compute number of correct predictions in the batch
        total_correct += (predicted == test_labels).sum().item()
        
        # Count total number of samples in the batch
        total_samples += test_labels.size(0)

# Compute total accuracy
test_accuracy = total_correct / total_samples * 100

# Report total accuracy
#print("Test Accuracy:", test_accuracy, "%")


In [15]:
import seaborn as sns

sns.set(style = 'whitegrid', font_scale = 1.5)
#print(validation_accuracy_list)

In [1]:
import matplotlib.pyplot as plt

#fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(16, 6))

#axes[0].plot(train_loss_list, linewidth = 3)
#axes[0].set_ylabel("training loss")
#axes[0].set_xlabel("epochs")

#axes[1].plot(validation_accuracy_list, linewidth = 3, color = 'gold')
#axes[1].set_ylabel("validation accuracy")
#sns.despine()
