## Lab 4. Building a Neural Network in PyTorch

Jay Urbain, PhD

12/30/2022, 1/4/2023


Lab 4: In This assignment we will be building nueral networks in PyTorch to classify images in the `FashionMNIST` dataset. 

The assignments consiste of three parts. Part 1 is actually done for you and is meant as a tutorial. Make sure you read and execute the notebook. In part 2, you will create a basic convolutional neural network as described below. In part 3, you are encouraged to improve the performance of the network. Do a little thinking and research. Good luck!

TODO: Part 1: Create a neural network model   
TODO: Part 2: Create a convolutional neural network model   
TODO: Part 3: Improve your convolutional neural network model   

In [None]:
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor

PyTorch offers domain-specific libraries such as `TorchText`, `TorchVision`, and `TorchAudio`, all of which include datasets. 

We will be using the `FashionMNIST` dataset from `1TorchVision`. 

https://www.kaggle.com/code/pavansanagapati/a-simple-cnn-model-beginner-guide/data

Code for processing data samples can get messy and hard to maintain. We  want our dataset code to be decoupled from our model training code for better readability and modularity. 

PyTorch provides two data primitives: `torch.utils.data.DataLoader` and `torch.utils.data.Dataset` that allow you to use pre-loaded datasets as well as your own data. Dataset stores the samples and their corresponding labels, and DataLoader wraps an iterable around the Dataset to enable easy access to the samples.

https://pytorch.org/tutorials/beginner/basics/data_tutorial.html

If you have a custom data set you'll need to create an instance of `torch.utils.data.Dataset` for your files.

Note: We're only performing one basic transformation, `ToTensor` to transform are numpy ndarray's into a PyTorch tensor.

In [None]:
# Download training data from open datasets.
training_data = datasets.FashionMNIST(
    root="data",
    train=True,
    download=True,
    transform=ToTensor(),
)

# Download test data from open datasets.
test_data = datasets.FashionMNIST(
    root="data",
    train=False,
    download=True,
    transform=ToTensor(),
)

Examine the dataset.

In [None]:
import matplotlib.pyplot as plt

img_ = training_data[0][0].numpy().reshape(28, 28)
plt.imshow(img_, cmap='gray')
plt.show()
img_.shape

Pass the Dataset as an argument to `DataLoader`. This wraps an iterable over the dataset, and supports automatic batching, sampling, shuffling and multiprocess data loading. Another benefit is the ability to perform data manipulation. But don't over do it since you want data loading to be efficient.

Here we define a batch size of 64, i.e. each element in the dataloader iterable will return a batch of 64 features and labels.

In [None]:
batch_size = 64

# Create data loaders.
train_dataloader = DataLoader(training_data, batch_size=batch_size)
test_dataloader = DataLoader(test_data, batch_size=batch_size)

for X, y in test_dataloader:
    print(f"Shape of X [N, C, H, W]: {X.shape}")
    print(f"Shape of y: {y.shape} {y.dtype}")
    break

#### Creating Models

To define a neural network in PyTorch, create a class that inherits from nn.Module. 

Define the layers of the network in the `__init__` function and specify how data will pass through the network in the `forward` function. To accelerate operations in the neural network, move it to the GPU if available.

## TODO: Part 1: Create a neural network model 

In [None]:
# Get cpu or gpu device for training.
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using {device} device")

# Define model
class NeuralNetwork(nn.Module):
    def __init__(self):
        super().__init__()
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28*28, 512),
            nn.ReLU(),
            nn.Linear(512, 512),
            nn.ReLU(),
            nn.Linear(512, 10)
        )

    def forward(self, x):
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

model = NeuralNetwork().to(device)
print(model)

## TODO: Part 2: Create a convolutional neural network model 

First review, the following references:

Conv2d  
https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html

BatchNorm2d  
https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm2d.html

ReLU  
https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html 

MaxPool2d  
https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html

Linear  
https://pytorch.org/docs/stable/generated/torch.nn.Linear.html   

Second, complete the following ConvNet class.

In [None]:
from torch.nn import Linear, ReLU, CrossEntropyLoss, Sequential, Conv2d, MaxPool2d, Module, Softmax, BatchNorm2d, Dropout

class ConvNet(nn.Module):   
    def __init__(self):
        super(ConvNet, self).__init__()

        self.cnn_layers = nn.Sequential(
            # Defining a 2D 3x3 convolution layer using Conv2d. Use a stride of 1 and padding of 1.
            # input channels should be 1, output channels should be 4.

            
            # Perform batch normilation

            
            # Apply the ReLU non-linear activation function

            
            # Reduce the spatial dimensionality by using max pooling with size of 2 and stride of 2.

            
            # Define another 2D convolution layer
            # Be careful to define the correct number of input channels into your convolution layer

        )

        self.linear_layers = Sequential(
            Linear(4 * 7 * 7, 10)
        )

    # Defining the forward pass    
    def forward(self, x):
        x = self.cnn_layers(x)
        x = x.view(x.size(0), -1)
        x = self.linear_layers(x)
        return x
    
# model = ConvNet().to(device)
# print(model)


## TODO: Part 3: Improve your convolutional neural network model 

Can you improve the performance of this convolutional neural network and right trash talk your Professor?


In [None]:
# your super duper convnet here

# TODO


#### Optimizing the Model Parameters

To train a model, define a loss function and an optimizer. Note: We are just using cross entropy loss and stochastic gradient descent as discussed in class.

In [None]:
loss_fn = nn.CrossEntropyLoss() # define loss
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3) # define optimizer

In a single training loop, the model makes predictions on the training dataset (fed to it in batches), and backpropagates the prediction error to adjust the model’s parameters.


In [None]:
def train(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)
    model.train()
    for batch, (X, y) in enumerate(dataloader):
        X, y = X.to(device), y.to(device)

        # Compute prediction error
        pred = model(X) # use model to make a prediction
        loss = loss_fn(pred, y) # measure loss with respect to ground truth y

        # Backpropagation
        optimizer.zero_grad() # zero out gradients from last pass
        loss.backward() # calculate gradients using cross-entropy
        optimizer.step() # optimize model parameters

        if batch % 100 == 0:
            loss, current = loss.item(), batch * len(X)
            print(f"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]")

Define a test function to test the model’s performance against the test dataset to ensure it is learning. This is typically called validation.

In [None]:
from torch.utils.data import DataLoader

def test(dataloader, model, loss_fn):
    size = len(dataloader.dataset)
    num_batches = len(dataloader)
    model.eval()
    test_loss, correct = 0, 0
    with torch.no_grad():
        for X, y in dataloader:
            X, y = X.to(device), y.to(device)
            pred = model(X)
            test_loss += loss_fn(pred, y).item()
            correct += (pred.argmax(1) == y).type(torch.float).sum().item()
    test_loss /= num_batches
    correct /= size
    print(f"Test Error: \n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n")
    return correct, test_loss

The training process is conducted over several iterations (epochs). 

During each epoch, the model learns parameters to make better predictions. 

Print the model’s accuracy and loss at each epoch. You want to see the accuracy increase and the loss decrease with every epoch.

In [None]:
epochs = 5 # you may want to increase this number
loss_list = []
acc_list = []
for t in range(epochs):
    print(f"Epoch {t+1}\n-------------------------------")
    train(train_dataloader, model, loss_fn, optimizer)
    accuracy, loss = test(test_dataloader, model, loss_fn)
    acc_list.append(accuracy)
    loss_list.append(loss)
print("Done!")

In [None]:
import matplotlib.pyplot as plt

plt.plot(loss_list)
plt.xlabel("no. of epochs")
plt.ylabel("total loss")
plt.title("Loss")
plt.show()

In [None]:
plt.plot(acc_list)
plt.xlabel("no. of epochs")
plt.ylabel("total accuracy")
plt.title("Accuracy")
plt.show()

#### Saving Models

You'll want to save your models so you don't have to retrain each time!

A common way to save a model is to serialize the internal state dictionary (containing the model parameters).


In [None]:
torch.save(model.state_dict(), "model.pth")
print("Saved PyTorch Model State to model.pth")

#### Loading Models

The process for loading a model includes re-creating the model structure and loading the state dictionary into it.

In [None]:
model = NeuralNetwork()
# model = ConvNet()
model.load_state_dict(torch.load("model.pth"))

#### Make predictions!

In [None]:
import numpy as np
classes = [
    "T-shirt/top",
    "Trouser",
    "Pullover",
    "Dress",
    "Coat",
    "Sandal",
    "Shirt",
    "Sneaker",
    "Bag",
    "Ankle boot",
]

model.eval()
x, y = test_data[0][0], test_data[0][1]
print(x.shape)

# approach 1, convert example to torch tensor
# Allows you to pick any specific example
x = np.expand_dims(x, axis=0)
y = np.expand_dims(y, axis=0)
x = x.astype(np.float32)
y = y.astype(np.int64)
x = torch.from_numpy(x)
y = torch.from_numpy(y)

# approach 2, use dataloader iterator. 
# Doesn't allow you to pick a specific example
# inputs, classes = next(iter(test_dataloader)) 

# approach 3, use dataloader 
# Doesn't allow you to pick a specific example
# for X, y in dataloader:
#     X, y = X.to(device), y.to(device)
#     break

with torch.no_grad():
    pred = model(x)
    predicted, actual = classes[pred[0].argmax(0)], classes[y]
    print(f'Predicted: "{predicted}", Actual: "{actual}"')

In [None]:
import matplotlib.pyplot as plt

img_ = test_data[0][0].numpy().reshape(28, 28)
plt.imshow(img_, cmap='gray')
plt.show()
