## Welcome to the advanced version of the UMARV 2024-2025 PyTorch coding checkpoint

In this notebook you will be creating a basic CNN architecture that performormes image recognition. This will give you the necessary experience you need to get started designing models for the robot. 

# This should take around 1-3 hours
You may look to Ryan or online resources for help if you want to meet this time. We suggest you stay away from AI tools because they will not help you learn the actual material and therefore hurting your ability to contribute to the actual robot. You are obviously welcome to go beyond this time to get more out of the checkpoint

# Step One 
Please download the dataset 

In [6]:
#Do not adjust the imports unless you add additional functionality. These should be all that you need to complete the checkpoint
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
from torch.utils.data import DataLoader
import torchvision.datasets as datasets
import torchvision.transforms as transforms

In [7]:
class CNN(nn.Module):
    def __init__(self, in_channels = 1, num_classes = 10):
        super(CNN, self).__init__()
        '''
        Network outline
        conv1 -> ReLU -> max pool -> conv2 -> ReLU -> max pool -> fully connected
        
        conv1: nn.Conv2d with these parameters: in_channels=1, out_channels=8, kernel_size=(3,3), stride=1, padding=1
        pool1: nn.MaxPool2d with these parameters: kernel_size=(2,2), stride=(2,2)
        conv2: nn.Conv2d with these parameters: in_channels=8, out_channels=16, kernel_size=(3,3), stride=1, padding=1
        pool2: nn.MaxPool2d with these parameters: kernel_size=(2,2), stride=(2,2)
        fully connection: nn.Linear with these parameters: in_features=16*7*7, out_features=num_classes
        
        Leave the ReLU parts for the forward function
        '''
        #TODO: use the outline above to create the layers of your convolutional neural network.
        self.conv1 = 
        self.pool =         #only need one pooling layer because you can reuse them
        self.conv2 = 
        self.fc1 = 
        
    def forward(self, x):
        '''
        In this forward function you will feed a datapoint through your network and return the value.
        The return value is a tensor (basically just an optimized list) with 10 probabilites each giving the chance that
        the datapoint is that class.
        
        For example: [0.1, 0.1, 0.4, 0.0, 0.0, 0.2, 0.0, 0.0, 0.1, 0.1] this would datapoint would be classified as a 2 becuase thats what index
        has the highest probability. 
        
        
        '''
        # here is the first example of how to pass the datapoint x into a convolutional layer and then through a ReLU activation function.
        # TODO: Finish passing x through the neural network, making sure to go through each layer and activation function from the netowrk outline above. 
        # Then return x
        x = F.relu(self.conv1(x))
        x = self.pool(x)
        x = F.relu(self.conv2(x))
        x = self.pool(x)
        
        # Once you move into the the full connected part, you must reshape x by flattening all of the pixels 
        # into a line so it can be fed through a fully connected layer. The line below does that for you.
        x = x.reshape(x.shape[0], -1)
        x =         #pass x through full connected layer
        
        return x
    

        

# Now lets load the data

In [10]:
# Here we load in the MNIST dataset and divide it into a large training dataset and a smaller testing dataset
test_dataset = datasets.MNIST(root='dataset/', train=False, transform=transforms.ToTensor(), download=True)
train_dataset = datasets.MNIST(root='dataset/', train=True, transform=transforms.ToTensor(), download=True) #To Tensor for effeciency

'''
A key part of a any neural network is a dataloader. These loaders will automatically grab batches of data from the dataset. You can iterate
through these dataloaders and therefore interate over your entire dataset in a very simple way. 
'''
#TODO: define a variable called batch_size so that the loaders know how much data to grab at a time. A good place to start is 16, 32, or 64

#TODO: Replace the None with the names of the datasets you want to use for that loader and if you want to shuffle the data or not (choose one and adjust later)
train_loader = DataLoader(dataset=None, batch_size=batch_size, shuffle=None)
test_loader = DataLoader(dataset=None, batch_size=batch_size, shuffle=None)

In [25]:
#The line below should not be adjusted and simply detects if there is a usable GPU that can be used for training.
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

#TODO: Please set these two variables to some generic values to start
# Learning rate is how fast your model will learn. If its too high it will miss understand the data, and if its too low it wont learn quick enough. so start with something like 0.001 0.01
# num epochs should stay under 20 for the sake of time, plus you shouldnt need many more than that with because the dataset is pretty simple
learning_rate = None
num_epochs = None

#Here we are defining the model and moving it to the device we want. 
model = CNN().to(device)

#You may also leave these alone
criterion = nn.CrossEntropyLoss() #criterion is the way that we will measure the accuracy of the model throughout training process. You may research CrossEntropy on your own if you want.
optimizer = optim.Adam(model.parameters(), lr=learning_rate) # Adam is an optimizer that will help the neural network learn over time and make minor adjustments to the network to predict better.

#This is the start of the training loop, the outer loop goes through the epochs and the inner loop uses the dataloader to split up the data into batches and process them seperately
for epoch in range(num_epochs):
    for batch_idx, (data, targets) in enumerate(train_loader):
        #Send data to the device you are using so that if you have a GPU you can use its optimization for faster training
        data = data.to(device=device)
        targets = targets.to(device=device)
        
        #TODO: There are five main steps to train any NN or CNN and you are to place them below. Each are one line.
        #TODO 1: feed the data through your model using the model variable that was defined earlier.

        
        #TODO 2: use the criterion variable to calculate the loss of the model. This means you must pass in the predicted values you just got and the target values given from the dataset.

        
        #TODO 3: Reset the gradient values of the optimizer using the zero_grad() function.

        
        #TODO 4: Use the loss and .backward() function to perform backpropogation which will tell the network where is needs to imporove the most to make better predictions

        
        #TODO 5: Use the optimizer and the .step() function to update parameters in the network and hepfully improve the performance

        
    print(f"Completed {epoch} epochs so far")
        


Completed 0 epochs so far
Completed 1 epochs so far
Completed 2 epochs so far
Completed 3 epochs so far
Completed 4 epochs so far
Completed 5 epochs so far
Completed 6 epochs so far
Completed 7 epochs so far
Completed 8 epochs so far
Completed 9 epochs so far


# Now Lets Evaluate the Model Accuracy

In [24]:
def check_accuracy(loader, model):
    if loader.dataset.train:
        print("checking accuracy on training data")
    else:
        print("checking accuracy on test data")
    
    num_correct = 0
    num_samples = 0
    model.eval()
    
    with torch.no_grad(): #using torch.no_grad() makes sure the network parameters are not changed while we evaluate it
        #TODO: use the loader to loop through the features (x) and labels (y)
            #Again moving the data to the correct device
            x = x.to(device=device)
            y = y.to(device=device)
            
            #TODO: feed the data through your newly trained model and set the predictions to a variable
            
            #TODO: Now you should have a tensor that is x.shape[0] by 10 where each row represents a data point and each column is a class. 
            # Every cell should be the probability that a certain data point is classified as the given class. So now use the .max function to find the highest probability class
            # for each data point. Try you best with this, there are multiple ways to go about it (some in one line and some in many more). If you have tried for a while, visit 
            # HINT1 below for some help
           
            #TODO: calculate the number of correct predictions by comparing the predictions to the labels (y).
            num_correct += None
            
            num_samples += predictions.size(0) #no need to adjust, this simply keeps track of how many samples we have evaluate thus far
            
        print(f"Got {num_correct} / {num_samples} with accuracy {float(num_correct)/float(num_samples)*100:.2f}")
    model.train()
    
check_accuracy(train_loader, model)
check_accuracy(test_loader, model)

checking accuracy on training data
Got 58026 / 60000 with accuracy 96.71
checking accuracy on test data
Got 9701 / 10000 with accuracy 97.01


# Hints
These should be used if you are unsure how to move forward on this checkpoint. These will not address all issues so feel free to ask Ryan or Matt if you are struggling with something beyond the Hints.

In [None]:
#HINT1: assuming scores is your model output tensor with the probabilities. you may use the max function like this: scores.max(some_parameter)