# Training Neural Networks
In this exercise, you will train a neural network using PyTorch. You will be provided some starter code and will fill in the blanks. 

This will walk through the entire process, from loading datasets to creating transforms, all the way through to creating the network code and training it to classify the CIFAR-10 dataset.

In [2]:
# DO NOT EDIT THIS CELL
import torch
import os
from torch import nn
from torch import optim
import torch.nn.functional as F
from torch.utils.data import DataLoader
from torchvision import transforms
from torchvision import datasets
import matplotlib.pyplot as plt

## Loading and Preprocessing Data
In this section, we will load and preprocess our data using any relevant methods from `transforms` and `datasets`.
Then, we will create `DataLoader`s for our train and test sets.

If you have trouble, feel free to consult the documentation for [transforms](https://pytorch.org/vision/0.12/transforms.html) and [CIFAR-10](https://pytorch.org/vision/stable/generated/torchvision.datasets.CIFAR10.html#torchvision.datasets.CIFAR10)

In [3]:
DATA_PATH = os.path.join(os.getcwd(), "cifar-10-python")
# DATA_PATH = os.path.join(DATA_PATH, 'cifar-10-batches-py')
print(DATA_PATH)

C:\Users\mkand\Documents\Machine_learning\ML_Fundamentals\Deep Learning\cifar-10-python


In [4]:
# Establish our transform
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5, ), (0.5, ))])

# Load train and test datasets
training_data = datasets.CIFAR10(root = DATA_PATH, train = True, transform = transform, download = True)
test_data = datasets.CIFAR10(root = DATA_PATH, train = True, transform = transform, download = True)

# Create the training and test dataloaders
train_loader = DataLoader(training_data, batch_size = 32, shuffle = True)
test_loader = DataLoader(test_data, batch_size = 32)

Files already downloaded and verified
Files already downloaded and verified


## Defining our Neural Network
Once our data is loaded, we want to define our model. 
For this example, we want to use a fully-connected model, which means we will need to use the `flatten` method to take our 32 x 32 x 3 tensor and flatten it into a single input. 

We want to have at least 2 hidden layers. 
The input size of the first layer will need to account for the flattening and will be 32 * 32 * 3.
Feel free to experiment here, and if you need additional help, consult the [PyTorch documentation](https://pytorch.org/docs/stable/nn.html)

In [5]:
# Define the class for your neural network
class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.activation = F.relu
        self.layer1 = nn.Linear(3072, 1024)
        self.layer2 = nn.Linear(1024, 512)
        self.layer3 = nn.Linear(512, 10)

    def forward(self, x):
        x = torch.flatten(x, 1) # flatten all dimensions except batch
        x = self.activation(self.layer1(x))
        x = self.activation(self.layer2(x))
        x = self.layer3(x)
        return x

# Instantiate the model
net = Net()

## Optimizer and Loss function
Before we get into our training loop, we need to choose an optimizer and loss function for our network. 

In [6]:
for p in net.parameters():
    print(p)
    break

Parameter containing:
tensor([[-5.9408e-03, -1.3315e-02, -9.2730e-03,  ...,  1.7658e-03,
         -1.4176e-02, -1.2058e-02],
        [ 5.2786e-03,  1.3002e-02,  1.2796e-02,  ...,  1.3725e-02,
         -7.4252e-03,  1.2435e-02],
        [ 7.4223e-03, -4.0730e-03, -1.4358e-02,  ...,  1.9649e-03,
          9.9431e-03, -1.2520e-02],
        ...,
        [ 1.4764e-03,  4.5004e-03, -9.9901e-05,  ..., -6.9590e-03,
         -1.2752e-02,  1.2805e-02],
        [ 9.2023e-03, -7.4177e-03,  8.9031e-03,  ...,  4.0995e-03,
         -1.1467e-02,  1.1628e-02],
        [ 6.7633e-03, -8.4288e-03,  1.3586e-02,  ..., -1.1011e-02,
         -1.6743e-02,  4.2155e-03]], requires_grad=True)


In [7]:
# Choose an optimizer
optimizer = optim.SGD(net.parameters(), lr = 0.003, momentum=0.9)

# Choose a loss function
criterion = nn.CrossEntropyLoss()

## Creating the Training Loop
With our network, optimizer, and loss function, we can now begin the training step! 
Using the test set to validate our accuracy, we can see when our network has given us the best fit.

In [None]:
num_epochs = 15

# Establish a list for our history
train_loss_history = list()
val_loss_history = list()

for epoch in range(num_epochs):
    net.train()
    train_loss = 0.0
    train_correct = 0
    for i, data in enumerate(train_loader):
        # data is a list of [inputs, labels]
        inputs, labels = data

        # Pass to GPU if available.
        if torch.cuda.is_available():
            inputs, labels = inputs.cuda(), labels.cuda()

        # Zero out the gradients of the optimizer
        optimizer.zero_grad()

        # Get the outputs of your model and compute your loss
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        
        # Compute the loss gradient using the backward method and have the optimizer take a step
        loss.backward()
        optimizer.step()

        # Compute the accuracy and print the accuracy and loss
        _, preds = torch.max(outputs.data, 1)
        train_correct += sum(preds == labels)/len(preds == labels)
        train_loss += loss.item()
    print(f'Epoch {epoch + 1} training accuracy: {train_correct/len(train_loader):.2f}% training loss: {train_loss/len(train_loader):.5f}')
    train_loss_history.append(train_loss/len(train_loader))

    # The validation step is done for you.
    val_loss = 0.0
    val_correct = 0
    net.eval()
    for inputs, labels in test_loader:
        if torch.cuda.is_available():
            inputs, labels = inputs.cuda(), labels.cuda()

        outputs = net(inputs)
        loss = criterion(outputs, labels)

        _, preds = torch.max(outputs.data, 1)
#         val_correct += (preds == labels).mean().item()
        val_correct += sum(preds == labels)/len(preds == labels)
        val_loss += loss.item()
    print(f'Epoch {epoch + 1} validation accuracy: {val_correct/len(test_loader):.2f}% validation loss: {val_loss/len(test_loader):.5f}')
    val_loss_history.append(val_loss/len(test_loader))

In [None]:
# Plot the training and validation loss history
plt.plot(train_loss_history, label="Training Loss")
plt.plot(val_loss_history, label="Validation Loss")
plt.legend()
plt.show()