<a href="https://colab.research.google.com/gist/khaiyichin/21f114d7f06027eabaaa407c5e942174/test.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Neural network tutorial with PyTorch
This brief tutorial provides instruction on how to:

1. obtain prepared data offered by PyTorch,
2. set up a neural network class,
3. train the neural network, and
4. save/load the models to train on different devices.

Note: training the neural networks in this tutorial will require GPU; for easy GPU access simply run this in Google Colab. Once in Colab, go to `Edit` > `Notebook Settings` which will generate a pop-up. In the pop-up select `GPU` as the `Hardware accelerator`.

---

The materials here are adapted from a tutorial provided by the WPI DS595: Reinforcement Learning class on using PyTorch.

Reference on saving/loading models: https://debuggercafe.com/effective-model-saving-and-resuming-training-in-pytorch/

In [None]:
import numpy as np
import random

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms

import os

### Step 1. Prepare data.
2 things to keep in mind when preprocessing the data for training:
* Tensor shape
* Tensor datatype

Download the CIFAR 10 data:

In [None]:
# Prepare Data
transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
                                          shuffle=True, num_workers=2)

print('original dtype:',type(trainset.data))
print('original shape:',trainset.data.shape)

Files already downloaded and verified
original dtype: <class 'numpy.ndarray'>
original shape: (50000, 32, 32, 3)


Example on how to create a minibatch:

In [None]:
def batch(data,batch_size):
    minibatch = random.sample(data, batch_size)
    minibatch = np.array(minibatch).transpose(0,3,1,2)
    minibatch = torch.tensor(minibatch/ 255.0)
    return minibatch

minibatch = batch(list(trainset.data),32)
print(minibatch.shape)
print(minibatch.dtype)

torch.Size([32, 3, 32, 32])
torch.float64


### Step 2. Define a neural network.

We define our desired neural network architecture as a class that inherits from `nn.Module`; this is so that core PyTorch neural network methods are available to us.

In the `__init__` function the parent methods and attributes are inherited first before the definition of desired neural network blocks.

In the `forward` method, the network structure is set up based on the defined blocks in `__init__`. This is the method that executes forward propagation in the network.

In [None]:
class Net(nn.Module):
    """Neural network class.
    """

    def __init__(self):
        super(Net, self).__init__() # inherit parent methods and attributes

        # Define layers
        self.conv1 = nn.Conv2d(3, 6, 5) # convolves 3 -> 6 channels with a 5x5 kernel
        self.pool = nn.MaxPool2d(2, 2) # pools 2x2 elements of the input with a stride of 2
        self.conv2 = nn.Conv2d(6, 16, 5) # convolves 6 -> 16 channels with a 5x5 kernel
        self.fc1 = nn.Linear(16 * 5 * 5, 120) # fully connected layer with 120 nodes
        self.fc2 = nn.Linear(120, 84) # fully connected layer with 84 nodes
        self.fc3 = nn.Linear(84, 10) # fully connected layer with 10 nodes

    def forward(self, x):
        """Executes forward propagation of the network.

        Processes x to generate a prediction.

        This method should not be called directly, instead use the class as a
        function, e.g.,
            net = Net()
            ...
            y = net(x) # will output prediction based on input tensor x

        Args:
            x: A PyTorch tensor input.

        Returns:
            A prediction based on the input x.
        """
        # Convolution block 1
        x = self.pool(F.relu(self.conv1(x))) # 32x32x3 -> 28x28x6 -> 14x14x6

        # Convolution block 2
        x = self.pool(F.relu(self.conv2(x))) # 14x14x6 -> 10x10x16 -> 5x5x16

        # Flatten into 1D
        x = x.view(-1, 16 * 5 * 5)

        # Fully connected layers
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))

        # Output layer
        x = self.fc3(x)

        return x

### Step 3. Train on GPU, then save model.

Show that the GPU device is available:

In [None]:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

# Assuming that we are on a CUDA machine, this should print a CUDA device:
print(device)

cuda:0


Train the network on the CIFAR10 dataset for 1 epoch using the GPU, then save the trained model:

In [None]:
import time

print("Training with GPU for the 1st epoch")

# Initialize neural network model and  optimizer
gpu_net = Net().to(device)
gpu_optimizer = optim.SGD(gpu_net.parameters(), lr=0.001, momentum=0.9)
gpu_criterion = nn.CrossEntropyLoss()

t = time.process_time()

# Train model for 1 epoch
for epoch in range(1):  # can loop over the dataset multiple times

    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        # get the inputs; data is a list of [inputs, labels]
        ####################################################
        # tensor
        # batch_size, channel, H, W
        inputs, labels = data[0].to(device), data[1].to(device)

        # zero the parameter gradients
        gpu_optimizer.zero_grad()

        # forward + backward + optimize
        outputs = gpu_net(inputs)
        loss = gpu_criterion(outputs, labels)
        loss.backward()
        gpu_optimizer.step()
        ####################################################
        
        # print statistics
        running_loss += loss.item()
        if i % 2000 == 1999:    # print every 2000 mini-batches
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 2000))
            running_loss = 0.0

print('Finished Training in ' + str(time.process_time() - t) + ' s')

# Save model, optimizer and loss function
PATH_1 = './test_model.pth'

torch.save({
            'model_state_dict': gpu_net.state_dict(),
            'optimizer_state_dict': gpu_optimizer.state_dict(),
            'loss': gpu_criterion,
            }, PATH_1)

Training with GPU
[1,  2000] loss: 2.191
[1,  4000] loss: 1.836
[1,  6000] loss: 1.661
[1,  8000] loss: 1.569
[1, 10000] loss: 1.492
[1, 12000] loss: 1.467
Finished Training in 69.98760816800001 s


### Step 4. Load model to train on CPU, then save it.

In [None]:
# Load model to train with CPU
print("Training with CPU for the 2nd epoch")
device = torch.device("cpu")

# Initialize empty containers to load our model and optimizer to
cpu_net = Net().to(device)
cpu_optimizer = optim.SGD(cpu_net.parameters(), lr=0.001, momentum=0.9)
cpu_criterion = nn.CrossEntropyLoss()

# Load required objects
checkpoint = torch.load(PATH_1, map_location=device)

cpu_net.load_state_dict(checkpoint['model_state_dict'])
cpu_optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
cpu_criterion = checkpoint['loss']
cpu_net.train() # set model to training mode

t = time.process_time()

# Continue training model for another epoch
for epoch in range(1):  # can loop over the dataset multiple times

    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        # get the inputs; data is a list of [inputs, labels]
        ####################################################
        # tensor
        # batch_size, channel, H, W
        inputs, labels = data[0].to(device), data[1].to(device)

        # zero the parameter gradients
        cpu_optimizer.zero_grad()

        # forward + backward + optimize
        outputs = cpu_net(inputs)
        loss = cpu_criterion(outputs, labels)
        loss.backward()
        cpu_optimizer.step()
        ####################################################
        
        # print statistics
        running_loss += loss.item()
        if i % 2000 == 1999:    # print every 2000 mini-batches
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 2000))
            running_loss = 0.0

print('Finished Training in ' + str(time.process_time() - t) + ' s')

# Save model, optimizer and loss function
PATH_2 = './test_model_2.pth'

torch.save({
            'model_state_dict': cpu_net.state_dict(),
            'optimizer_state_dict': cpu_optimizer.state_dict(),
            'loss': cpu_criterion,
            }, PATH_2)

Training with CPU
[1,  2000] loss: 1.386
[1,  4000] loss: 1.365
[1,  6000] loss: 1.359
[1,  8000] loss: 1.321
[1, 10000] loss: 1.289
[1, 12000] loss: 1.290
Finished Training in 61.85360124400006 s


### Step 5. Load model to train on GPU for the 3rd and final epoch.

In [None]:
# Load model to train with CPU
print("Training with GPU for the 3rd epoch")
device = torch.device("cuda:0")

# Initialize empty containers to load our model and optimizers to
final_net = Net().to(device)
final_optimizer = optim.SGD(final_net.parameters(), lr=0.001, momentum=0.9)
final_criterion = nn.CrossEntropyLoss()

# Load required objects
checkpoint = torch.load(PATH_2, map_location=device)

final_net.load_state_dict(checkpoint['model_state_dict'])
final_optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
final_criterion = checkpoint['loss']
final_net.train() # set model to training mode

t = time.process_time()

# Train model for another epoch
for epoch in range(1):  # loop over the dataset multiple times

    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        # get the inputs; data is a list of [inputs, labels]
        ####################################################
        # tensor
        # batch_size, channel, H, W
        inputs, labels = data[0].to(device), data[1].to(device)

        # zero the parameter gradients
        final_optimizer.zero_grad()

        # forward + backward + optimize
        outputs = final_net(inputs)
        loss = final_criterion(outputs, labels)
        loss.backward()
        final_optimizer.step()
        ####################################################
        
        # print statistics
        running_loss += loss.item()
        if i % 2000 == 1999:    # print every 2000 mini-batches
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 2000))
            running_loss = 0.0

print('Finished Training in ' + str(time.process_time() - t) + ' s')

Training with GPU
[1,  2000] loss: 1.226
[1,  4000] loss: 1.214
[1,  6000] loss: 1.223
[1,  8000] loss: 1.210
[1, 10000] loss: 1.210
[1, 12000] loss: 1.179
Finished Training in 70.29780777899998 s
