<h1>Image Classifier</h1>
Module Imports

In [2]:
import torch
from torch import nn
import torch.nn.functional as F
from torch import optim
from torchvision import datasets, transforms
from torch.utils.data.dataloader import DataLoader
from torch.utils.data import random_split

<h2>Cuda Test</h2>
Code line used to set device to cuda if available

In [3]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
device

device(type='cuda')

<h2>Neural Network Model Class</h2>
The class that defines the neural network.
Methods include the convolution layer made up of three blocks, each representing a layer in the network and params representing the number of nodes.
Conv_layer is used to filter the data and learn from it.
FC_layer is the feature configuration layer taking the filtered data to apply and recognise features.
Forward is the method to move a data, x, through the network to reach an output from input.

In [4]:
#flexible model
class CIFAR10Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv_layer = nn.Sequential(
            # Conv Layer block 1
            nn.Conv2d(in_channels=3, out_channels=32, kernel_size=3, padding=1),
            nn.BatchNorm2d(32),
            nn.ReLU(inplace=True),
            nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2),

            # Conv Layer block 2
            nn.Conv2d(in_channels=64, out_channels=128, kernel_size=3, padding=1),
            nn.BatchNorm2d(128),
            nn.ReLU(inplace=True),
            nn.Conv2d(in_channels=128, out_channels=128, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2),
            nn.Dropout2d(p=0.05),

            # Conv Layer block 3
            nn.Conv2d(in_channels=128, out_channels=256, kernel_size=3, padding=1),
            nn.BatchNorm2d(256),
            nn.ReLU(inplace=True),
            nn.Conv2d(in_channels=256, out_channels=256, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2),
        )

        self.fc_layer = nn.Sequential(
            nn.Dropout(p=0.1),
            nn.Linear(4096, 1024),
            nn.ReLU(inplace=True),
            nn.Linear(1024, 512),
            nn.ReLU(inplace=True),
            nn.Dropout(p=0.1),
            nn.Linear(512, 10)
        )
        
    def forward(self, x):
        # Flatten images into vectors
         # conv layers
        x = self.conv_layer(x)
        
        # flatten
        x = x.view(x.size(0), -1)
        
        # fc layer
        x = self.fc_layer(x)

        return x


model = CIFAR10Model().to(device)

<h2>Optimiser and Loss Functions</h2>
Using the inbuilt functionality from pytorch to create the loss and optimiser functions to train the neural network.

In [5]:
#define optimiser
params = model.parameters()
optimiser = optim.SGD(params, lr=1e-2)


In [6]:
#define loss
loss = nn.CrossEntropyLoss()

<h2>Load training and validation data</h2>
The CIFAR10 dataset is loaded here and transformed into tensors, whereby the 32x32 pixel images are transformed into multidimensional arrays and normalised so that it may be passed as input data to train the neural network. 

In [11]:
#train, val, label
transform = transforms.Compose(
    [
        transforms.ToTensor(),
        transforms.Normalize((0.5,0.5,0.5), (0.5,0.5,0.5)),
    ]
)

train_data = datasets.CIFAR10(root='./.data', train=True, download=True, transform=transform)

val_size = 5000
batch_size = 4
train, val = random_split(train_data, [len(train_data)-val_size, val_size])
train_loader = DataLoader(train, batch_size=batch_size, num_workers=2)
val_loader = DataLoader(val, batch_size=batch_size, num_workers=2)

# mapping from label to english description
classes = train_data.classes
classes

Files already downloaded and verified


['airplane',
 'automobile',
 'bird',
 'cat',
 'deer',
 'dog',
 'frog',
 'horse',
 'ship',
 'truck']

<h2>Training</h2>
This is where the model is trained. Epoch is the number of iterations the model trains for and outputs the loss value, how far the model was from the correct class, and accuracy of how many correct guesses it classified the images.
Training is split into two where the train_loader is used to adjust the network via gradient descent implemented by pyTorch, val_loader is used to test the model to ensure it is not overfitting to the training data where it runs the model without changing the network.

In [7]:
#training and validation loop

num_epochs = 8
for epoch in range(num_epochs):
    losses = list()
    accuracies = list()
    for batch in train_loader:
        x, y = batch[0].to(device), batch[1].to(device)
        l = model(x)
        J = loss(l, y)
        model.zero_grad()
        J.backward()
        optimiser.step()
        
        losses.append(J.item())
        accuracies.append(y.eq(l.detach().argmax(dim=1)).float().mean())
    print(f'Epoch {epoch + 1}, train loss: {torch.tensor(losses).mean():.2f}, train acc: {torch.tensor(accuracies).mean():.2f}')

    losses = list()
    accuracies = list()
    for batch in val_loader:
        x, y = batch[0].to(device), batch[1].to(device)
        with torch.no_grad():
            l = model(x)
        J = loss(l, y.to(device))
        
        losses.append(J.item())
        accuracies.append(y.eq(l.detach().argmax(dim=1)).float().mean())
    print(f'Epoch {epoch + 1}, val loss: {torch.tensor(losses).mean():.2f}, val acc: {torch.tensor(accuracies).mean():.2f}')
    

Epoch 1, train loss: 1.31, train acc: 0.53
Epoch 1, val loss: 0.88, val acc: 0.69
Epoch 2, train loss: 0.79, train acc: 0.72
Epoch 2, val loss: 0.71, val acc: 0.75
Epoch 3, train loss: 0.58, train acc: 0.80
Epoch 3, val loss: 0.70, val acc: 0.77
Epoch 4, train loss: 0.42, train acc: 0.85
Epoch 4, val loss: 0.67, val acc: 0.78
Epoch 5, train loss: 0.30, train acc: 0.89
Epoch 5, val loss: 0.74, val acc: 0.78
Epoch 6, train loss: 0.22, train acc: 0.92
Epoch 6, val loss: 0.77, val acc: 0.79
Epoch 7, train loss: 0.17, train acc: 0.94
Epoch 7, val loss: 0.83, val acc: 0.79
Epoch 8, train loss: 0.13, train acc: 0.95
Epoch 8, val loss: 0.92, val acc: 0.78


<h2>Save and Load</h2>
This is where the model is exported as a dictionary and then imported back to the script to ensure the model is saved.

In [8]:
#TODO: SAVE AND LOAD MODEL HERE; export as dict and load dict into nn module
PATH = './cifar_model.pth'
torch.save(model.state_dict(), PATH)

<h2>Load Test Data</h2>
Similar to the validation training, the test data, unseen data by the model, is loaded to test the correctness of the trained network.

In [14]:
test_dataset = datasets.CIFAR10(root='./.data', train=False, download=True, transform=transform)
test_loader = DataLoader(test_dataset, batch_size = batch_size, shuffle=True)

Files already downloaded and verified


<h2>Model Testing</h2>
This is where the test_loader is tested upon the trained model and output the accuracy of the model on 'real data' or data that it has not seen before how well was the generalisation of the training.

In [15]:
class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))
with torch.no_grad():
    for data in test_loader:
        images, labels = data[0].to(device), data[1].to(device)
        outputs = model(images)
        _, predicted = torch.max(outputs, 1)
        c = (predicted == labels).squeeze()
        for i in range(4):
            label = labels[i]
            class_correct[label] += c[i].item()
            class_total[label] += 1


for i in range(10):
    print('Accuracy of %5s : %2d %%' % (
        classes[i], 100 * class_correct[i] / class_total[i]))

print('\nAccuracy of %5s : %2d %%' % (
        "total", 100 * sum(class_correct) / sum(class_total)))

Accuracy of plane : 86 %
Accuracy of   car : 86 %
Accuracy of  bird : 77 %
Accuracy of   cat : 64 %
Accuracy of  deer : 69 %
Accuracy of   dog : 66 %
Accuracy of  frog : 76 %
Accuracy of horse : 81 %
Accuracy of  ship : 91 %
Accuracy of truck : 87 %

Accuracy of total : 78 %
