DATALOADER TRAINING

This script is the basic training script for the Pytorch based machine learning

'torch' and 'torchvision' are the main libraries in use here and are parts of the Pytorch machine learning library. Multiple parts of this library are imported separately because it was easier, but this could be cleaned up


The 'device' variable is setting the device used to run the machine learning to either a CUDA based GPU or a CPU depending on what is available. A CUDA GPU will be many times faster to run on

In [None]:
from torchvision.datasets import ImageFolder
from torchvision.transforms import ToTensor
from torch.utils.data import DataLoader
import torch
import torchvision
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
import numpy as np
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

# Assuming that we are on a CUDA machine, this should print a CUDA device:

print(device)

This next section sets up the Dataset. This is done using the 'ImageFolder' function within Pytorch. This takes in a folder structured in a particular way, and converts it into a Pytorch dataset that uses tensors.  
Next this dataset is split into two channels, train and validation. The sizes of the channels can be changed here using the numbers in the 'lengths' variable which are currently 0.8 and 0.2 respectivly  
The two channels are then turned into Dataloaders, which control the batch size, shuffling and number of parallel workers.

In [None]:
base = 'C:\\Users\\zshal\\Documents\\Pytorch\\Images\\Images\\v3'
data = ImageFolder(root=base, transform=ToTensor())
lengths = ((int)(len(data)*0.8), (int)(len(data)*0.2))
trainsplit, testsplit = torch.utils.data.random_split(data, lengths)

trainimages = DataLoader(trainsplit, batch_size=4, shuffle=True, num_workers=0)
testimages = DataLoader(testsplit, batch_size=4, shuffle=False, num_workers=0)


transform = transforms.Compose([transforms.ToTensor(), 
                                transforms.Normalize((0.5, 0.5, 0.5),
                                (0.5, 0.5, 0.5))])


classes = ('pothole', 'no pothole')

In this next section the model itself is described using a class. The descriptions are hopefully pretty straightforward, but the _init_ section is setting up the layers that will be used, and the 'forward' section is showing how the layers fit together. This example has been kept basic, but should have a decent balance of accuracy and training speed.

In [None]:
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 12, 3)
        self.pool = nn.MaxPool2d(2, 2)
        #self.conv2 = nn.Conv2d(12, 12, 3)
        #self.fc1 = nn.Linear(555984, 120)
        self.fc1 = nn.Linear( 12*399*471, 360)
        self.fc2 = nn.Linear( 360, 120)
        self.fc3 = nn.Linear(120, 84)
        self.fc4 = nn.Linear(84, 2)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        #x = self.pool(F.relu(self.conv2(x)))
        #x = x.view(-1, 12*198*234)
        x = x.view(-1, 12*399*471)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = F.relu(self.fc3(x))
        x = self.fc4(x)
        return x


net = Net()

This is where the loss fucntion and targets are set. These can be changed to other loss functions and optimisers to change how the model functions

In [None]:
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

This is the training loop section. This should be relativly self explanitory, but the number of epochs can be changed here in the for loop. The printing style can also be changed if more detailed logs are requried

In [None]:
for epoch in range(6):  # loop over the dataset multiple times

    running_loss = 0.0
    for i, data in enumerate(trainimages, 0):
        # get the inputs; data is a list of [inputs, labels]
        inputs, labels = data

        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        # print statistics
        running_loss += loss.item()
        print(i)
        if i % 2000 == 1999:    # print every 2000 mini-batches
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 2000))
            running_loss = 0.0

print('Finished Training')

Here the output of the training is saved to a .pth file. This can be loaded into other scripts and it is every important that it gets saved properly. In this case the loading back is shown too as an example

In [None]:
PATH = './conv1CrossValidation6.pth'
torch.save(net.state_dict(), PATH)

net = Net()
net.load_state_dict(torch.load(PATH))
net.to(device)




correct = 0
total = 0

Finally, the trained net is used on the validation dataset to check accuracy for both the classes

In [None]:
with torch.no_grad():
    for data in testimages:
        images, labels = data[0].to(device), data[1].to(device)
        outputs = net(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print('Accuracy of the network on the 10000 test images: %d %%' % (
    100 * correct / total))

In [None]:
class_correct = list(0. for i in range(2))
class_total = list(0. for i in range(2))
with torch.no_grad():
    for data in testimages:
        images, labels = data[0].to(device), data[1].to(device)
        outputs = net(images)
        _, predicted = torch.max(outputs, 1)
        c = (predicted == labels).squeeze()
        for i in range(2):
            label = labels[i]
            class_correct[label] += c[i].item()
            class_total[label] += 1


for i in range(2):
    print('Accuracy of %5s : %2d %%' % (
        classes[i], 100 * class_correct[i] / class_total[i]))