# Image classifier

Building a single layer NN in PyTorch.

Using my notes from the machine learning module last year: https://github.com/hannahjayneknight/machine-learning 

Also see: https://machinelearningmastery.com/building-an-image-classifier-with-a-single-layer-neural-network-in-pytorch/ 

To do/ thoughts:
- Try making my own dataset loader class?
- Reduce size of images to speed up
- Use CNN
- Further data augmentation: https://towardsdatascience.com/custom-dataset-in-pytorch-part-1-images-2df3152895 

In [2]:
import torch
from torchvision import transforms, datasets
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

NB: Images are 1920 x 1080, 2.9MB, 72 dpi, 32 bit

Good explanation of why we need to transform: https://www.kaggle.com/code/leifuer/intro-to-pytorch-loading-image-data

In [11]:
transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

train = datasets.ImageFolder(root='C:/Users/hanna/Desktop/git/interiorcardamage/Data/train', transform=transform)
test  = datasets.ImageFolder(root='C:/Users/hanna/Desktop/git/interiorcardamage/Data/test', transform=transform)

trainset = torch.utils.data.DataLoader(train, batch_size=1, shuffle=True)
testset = torch.utils.data.DataLoader(test, batch_size=1, shuffle=True)

In [12]:
for X, y in testset:
    print(f"Shape of X [N, C, H, W]: {X.shape}")
    print(f"Shape of y: {y.shape} {y.dtype}")
    break

Shape of X [N, C, H, W]: torch.Size([1, 3, 1080, 1920])
Shape of y: torch.Size([1]) torch.int64


ReLu functions and Adam's gradient descent

In [20]:
class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(3*1080*1920, 64)
        self.fc2 = nn.Linear(64, 64)
        self.fc3 = nn.Linear(64, 64)
        self.fc4 = nn.Linear(64, 10) # softmax is an activation function that we need to apply wx+b to

    def forward(self, x):
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = F.relu(self.fc3(x))
        x = self.fc4(x)
        return F.softmax(x, dim=1)

net = Net()

optimizer = optim.Adamax(net.parameters(), lr=0.001)

for epoch in range(1):
    for data in trainset:
        X, y = data
        net.zero_grad()
        output = net.forward(X.view(-1, 3*1080*1920))
        loss = F.nll_loss(output, y)
        loss.backward()
        optimizer.step()

correct =0
total = 0

with torch.no_grad():
    for data in testset:
        X, y = data
        output = net.forward(X.view(-1, 3*1080*1920))
        for idx, i in enumerate(output):
            if torch.argmax(i) == y[idx]:
                correct += 1
            total += 1

print("Accuracy: ", round(correct/total, 3))

Accuracy:  0.5
