# Let's build an AI to detect robots!

Ceylon & HelloElwin @ HKU Astar

## Installation

### install pytorch
```shell
# Python 3.x
pip3 install torch torchvision
```

### install jupyter notebook
```shell
pip install notebook
```

## Basic workflows
* working with data; 
* creating models;
* optimizing model parameters;
* testing the model.

In [1]:
import torch
import torch.nn as nn
from torchvision.io import read_image
from torch.utils.data import DataLoader, Dataset
import torch.nn.functional as F
import os

# Size of images
IMG_R = 48
IMG_C = 64

BATCH_SIZE = 8
EPOCH = 20

## Working with data

PyTorch has two primitives to work with data: `torch.utils.data.DataLoader` and `torch.utils.data.Dataset`. 
`Dataset` stores the samples and their corresponding labels, and `DataLoader` wraps an iterable around a `Dataset`.

Required member function of a `Dataset` class:

- `__init__`

    The `__init__` function is run once when instantiating the `RobotDataset` object. We initialize the directory with putting all images and their corresponding labels into lists.

- `__len__`
    The `__len__` function returns the number of samples in our dataset.

- `__getitem__`

    The `__getitem__` function loads and returns a sample from the dataset at the given index `idx`. 

Some external functions we use:

- `os.listdir`: get the list of names of all files and directories in the specified directory.

- `read_image`: converts an image to a `Tensor`, a datatype for multi-dimensional matrices in PyTorch.


In [2]:
class RobotDataset(Dataset):
    def __init__(self, robots_dir, others_dir):
        self.images = []
        self.labels = []
        for file_name in os.listdir(robots_dir):
            image = read_image(robots_dir + file_name)
            image = image[:, ::10, ::10]
            self.images.append(image)
            self.labels.append(0)
        for file_name in os.listdir(others_dir):
            image = read_image(others_dir + file_name)
            self.images.append(image)
            self.labels.append(1)

    def __len__(self):
        return len(self.images)

    def __getitem__(self, idx):
        return self.images[idx], self.labels[idx]

To accelerate operations like matrix multiplications in the neural network, we allocate the whole model and data to the GPU if available.

This is done by member function `torch.Tensor.to()` of a `Tensor`. We specify a `device`, e.g. `cuda` (GPU) or `cpu`, and then use `x.to(device)` to move the Tensor `x` to that `device`.

In [None]:
train_data = RobotDataset('./datasets/train/robots/', './datasets/train/others/')
test_data = RobotDataset('./datasets/test/robots/', './datasets/test/others/')
train_dataloader = DataLoader(train_data, batch_size=BATCH_SIZE, shuffle=True)
test_dataloader = DataLoader(test_data, batch_size=BATCH_SIZE, shuffle=False)

# Here we select a device for later use
device = "cuda" if torch.cuda.is_available() else "cpu"

## Creating models
* To define a neural network or any ML models in PyTorch, we create a class that inherits from `nn.Module`. 
* We define the layers of the network in the `__init__` function and specify how data will pass through the network in the `forward` function. 

In [4]:
class MLP(nn.Module):
    def __init__(self):
        super().__init__() 
        self.n = nn.Sequential(
            nn.Linear(3 * IMG_R * IMG_C, 4608),
            nn.ReLU(),
            nn.Linear(4608, 1152),
            nn.ReLU(),
            nn.Linear(1152, 288),
            nn.ReLU(),
            nn.Linear(288, 64),
            nn.ReLU(),
            nn.Linear(64, 2)
        )

    def forward(self, x):
        flatten = nn.Flatten()
        x = flatten(x)
        x = self.n(x)
        return x

In [5]:
class CNN(nn.Module):

    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 4)
        self.conv2 = nn.Conv2d(6, 16, 4)
        self.fc1 = nn.Linear(1872, 256)
        self.fc2 = nn.Linear(256, 128)
        self.fc3 = nn.Linear(128, 2)

    def forward(self, x):
        # Max pooling over a (2, 2) window
        x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
        # If the size is a square, you can specify with a single number
        x = F.max_pool2d(F.relu(self.conv2(x)), 2)
        
        x = torch.flatten(x, 1) # flatten all dimensions except the batch dimension
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

## Optimizing model parameters (train)

We iteratively train the model through all data in the `dataloader` in each `epoch`, and use a certain kind of `optimizer` to do the "Gradient Descent" for us.

In [6]:
def train(dataloader, model, loss_fn, optimizer):
    model.train()
    size = len(dataloader.dataset)
    for epoch in range(EPOCH):
        epoch_loss = 0
        for batch, (x, y) in enumerate(dataloader):
            x, y = x.to(device), y.to(device)

            optimizer.zero_grad()

            pred = model(x.float())
            loss = loss_fn(pred, y)

            # Backpropagation
            loss.backward()
            optimizer.step()

            loss, current = loss.item(), min(batch * BATCH_SIZE, size)
            print(f"loss: {loss:.6f} [{current}/{size}]   ", end="\r")
            
            epoch_loss += loss
            
        print(f"Epoch [{epoch}/{EPOCH}]: Loss={epoch_loss:.6f}        ")

In [8]:
model = CNN()
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.001, momentum=0.9)
model = model.to(device)
train(train_dataloader, model, loss_fn, optimizer)

Epoch [0/20]: Loss=38.207765        
Epoch [1/20]: Loss=19.360587        
Epoch [2/20]: Loss=14.358377        
Epoch [3/20]: Loss=10.978099        
Epoch [4/20]: Loss=8.510617        
Epoch [5/20]: Loss=6.641623        
Epoch [6/20]: Loss=5.147152        
Epoch [7/20]: Loss=3.923710        
Epoch [8/20]: Loss=2.923370        
Epoch [9/20]: Loss=2.117846        
Epoch [10/20]: Loss=1.513425        
Epoch [11/20]: Loss=1.086297        
Epoch [12/20]: Loss=0.792458        
Epoch [13/20]: Loss=0.589680        
Epoch [14/20]: Loss=0.452154        
Epoch [15/20]: Loss=0.357078        
Epoch [16/20]: Loss=0.287551        
Epoch [17/20]: Loss=0.237379        
Epoch [18/20]: Loss=0.199184        
Epoch [19/20]: Loss=0.169702        


## Testing the model

In [None]:
def test(dataloader, model, loss_fn):
    model.eval()
    size = len(dataloader.dataset)
    test_loss = 0
    
    for batch, (x, y) in enumerate(dataloader):
        x, y = x.to(device), y.to(device)

        pred = model(x.float())
        loss = loss_fn(pred, y)

        loss, current = loss.item(), min(batch * BATCH_SIZE, size)
        print(f"loss: {loss:>7f} [{current}/{size}]   ", end="\r")

        test_loss += loss

    print(f"Test: Loss={test_loss:.6f}              ")

In [9]:
test(test_dataloader, model, loss_fn)

loss: 0.000021 [0/158]   
loss: 0.000013 [8/158]   
loss: 0.000044 [16/158]   
loss: 0.000009 [24/158]   
loss: 0.000043 [32/158]   
loss: 0.000800 [40/158]   
loss: 0.000005 [48/158]   
loss: 0.000313 [56/158]   
loss: 0.002030 [64/158]   
loss: 0.000953 [72/158]   
loss: 0.003836 [80/158]   
loss: 0.003799 [88/158]   
loss: 0.003838 [96/158]   
loss: 0.003798 [104/158]   
loss: 0.003790 [112/158]   
loss: 0.004637 [120/158]   
loss: 0.003808 [128/158]   
loss: 0.003799 [136/158]   
loss: 0.003833 [144/158]   
loss: 0.003803 [152/158]   
Test: Loss=0.043172              


## Useful Links & Reference
https://pytorch.org/tutorials/