# Week 2b - build and train a dog rating network

In this code we will see how to train a dog rating network using the [we rate dogs dataset](https://www.kaggle.com/datasets/terencebroad/we-rate-dogs-images-ratings-and-captions).

First lets check you have the write environment setup:

### Setting up your Python environment

Before you work through this notebook, please follow the instructions in [Setup-and-test-conda-environment.ipynb](Setup-and-test-conda-environment.ipynb)

Once you have done that you will need to make sure that the environment selected to run this notebook and all the other notebooks used in this unit is called `aim`. 

To do this click the **Select kernel** button in the top right corner of this notebook, and then select `aim`.

To make sure that is configured properly, Hit the run cell button (â–¶) on the cell below:

In [None]:
import os
print(os.environ['CONDA_DEFAULT_ENV'])

aim


Does it output the text `aim`?

If it does not output the text `aim`, please revisit and follow the instructions in [Setup-and-test-conda-environment.ipynb](Setup-and-test-conda-environment.ipynb).

If you still cannot get it working, please raise this with the course instructor. 

### Imports

First we need to import torch and the various other utilities in torch for building and training models, and loading data:

In [1]:
import torch
import torchvision
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
import torchvision.transforms as transforms
from torch.utils.data import DataLoader

Now we need to import the WeRateDogsDataset class from [util/we_rate_dogs_dataset.py](util/we_rate_dogs_dataset.py) that has been built to load and process this custom dataset.

In [2]:
from util.we_rate_dogs_dataset import WeRateDogsDataset
# Import data loader class from: https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader

### Setup parameters for training the model

In [3]:
device = 'cpu'
momentum = 0.9
batch_size = 100
learn_rate = 0.001
data_path = 'class-datasets/we-rate-dogs-mini/'

### Transformations for pre-procesing the image data into the right format for the model to process

A breakdown of what is going on here is given in the slides

In [4]:
transform = transforms.Compose(
    [   
        torchvision.transforms.Resize(10, antialias=True),
        torchvision.transforms.CenterCrop(10),
        torchvision.transforms.Grayscale(num_output_channels=1),
        transforms.ToTensor(),
        transforms.Normalize((0.5), (0.5)),
        torch.flatten
    ])

### Define dataset and dataloader objects

In torch a dataset class defines what and where our dataset is, and a dataloader class defines how we load that data into batches when training.

Here we have two of each, one for our training dataset that the model is trained on, and one for our test dataset that we will use to evaluate the accuracy of our models at regular intervals: 

In [5]:
train_dataset = WeRateDogsDataset(data_path, 'train', transform=transform)
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)

test_dataset = WeRateDogsDataset(data_path, 'test', transform=transform)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=True)

### Define the network architecture

Here you will need to define the architecture for the network. Use the notebook [Week-2a-basic-MLP-PyTroch.ipynb](Week-2a-basic-MLP-PyTroch.ipynb) as a reference. This network will need to take a vector of dimensionality 100 into the first layer, and have a single scalar output in the final layer:

In [6]:
class DogRatingNetwork(nn.Module):
    def __init__(self):
        super(DogRatingNetwork, self).__init__()
        # Define network architecture here

    def forward(self, x):
        # Define the forward pass of the network here

### Define core objects for training

Here we instantiate the three core objects for any training in PyTorch:
- The neural network model
- The loss function (aka criterion)
- The optimiser (for updating the weights of the model)

In [7]:
model = DogRatingNetwork()
model.to(device)
criterion = nn.L1Loss() 
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)

# Training loop

Here is a basic training loop in PyTorch. `num_epochs` defines the number of full iterations we take through the dataset. With the for loop where we go through each epoch there are two sub-loops:
- **Training loop:** here we cycle through the training dataset, calculate the loss and use that to update the weights of the network
- **Testing loop:** every 10 epochs we cycle through the test dataset and calculate the total loss, here we do not update the weights of the network, as we need to use *unseen data* to properly evaluate our networks performance and avoid *overfitting*.

This code will automatically find the model checkpoint that has the best score on the test dataset and save that to the file `best_dog_rating_network.pt`

In [None]:
best_loss = 100000

num_epochs = 100
for epoch in range(num_epochs): 
    train_loss = 0.0
    test_loss = 0.0
    
    # Training loop
    for i, data in enumerate(train_loader, 0):
        inputs, labels = data
        inputs = inputs.to(device)
        labels = labels.to(device)
        optimizer.zero_grad()
        outputs = model(inputs)
        loss = criterion(outputs, labels.unsqueeze(1))
        loss.backward()
        optimizer.step()
        train_loss += loss.item() / len(train_loader)

    # After 10 epochs
    if (epoch+1) % 10 == 0:
        # Test loop
        with torch.no_grad():
            for i, data in enumerate(test_loader, 0):
                inputs, labels = data
                inputs = inputs.to(device)
                labels = labels.to(device)
                outputs = model(inputs)
                loss = criterion(outputs, labels)
                test_loss += loss.item() / len(test_loader)

        print(f'Epoch {epoch + 1}, train loss: {train_loss:.3f}, test loss: {test_loss:.3f}')
        if test_loss < best_loss:
            best_loss = test_loss
            torch.save(model.state_dict(), 'best_dog_rating_network.pt')

#### Load best model

Now lets reload the best preforming model that we saved to our checkpoint file, and overwrite the current weights of our model with the best set of weights we have found:

In [None]:
model.load_state_dict(torch.load('best_dog_rating_network.pt'))

### Load a new image from the web

Here we are loading in a new image of a dog to test our trained model, try finding a different image from the web and loading it into here:

In [None]:
from PIL import Image
import requests

dog_url = 'https://hips.hearstapps.com/hmg-prod/images/dog-puppy-on-garden-royalty-free-image-1586966191.jpg?crop=0.752xw:1.00xh;0.175xw,0&resize=1200:*'
im = Image.open(requests.get(dog_url, stream=True).raw)
im

### Test model on new image

Now lets test our model on this new data and see how it rates the dog:

In [None]:
torch_im = transform(im)
torch_im = torch_im.unsqueeze(0)
pred = model(torch_im)
print(f'We give this doggo a rating of {pred.item():.2f}/10')

# Tasks 

**Task 1:** [Build a simple MLP model in this cell](#define-the-network-architecture) with one hidden layer followed by a output layer. The input layer should be 100 units, the first hidden layer should have 10 units, and the output layer should have 1 unit. Then hit **run all** at the top of your notebook to run training. Use the notebook [Week-2a-basic-MLP-PyTroch.ipynb](Week-2a-basic-MLP-PyTroch.ipynb) as a reference.

**Task 2:** Adapt this MLP model to make use of biases and activation functions:
-  In `__init__` adapt the code to add biases to the units in the hidden layer
-  In `forward` use the [pytorch RELU (rectified linear unit)](https://pytorch.org/docs/main/generated/torch.nn.ReLU.html#relu) on the outputs outputs of the hidden layer ([see this blog for reference code](https://eitca.org/artificial-intelligence/eitc-ai-dlpp-deep-learning-with-python-and-pytorch/data-eitc-ai-dlpp-deep-learning-with-python-and-pytorch/datasets/what-is-the-relu-function-in-pytorch/)).

**Task 3a:** Now try adding more layers to your network, and changing the number of units in each hidden layer? Can you reduce the test error by increasing the size of the network models? Keep track of your lowest score on the test set and upload your best score to the leaderboard (link on this weeks lesson plan on moodle).

**Task 3b:** Now try experimenting with [different activation functions](https://pytorch.org/docs/stable/nn.html#non-linear-activations-weighted-sum-nonlinearity) or adapting [the parameters for training](#setup-parameters-for-training-the-model), by adjusting the `momentum`, `learning_rate`, or `num_epochs` used for training. Keep track of your lowest score on the test set and upload your best score to the leaderboard (link on this weeks lesson plan on moodle).

**Task 4:** Find some new cute pictures of dogs on google image search (or search engine of your choice). What is the highest score you can find for a new dog picture you have found? Upload the dog and rating to the leaderboard (link on this weeks lesson plan on moodle).
