# CSE5CV - Deep Learning Image Classification
In this lab we classify image data using a pretrained Convolutional Neural Network (CNN).

By the end of this lab, you should be able to:
* Classify image data using a pretrained CNN
* Understand how to interpret the output of a CNN
* Implement and interpret various evaluation metrics

## Colab preparation

Google Colab is a free online service for editing and running code in notebooks like this one. To get started, follow the steps below:

1. Click the "Copy to Drive" button at the top of the page. This will open a new tab with the title "Copy of...". This is a copy of the lab notebook which is saved in your personal Google Drive. **Continue working in that copy, otherwise you will not be able to save your work**. You may close the original Colab page (the one which displays the "Copy to Drive" button).
2. Run the code cell below to prepare the Colab coding environment by downloading sample files. Note that if you close this notebook and come back to work on it again later, you will need to run this cell again.

In [None]:
!git clone https://github.com/ltu-cse5cv/cse5cv-labs.git
%cd cse5cv-labs/Lab03

## Packages
In this lab we will be using the following packages:
* *PyTorch* to work with deep learning models
* *Torchvision* to download pre-trained models and apply transformations to image data
* *Scikit-learn* to compute various evaluation metrics

In [None]:
# Packages
import cv2
import numpy as np
import torch
import torchvision.models as models
import torchvision.transforms.functional as tvtf
from matplotlib import pyplot as plt
from sklearn.metrics import (
    confusion_matrix, ConfusionMatrixDisplay, accuracy_score,
    recall_score, precision_score, f1_score)
from urllib.request import urlopen

### PyTorch
PyTorch is an optimized tensor library for deep learning using GPUs and CPUs.

Package Homepage: https://pytorch.org/    
Python Documentation (latest): https://pytorch.org/docs/stable/index.html

<details>
<summary style='cursor:pointer;'><u>More Details</u></summary>

We will be making extensive use of *PyTorch* to:  

- Create Tensors (Like *numpy* arrays, but can be used with the CPU or GPU)
- Work with Deep Learning models (performing forward passes through those models)
</details>

### Torchvision
This library is part of the PyTorch project. PyTorch is an open source machine learning framework. The torchvision package consists of popular datasets, model architectures, and common image transformations for computer vision.

Python Documentation (latest): https://pytorch.org/vision/stable/index.html  

<details>
<summary style='cursor:pointer;'><u>More Details</u></summary>


We will be making use of *torchvision* to:  

- Download pre-trained models (These are Deep Learning models that have already been defined and trained on other datasets)
- Apply transformations to our image data
</details>

### Scikit-learn
Scikit-learn is an open source machine learning library that supports supervised and unsupervised learning. It also provides various tools for model fitting, data preprocessing, model selection and evaluation, and many other utilities.

Package Homepage: https://scikit-learn.org/stable/    
Python Documentation (latest): https://scikit-learn.org/stable/modules/classes.html

<details>
<summary style='cursor:pointer;'><u>More Details</u></summary>

We will be making use of *Scikit-learn* to compute evaluation metrics on the predictions of our neural network. Using scikit-learn means we do not need to write the evaluation computation code ourselves.
</details>

## CPU vs. GPU

A Central Processing Unit (CPU) is a piece of hardware in every computer that handles interpreting instructions and performing processing operations. On a typical machine, the CPU would have around 4-8 cores, meaning it can process up to 4-8 things in parallel.

A Graphics Processing Unit (GPU) is a piece of hardware that handles rendering of graphics. Not every computer will have a GPU as the CPU can also handle these tasks. GPUs are designed with parallelism in mind, and as such have a huge number of cores. For example, the NVIDIA GTX 1080 GPU has 2560 cores (supporting a lot of parallelism!)

It turns out that performing forward and backward passes through a neural network can be greatly optimized by parallelizing the operations that take place. This is something perfectly suited for a GPU!

The *PyTorch* package supports both CPU and GPU for training and evaluating deep learning models. We will be using the CPU only for these labs. The great thing about *PyTorch* is that the code you write works for both the CPU and GPU, and to change between the two you simply need to tell *PyTorch* to put data on the respective ***device***.

# 1. Classification

In this section we will develop code to classify an image using a Convolutional Neural Network (CNN)!

## 1.1 Pre-Trained Network

To train a neural network for classification, we need a huge amount of training data from all of the different classes we want our network to predict.

For this lab, this presents two problems:
* How can we get access to so much training data?
* How can we train the network, given we likely don't have access to a GPU, nor do we have the time to train something.

Luckily for us the *torchvision* package has our back!  

The *torchvision* package consists of popular datasets, model architectures, and common image transformations for computer vision. With *torchvision*, we can very easily choose a popular model architecture and download model weights that have been already been trained on a large dataset.  

### Downloading a Pre-Trained Network
You can see the torchvision documentation for a [list of available model architectures](https://pytorch.org/vision/stable/models.html). In this lab we are looking at performing classification, and to get us started we will work with the [MobileNetV3 Small](https://arxiv.org/abs/1905.02244) architecture.  

The classification models available through *torchvision* have been trained on the [ImageNet dataset](https://image-net.org/). This is a massive dataset that contains a huge number of images across 1000 different classes.

In the cell below we create a pretrained MobileNetV3 model. If you are interested, you can find more information in the torchvision documentation on [creating pretrained models](https://pytorch.org/vision/stable/models.html#:~:text=We%20provide%20pre-trained%20models%2C%20using%20the%20PyTorch%20torch.utils.model_zoo.%20These%20can%20be%20constructed%20by%20passing%20pretrained%3DTrue%3A). It may take a few moments the first time you run this cell as *torchvision* needs to download the trained weights to your machine.

In [None]:
mobilenet_v3 = models.mobilenet_v3_small(pretrained=True)

### Inspecting the Pre-Trained Network
Models downloaded with *torchvision* are written using *PyTorch*. They consist of many different *PyTorch* layers (e.g. Convolution, Linear, Pooling, etc.) connected together to create a pipeline.

You have performed convolutions in earlier labs with manually defined kernels. The basic idea of a Convolutional Neural Network is that it *learns* good kernels to extract image features and classify images at the same time. Of course that's not all there is to it.

In the cell below we print out the layers used in our MobileNetV3 model.

In [None]:
print(mobilenet_v3)

<details>
<summary style='cursor:pointer;'><u>Expand for Discussion</u></summary>

As you can see in the printed summary there are quite a number of layers in the network.
  
The inner-most layers that you can see relate directly to a layer you can create in *PyTorch*. The names of these are self-explanatory to what type of layer they are (for example, a Conv2D layer is a 2D convolution layer). You can take a look at the [*PyTorch* *nn* documentation] (https://pytorch.org/docs/stable/nn.html) for a detailed description of each type of layer/activation.

You'll see that the printed summary also shows some of the properties of each layer. To briefly discuss the first Conv2D layer in the printed summary:  
`Conv2d(3, 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)`
* The first number (3) describes the expected number if input channels/feature maps. This is set to 3 (Usually the first Conv layer in a network expects 3 channels - RGB!)
* The second number (16) describes how many feature maps will be generated.
* The *kernel_size* argument (3, 3) describes the spatial size of the convolution kernel, here it is a 3x3 kernel.
* The *stride* argument (2, 2) describes how the kernel moves across the input.
* The *padding* argument (1, 1) describes the padding applied around the input.
* The *bias* argument (False) describes if the layer should also include a bias parameter.

You can always refer back to the [*nn* documentation](https://pytorch.org/docs/stable/nn.html) for a full description of what arguments can be given to each layer.  
    
Something else that is good to be aware of - If we take a look at the MobileNetV3 Small table in the [MobileNetV3 paper](https://arxiv.org/abs/1905.02244) which describes the layers in the network (see below), we can see that the layers in the table match the layers that are printed in the summary. This helps give us some reassurance that the *torchvision* implementation does match what the paper describes.  
    
See if you can find the relationships between the printed model summary and the model as described in the paper.
    
![MobileNet V3 Small Architecture.png](attachment:8ed85952-76bc-4925-84a5-c953decf4d01.png)    
  
</details>

## 1.2 Tensors
Up until now we have stored our image data in *numpy* **arrays**. However, *PyTorch* models only accept **tensors**.

> A PyTorch Tensor is basically the same as a numpy array: it does not know anything about deep learning or computational graphs or gradients, and is just a generic n-dimensional array to be used for arbitrary numeric computation.  
The biggest difference between a numpy array and a PyTorch Tensor is that a PyTorch Tensor can run on either CPU or GPU.

From: https://pytorch.org/tutorials/beginner/examples_tensor/two_layer_net_tensor.html

The *PyTorch* package has functions that let us switch between *PyTorch* **tensors** and *numpy* **ndarrays**.

**An important note on tensors**: *PyTorch* expects the dimensions of tensors to be ordered as CHW (C = channels, H = height, W = width). This is different from *numpy* where we used HWC ordering.

### Creating Tensors
We are able to create tensors directly through *PyTorch*.

In the code cell below, we create a 3x244x244 tensor filled with random values in the range \[0, 1) using the [*`torch.rand()`* function](https://pytorch.org/docs/stable/generated/torch.rand.html), then print out some properties of the tensor.

In [None]:
# Create the random tensor, specifying a float32 datatype
random_tensor = torch.rand([3, 224, 224], dtype=torch.float32)

# Print some properties of the tensor
print(f'The type of our tensor is: {type(random_tensor)}')
print('=' * 50)
print(f'The shape of our tensor is: {random_tensor.shape}')
print('=' * 50)
print(f'The datatype of our tensor is: {random_tensor.dtype}')
print('=' * 50)
print(f'Our tensor is on the: {random_tensor.device} device')
print('=' * 50)
print(f'The minimum value is: {random_tensor.min()}')
print(f'The minimum value is: {random_tensor.max()}')

<details>
<summary style='cursor:pointer;'><u>Expand for Discussion</u></summary>

You may notice that the properties of our tensor that we accessed are essentially the same as what we have done in the past with a *numpy* array. The *type* of our tensor is a *torch.Tensor* as expected. The *shape* and *datatype* are also exactly as we expect, given we specified them when creating the tensor.

A property that we haven't come across before with *numpy* arrays is the *device* property. Tensors can be placed on the CPU or GPU, and the way we can tell which one our tensor is on is through the *device* property. For this lab we will be working exclusively with the CPU.

We also printed out the minimum and maximum values of the tensor. You should be able to see that the minimum/maximum values are very close to 0/1 respectively.
</details>

### Tensors from Numpy Arrays
It's also possible to create a tensor directly from a numpy array. This is very useful when we load our image data into a *numpy* array and later want to do something with it with *PyTorch*.

In the code cell below, we create a *numpy* array of random values and convert it to a tensor using the [*`torch.as_tensor()`* function](https://pytorch.org/docs/stable/generated/torch.as_tensor.html).

In [None]:
# Create a random numpy ndarray with dimensionality 3x224x224
random_ndarray = np.random.rand(3, 224, 224)

# Convert the ndarray to a torch tensor
converted_torch = torch.as_tensor(random_ndarray)

# Validate the types are different
print(type(converted_torch))

We mentioned above that we represent image data with dimensionality HWC in a *numpy* array, but as CHW in a *PyTorch* tensor.

To swap the ordering of our *numpy* array we can use the [*`np.transpose()`* function](https://numpy.org/doc/stable/reference/generated/numpy.transpose.html). An example is provided for you here:

```
my_array = np.random.rand(3, 4, 5)
print(my_array.shape)
>>> (3, 4, 5)
transposed_array = np.transpose(my_array, (2, 0, 1))
print(transposed_array.shape)
>>> (5, 3, 4)
```

**Task**: Create a 200x100x3 (HWC) *numpy* array filled with random values, transpose it to get CHW dimensionality, then convert it to a tensor.

At the bottom of the code cell is some code that will print out the shape of your resulting tensor. Before moving on, verify that your tensor has shape (3, 100, 200) and the datatype is *torch.Tensor*.

In [None]:
# TODO: Create a random numpy array with shape: (200, 100, 3)
# random_ndarray = ...

# TODO: Transpose the numpy array to get dimensionality: (3, 100, 200)
# transposed_ndarray = ...

# TODO: Convert the numpy array to a torch tensor
# converted_torch = ...

# Test your code!
print(converted_torch.shape)       # Expected: (3, 100, 200)
print(type(converted_torch))       # Expected: torch.Tensor

#### Task solution

In [None]:
# TODO: Create a random numpy array with shape: (200, 100, 3)
random_ndarray = np.random.rand(200, 100, 3)

# TODO: Transpose the numpy array to get dimensionality: (3, 100, 200)
transposed_ndarray = np.transpose(random_ndarray, (2, 1, 0))

# TODO: Convert the numpy array to a torch tensor
converted_torch = torch.as_tensor(transposed_ndarray)

# Test your code!
print(converted_torch.shape)       # Expected: (3, 100, 200)
print(type(converted_torch))       # Expected: torch.Tensor

### Numpy Arrays from Tensors
We often need to convert tensors back into numpy arrays. For example, for visualisation using `matplotlib`.

In the code cell below, we create a tensor of random values and convert it to a *numpy* array.

In [None]:
# Create a random tensor with dimensionality 3x224x224
random_tensor = torch.rand([3, 224, 224], dtype=torch.float32)

# Convert the tensor to an ndarray
# NOTE: Calling .detach() is important when we are using a tensor that has been passed through a neural network
#       Calling .cpu() is important incase our tensor is on the GPU (numpy arrays can only be on the CPU)
#       In general we will leave both of these calls in as part of conversion
converted_ndarray = random_tensor.detach().cpu().numpy()

# Validate the type has changed
print(type(converted_ndarray))

### Displaying Tensors
As we saw in previous labs, it's extremely useful to be able to visualize our image data.

We saw in Lab 1 that we can use *matplotlib* to display image data. However *matplotlib* cannot display data stored in tensors.

To consolidate what you have seen so far in this lab, and to reuse some of the code you have written in previous labs, let's write a function to handle displaying a tensor.

**Task**: Write a function named *display_tensor* that:
* Takes an image stored in a tensor
* Converts the tensor to a *numpy* array
* Transposes the numpy array to get channel ordering HWC (from CHW)
* Displays the image using *matplotlib* *(Refer to the code you wrote in Lab 1. Do not worry about handling grayscale)*

At the bottom of the code cell is some code that will call your display function. Use this to verify that the tensor is displayed correctly (The tensor is filled with random pixels).

In [None]:
# TODO Write your function here




# Test your function!
random_tensor = torch.rand([3, 224, 224], dtype=torch.float32)
display_tensor(random_tensor)

#### Task solution

In [None]:
def display_tensor(tensor):
    # Convert to numpy ndarray
    tensor_as_numpy = tensor.detach().cpu().numpy()

    # Transpose from CHW to HWC
    tensor_as_numpy = np.transpose(tensor_as_numpy, (1, 2, 0))

    # Display the image
    fig, axes = plt.subplots(figsize=(12, 8))
    axes.imshow(tensor_as_numpy)
    plt.show()

### Batching Tensors
The final thing to be aware of when dealing with tensors is that when we want to pass them through a neural network, we do so with *batches* of tensors. That is, instead of only passing tensors through our network 1 by 1, *PyTorch* explicitly expects us to pass a collection of tensors through a network at once.  

When we batch tensors together, we end up with a single tensor that has an extra **B**atch dimension. This means the expected dimensionality of a tensor for a neural network is: **B**x**C**x**H**x**W**.

There are two cases we might come across that require us to batch up data into a tensor for passing through a neural network:
- We have a single tensor and we need to create a batch dimension (The size of the batch dimension is 1). Here we make use of the [*`unsqueeze()`* function](https://pytorch.org/docs/stable/generated/torch.unsqueeze.html).
- We have *N* tensors and we need to combine them into a single tensor (The size of the batch is *N*). Here we make use of the [*`stack()`* function](https://pytorch.org/docs/stable/generated/torch.stack.html).

The code below shows how we handle each case.

In [None]:
# Case 1 - We have a single tensor and need to create a batch dimension
random_tensor = torch.rand([3, 224, 224], dtype=torch.float32)
print(random_tensor.shape)     # 3, 224, 224

# Insert a dimension of size 1
batched_tensor = random_tensor.unsqueeze(dim=0)
print(batched_tensor.shape)
# NOTE: We can also do this with: batched_tensor = torch.unsqueeze(random_tensor, dim=0)

print('-' * 50)

# Case 2 - We have multiple tensors and want to combine them into a single tensor along the batch dimension
random_tensor_a = torch.rand([3, 224, 224], dtype=torch.float32)
random_tensor_b = torch.rand([3, 224, 224], dtype=torch.float32)
print(random_tensor_a.shape, random_tensor_b.shape)

# Stack the tensors
batched_tensor = torch.stack([random_tensor_a, random_tensor_b], dim=0)
print(batched_tensor.shape)

## 1.3 Making Predictions
That was a lot of work! But now we know:
- How to create a pretrained network (We have chosen MobileNet V3)
- How to create random tensors, and importantly, how to put them into a batch

All that's left is passing some data through the network and looking at the classification!

### The Forward Pass
In this lab we are interested in taking a pretrained model, passing some data through that network, then analyzing the classification output of the network. Because of this, we are only interested in the **forward pass** of the network.

The code below shows how to pass data through a neural network (Using the pretrained MobileNet V3 we loaded earlier and a tensor filled with random values).

In [None]:
# Define a random tensor (including a batch dimension)
random_tensor = torch.rand([1, 3, 224, 224], dtype=torch.float32)

# Perform the forward pass through our mobilenet_v3 model
with torch.no_grad():
    output = mobilenet_v3(random_tensor)

# Inspect the output!
print(f'The type of the output is: {type(output)}')
print(f'The shape of the output is: {output.shape}')
print(f'The output data is:\n\t {output}')

As you can see from the code cell above, performing the forward pass through our neural network is actually quite simple! Once our model is created (and stored in a variable), we can use it like a function, where we use our batched tensor as the argument. Pay attention to the `with torch.no_grad()` context manager. We use this around the forward pass of our model when we are not interested in computing gradients (we compute gradients when we are training a network). For your interest, you can read more in the [*`torch.no_grad()`* documentation](https://pytorch.org/docs/stable/generated/torch.no_grad.html).

#### 1000D Output
Something that might not be immediately clear is what the output of our network represents. We know it is some form of prediction. After running that code cell you should have seen a huge tensor filled with seemingly random numbers being printed as the output (with dimensionality 1x1000).  

Let's start by talking about the dimensionality of the data.  
The first dimension (of size 1) represents the batch dimension. Given we only had a single tensor in our batch, the size of this dimension is 1. If we had *N* tensors in our batch, then this dimension would be *N*.  

When we talked about the pretrained model at the start of this lab, we mentioned that the model was pretrained on ImageNet which consisted of 1000 different classes. This is no coincidence that it exactly matches the size of the second dimension!  
The second batch dimension (of size 1000) corresponds to a score *per-class* based on the data we trained the model on. This means to assign the input to a class, we just need to determine which index into the 1000 dimensional vector produced the highest score.  

<details>
<summary style='cursor:pointer;'><u>Why should it be the highest?</u></summary>

Why not the lowest?  

When training a neural network it is convention that the goal is for the network to output a higher score for the correct class and we must match how the model was trained.
</details>

The final question is then, if we know the index of the 1000 dimensional vector that produced the highest score, how do we map that back to the name of a class?  
The answer is that we need to lookup the name of the class corresponding to that index!

The below cell contains the definition for a dictionary containing the class index and corresponding label for the ImageNet dataset. We will use this dictionary to help us map from class index producing highest score to class label. Have a brief look through the classes then run the code cell to define the dictionary.

In [None]:
# Classes were taken from: https://gist.github.com/yrevar/942d3a0ac09ec9e5eb3a
IMAGENET_CLASSES = {
    0: 'tench, Tinca tinca',
    1: 'goldfish, Carassius auratus',
    2: 'great white shark, white shark, man-eater, man-eating shark, Carcharodon carcharias',
    3: 'tiger shark, Galeocerdo cuvieri',
    4: 'hammerhead, hammerhead shark',
    5: 'electric ray, crampfish, numbfish, torpedo',
    6: 'stingray',
    7: 'cock',
    8: 'hen',
    9: 'ostrich, Struthio camelus',
    10: 'brambling, Fringilla montifringilla',
    11: 'goldfinch, Carduelis carduelis',
    12: 'house finch, linnet, Carpodacus mexicanus',
    13: 'junco, snowbird',
    14: 'indigo bunting, indigo finch, indigo bird, Passerina cyanea',
    15: 'robin, American robin, Turdus migratorius',
    16: 'bulbul',
    17: 'jay',
    18: 'magpie',
    19: 'chickadee',
    20: 'water ouzel, dipper',
    21: 'kite',
    22: 'bald eagle, American eagle, Haliaeetus leucocephalus',
    23: 'vulture',
    24: 'great grey owl, great gray owl, Strix nebulosa',
    25: 'European fire salamander, Salamandra salamandra',
    26: 'common newt, Triturus vulgaris',
    27: 'eft',
    28: 'spotted salamander, Ambystoma maculatum',
    29: 'axolotl, mud puppy, Ambystoma mexicanum',
    30: 'bullfrog, Rana catesbeiana',
    31: 'tree frog, tree-frog',
    32: 'tailed frog, bell toad, ribbed toad, tailed toad, Ascaphus trui',
    33: 'loggerhead, loggerhead turtle, Caretta caretta',
    34: 'leatherback turtle, leatherback, leathery turtle, Dermochelys coriacea',
    35: 'mud turtle',
    36: 'terrapin',
    37: 'box turtle, box tortoise',
    38: 'banded gecko',
    39: 'common iguana, iguana, Iguana iguana',
    40: 'American chameleon, anole, Anolis carolinensis',
    41: 'whiptail, whiptail lizard',
    42: 'agama',
    43: 'frilled lizard, Chlamydosaurus kingi',
    44: 'alligator lizard',
    45: 'Gila monster, Heloderma suspectum',
    46: 'green lizard, Lacerta viridis',
    47: 'African chameleon, Chamaeleo chamaeleon',
    48: 'Komodo dragon, Komodo lizard, dragon lizard, giant lizard, Varanus komodoensis',
    49: 'African crocodile, Nile crocodile, Crocodylus niloticus',
    50: 'American alligator, Alligator mississipiensis',
    51: 'triceratops',
    52: 'thunder snake, worm snake, Carphophis amoenus',
    53: 'ringneck snake, ring-necked snake, ring snake',
    54: 'hognose snake, puff adder, sand viper',
    55: 'green snake, grass snake',
    56: 'king snake, kingsnake',
    57: 'garter snake, grass snake',
    58: 'water snake',
    59: 'vine snake',
    60: 'night snake, Hypsiglena torquata',
    61: 'boa constrictor, Constrictor constrictor',
    62: 'rock python, rock snake, Python sebae',
    63: 'Indian cobra, Naja naja',
    64: 'green mamba',
    65: 'sea snake',
    66: 'horned viper, cerastes, sand viper, horned asp, Cerastes cornutus',
    67: 'diamondback, diamondback rattlesnake, Crotalus adamanteus',
    68: 'sidewinder, horned rattlesnake, Crotalus cerastes',
    69: 'trilobite',
    70: 'harvestman, daddy longlegs, Phalangium opilio',
    71: 'scorpion',
    72: 'black and gold garden spider, Argiope aurantia',
    73: 'barn spider, Araneus cavaticus',
    74: 'garden spider, Aranea diademata',
    75: 'black widow, Latrodectus mactans',
    76: 'tarantula',
    77: 'wolf spider, hunting spider',
    78: 'tick',
    79: 'centipede',
    80: 'black grouse',
    81: 'ptarmigan',
    82: 'ruffed grouse, partridge, Bonasa umbellus',
    83: 'prairie chicken, prairie grouse, prairie fowl',
    84: 'peacock',
    85: 'quail',
    86: 'partridge',
    87: 'African grey, African gray, Psittacus erithacus',
    88: 'macaw',
    89: 'sulphur-crested cockatoo, Kakatoe galerita, Cacatua galerita',
    90: 'lorikeet',
    91: 'coucal',
    92: 'bee eater',
    93: 'hornbill',
    94: 'hummingbird',
    95: 'jacamar',
    96: 'toucan',
    97: 'drake',
    98: 'red-breasted merganser, Mergus serrator',
    99: 'goose',
    100: 'black swan, Cygnus atratus',
    101: 'tusker',
    102: 'echidna, spiny anteater, anteater',
    103: 'platypus, duckbill, duckbilled platypus, duck-billed platypus, Ornithorhynchus anatinus',
    104: 'wallaby, brush kangaroo',
    105: 'koala, koala bear, kangaroo bear, native bear, Phascolarctos cinereus',
    106: 'wombat',
    107: 'jellyfish',
    108: 'sea anemone, anemone',
    109: 'brain coral',
    110: 'flatworm, platyhelminth',
    111: 'nematode, nematode worm, roundworm',
    112: 'conch',
    113: 'snail',
    114: 'slug',
    115: 'sea slug, nudibranch',
    116: 'chiton, coat-of-mail shell, sea cradle, polyplacophore',
    117: 'chambered nautilus, pearly nautilus, nautilus',
    118: 'Dungeness crab, Cancer magister',
    119: 'rock crab, Cancer irroratus',
    120: 'fiddler crab',
    121: 'king crab, Alaska crab, Alaskan king crab, Alaska king crab, Paralithodes camtschatica',
    122: 'American lobster, Northern lobster, Maine lobster, Homarus americanus',
    123: 'spiny lobster, langouste, rock lobster, crawfish, crayfish, sea crawfish',
    124: 'crayfish, crawfish, crawdad, crawdaddy',
    125: 'hermit crab',
    126: 'isopod',
    127: 'white stork, Ciconia ciconia',
    128: 'black stork, Ciconia nigra',
    129: 'spoonbill',
    130: 'flamingo',
    131: 'little blue heron, Egretta caerulea',
    132: 'American egret, great white heron, Egretta albus',
    133: 'bittern',
    134: 'crane',
    135: 'limpkin, Aramus pictus',
    136: 'European gallinule, Porphyrio porphyrio',
    137: 'American coot, marsh hen, mud hen, water hen, Fulica americana',
    138: 'bustard',
    139: 'ruddy turnstone, Arenaria interpres',
    140: 'red-backed sandpiper, dunlin, Erolia alpina',
    141: 'redshank, Tringa totanus',
    142: 'dowitcher',
    143: 'oystercatcher, oyster catcher',
    144: 'pelican',
    145: 'king penguin, Aptenodytes patagonica',
    146: 'albatross, mollymawk',
    147: 'grey whale, gray whale, devilfish, Eschrichtius gibbosus, Eschrichtius robustus',
    148: 'killer whale, killer, orca, grampus, sea wolf, Orcinus orca',
    149: 'dugong, Dugong dugon',
    150: 'sea lion',
    151: 'Chihuahua',
    152: 'Japanese spaniel',
    153: 'Maltese dog, Maltese terrier, Maltese',
    154: 'Pekinese, Pekingese, Peke',
    155: 'Shih-Tzu',
    156: 'Blenheim spaniel',
    157: 'papillon',
    158: 'toy terrier',
    159: 'Rhodesian ridgeback',
    160: 'Afghan hound, Afghan',
    161: 'basset, basset hound',
    162: 'beagle',
    163: 'bloodhound, sleuthhound',
    164: 'bluetick',
    165: 'black-and-tan coonhound',
    166: 'Walker hound, Walker foxhound',
    167: 'English foxhound',
    168: 'redbone',
    169: 'borzoi, Russian wolfhound',
    170: 'Irish wolfhound',
    171: 'Italian greyhound',
    172: 'whippet',
    173: 'Ibizan hound, Ibizan Podenco',
    174: 'Norwegian elkhound, elkhound',
    175: 'otterhound, otter hound',
    176: 'Saluki, gazelle hound',
    177: 'Scottish deerhound, deerhound',
    178: 'Weimaraner',
    179: 'Staffordshire bullterrier, Staffordshire bull terrier',
    180: 'American Staffordshire terrier, Staffordshire terrier, American pit bull terrier, pit bull terrier',
    181: 'Bedlington terrier',
    182: 'Border terrier',
    183: 'Kerry blue terrier',
    184: 'Irish terrier',
    185: 'Norfolk terrier',
    186: 'Norwich terrier',
    187: 'Yorkshire terrier',
    188: 'wire-haired fox terrier',
    189: 'Lakeland terrier',
    190: 'Sealyham terrier, Sealyham',
    191: 'Airedale, Airedale terrier',
    192: 'cairn, cairn terrier',
    193: 'Australian terrier',
    194: 'Dandie Dinmont, Dandie Dinmont terrier',
    195: 'Boston bull, Boston terrier',
    196: 'miniature schnauzer',
    197: 'giant schnauzer',
    198: 'standard schnauzer',
    199: 'Scotch terrier, Scottish terrier, Scottie',
    200: 'Tibetan terrier, chrysanthemum dog',
    201: 'silky terrier, Sydney silky',
    202: 'soft-coated wheaten terrier',
    203: 'West Highland white terrier',
    204: 'Lhasa, Lhasa apso',
    205: 'flat-coated retriever',
    206: 'curly-coated retriever',
    207: 'golden retriever',
    208: 'Labrador retriever',
    209: 'Chesapeake Bay retriever',
    210: 'German short-haired pointer',
    211: 'vizsla, Hungarian pointer',
    212: 'English setter',
    213: 'Irish setter, red setter',
    214: 'Gordon setter',
    215: 'Brittany spaniel',
    216: 'clumber, clumber spaniel',
    217: 'English springer, English springer spaniel',
    218: 'Welsh springer spaniel',
    219: 'cocker spaniel, English cocker spaniel, cocker',
    220: 'Sussex spaniel',
    221: 'Irish water spaniel',
    222: 'kuvasz',
    223: 'schipperke',
    224: 'groenendael',
    225: 'malinois',
    226: 'briard',
    227: 'kelpie',
    228: 'komondor',
    229: 'Old English sheepdog, bobtail',
    230: 'Shetland sheepdog, Shetland sheep dog, Shetland',
    231: 'collie',
    232: 'Border collie',
    233: 'Bouvier des Flandres, Bouviers des Flandres',
    234: 'Rottweiler',
    235: 'German shepherd, German shepherd dog, German police dog, alsatian',
    236: 'Doberman, Doberman pinscher',
    237: 'miniature pinscher',
    238: 'Greater Swiss Mountain dog',
    239: 'Bernese mountain dog',
    240: 'Appenzeller',
    241: 'EntleBucher',
    242: 'boxer',
    243: 'bull mastiff',
    244: 'Tibetan mastiff',
    245: 'French bulldog',
    246: 'Great Dane',
    247: 'Saint Bernard, St Bernard',
    248: 'Eskimo dog, husky',
    249: 'malamute, malemute, Alaskan malamute',
    250: 'Siberian husky',
    251: 'dalmatian, coach dog, carriage dog',
    252: 'affenpinscher, monkey pinscher, monkey dog',
    253: 'basenji',
    254: 'pug, pug-dog',
    255: 'Leonberg',
    256: 'Newfoundland, Newfoundland dog',
    257: 'Great Pyrenees',
    258: 'Samoyed, Samoyede',
    259: 'Pomeranian',
    260: 'chow, chow chow',
    261: 'keeshond',
    262: 'Brabancon griffon',
    263: 'Pembroke, Pembroke Welsh corgi',
    264: 'Cardigan, Cardigan Welsh corgi',
    265: 'toy poodle',
    266: 'miniature poodle',
    267: 'standard poodle',
    268: 'Mexican hairless',
    269: 'timber wolf, grey wolf, gray wolf, Canis lupus',
    270: 'white wolf, Arctic wolf, Canis lupus tundrarum',
    271: 'red wolf, maned wolf, Canis rufus, Canis niger',
    272: 'coyote, prairie wolf, brush wolf, Canis latrans',
    273: 'dingo, warrigal, warragal, Canis dingo',
    274: 'dhole, Cuon alpinus',
    275: 'African hunting dog, hyena dog, Cape hunting dog, Lycaon pictus',
    276: 'hyena, hyaena',
    277: 'red fox, Vulpes vulpes',
    278: 'kit fox, Vulpes macrotis',
    279: 'Arctic fox, white fox, Alopex lagopus',
    280: 'grey fox, gray fox, Urocyon cinereoargenteus',
    281: 'tabby, tabby cat',
    282: 'tiger cat',
    283: 'Persian cat',
    284: 'Siamese cat, Siamese',
    285: 'Egyptian cat',
    286: 'cougar, puma, catamount, mountain lion, painter, panther, Felis concolor',
    287: 'lynx, catamount',
    288: 'leopard, Panthera pardus',
    289: 'snow leopard, ounce, Panthera uncia',
    290: 'jaguar, panther, Panthera onca, Felis onca',
    291: 'lion, king of beasts, Panthera leo',
    292: 'tiger, Panthera tigris',
    293: 'cheetah, chetah, Acinonyx jubatus',
    294: 'brown bear, bruin, Ursus arctos',
    295: 'American black bear, black bear, Ursus americanus, Euarctos americanus',
    296: 'ice bear, polar bear, Ursus Maritimus, Thalarctos maritimus',
    297: 'sloth bear, Melursus ursinus, Ursus ursinus',
    298: 'mongoose',
    299: 'meerkat, mierkat',
    300: 'tiger beetle',
    301: 'ladybug, ladybeetle, lady beetle, ladybird, ladybird beetle',
    302: 'ground beetle, carabid beetle',
    303: 'long-horned beetle, longicorn, longicorn beetle',
    304: 'leaf beetle, chrysomelid',
    305: 'dung beetle',
    306: 'rhinoceros beetle',
    307: 'weevil',
    308: 'fly',
    309: 'bee',
    310: 'ant, emmet, pismire',
    311: 'grasshopper, hopper',
    312: 'cricket',
    313: 'walking stick, walkingstick, stick insect',
    314: 'cockroach, roach',
    315: 'mantis, mantid',
    316: 'cicada, cicala',
    317: 'leafhopper',
    318: 'lacewing, lacewing fly',
    319: "dragonfly, darning needle, devil's darning needle, sewing needle, snake feeder, snake doctor, mosquito hawk, skeeter hawk",
    320: 'damselfly',
    321: 'admiral',
    322: 'ringlet, ringlet butterfly',
    323: 'monarch, monarch butterfly, milkweed butterfly, Danaus plexippus',
    324: 'cabbage butterfly',
    325: 'sulphur butterfly, sulfur butterfly',
    326: 'lycaenid, lycaenid butterfly',
    327: 'starfish, sea star',
    328: 'sea urchin',
    329: 'sea cucumber, holothurian',
    330: 'wood rabbit, cottontail, cottontail rabbit',
    331: 'hare',
    332: 'Angora, Angora rabbit',
    333: 'hamster',
    334: 'porcupine, hedgehog',
    335: 'fox squirrel, eastern fox squirrel, Sciurus niger',
    336: 'marmot',
    337: 'beaver',
    338: 'guinea pig, Cavia cobaya',
    339: 'sorrel',
    340: 'zebra',
    341: 'hog, pig, grunter, squealer, Sus scrofa',
    342: 'wild boar, boar, Sus scrofa',
    343: 'warthog',
    344: 'hippopotamus, hippo, river horse, Hippopotamus amphibius',
    345: 'ox',
    346: 'water buffalo, water ox, Asiatic buffalo, Bubalus bubalis',
    347: 'bison',
    348: 'ram, tup',
    349: 'bighorn, bighorn sheep, cimarron, Rocky Mountain bighorn, Rocky Mountain sheep, Ovis canadensis',
    350: 'ibex, Capra ibex',
    351: 'hartebeest',
    352: 'impala, Aepyceros melampus',
    353: 'gazelle',
    354: 'Arabian camel, dromedary, Camelus dromedarius',
    355: 'llama',
    356: 'weasel',
    357: 'mink',
    358: 'polecat, fitch, foulmart, foumart, Mustela putorius',
    359: 'black-footed ferret, ferret, Mustela nigripes',
    360: 'otter',
    361: 'skunk, polecat, wood pussy',
    362: 'badger',
    363: 'armadillo',
    364: 'three-toed sloth, ai, Bradypus tridactylus',
    365: 'orangutan, orang, orangutang, Pongo pygmaeus',
    366: 'gorilla, Gorilla gorilla',
    367: 'chimpanzee, chimp, Pan troglodytes',
    368: 'gibbon, Hylobates lar',
    369: 'siamang, Hylobates syndactylus, Symphalangus syndactylus',
    370: 'guenon, guenon monkey',
    371: 'patas, hussar monkey, Erythrocebus patas',
    372: 'baboon',
    373: 'macaque',
    374: 'langur',
    375: 'colobus, colobus monkey',
    376: 'proboscis monkey, Nasalis larvatus',
    377: 'marmoset',
    378: 'capuchin, ringtail, Cebus capucinus',
    379: 'howler monkey, howler',
    380: 'titi, titi monkey',
    381: 'spider monkey, Ateles geoffroyi',
    382: 'squirrel monkey, Saimiri sciureus',
    383: 'Madagascar cat, ring-tailed lemur, Lemur catta',
    384: 'indri, indris, Indri indri, Indri brevicaudatus',
    385: 'Indian elephant, Elephas maximus',
    386: 'African elephant, Loxodonta africana',
    387: 'lesser panda, red panda, panda, bear cat, cat bear, Ailurus fulgens',
    388: 'giant panda, panda, panda bear, coon bear, Ailuropoda melanoleuca',
    389: 'barracouta, snoek',
    390: 'eel',
    391: 'coho, cohoe, coho salmon, blue jack, silver salmon, Oncorhynchus kisutch',
    392: 'rock beauty, Holocanthus tricolor',
    393: 'anemone fish',
    394: 'sturgeon',
    395: 'gar, garfish, garpike, billfish, Lepisosteus osseus',
    396: 'lionfish',
    397: 'puffer, pufferfish, blowfish, globefish',
    398: 'abacus',
    399: 'abaya',
    400: "academic gown, academic robe, judge's robe",
    401: 'accordion, piano accordion, squeeze box',
    402: 'acoustic guitar',
    403: 'aircraft carrier, carrier, flattop, attack aircraft carrier',
    404: 'airliner',
    405: 'airship, dirigible',
    406: 'altar',
    407: 'ambulance',
    408: 'amphibian, amphibious vehicle',
    409: 'analog clock',
    410: 'apiary, bee house',
    411: 'apron',
    412: 'ashcan, trash can, garbage can, wastebin, ash bin, ash-bin, ashbin, dustbin, trash barrel, trash bin',
    413: 'assault rifle, assault gun',
    414: 'backpack, back pack, knapsack, packsack, rucksack, haversack',
    415: 'bakery, bakeshop, bakehouse',
    416: 'balance beam, beam',
    417: 'balloon',
    418: 'ballpoint, ballpoint pen, ballpen, Biro',
    419: 'Band Aid',
    420: 'banjo',
    421: 'bannister, banister, balustrade, balusters, handrail',
    422: 'barbell',
    423: 'barber chair',
    424: 'barbershop',
    425: 'barn',
    426: 'barometer',
    427: 'barrel, cask',
    428: 'barrow, garden cart, lawn cart, wheelbarrow',
    429: 'baseball',
    430: 'basketball',
    431: 'bassinet',
    432: 'bassoon',
    433: 'bathing cap, swimming cap',
    434: 'bath towel',
    435: 'bathtub, bathing tub, bath, tub',
    436: 'beach wagon, station wagon, wagon, estate car, beach waggon, station waggon, waggon',
    437: 'beacon, lighthouse, beacon light, pharos',
    438: 'beaker',
    439: 'bearskin, busby, shako',
    440: 'beer bottle',
    441: 'beer glass',
    442: 'bell cote, bell cot',
    443: 'bib',
    444: 'bicycle-built-for-two, tandem bicycle, tandem',
    445: 'bikini, two-piece',
    446: 'binder, ring-binder',
    447: 'binoculars, field glasses, opera glasses',
    448: 'birdhouse',
    449: 'boathouse',
    450: 'bobsled, bobsleigh, bob',
    451: 'bolo tie, bolo, bola tie, bola',
    452: 'bonnet, poke bonnet',
    453: 'bookcase',
    454: 'bookshop, bookstore, bookstall',
    455: 'bottlecap',
    456: 'bow',
    457: 'bow tie, bow-tie, bowtie',
    458: 'brass, memorial tablet, plaque',
    459: 'brassiere, bra, bandeau',
    460: 'breakwater, groin, groyne, mole, bulwark, seawall, jetty',
    461: 'breastplate, aegis, egis',
    462: 'broom',
    463: 'bucket, pail',
    464: 'buckle',
    465: 'bulletproof vest',
    466: 'bullet train, bullet',
    467: 'butcher shop, meat market',
    468: 'cab, hack, taxi, taxicab',
    469: 'caldron, cauldron',
    470: 'candle, taper, wax light',
    471: 'cannon',
    472: 'canoe',
    473: 'can opener, tin opener',
    474: 'cardigan',
    475: 'car mirror',
    476: 'carousel, carrousel, merry-go-round, roundabout, whirligig',
    477: "carpenter's kit, tool kit",
    478: 'carton',
    479: 'car wheel',
    480: 'cash machine, cash dispenser, automated teller machine, automatic teller machine, automated teller, automatic teller, ATM',
    481: 'cassette',
    482: 'cassette player',
    483: 'castle',
    484: 'catamaran',
    485: 'CD player',
    486: 'cello, violoncello',
    487: 'cellular telephone, cellular phone, cellphone, cell, mobile phone',
    488: 'chain',
    489: 'chainlink fence',
    490: 'chain mail, ring mail, mail, chain armor, chain armour, ring armor, ring armour',
    491: 'chain saw, chainsaw',
    492: 'chest',
    493: 'chiffonier, commode',
    494: 'chime, bell, gong',
    495: 'china cabinet, china closet',
    496: 'Christmas stocking',
    497: 'church, church building',
    498: 'cinema, movie theater, movie theatre, movie house, picture palace',
    499: 'cleaver, meat cleaver, chopper',
    500: 'cliff dwelling',
    501: 'cloak',
    502: 'clog, geta, patten, sabot',
    503: 'cocktail shaker',
    504: 'coffee mug',
    505: 'coffeepot',
    506: 'coil, spiral, volute, whorl, helix',
    507: 'combination lock',
    508: 'computer keyboard, keypad',
    509: 'confectionery, confectionary, candy store',
    510: 'container ship, containership, container vessel',
    511: 'convertible',
    512: 'corkscrew, bottle screw',
    513: 'cornet, horn, trumpet, trump',
    514: 'cowboy boot',
    515: 'cowboy hat, ten-gallon hat',
    516: 'cradle',
    517: 'crane',
    518: 'crash helmet',
    519: 'crate',
    520: 'crib, cot',
    521: 'Crock Pot',
    522: 'croquet ball',
    523: 'crutch',
    524: 'cuirass',
    525: 'dam, dike, dyke',
    526: 'desk',
    527: 'desktop computer',
    528: 'dial telephone, dial phone',
    529: 'diaper, nappy, napkin',
    530: 'digital clock',
    531: 'digital watch',
    532: 'dining table, board',
    533: 'dishrag, dishcloth',
    534: 'dishwasher, dish washer, dishwashing machine',
    535: 'disk brake, disc brake',
    536: 'dock, dockage, docking facility',
    537: 'dogsled, dog sled, dog sleigh',
    538: 'dome',
    539: 'doormat, welcome mat',
    540: 'drilling platform, offshore rig',
    541: 'drum, membranophone, tympan',
    542: 'drumstick',
    543: 'dumbbell',
    544: 'Dutch oven',
    545: 'electric fan, blower',
    546: 'electric guitar',
    547: 'electric locomotive',
    548: 'entertainment center',
    549: 'envelope',
    550: 'espresso maker',
    551: 'face powder',
    552: 'feather boa, boa',
    553: 'file, file cabinet, filing cabinet',
    554: 'fireboat',
    555: 'fire engine, fire truck',
    556: 'fire screen, fireguard',
    557: 'flagpole, flagstaff',
    558: 'flute, transverse flute',
    559: 'folding chair',
    560: 'football helmet',
    561: 'forklift',
    562: 'fountain',
    563: 'fountain pen',
    564: 'four-poster',
    565: 'freight car',
    566: 'French horn, horn',
    567: 'frying pan, frypan, skillet',
    568: 'fur coat',
    569: 'garbage truck, dustcart',
    570: 'gasmask, respirator, gas helmet',
    571: 'gas pump, gasoline pump, petrol pump, island dispenser',
    572: 'goblet',
    573: 'go-kart',
    574: 'golf ball',
    575: 'golfcart, golf cart',
    576: 'gondola',
    577: 'gong, tam-tam',
    578: 'gown',
    579: 'grand piano, grand',
    580: 'greenhouse, nursery, glasshouse',
    581: 'grille, radiator grille',
    582: 'grocery store, grocery, food market, market',
    583: 'guillotine',
    584: 'hair slide',
    585: 'hair spray',
    586: 'half track',
    587: 'hammer',
    588: 'hamper',
    589: 'hand blower, blow dryer, blow drier, hair dryer, hair drier',
    590: 'hand-held computer, hand-held microcomputer',
    591: 'handkerchief, hankie, hanky, hankey',
    592: 'hard disc, hard disk, fixed disk',
    593: 'harmonica, mouth organ, harp, mouth harp',
    594: 'harp',
    595: 'harvester, reaper',
    596: 'hatchet',
    597: 'holster',
    598: 'home theater, home theatre',
    599: 'honeycomb',
    600: 'hook, claw',
    601: 'hoopskirt, crinoline',
    602: 'horizontal bar, high bar',
    603: 'horse cart, horse-cart',
    604: 'hourglass',
    605: 'iPod',
    606: 'iron, smoothing iron',
    607: "jack-o'-lantern",
    608: 'jean, blue jean, denim',
    609: 'jeep, landrover',
    610: 'jersey, T-shirt, tee shirt',
    611: 'jigsaw puzzle',
    612: 'jinrikisha, ricksha, rickshaw',
    613: 'joystick',
    614: 'kimono',
    615: 'knee pad',
    616: 'knot',
    617: 'lab coat, laboratory coat',
    618: 'ladle',
    619: 'lampshade, lamp shade',
    620: 'laptop, laptop computer',
    621: 'lawn mower, mower',
    622: 'lens cap, lens cover',
    623: 'letter opener, paper knife, paperknife',
    624: 'library',
    625: 'lifeboat',
    626: 'lighter, light, igniter, ignitor',
    627: 'limousine, limo',
    628: 'liner, ocean liner',
    629: 'lipstick, lip rouge',
    630: 'Loafer',
    631: 'lotion',
    632: 'loudspeaker, speaker, speaker unit, loudspeaker system, speaker system',
    633: "loupe, jeweler's loupe",
    634: 'lumbermill, sawmill',
    635: 'magnetic compass',
    636: 'mailbag, postbag',
    637: 'mailbox, letter box',
    638: 'maillot',
    639: 'maillot, tank suit',
    640: 'manhole cover',
    641: 'maraca',
    642: 'marimba, xylophone',
    643: 'mask',
    644: 'matchstick',
    645: 'maypole',
    646: 'maze, labyrinth',
    647: 'measuring cup',
    648: 'medicine chest, medicine cabinet',
    649: 'megalith, megalithic structure',
    650: 'microphone, mike',
    651: 'microwave, microwave oven',
    652: 'military uniform',
    653: 'milk can',
    654: 'minibus',
    655: 'miniskirt, mini',
    656: 'minivan',
    657: 'missile',
    658: 'mitten',
    659: 'mixing bowl',
    660: 'mobile home, manufactured home',
    661: 'Model T',
    662: 'modem',
    663: 'monastery',
    664: 'monitor',
    665: 'moped',
    666: 'mortar',
    667: 'mortarboard',
    668: 'mosque',
    669: 'mosquito net',
    670: 'motor scooter, scooter',
    671: 'mountain bike, all-terrain bike, off-roader',
    672: 'mountain tent',
    673: 'mouse, computer mouse',
    674: 'mousetrap',
    675: 'moving van',
    676: 'muzzle',
    677: 'nail',
    678: 'neck brace',
    679: 'necklace',
    680: 'nipple',
    681: 'notebook, notebook computer',
    682: 'obelisk',
    683: 'oboe, hautboy, hautbois',
    684: 'ocarina, sweet potato',
    685: 'odometer, hodometer, mileometer, milometer',
    686: 'oil filter',
    687: 'organ, pipe organ',
    688: 'oscilloscope, scope, cathode-ray oscilloscope, CRO',
    689: 'overskirt',
    690: 'oxcart',
    691: 'oxygen mask',
    692: 'packet',
    693: 'paddle, boat paddle',
    694: 'paddlewheel, paddle wheel',
    695: 'padlock',
    696: 'paintbrush',
    697: "pajama, pyjama, pj's, jammies",
    698: 'palace',
    699: 'panpipe, pandean pipe, syrinx',
    700: 'paper towel',
    701: 'parachute, chute',
    702: 'parallel bars, bars',
    703: 'park bench',
    704: 'parking meter',
    705: 'passenger car, coach, carriage',
    706: 'patio, terrace',
    707: 'pay-phone, pay-station',
    708: 'pedestal, plinth, footstall',
    709: 'pencil box, pencil case',
    710: 'pencil sharpener',
    711: 'perfume, essence',
    712: 'Petri dish',
    713: 'photocopier',
    714: 'pick, plectrum, plectron',
    715: 'pickelhaube',
    716: 'picket fence, paling',
    717: 'pickup, pickup truck',
    718: 'pier',
    719: 'piggy bank, penny bank',
    720: 'pill bottle',
    721: 'pillow',
    722: 'ping-pong ball',
    723: 'pinwheel',
    724: 'pirate, pirate ship',
    725: 'pitcher, ewer',
    726: "plane, carpenter's plane, woodworking plane",
    727: 'planetarium',
    728: 'plastic bag',
    729: 'plate rack',
    730: 'plow, plough',
    731: "plunger, plumber's helper",
    732: 'Polaroid camera, Polaroid Land camera',
    733: 'pole',
    734: 'police van, police wagon, paddy wagon, patrol wagon, wagon, black Maria',
    735: 'poncho',
    736: 'pool table, billiard table, snooker table',
    737: 'pop bottle, soda bottle',
    738: 'pot, flowerpot',
    739: "potter's wheel",
    740: 'power drill',
    741: 'prayer rug, prayer mat',
    742: 'printer',
    743: 'prison, prison house',
    744: 'projectile, missile',
    745: 'projector',
    746: 'puck, hockey puck',
    747: 'punching bag, punch bag, punching ball, punchball',
    748: 'purse',
    749: 'quill, quill pen',
    750: 'quilt, comforter, comfort, puff',
    751: 'racer, race car, racing car',
    752: 'racket, racquet',
    753: 'radiator',
    754: 'radio, wireless',
    755: 'radio telescope, radio reflector',
    756: 'rain barrel',
    757: 'recreational vehicle, RV, R.V.',
    758: 'reel',
    759: 'reflex camera',
    760: 'refrigerator, icebox',
    761: 'remote control, remote',
    762: 'restaurant, eating house, eating place, eatery',
    763: 'revolver, six-gun, six-shooter',
    764: 'rifle',
    765: 'rocking chair, rocker',
    766: 'rotisserie',
    767: 'rubber eraser, rubber, pencil eraser',
    768: 'rugby ball',
    769: 'rule, ruler',
    770: 'running shoe',
    771: 'safe',
    772: 'safety pin',
    773: 'saltshaker, salt shaker',
    774: 'sandal',
    775: 'sarong',
    776: 'sax, saxophone',
    777: 'scabbard',
    778: 'scale, weighing machine',
    779: 'school bus',
    780: 'schooner',
    781: 'scoreboard',
    782: 'screen, CRT screen',
    783: 'screw',
    784: 'screwdriver',
    785: 'seat belt, seatbelt',
    786: 'sewing machine',
    787: 'shield, buckler',
    788: 'shoe shop, shoe-shop, shoe store',
    789: 'shoji',
    790: 'shopping basket',
    791: 'shopping cart',
    792: 'shovel',
    793: 'shower cap',
    794: 'shower curtain',
    795: 'ski',
    796: 'ski mask',
    797: 'sleeping bag',
    798: 'slide rule, slipstick',
    799: 'sliding door',
    800: 'slot, one-armed bandit',
    801: 'snorkel',
    802: 'snowmobile',
    803: 'snowplow, snowplough',
    804: 'soap dispenser',
    805: 'soccer ball',
    806: 'sock',
    807: 'solar dish, solar collector, solar furnace',
    808: 'sombrero',
    809: 'soup bowl',
    810: 'space bar',
    811: 'space heater',
    812: 'space shuttle',
    813: 'spatula',
    814: 'speedboat',
    815: "spider web, spider's web",
    816: 'spindle',
    817: 'sports car, sport car',
    818: 'spotlight, spot',
    819: 'stage',
    820: 'steam locomotive',
    821: 'steel arch bridge',
    822: 'steel drum',
    823: 'stethoscope',
    824: 'stole',
    825: 'stone wall',
    826: 'stopwatch, stop watch',
    827: 'stove',
    828: 'strainer',
    829: 'streetcar, tram, tramcar, trolley, trolley car',
    830: 'stretcher',
    831: 'studio couch, day bed',
    832: 'stupa, tope',
    833: 'submarine, pigboat, sub, U-boat',
    834: 'suit, suit of clothes',
    835: 'sundial',
    836: 'sunglass',
    837: 'sunglasses, dark glasses, shades',
    838: 'sunscreen, sunblock, sun blocker',
    839: 'suspension bridge',
    840: 'swab, swob, mop',
    841: 'sweatshirt',
    842: 'swimming trunks, bathing trunks',
    843: 'swing',
    844: 'switch, electric switch, electrical switch',
    845: 'syringe',
    846: 'table lamp',
    847: 'tank, army tank, armored combat vehicle, armoured combat vehicle',
    848: 'tape player',
    849: 'teapot',
    850: 'teddy, teddy bear',
    851: 'television, television system',
    852: 'tennis ball',
    853: 'thatch, thatched roof',
    854: 'theater curtain, theatre curtain',
    855: 'thimble',
    856: 'thresher, thrasher, threshing machine',
    857: 'throne',
    858: 'tile roof',
    859: 'toaster',
    860: 'tobacco shop, tobacconist shop, tobacconist',
    861: 'toilet seat',
    862: 'torch',
    863: 'totem pole',
    864: 'tow truck, tow car, wrecker',
    865: 'toyshop',
    866: 'tractor',
    867: 'trailer truck, tractor trailer, trucking rig, rig, articulated lorry, semi',
    868: 'tray',
    869: 'trench coat',
    870: 'tricycle, trike, velocipede',
    871: 'trimaran',
    872: 'tripod',
    873: 'triumphal arch',
    874: 'trolleybus, trolley coach, trackless trolley',
    875: 'trombone',
    876: 'tub, vat',
    877: 'turnstile',
    878: 'typewriter keyboard',
    879: 'umbrella',
    880: 'unicycle, monocycle',
    881: 'upright, upright piano',
    882: 'vacuum, vacuum cleaner',
    883: 'vase',
    884: 'vault',
    885: 'velvet',
    886: 'vending machine',
    887: 'vestment',
    888: 'viaduct',
    889: 'violin, fiddle',
    890: 'volleyball',
    891: 'waffle iron',
    892: 'wall clock',
    893: 'wallet, billfold, notecase, pocketbook',
    894: 'wardrobe, closet, press',
    895: 'warplane, military plane',
    896: 'washbasin, handbasin, washbowl, lavabo, wash-hand basin',
    897: 'washer, automatic washer, washing machine',
    898: 'water bottle',
    899: 'water jug',
    900: 'water tower',
    901: 'whiskey jug',
    902: 'whistle',
    903: 'wig',
    904: 'window screen',
    905: 'window shade',
    906: 'Windsor tie',
    907: 'wine bottle',
    908: 'wing',
    909: 'wok',
    910: 'wooden spoon',
    911: 'wool, woolen, woollen',
    912: 'worm fence, snake fence, snake-rail fence, Virginia fence',
    913: 'wreck',
    914: 'yawl',
    915: 'yurt',
    916: 'web site, website, internet site, site',
    917: 'comic book',
    918: 'crossword puzzle, crossword',
    919: 'street sign',
    920: 'traffic light, traffic signal, stoplight',
    921: 'book jacket, dust cover, dust jacket, dust wrapper',
    922: 'menu',
    923: 'plate',
    924: 'guacamole',
    925: 'consomme',
    926: 'hot pot, hotpot',
    927: 'trifle',
    928: 'ice cream, icecream',
    929: 'ice lolly, lolly, lollipop, popsicle',
    930: 'French loaf',
    931: 'bagel, beigel',
    932: 'pretzel',
    933: 'cheeseburger',
    934: 'hotdog, hot dog, red hot',
    935: 'mashed potato',
    936: 'head cabbage',
    937: 'broccoli',
    938: 'cauliflower',
    939: 'zucchini, courgette',
    940: 'spaghetti squash',
    941: 'acorn squash',
    942: 'butternut squash',
    943: 'cucumber, cuke',
    944: 'artichoke, globe artichoke',
    945: 'bell pepper',
    946: 'cardoon',
    947: 'mushroom',
    948: 'Granny Smith',
    949: 'strawberry',
    950: 'orange',
    951: 'lemon',
    952: 'fig',
    953: 'pineapple, ananas',
    954: 'banana',
    955: 'jackfruit, jak, jack',
    956: 'custard apple',
    957: 'pomegranate',
    958: 'hay',
    959: 'carbonara',
    960: 'chocolate sauce, chocolate syrup',
    961: 'dough',
    962: 'meat loaf, meatloaf',
    963: 'pizza, pizza pie',
    964: 'potpie',
    965: 'burrito',
    966: 'red wine',
    967: 'espresso',
    968: 'cup',
    969: 'eggnog',
    970: 'alp',
    971: 'bubble',
    972: 'cliff, drop, drop-off',
    973: 'coral reef',
    974: 'geyser',
    975: 'lakeside, lakeshore',
    976: 'promontory, headland, head, foreland',
    977: 'sandbar, sand bar',
    978: 'seashore, coast, seacoast, sea-coast',
    979: 'valley, vale',
    980: 'volcano',
    981: 'ballplayer, baseball player',
    982: 'groom, bridegroom',
    983: 'scuba diver',
    984: 'rapeseed',
    985: 'daisy',
    986: "yellow lady's slipper, yellow lady-slipper, Cypripedium calceolus, Cypripedium parviflorum",
    987: 'corn',
    988: 'acorn',
    989: 'hip, rose hip, rosehip',
    990: 'buckeye, horse chestnut, conker',
    991: 'coral fungus',
    992: 'agaric',
    993: 'gyromitra',
    994: 'stinkhorn, carrion fungus',
    995: 'earthstar',
    996: 'hen-of-the-woods, hen of the woods, Polyporus frondosus, Grifola frondosa',
    997: 'bolete',
    998: 'ear, spike, capitulum',
    999: 'toilet tissue, toilet paper, bathroom tissue'
}

Now that we have our class labels, let's see how we can find which class label corresponded to the predicted class for our random tensor.  

We use [*`torch.argmax()`*](https://pytorch.org/docs/stable/generated/torch.argmax.html) to find the index of the class with the highest score in the output.  

The code below passes a random tensor through MobileNet V3 (like before), however this time it also computes the class label corresponding to the predicted class.

In [None]:
# Define a random tensor (including a batch dimension)
random_tensor = torch.rand([1, 3, 224, 224], dtype=torch.float32)

# Perform the forward pass through our mobilenet_v3 model
with torch.no_grad():
    output = mobilenet_v3(random_tensor)

# Determine the index of the class with the highest score
#   We need to specify the dimension to argmax so that we compute it per-element in our batch
#   This will mean if the batch_size is N, we will get an (N, ) dimensional vector
class_indexes = output.argmax(dim=1)

# Extract just the first class index because our batch size = 1
predicted_class_index = class_indexes[0]
print(f'The predicted class index is: {predicted_class_index}')

# Lookup the class label based on the class index producing the highest score
#  predicted_class_index is a tensor. To convert it to an integer we need to use .item()
print(f'The predicted label is: "{IMAGENET_CLASSES[predicted_class_index.item()]}"')

After running the above code cell you should see a class index and class label printed out!  
That means you've just successfully made a prediction with a neural network on some random data!  

Before moving on, a couple of points worth noting:  
* In our case we only had a single tensor we made a prediction on, so we could extract the first index into the `class_indexes` tensor. If your batch size is > 1, you would want to get the class label for each element in your `class_indexes` tensor (This is why specifying `dim=1` is so important in the call to *`argmax()`*).
* After extracting the `predicted_class_index`, the type of `predicted_class_index` was still a tensor. Because we wanted to use this value as an index in a python list we needed to call [*`.item()`*](https://pytorch.org/docs/stable/generated/torch.Tensor.item.html) on the tensor. This converts the tensor into a standard python number, which we need to index into our dictionary.

### Classifying a Real Image
Now we know how to make predictions on random data, the next step is to classify real image data!  

Along with this lab you should have downloaded a `french_bulldog.png` file. In this section we see if we can get our network to correctly predict that our image is a French Bulldog! (This happens to be class index 245 in `IMAGENET_CLASSES`)

Given we will be working with data directly from a file, we should be aware of some modifications we need to make to the input image before we are able to pass it through our network.  

Along with loading it into a *numpy* array and converting to a *PyTorch* tensor, the pretrained network we are using requires that the input is resized to 224x224 pixels, the pixel values are between \[0, 1], and the datatype is *float32*. Following the next code cell we will look into this criteria further, but for now be aware that these are a few things we will need to do.

Now it's your turn! Let's try to classify the French Bulldog! Make sure you follow each step below, as there are a few things we need to do.

**Task**: In the below code cell, load, preprocess and classify the `french_bulldog.png` image, following the steps outlined in the cell comments. You'll need to copy in your *`load_image_rgb()`* and *`display_image()`* functions from Lab 1.

**NOTE:** The predicted class might not be what you expect. Also if you run your code cell multiple times you'll likely see different predictions. Read on to see why!

In [None]:
# TODO: Copy in your load_image_rgb function from lab 1



# TODO: Copy in your display_image function from Lab 1



# TODO: Load the french_bulldog.png image (using load_image_rgb)
# image = ...


# TODO: Display the loaded image (using display_image)



# TODO: Resize the image to 224x224 pixels using OpenCV (refer to Lab 1)
# image = ...


# TODO: Transpose the image to get dimensionality CxHxW
# image = ...


# TODO: Cast the image to a float32 datatype with pixels in the range [0, 1]
image = image.astype(np.float32) / 255


# TODO: Convert the image to a tensor
# tensor_image = ...


# TODO: Display the tensor (using display_tensor)



# TODO: Create a batch dimension for your tensor
# batched_tensor = ...


# TODO: Pass your batched tensor through mobilenet_v3



# TODO: Print out the label corresponding to the class with the highest score




#### Task solution

In [None]:
# TODO: Copy in your load_image_rgb function from lab 1
def load_image_rgb(filepath):
    image = cv2.imread(filepath)
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    return image


# TODO: Copy in your display_image function from Lab 1
def display_image(image):
    fig, axes = plt.subplots(figsize=(12, 8))

    axes.imshow(image)

    plt.show()


# TODO: Load the french_bulldog.png image (using load_image_rgb)
image = load_image_rgb('french_bulldog.png')


# TODO: Display the loaded image (using display_image)
display_image(image)


# TODO: Resize the image to 224x224 pixels using OpenCV (refer to Lab 1)
image = cv2.resize(image, dsize=(224, 224), interpolation=cv2.INTER_AREA)


# TODO: Transpose the image to get dimensionality CxHxW
image = np.transpose(image, (2, 0, 1))


# TODO: Cast the image to a float32 datatype with pixels in the range [0, 1]
image = image.astype(np.float32) / 255


# TODO: Convert the image to a tensor
tensor_image = torch.as_tensor(image)


# TODO: Display the tensor (using display_tensor)
display_tensor(tensor_image)


# TODO: Create a batch dimension for your tensor
batched_tensor = tensor_image.unsqueeze(dim=0)


# TODO: Pass your batched tensor through mobilenet_v3
with torch.no_grad():
    outputs = mobilenet_v3(batched_tensor)


# TODO: Print out the label corresponding to the class with the highest score
class_indexes = outputs.argmax(dim=1)
predicted_class_index = class_indexes[0]
print(f'The predicted label is: "{IMAGENET_CLASSES[predicted_class_index.item()]}"')

#### Evaluation mode

Did you see the predicted class wasn't quite right?  
Run your code cell 5-10 times, do you see different predictions?  

If the second point above happened, that's a very big sign that something isn't quite right.  

There are two ways we can configure our neural network, specifically ***train**ing* mode or ***eval**uation* mode.
There are some layers in our network that behave very differently depending on which mode our network is in (e.g. BatchNorm, Dropout, etc.). If you saw different predictions each time you tried to make a prediction on your image, then this is a strong indicator that our network is in the wrong mode!  

Whenever we want to predict on new data, we need to make sure our network is set to ***eval**uation* mode by calling its [*`eval()`* method](https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.eval).  

**Task**: In the code cell below, make the call to set your `mobilenet_v3` model into eval mode at the top of the code cell then copy in your solution to the above code cell (no need to copy your function definitions).

Run the code cell a few times to make sure your predictions are the same each time.

In [None]:
# TODO: Set your MobileNet V3 into eval mode



# TODO: Copy your code from the above code cell here. Do not copy the function definitions




#### Task solution

In [None]:
# TODO: Set your MobileNet V3 into eval mode
mobilenet_v3.eval()


# TODO: Copy your code from the above code cell here. Do not copy the function definitions
# TODO: Load the french_bulldog.png image (using load_image_rgb)
image = load_image_rgb('french_bulldog.png')


# TODO: Display the loaded image (using display_image)
display_image(image)


# TODO: Resize the image to 224x224 pixels using OpenCV (refer to Lab 1)
image = cv2.resize(image, dsize=(224, 224), interpolation=cv2.INTER_AREA)


# TODO: Transpose the image to get dimensionality CxHxW
image = np.transpose(image, (2, 0, 1))


# TODO: Cast the image to a float32 datatype with pixels in the range [0, 1]
image = image.astype(np.float32) / 255


# TODO: Convert the image to a tensor
tensor_image = torch.as_tensor(image)


# TODO: Display the tensor (using display_tensor)
display_tensor(tensor_image)


# TODO: Create a batch dimension for your tensor
batched_tensor = tensor_image.unsqueeze(dim=0)


# TODO: Pass your batched tensor through mobilenet_v3
with torch.no_grad():
    outputs = mobilenet_v3(batched_tensor)


# TODO: Print out the label corresponding to the class with the highest score
class_indexes = outputs.argmax(dim=1)
predicted_class_index = class_indexes[0]
print(f'The predicted label is: "{IMAGENET_CLASSES[predicted_class_index.item()]}"')

### Data Preprocessing
Awesome work so far! If everything went well, your code should have correctly predicted that the image was a French Bulldog!

In the previous labs we had a taste of the possible preprocessing steps we can take to prepare images for classification. Since we are using a pre-trained model, we must ensure that we perform the same preprocessing steps on our image data as when it was trained, or else the model won't perform as well. Convolutional Neural Networks drastically reduce the amount of preprocessing required, so we've probably already covered most of the steps required.

Looking at the [torchvision.models documentation](https://pytorch.org/vision/stable/models.html#:~:text=eval()%20for%20details.-,All%20pre-trained%20models%20expect%20input%20images%20normalized%20in%20the,using%20mean%20%3D%20%5B0.485%2C%200.456%2C%200.406%5D%20and%20std%20%3D%20%5B0.229%2C%200.224%2C%200.225%5D.,-You%20can%20use), we can see that it gives us some information about the expected format of image data when we use the pretrained models for classification.  

That paragraph tells us:
* We need mini-batches of RGB images (Bx3xHxW), with height and width of 224 *(We did this)*
* Images should be loaded in to a range of [0,1] *(We did this)*
* Images should be normalized with a given mean and standard deviation *(We didn't do this)*

To make sure the classification results we get are as expected, we need to make sure our image data is in the appropriate format.  

In the previous section we implemented the first two points manually and did not implement the third point.  

To simplify the code we need to write, we can make use of the *torchvision* package to help perform these necessary preprocessing steps. The [torchvision.transforms documentation](https://pytorch.org/vision/stable/transforms.html) has a collection of transforms that we might want to make on our data, and specifically, can handle converting to tensor in the range [0,1], resizing our data, and normalizing our data.

The code below shows examples of using *torchvision* to apply the three types of transforms mentioned above. Specifically we use the [functional transforms](https://pytorch.org/vision/stable/transforms.html#functional-transforms) imported as `tvtf` (named for **t**orch**v**ision.**t**ransforms.**f**unctional).

In [None]:
# Load the image into a numpy ndarray
image = load_image_rgb('french_bulldog.png')
print(f'Loaded image type: {type(image)}.\nMin/Max pixel values: {image.min()}, {image.max()}.\nDatatype: {image.dtype}\nShape: {image.shape}')

print('-' * 50)

# Convert the image to a tensor using torchvision
tensor_image = tvtf.to_tensor(image)
print(f'Loaded image type: {type(tensor_image)}.\nMin/Max pixel values: {tensor_image.min()}, {tensor_image.max()}.\nDatatype: {tensor_image.dtype}\nShape: {tensor_image.shape}')

print('-' * 50)

# Resize the image so that the smallest edge is 224 pixels
tensor_image = tvtf.resize(tensor_image, 224)
display_tensor(tensor_image)

print('-' * 50)

# Normalize the image using the mean/std given in the MobileNetV3 documentation
tensor_image = tvtf.normalize(tensor_image, mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
print(f'Min/Max pixel values: {tensor_image.min()}, {tensor_image.max()}.\nShape: {tensor_image.shape}')

As you can see, using *torchvision* greatly simplifies the code we otherwise need to write. The code to convert the *numpy* array to a *PyTorch* tensor can be performed in one line of code using *torchvision* vs. 3 lines of code when writing it manually.  

You should be able to see the analogies between the custom code we needed to write vs. the code we write when using *torchvision*.

Resizing our data is an important step to ensure that the spatial size of our image matches what our network expects. In the French Bulldog example, that image was square, meaning we had no issues when we resized it down to a resolution of 224x224. We will run into problems as soon as we encounter non-square images, where if we just resize the image we may end up skewing the image (creating a 'stretched' image).  

One way we can overcome this is to resize the image so that the shortest edge is a bit larger than our desired size, then take a crop from the centre of the image. By doing this we will ideally capture the main contents of the image whilst still making sure we get the required resolution for our network.

The `torchvision` package has transforms that can exactly perform these tasks for us, so in terms of code it just becomes a few function calls!

To resize the image on the shortest edge we can use the [*`resize()`* function](https://pytorch.org/vision/stable/transforms.html#torchvision.transforms.functional.resize) and to take a crop from the centre of the image we can use the [*`center_crop()`* function](https://pytorch.org/vision/stable/transforms.html#torchvision.transforms.functional.center_crop).

Pay attention to the expected types of arguments to the transform functions when looking at the documentation. In general, most transforms can be applied to a *PIL Image* or *Tensor*. Given our image data is stored in *numpy* arrays, we will first convert our data to a tensor then apply the transforms.

Let's get some practice with this!

**Task**: In the below code cell, load the `cat.jpg` image and perform all of the data preprocessing steps outlined in the cell comments.

In [None]:
# TODO: Load the cat.png image (using load_image_rgb)



# TODO: Display the loaded image (using display_image)



# TODO: Convert the image to a tensor (using the torchvision transform)



# TODO: Resize the tensor to 256px on the shortest edge (using the torchvision resize() transform)



# TODO: Take a 224px centre crop of the tensor (using the torchvision center_crop() transform)



# TODO: Display the tensor (using display_tensor)




#### Task solution

In [None]:
# TODO: Load the cat.png image (using load_image_rgb)
image = load_image_rgb('cat.jpg')


# TODO: Display the loaded image (using display_image)
display_image(image)


# TODO: Convert the image to a tensor (using the torchvision transform)
tensor_image = tvtf.to_tensor(image)


# TODO: Resize the tensor to 256px on the shortest edge (using the torchvision resize() transform)
tensor_image = tvtf.resize(tensor_image, 256)


# TODO: Take a 224px centre crop of the tensor (using the torchvision center_crop() transform)
tensor_image = tvtf.center_crop(tensor_image, 224)


# TODO: Display the tensor (using display_tensor)
display_tensor(tensor_image)

### Preprocessing Function

Great work!

The `cat.jpg` image initially had a pixel resolution of 1327x913px, however with some transforms you were able to get it down to 224x224px whilst capturing the main part of the image and preserving the aspect ratio of the data.

**Task**: As a final step before moving on, let's write a function that will handle preprocessing an image for us.

In the below code cell, write a function `preprocess_image` that:
* Takes an image as a parameter
* Converts the image to a tensor
* Resizes the tensor to 256px on the shortest edge
* Takes a 224px centre crop of the tensor
* Normalizes the tensor (using the mean and std described above)
* Creates a batch dimension for the tensor
* Returns the processed tensor

Make sure you use the *torchvision* transforms to implement this

In [None]:
# TODO: Write your function here



cat_image = load_image_rgb('cat.jpg')
cat_tensor = preprocess_image(cat_image)
print(type(cat_tensor))      # Should be torch.Tensor
print(cat_tensor.shape)      # Should be (1, 3, 224, 224)
print(f'Min/Max pixel values: {tensor_image.min()}, {tensor_image.max()}.')    # Should be between [0, 1]

#### Task solution

In [None]:
def preprocess_image(image):
    # Convert image to tensor
    image = tvtf.to_tensor(image)

    # Resize to 256px on shortest edge
    image = tvtf.resize(image, 256)

    # 224px centre crop
    image = tvtf.center_crop(image, 224)

    # Normalise the image
    image = tvtf.normalize(image, mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])

    # Create a batch dimension
    image = image.unsqueeze(dim=0)

    return image

### Pulling Everything Together
Now we have covered everything we need to be able to use our pretrained network to classify new unseen data.

In this section let's create a function that can take a neural network model and image data (stored in a numpy array), and give us back some data on the predictions of the image!


Now it's your turn! Let's try classify the French Bulldog! Make sure you follow each step below, as there are a few things we need to do.

In the below code cell, your task is to write a function named *classify_image* that:
* Takes a *model* and *image* as parameters
* Sets the *model* to evaluation mode
* Preprocesses the image, resulting in a tensor (Use the function you previously wrote)
* Performs a foward pass of the batched data through the model
* Returns a 3-tuple:
    * The output 1000D vector (converted to a *numpy* array),
    * The index of the class with the highest score (use `.item()` ), and
    * The class label corresponding to that score

At the bottom of the code cell is some code that will call your classify_image function with MobileNet V3 and the cat image. Use this to verify that your function works!

In [None]:
# TODO: Write your function here




# Test your function!
cat_image = load_image_rgb('cat.jpg')
outputs, class_index, class_label = classify_image(mobilenet_v3, cat_image)
print(f'Predicted to be: {class_label}')
display_image(cat_image)

#### Task solution

In [None]:
# TODO: Write your function here
def classify_image(model, image):
    # Set model to eval mode
    model.eval()

    # Preprocess the image
    image = preprocess_image(image)

    # Perform the forward pass
    with torch.no_grad():
        outputs = model(image)

    # Get the output vector, class index of highest score and class label
    output = outputs[0]                             # Extract the 1000D vector
    class_index = output.argmax(dim=0).item()       # Output is now a (1000,) dimensional vector, so we argmax over dim=0
    class_label = IMAGENET_CLASSES[class_index]

    # Convert the output vector to a numpy array
    output = output.detach().cpu().numpy()

    # Return the data
    return output, class_index, class_label

### More Predictions
Now you've got a really powerful function that can classify new image data.  

In the code cell below we have provided a function for you that can load image data from a URL into a *numpy* array.

The code cell below defines the function and shows an example of using it.

In [None]:
def load_image_from_url(url):
    """Given a URL, loads the image into a numpy

    Image loaded in RGB, with HWC channel ordering

    Args:
        url (str): The URL of the image to load

    Returns:
        (np.ndarray): The RGB, HWC ordered image
    """
    with urlopen(url) as ur:
        image = np.asarray(bytearray(ur.read()), dtype='uint8')
    image = cv2.imdecode(image, cv2.IMREAD_COLOR)
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    return image

# Load and display an image
image = load_image_from_url('https://images.unsplash.com/photo-1543466835-00a7907e9de1?ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&ixlib=rb-1.2.1&auto=format&fit=crop&w=1267&q=80')
print(type(image), image.shape)
display_image(image)

To further improve how we can present our classifications, let's try to improve the display_image function to accept a title for the image. Following classifying an image, when displaying the image we will set the title to the predicted label.

You may already have done this if you completed Challenge Question 2 in Lab 1.

**Task**: In the code cell below, write a function named *display_image* that:
* Takes an image (in a *numpy* array) and a title (default value = None)
* Displays the image using *matplotlib* (Use your code from Lab 1)
* Before calling `plt.show()`, if the title is not None, sets the title of the plot to be the given title

At the bottom of the code cell is some code that will call your display_image function. Use this to verify that your function works, you should see the title set to 'Dog'.

In [None]:
# TODO: Write your function here




# Test your function!
image = load_image_from_url('https://images.unsplash.com/photo-1543466835-00a7907e9de1?ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&ixlib=rb-1.2.1&auto=format&fit=crop&w=1267&q=80')
display_image(image, 'Dog')

#### Task solution

In [None]:
def display_image(image, title=None):
    fig, axes = plt.subplots(figsize=(12, 8))

    axes.imshow(image)

    if title is not None:
        plt.title(title)

    plt.show()

#### Web Image Classification

Now we have a nicer displaying function, let's make a prediction on an image from a URL and display the predicted label along with the image!

**Task**: In the code cell below load, classify and display the image from the given URL.

In [None]:
url = 'https://images.unsplash.com/photo-1543466835-00a7907e9de1?ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&ixlib=rb-1.2.1&auto=format&fit=crop&w=1267&q=80'

# TODO: Load the image from the URL



# Classify the image using MobileNet V3



# Display the image with the title set to be the predicted label




#### Task solution

In [None]:
url = 'https://images.unsplash.com/photo-1543466835-00a7907e9de1?ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&ixlib=rb-1.2.1&auto=format&fit=crop&w=1267&q=80'

# TODO: Load the image from the URL
image = load_image_from_url(url)


# Classify the image using MobileNet V3
_, _, label = classify_image(mobilenet_v3, image)


# Display the image with the title set to be the predicted label
display_image(image, label)

#### Further Testing

Awesome work!

Feel free to play around with different URLs (or loading images from disk) to see differences in prediction!

A good resource for finding sample images to classify is [Unsplash.com](https://unsplash.com/).  

To get the URL for an image from Unsplash:
1. Search for an image
2. Click on the desired image to open it up in a larger window
3. Right click on the image and select "copy image address"
4. Paste the copied URL into the code cell and run your code!

## 1.4 Ensembling with a Single Model
Before finishing up, let's see what the effects are if we first modify our image in some way before classifying it!

To keep things simple, let's see what the effect of classifying an image of a cat is when we first apply a rotation to the cat image.

The code cell below loads an image of a cat then classifies it, keeping track of the output vector, class index and class label.

In [None]:
# Load the cat image
url = 'https://images.unsplash.com/photo-1611242118897-f06ed77f6ab0?ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&ixlib=rb-1.2.1&auto=format&fit=crop&w=1350&q=80'
cat_image = load_image_from_url(url)

# Degrade the image by clipping pixel values
cat_image = np.clip(cat_image, 150, 205)

# Classify and display the cat image
cat_output, cat_class_index, cat_label = classify_image(mobilenet_v3, cat_image)
display_image(cat_image, cat_label)

Following that, you should see the cat predicted as a "tabby, tabby cat" or "tiger cat".  

What if we were to rotate the image before classifying it? Let's have a look at how this changes our classification!

**Task**: In the code cell below, apply rotations of +- 20 degrees to the cat image and visualize their classified labels. You'll need your *`rotate_image()`* function from Lab 1.

**NOTE:** To classify multiple images at the same time we would usually combine them into a batch and do a single forward pass.

In [None]:
# TODO: Copy in your rotate_image function from Lab 1



# TODO: Create a rotated cat image by rotating 20 degrees counter-clockwise
# cat_ccw_rot_image = ...


# Classify the rotated cat image
cat_ccw_rot_output, cat_ccw_rot_class_index, cat_ccw_rot_label = classify_image(mobilenet_v3, cat_ccw_rot_image)


# Display the rotated cat image along with its label
display_image(cat_ccw_rot_image, cat_ccw_rot_label)


# TODO: Create a rotated cat image by rotating 20 degrees clockwise
# cat_cw_rot_image = ...


# Classify the rotated cat image
cat_cw_rot_output, cat_cw_rot_class_index, cat_cw_rot_label = classify_image(mobilenet_v3, cat_cw_rot_image)


# Display the rotated cat image along with its label
display_image(cat_cw_rot_image, cat_cw_rot_label)

#### Task solution

In [None]:
# TODO: Copy in your rotate_image function from Lab 1
def rotate_image(image, angle):
    height, width = image.shape[:2]
    M = cv2.getRotationMatrix2D((width // 2, height//2), angle, 1)
    return cv2.warpAffine(image, M, (width, height))


# TODO: Create a rotated cat image by rotating 20 degrees counter-clockwise
cat_ccw_rot_image = rotate_image(cat_image, 20)


# Classify the rotated cat image
cat_ccw_rot_output, cat_ccw_rot_class_index, cat_ccw_rot_label = classify_image(mobilenet_v3, cat_ccw_rot_image)

# Display the rotated cat image along with its label
display_image(cat_ccw_rot_image, cat_ccw_rot_label)


# Create a rotated cat image by rotating 20 degrees clockwise
cat_cw_rot_image = rotate_image(cat_image, -20)


# Classify the rotated cat image
cat_cw_rot_output, cat_cw_rot_class_index, cat_cw_rot_label = classify_image(mobilenet_v3, cat_cw_rot_image)


# Display the rotated cat image along with its label
display_image(cat_cw_rot_image, cat_cw_rot_label)

#### Discussion
You might see that the counter-clockwise rotation produced a different label to the clockwise rotation.    

This shows that the network can produce different results depending on transformations applied to the input image. It also means in practice, if you took a photo of your cat on an angle the network may predict a different class.

There are times where we can use this to our advantage! As long as the transformations we apply do not change the contents of the image (the class of the image is still obvious to us), then we can classify the same image (with different transformations applied) and then aggregate the classifications to get a final label. This can get us better performance at the cost of computer processing time.

There are two common techniques we can use to do this, *Voting* on the class label or *Averaging the output vector*.

### Voting
When we use voting, we look at the collection of predictions we have made on the same image (with transformations applied), and then select the class that was most frequently predicted.

When doing this we should take care of the case there are equally frequent classes predicted.

An example of using Voting to aggregate the classifications in the previous example is found below.

In [None]:
# Find the most common class index from the predicted class indexes
predicted_class_indexes = [cat_class_index, cat_ccw_rot_class_index, cat_cw_rot_class_index]

# This is a shortcut to find the mode of a list
most_common_class_index = max(predicted_class_indexes, key=predicted_class_indexes.count)

# Print out the class
print(f'Most common class index: {most_common_class_index}')
print(f'This corresponds to class: {IMAGENET_CLASSES[most_common_class_index]}')

### Averaging Output Vectors
Remembering that the output of our network is a score per-class, with the goal that the highest score corresponds to the predicted class.  

Another way we can aggregate the different predictions is to find the average of the output vectors for each different image variation, then choose the class index directly from the averaged vector.

An example of Averaging the Output Vector to aggregate the classifications in the previous example is found below.

In [None]:
# Collect the different output vectors into a list
predicted_output_vectors = [cat_output, cat_ccw_rot_output, cat_cw_rot_output]

# Find the average of the output vectors, resulting in a single (1000,) dimensional vector
averaged_output_vector = np.mean(predicted_output_vectors, axis=0)
print(f'The size of the averaged vector is: {averaged_output_vector.shape}')

# Find the class index of the highest average prediction
highest_average_class_index = averaged_output_vector.argmax(axis=0)

# Print out the class
print(f'Class index of highest average: {highest_average_class_index}')
print(f'This corresponds to class: {IMAGENET_CLASSES[highest_average_class_index]}')

Following this you might see that we get different results based on the way we aggregated the predictions!  

**Question**: Give an example of a circumstance where voting and averaging output vectors would give different answers

<details>
<summary style='cursor:pointer;'><u>Answer</u></summary>

```python
pred1 = [0.45, 0.55]
pred2 = [0.45, 0.55]
pred3 = [0.90, 0.10]
    
average = [(0.49*2+0.99)/3, (0.51*2+0.01)/3]
# [0.6, 0.4]
# So voting gives class 1, but averaging gives class 0
```
</details>

# 2. Evaluation
Evaluating how well neural networks perform is an extremely important task. To be able to use a neural network in practice, you need to be confident that the predictions it makes can generalize to new unseen data.

In this section we will look at various evaluation metrics we can use to help determine how well out network is performing.

## 2.1 Multiclass Classification vs. Binary Classification
When performing classification, there are two types of classification tasks we can perform, multiclass or binary.

`Multiclass Classification` refers to classification tasks where there are more than 2 classes to choose between.  
`Binary Classification` refers to classification tasks where there are exactly 2 classes. Usually, these two classes are referred to as the *positive* and *negative* classes.

It is common to see binary classification used on tasks where there are only 2 possible outcomes. For example, predicting whether a patient has a certain disease or not.

In this lab we have been performing multiclass classification, given our model can predict 1000 different classes.

## 2.2 Data Collection
Before computing any evaluation metrics, let's first construct a small dataset that we can evaluate our MobileNet V3 network on and get predictions for each image in our dataset. For this we will use 4x images that belong to 5x different classes (20 images in total).

The classes we will choose are: *lion*, *kite*, *magpie*, *toaster*, *eel*.

In the code cell below, you'll see a list of `URLS` and `GROUND_TRUTH_LABELS`. These contain the image URLs we will consider as part of our dataset, and their corresponding class index labels.

**Task**: Your task is to:
* Create an empty list to store the predicted class indexes (called `predicted_labels`)
* Loop through the list of urls, and:
    * Load the URL into a *numpy* array
    * Classify the image, getting back the predicted class index
    * Append the predicted class index to the list of `predicted_labels`
* **Optional:** If you are interested, you can also call the `display_image()` function to see what the image data looks like and see the ground truth/predicted labels

In [None]:
URLS = [
    'https://images.unsplash.com/photo-1546182990-dffeafbe841d?ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&ixlib=rb-1.2.1&auto=format&fit=crop&w=1340&q=80',
    'https://images.unsplash.com/photo-1511216113906-8f57bb83e776?ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&ixlib=rb-1.2.1&auto=format&fit=crop&w=934&q=80',
    'https://images.unsplash.com/photo-1590668468552-d87c3a011afb?ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&ixlib=rb-1.2.1&auto=format&fit=crop&w=1352&q=80',
    'https://images.unsplash.com/photo-1562512619-e5ed0e495c78?ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&ixlib=rb-1.2.1&auto=format&fit=crop&w=1350&q=80',
    'https://images.unsplash.com/photo-1596554817336-19fbecb23705?ixlib=rb-1.2.1&ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&auto=format&fit=crop&w=934&q=80',
    'https://images.unsplash.com/photo-1604153741124-8e0eb40964ab?ixlib=rb-1.2.1&ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&auto=format&fit=crop&w=934&q=80',
    'https://images.unsplash.com/photo-1588356294626-5c653a904228?ixlib=rb-1.2.1&ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&auto=format&fit=crop&w=1350&q=80',
    'https://images.unsplash.com/photo-1554234362-59a913f24b78?ixlib=rb-1.2.1&ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&auto=format&fit=crop&w=1426&q=80',
    'https://images.unsplash.com/photo-1598271597568-1df2e4470095?ixlib=rb-1.2.1&ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&auto=format&fit=crop&w=1350&q=80',
    'https://images.unsplash.com/photo-1580637065333-4fd06585fd8a?ixlib=rb-1.2.1&ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&auto=format&fit=crop&w=1400&q=80',
    'https://images.unsplash.com/photo-1586496567894-7d5556870fdf?ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&ixlib=rb-1.2.1&auto=format&fit=crop&w=1350&q=80',
    'https://images.unsplash.com/photo-1610759990825-5db0ab4220b0?ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&ixlib=rb-1.2.1&auto=format&fit=crop&w=934&q=80',
    'https://images.unsplash.com/photo-1624209190904-aca680ededc1?ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&ixlib=rb-1.2.1&auto=format&fit=crop&w=1025&q=80',
    'https://images.unsplash.com/photo-1613221699807-4940ba9b83f4?ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&ixlib=rb-1.2.1&auto=format&fit=crop&w=1277&q=80',
    'https://images.unsplash.com/photo-1583729250536-d5fb10401671?ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&ixlib=rb-1.2.1&auto=format&fit=crop&w=1350&q=80',
    'https://images.unsplash.com/photo-1618506408870-64d8bec48248?ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&ixlib=rb-1.2.1&auto=format&fit=crop&w=1350&q=80',
    'https://images.unsplash.com/photo-1559897202-7fc939ce9db2?ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&ixlib=rb-1.2.1&auto=format&fit=crop&w=1286&q=80',
    'https://images.unsplash.com/photo-1540253236931-2a77e060b434?ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&ixlib=rb-1.2.1&auto=format&fit=crop&w=1317&q=80',
    'https://images.unsplash.com/photo-1516683169270-7514e272a5fc?ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&ixlib=rb-1.2.1&auto=format&fit=crop&w=1352&q=80',
    'https://images.unsplash.com/photo-1538180476225-ddaa78852f16?ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&ixlib=rb-1.2.1&auto=format&fit=crop&w=1297&q=80'
]
GROUND_TRUTH_LABELS = [291, 291, 291, 291, 21, 21, 21, 21, 18, 18, 18, 18, 859, 859, 859, 859, 390, 390, 390, 390]
ALL_CLASSES = [291, 21, 18, 859, 390]


# TODO: Create an empty list to store the predicted class indexes in



# TODO: Loop through the list of URLs and append the predicted class index to the list of predicted class indexes




How were the predictions? If you also visualized the data you should have a rough idea about how good the model performed.

#### Task solution

In [None]:
URLS = [
    'https://images.unsplash.com/photo-1546182990-dffeafbe841d?ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&ixlib=rb-1.2.1&auto=format&fit=crop&w=1340&q=80',
    'https://images.unsplash.com/photo-1511216113906-8f57bb83e776?ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&ixlib=rb-1.2.1&auto=format&fit=crop&w=934&q=80',
    'https://images.unsplash.com/photo-1590668468552-d87c3a011afb?ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&ixlib=rb-1.2.1&auto=format&fit=crop&w=1352&q=80',
    'https://images.unsplash.com/photo-1562512619-e5ed0e495c78?ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&ixlib=rb-1.2.1&auto=format&fit=crop&w=1350&q=80',
    'https://images.unsplash.com/photo-1596554817336-19fbecb23705?ixlib=rb-1.2.1&ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&auto=format&fit=crop&w=934&q=80',
    'https://images.unsplash.com/photo-1604153741124-8e0eb40964ab?ixlib=rb-1.2.1&ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&auto=format&fit=crop&w=934&q=80',
    'https://images.unsplash.com/photo-1588356294626-5c653a904228?ixlib=rb-1.2.1&ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&auto=format&fit=crop&w=1350&q=80',
    'https://images.unsplash.com/photo-1554234362-59a913f24b78?ixlib=rb-1.2.1&ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&auto=format&fit=crop&w=1426&q=80',
    'https://images.unsplash.com/photo-1598271597568-1df2e4470095?ixlib=rb-1.2.1&ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&auto=format&fit=crop&w=1350&q=80',
    'https://images.unsplash.com/photo-1580637065333-4fd06585fd8a?ixlib=rb-1.2.1&ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&auto=format&fit=crop&w=1400&q=80',
    'https://images.unsplash.com/photo-1586496567894-7d5556870fdf?ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&ixlib=rb-1.2.1&auto=format&fit=crop&w=1350&q=80',
    'https://images.unsplash.com/photo-1610759990825-5db0ab4220b0?ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&ixlib=rb-1.2.1&auto=format&fit=crop&w=934&q=80',
    'https://images.unsplash.com/photo-1624209190904-aca680ededc1?ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&ixlib=rb-1.2.1&auto=format&fit=crop&w=1025&q=80',
    'https://images.unsplash.com/photo-1613221699807-4940ba9b83f4?ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&ixlib=rb-1.2.1&auto=format&fit=crop&w=1277&q=80',
    'https://images.unsplash.com/photo-1583729250536-d5fb10401671?ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&ixlib=rb-1.2.1&auto=format&fit=crop&w=1350&q=80',
    'https://images.unsplash.com/photo-1618506408870-64d8bec48248?ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&ixlib=rb-1.2.1&auto=format&fit=crop&w=1350&q=80',
    'https://images.unsplash.com/photo-1559897202-7fc939ce9db2?ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&ixlib=rb-1.2.1&auto=format&fit=crop&w=1286&q=80',
    'https://images.unsplash.com/photo-1540253236931-2a77e060b434?ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&ixlib=rb-1.2.1&auto=format&fit=crop&w=1317&q=80',
    'https://images.unsplash.com/photo-1516683169270-7514e272a5fc?ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&ixlib=rb-1.2.1&auto=format&fit=crop&w=1352&q=80',
    'https://images.unsplash.com/photo-1538180476225-ddaa78852f16?ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&ixlib=rb-1.2.1&auto=format&fit=crop&w=1297&q=80'
]
GROUND_TRUTH_LABELS = [291, 291, 291, 291, 21, 21, 21, 21, 18, 18, 18, 18, 859, 859, 859, 859, 390, 390, 390, 390]
ALL_CLASSES = [291, 21, 18, 859, 390]


# TODO: Create an empty list to store the predicted class indexes in
predicted_labels = []

# TODO: Loop through the list of URLs and append the predicted class index to the list of predicted class indexes
# NOTE: Here we also loop through the ground truth labels to display the predictions with their
#       ground truth and predicted label
for url, label in zip(URLS, GROUND_TRUTH_LABELS):
    # Load the image
    image = load_image_from_url(url)

    # Classify the image
    _, class_index, class_label = classify_image(mobilenet_v3, image)

    # Append the class index to the list of predicted labels
    predicted_labels.append(class_index)

    # Display the image (Here we show the title and predicted labels)
    title = f'Ground Truth: {IMAGENET_CLASSES[label]}\nPrediction: {class_label}'
    display_image(image, title)

## 2.3 TP/TN/FP/FN
Before jumping into computing metrics for our dataset, let's briefly talk about TP/TN/FP/FN. When discussing TP/TN/FP/FN, we need to introduce the notion of what a 'positive' and 'negative' class means.  

In `binary classification` this is quite clear - There are only two classes and we assign one the 'positive' class and the other the 'negative' class.

In `multiclass classification` it is not as clear. We have multiple classes, so which one is considered the 'positive' class?  
When considering multiclass classification, we look at TP/TN/FP/FN per-class. That is, take turns considering 1 class to be the 'positive' class and the rest to be the 'negative' class, until all classes have been considered the 'positive' class.  

As an example: if we have 3 classes: {A, B, C}, then considering class A we would say class A is the 'positive' class and classes B and C are the 'negative' classes, then when considering class B, class B would be considered the 'positive' class and classes A and C the 'negative' classes (and so-on for class C). Another way to think about it is that the 'class of interest' is the 'positive' class.  

Descriptions of TP/TN/FP/FN can be found below. Keep in mind for multiclass classification when considering a 'positive' class, we are considering a single class at a time (That is, we find the number of TP/TN/FP/FN for each class in multiclass classification).

**True Positive (TP)**  
Refers to the number of predictions where the classifier correctly predicts the positive class as positive.

**True Negative (TN)**  
Refers to the number of predictions where the classifier correctly predicts the negative class as negative.  
In the case of multiclass classification, if we have 3 classes: {A, B, C} and class A is considered the 'positive' class, then an example from class B classified as class C is still considered a True Negative (because the classifier correctly predicted the negative class as negative).  

**False Positive (FP)**  
Refers to the number of predictions where the classifier incorrectly predicts the negative class as positive.

**False Negative (FN)**  
Refers to the number of predictions where the classifier incorrectly predicts the positive class as negative.

We can very quickly visualize this in a confusion matrix, which we can make use of *scikit-learn* to generate for us.

**Artificial Dataset** In the code cell below we create an artificial dataset from a 3 class problem. This dataset is simply a list of ground truth 'labels' and predicted 'labels'. We will be using this artificial dataset throughout this section to get experience manually computing some metrics.  

In the below code cell we create a confusion matrix with the artificial dataset. Once you've run the code cell, make sure to answer the comprehension questions before moving on. Don't be concerned with the plotting code.

In [None]:
# Define artificial ground truth and predicted labels
#    (there is an uneven amount of ground truth examples per-class)
ground_truth = [0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 2, 2]
predictions =  [1, 2, 0, 2, 1, 1, 0, 0, 0, 1, 2, 2]

# Compute the confusion matrix
cm = confusion_matrix(y_true=ground_truth, y_pred=predictions)

# Display the confusion matrix
cm_disp = ConfusionMatrixDisplay(confusion_matrix=cm)
cm_disp = cm_disp.plot(include_values=True, cmap='plasma_r', ax=None, xticks_rotation='horizontal')
plt.show()

### Comprehension Questions

**Question 1**

Considering class 0 as the 'positive' class, what are the values of TP/TN/FP/FN?

<details>
<summary style='cursor:pointer;'><u>Answer</u></summary>

$TP=1$
    
    
$TN=3$    
    
    
$FP=3$    
    
    
$FN=5$    
    
    
</details>

**Question 2**

Considering class 1 as the 'positive' class, what are the values of TP/TN/FP/FN?

<details>
<summary style='cursor:pointer;'><u>Answer</u></summary>

$TP=1$
    
    
$TN=5$    
    
    
$FP=3$    
    
    
$FN=3$    
    
    
</details>

**Question 3**

Considering class 2 as the 'positive' class, what are the values of TP/TN/FP/FN?

<details>
<summary style='cursor:pointer;'><u>Answer</u></summary>

$TP=2$
    
    
$TN=8$    
    
    
$FP=2$    
    
    
$FN=0$    
    
    
</details>

## 2.4 Metrics
Now we have a real and artificial dataset to work with, and we understand TP/TN/FP/FN notation, we can start looking at specific evaluation metrics and calculate them!

In each of the following sections, we will show code to compute the metric on our image dataset, and then to test your understanding you will be asked to manually compute the metric on the artificial dataset presented in the TP/TN/FP/FN section above.

### Accuracy
One of the simplest evaluation metrics we can compute is the accuracy. Accuracy tells us overall how often the model is making a correct prediction.  

The formula for computing accuracy is shown below:  
<center>$accuracy = \frac{TP + TN}{TP + TN + FP + FN}$</center>

To compute the accuracy score, we will use the [*scikit-learn* *`accuracy_score()`* function](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.accuracy_score.html) which will tell us the accuracy across our dataset.

In the below code cell we compute the accuracy score on our image dataset.

In [None]:
dataset_accuracy = accuracy_score(y_true=GROUND_TRUTH_LABELS, y_pred=predicted_labels)
print(f'Accuracy: {dataset_accuracy * 100:.2f}%')

#### Comprehension Questions

**Question 1**  
What is the accuracy of the artificial dataset described above?

<details>
<summary style='cursor:pointer;'><u>Answer</u></summary>

You can compute the accuracy considering any of the classes as the 'positive' class. This solution shows considering class 0 as the positive class:  

<center>$accuracy=\frac{1 + 3}{1 + 3 + 3 + 5} = \frac{4}{12} \approx 0.33 \space (33\%)$</center>
    
</details>

### Recall (Sensitivity)
Recall is a metric that quantifies the number of correct positive predictions made out of all positive examples. Recall tells you: when the model is presented with a positive example, what is the chance that the model will predict 'positive'? Recall is also referred to as sensitivity.

The formula for computing recall is shown below:  
<center>$recall = \frac{TP}{TP + FN}$</center>

Given we are dealing multiclass classification, we will generate a recall score *per-class*.

To compute the recall score, we will use the [*scikit-learn* *`recall_score()`* function](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.recall_score.html) function.

In [None]:
# We pass in the list of imagenet class label indexes to recall_score so it is aware of all classes in our dataset
# This will return back a (1000,) dimensional numpy array, with each value corresponding to an ImageNet class
# The values in this array correspond to the recall score for that class
recall_scores = recall_score(y_true=GROUND_TRUTH_LABELS, y_pred=predicted_labels,
                             labels=list(IMAGENET_CLASSES.keys()), average=None, zero_division=0)

# We are only interested in seeing the recall score for classes that we have ground truth data for
# Here we print out the recall score for each class defined in ALL_CLASSES
for class_idx in ALL_CLASSES:
    print(f'Class: {IMAGENET_CLASSES[class_idx]}. Recall score: {recall_scores[class_idx]}')

<details>
<summary style='cursor:pointer;'><u>Expand for Discussion</u></summary>

A few conclusions we can make:

Our classifier was unable to correctly predict any of the kite examples. This should be concerning to us if predicting kites was important.    
Our classifier was able to correctly predict every eel example.

</details>

#### Comprehension Questions

**Question 1**

In the artificial dataset described above, considering class 0 as the 'positive' class, what is the recall score for this class?

<details>
<summary style='cursor:pointer;'><u>Answer</u></summary>

<center>$recall = \frac{1}{1 + 5} = \frac{1}{6} \approx 0.17$</center>
    
</details>

**Question 2**

In the artificial dataset described above, considering class 1 as the 'positive' class, what is the recall score for this class?

<details>
<summary style='cursor:pointer;'><u>Answer</u></summary>

<center>$recall = \frac{1}{1 + 3} = \frac{1}{4} = 0.25$</center>
    
</details>

**Question 3**

In the artificial dataset described above, considering class 2 as the 'positive' class, what is the recall score for this class?

<details>
<summary style='cursor:pointer;'><u>Answer</u></summary>

<center>$recall = \frac{2}{2 + 0} = \frac{2}{2} = 1$</center>    
    
</details>

### Precision
Precision is a metric that quantifies the number of correct positive predictions made out of all predictions made to the positive class. Precision tells you: when a model has predicted 'positive', what is the chance that you actually have a positive example?

The formula for computing precision is shown below:  
<center>$precision = \frac{TP}{TP + FP}$</center>

Given we are dealing multiclass classification, we will generate a precision score *per-class*.

To compute the precision score, we will use the [*scikit-learn* *`precision_score()`* function](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.precision_score.html) function.

In [None]:
# We pass in the list of imagenet class label indexes to precision_score so it is aware of all classes in our dataset
# This will return back a (1000,) dimensional numpy array, with each value corresponding to an ImageNet class
# The values in this array correspond to the precision score for that class
precision_scores = precision_score(y_true=GROUND_TRUTH_LABELS, y_pred=predicted_labels,
                                   labels=list(IMAGENET_CLASSES.keys()), average=None, zero_division=0)

# We are only interested in seeing the precision score for classes that we have ground truth data for
# Here we print out the precision score for each class defined in ALL_CLASSES
for class_idx in ALL_CLASSES:
    print(f'Class: {IMAGENET_CLASSES[class_idx]}. Precision score: {precision_scores[class_idx]}')

<details>
<summary style='cursor:pointer;'><u>Expand for Discussion</u></summary>

The precision scores for this dataset are quite boring, but that is expected.  
    
For a class to not have a perfect precision score, we would need one of the examples in our dataset to be predicted as one of the other classes we are interested in (This is very unlikely, given that there are 1000 possible classes our model could predict).
    
The reason we see the *kite* class with a precision score of 0 is because none of the examples were predicted as the *kite* class.
</details>

#### Comprehension Questions

**Question 1**

In the artificial dataset described above, considering class 0 as the 'positive' class, what is the precision score for this class?

<details>
<summary style='cursor:pointer;'><u>Answer</u></summary>

<center>$precision = \frac{1}{1 + 3} = \frac{1}{4} = 0.25$</center>
    
</details>

**Question 2**

In the artificial dataset described above, considering class 1 as the 'positive' class, what is the precision score for this class?

<details>
<summary style='cursor:pointer;'><u>Answer</u></summary>

<center>$precision = \frac{1}{1 + 3} = \frac{1}{4} = 0.25$</center>
    
</details>

**Question 3**

In the artificial dataset described above, considering class 2 as the 'positive' class, what is the precision score for this class?

<details>
<summary style='cursor:pointer;'><u>Answer</u></summary>

<center>$precision = \frac{2}{2 + 2} = \frac{2}{4} = 0.5$</center>    
    
</details>

### F1 Score
The F1 score is a metric that is a weighted average of both the precision and recall, with the relative contribution of precision and recall both being equal.

The formula for computing the F1 score is shown below:  
<center>$F1 \space score = \frac{2 \cdot (precision \cdot recall)}{precision + recall}$</center>

Given we are dealing multiclass classification, we will generate an F1 score score *per-class*.

To compute the F1 score, we will use the [*scikit-learn* *`f1_score()`* function](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html) function.

In [None]:
# We pass in the list of imagenet class label indexes to f1_score so it is aware of all classes in our dataset
# This will return back a (1000,) dimensional numpy array, with each value corresponding to an ImageNet class
# The values in this array correspond to the F1 score for that class
f1_scores = f1_score(y_true=GROUND_TRUTH_LABELS, y_pred=predicted_labels,
                     labels=list(IMAGENET_CLASSES.keys()), average=None, zero_division=0)

# We are only interested in seeing the F1 score for classes that we have ground truth data for
# Here we print out the F1 score for each class defined in ALL_CLASSES
for class_idx in ALL_CLASSES:
    print(f'Class: {IMAGENET_CLASSES[class_idx]}. F1 score: {f1_scores[class_idx]}')

<details>
<summary style='cursor:pointer;'><u>Expand for Discussion</u></summary>

As we should expect, the F1 score for the eel class is perfect. This was because we saw perfect recall and precision for that class. This shows we can be happy that our classifier is doing a great job at predicting eels and not misclassifying other classes as eels. (With that said, we only have a *very small* amount of data we used for evaluation. To properly validate this claim we would need to evaluate on a larger dataset).  

</details>

#### Comprehension Questions

**Question 1**

In the artificial dataset described above, considering class 0 as the 'positive' class, what is the F1 score for this class?

<details>
<summary style='cursor:pointer;'><u>Answer</u></summary>

<center>$F1 \space score = \frac{2 * (0.25 * 0.17)}{0.25 + 0.17} = \frac{0.085}{0.42} \approx 0.20$</center>
    
</details>

**Question 2**

In the artificial dataset described above, considering class 1 as the 'positive' class, what is the F1 score for this class?

<details>
<summary style='cursor:pointer;'><u>Answer</u></summary>

<center>$F1 \space score = \frac{2 * (0.25 * 0.25)}{0.25 + 0.25} = \frac{0.125}{0.5} = 0.25$</center>
    
</details>

**Question 3**

In the artificial dataset described above, considering class 2 as the 'positive' class, what is the F1 score for this class?

<details>
<summary style='cursor:pointer;'><u>Answer</u></summary>

<center>$F1 \space score = \frac{2 * (0.5 * 1)}{0.5 + 1} = \frac{1}{1.5} \approx 0.67$</center>    
    
</details>

### Metrics on Artificial Dataset
You've now seen how we can compute these various metrics using `scikit-learn`. Your task now is to perform the same computations but with the artificial dataset we introduced above.  

Given you have answered the comprehension questions, you should be able to validate that your answer matches what `scikit-learn` gives.

**Task**: In the code cell below, using the artificial dataset described above:
* Compute (and print) the accuracy for the whole artificial dataset
* Compute (and print) the recall score, precision score and F1 score per-class

**NOTE:** Given our `ground_truth` and `predictions` contain examples from each class, there is no need to pass the `labels` or `zero_division` arguments to any of the *`recall_score()`*, *`precision_score()`* or *`f1_score()`* functions.

In [None]:
# TODO: Compute (and print) the accuracy



# TODO: Compute (and print) the recall score per-class



# TODO: Compute (and print) the precision score per-class



# TODO: Compute (and print) the F1 score per-class




#### Task solution

In [None]:
# TODO: Compute (and print) the accuracy
accuracy = accuracy_score(y_true=ground_truth, y_pred=predictions)
print(f'Accuracy: {dataset_accuracy * 100:.2f}%')
print('-' * 50)

# TODO: Compute (and print) the recall score per-class
recall_scores = recall_score(y_true=ground_truth, y_pred=predictions, average=None)
for class_idx, recall in enumerate(recall_scores):
    print(f'Class: {class_idx}. Recall score: {recall}')
print('-' * 50)

# TODO: Compute (and print) the precision score per-class
precision_scores = precision_score(y_true=ground_truth, y_pred=predictions, average=None)
for class_idx, precision in enumerate(precision_scores):
    print(f'Class: {class_idx}. Precision score: {precision}')
print('-' * 50)

# TODO: Compute (and print) the F1 score per-class
f1_scores = f1_score(y_true=ground_truth, y_pred=predictions, average=None)
for class_idx, f1 in enumerate(f1_scores):
    print(f'Class: {class_idx}. F1 score: {f1}')
print('-' * 50)

# 3. Challenge Tasks
These tasks are meant to help pull together everything you have covered in this lab or extend on other exercises previously covered. It is highly recommended that you give these tasks a go, but only try to once you've finished the Lab Exercises section.

## Challenge 1 - Comparing Architectures
In the evaluation part of this lab we only computed evaluation metrics using the MobileNet V3 network.  

Your task is to create a pretrained [`ResNext` network](https://pytorch.org/vision/stable/models.html#id28), evaluate it on the same image data you did in the evaluation section, then compute all metrics that you calculated for MobileNet V3 in the evaluation section.  

Once you have the metrics for ResNext, compare them against what you computed for MobileNet V3. Did one perform better than the other?

In [None]:
# TODO: Write your solution here



## Challenge 2 - Trying Other Architectures

This is an extension on Challenge question 1. After you have tried out ResNext, take a look at the [torchvision.models documentation](https://pytorch.org/vision/stable/models.html) and choose another network that you might want to try out.   

Play around with the different networks, try different input images (from disk or from a URL), evaluate them using the metrics we covered above. Can you identify which network gives you the best results?

In [None]:
# TODO: Write your solution here



# Summary
In this lab, we learned how to use a CNN to classify image data using *PyTorch*. We also saw some common evaluation metrics and learned how we can apply them to multiclass classification tasks.