**NOTE: This notebook is written for the Google Colab platform. However it can also be run (possibly with minor modifications) as a standard Jupyter notebook.** 



In [None]:
#@title -- Installation of Packages -- { display-mode: "form" }
import sys
!{sys.executable} -m pip install git+https://github.com/michalgregor/class_utils.git

In [None]:
#@title -- Import of Necessary Packages -- { display-mode: "form" }
import numpy as np
from PIL import Image
from torchvision import models
from torchvision import transforms
import torch

In [None]:
#@title -- Downloading Data -- { display-mode: "form" }
from class_utils.download import download_file_maybe_extract
download_file_maybe_extract("https://www.dropbox.com/s/a5ux951zo01gd5z/cat.jpg?dl=1", directory="data")
download_file_maybe_extract("https://www.dropbox.com/s/ma25i7w3jpqex2a/imagenet_classes?dl=1", directory="data")

# also create a directory for storing any outputs
import os
os.makedirs("output", exist_ok=True)

In [None]:
#@title -- Auxiliary Functions -- { display-mode: "form" }
with open("data/imagenet_classes", "r") as file:
    classes = [c[:-1] for c in file.readlines()]

## Using a Pre-trained Classifier

This notebook will show a very simple example of loading and using a classifier pre-trained on ImageNet. We will show how it can be used to classify new images.

### Loading the Model

In the previous examples we have defined a class for our own neural net and specified its architecture. Now we will instead use one of the predefined models with pretrained weights. These models are available in the `torchvision.models` package. We will use architecture `resnet50` in particular and when instantiating it, we'll specify which version of pretrained weights we want to use – in our case it will be the first version of weights trained on the ImageNet dataset. When calling this piece of code for the first time, the weights will first need to be downloaded over the internet.



In [None]:
device = "cuda" if torch.cuda.is_available() else "cpu"
model = models.resnet50(weights='ResNet50_Weights.IMAGENET1K_V1').to(device)

A further thing we can get from `torchvision.models` for our architecture is the correct way to preprocess the input data – most of these vision models use some specific normalization, resize the image to some standard size, etc.



In [None]:
image_transforms = models.ResNet50_Weights.IMAGENET1K_V1.transforms()

### Using the Model

We load and preprocess the image that we want to classify.



In [None]:
img = Image.open("data/cat.jpg").convert("RGB")
display(img)

Before plugging the image into the network we are going to apply preprocessing using `image_transforms`. We will also call `.unsqueeze(0)` on the result. This is because while we are working with a single image, the network expects a batch of images. By calling `.unsqueeze(0)` we add the batch dimension (with size 1).



In [None]:
img_prep = image_transforms(img).unsqueeze(0)
img_prep = img_prep.to(device)

model.eval()
with torch.no_grad():
    y_logit = model(img_prep)

The network will return the logits corresponding to all the ImageNet classes. As you'll recall, to get the index of the most probable class, we can `argmax`. In our case, we are storing class labels in the `classes` list (it was read from a file in the auxiliary code section), so we only need to index it using the class predicted by our network.



In [None]:
y = y_logit.argmax(dim=1)
classes[y]

Given that our network can actually predict class probabilities (we can get them by passing the logits through a softmax layer), we might want to known about those and not just about the label of the most probable class. In fact, let's display the top-5 predictions and their probabilities. This will give us a better idea of how confident the neural network is about its prediction and whether the other, less probable predictions make any sense or not.



In [None]:
def decode_proba(proba, top=5):
    proba = proba.ravel().detach().cpu()
    ind = proba.argsort(descending=True)
    
    for c in ind[:top]:
        print("{}:\t{} ({})".format(
            np.array2string(proba[c], precision=5),
            classes[c], c))

In [None]:
y_proba = y_logit.softmax(dim=1)
decode_proba(y_proba)

---
### Task 1: Predictions about Other Images

**Try to apply the same procedure to a different image.** 

Note: New images can be uploaded **directly through the notebook interface**  or else using:

```
from google.colab import files
content_img = files.upload()
filename = list(content_img)[0]
```
---


In [None]:



# -----





The list of classes that the network is supposed to be able to classify can be found in file `data/imagenet_classes`



In [None]:
for ic, c in enumerate(classes):
    print("{}:\t{}".format(ic, c))