# Upload Images

Before we can apply pretrained models to classify images, we have to upload the images to the Google Colab environment.

The `upload_files` function below can be used to upload files.

In [1]:
from google.colab import files

def upload_files():
    uploaded = files.upload()
    for k, v in uploaded.items():
        open(k, 'wb').write(v)
    return list(uploaded.keys())

Run the following code block to upload your image files. This function returns a list of image paths which will be stored in `files`.

In [2]:
files = upload_files()

Saving BREED Hero_0059_golden_retriever.jpeg to BREED Hero_0059_golden_retriever.jpeg


# Using Pre-trained Models

Initialize model with the best available weights.

In [3]:
from torchvision.models import resnet50, ResNet50_Weights

weights = ResNet50_Weights.DEFAULT
model = resnet50(weights=weights)
model.eval()

ResNet(
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): Bottleneck(
      (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (downsample): Sequential(
        (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 

Before using the pre-trained models, one must preprocess the image (resize with right resolution/interpolation, apply inference transforms, rescale the values etc). There is no standard way to do this as it depends on how a given model was trained. It can vary across model families, variants or even weight versions. Using the correct preprocessing method is critical and failing to do so may lead to decreased accuracy or incorrect outputs.

All the necessary information for the inference transforms of each pre-trained model is provided on its weights documentation. To simplify inference, TorchVision bundles the necessary preprocessing transforms into each model weight. These are accessible via the weight.transforms attribute:

In [4]:
import torch
from torchvision.io import read_image

# Load images from paths into tensors
imgs = []
for fp in files:
    img = read_image(fp)
    imgs.append(img)

# Initialize the inference transforms
preprocess = weights.transforms()

# Apply inference preprocessing transforms
if len(imgs) == 1:
    # Add a batch dimension after preprocessing
    batch = preprocess(imgs[0]).unsqueeze(0)
else:
    # Stack multiple preprocessed image
    batch = torch.stack([preprocess(img) for img in imgs])

print(batch.shape)

torch.Size([1, 3, 224, 224])


Use the model and print the predicted category.

In [5]:
# Prediction the probability of each class
logits = model(batch)
probas = logits.softmax(1)
preds = probas.argmax(1)

for i in range(len(preds)):
    class_id = preds[i].item()
    score = probas[i][class_id].item()
    category_name = weights.meta["categories"][class_id]
    print(f"{category_name}: {100 * score:.1f}%")

golden retriever: 53.3%


# Exercises: Change the model

You need to change the pre-trained model to be something else. Please have a look [here](https://pytorch.org/vision/stable/models.html#table-of-all-available-classification-weights) and pick one model of your choice. Then write the code below to apply a different model.

In [6]:
from torchvision.models import inception_v3,Inception_V3_Weights

weights = Inception_V3_Weights.DEFAULT 
model = inception_v3(weights=weights)
model.eval()

Inception3(
  (Conv2d_1a_3x3): BasicConv2d(
    (conv): Conv2d(3, 32, kernel_size=(3, 3), stride=(2, 2), bias=False)
    (bn): BatchNorm2d(32, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
  )
  (Conv2d_2a_3x3): BasicConv2d(
    (conv): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), bias=False)
    (bn): BatchNorm2d(32, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
  )
  (Conv2d_2b_3x3): BasicConv2d(
    (conv): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    (bn): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
  )
  (maxpool1): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
  (Conv2d_3b_1x1): BasicConv2d(
    (conv): Conv2d(64, 80, kernel_size=(1, 1), stride=(1, 1), bias=False)
    (bn): BatchNorm2d(80, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
  )
  (Conv2d_4a_3x3): BasicConv2d(
    (conv): Conv2d(80, 192, kernel_size=(3, 3), stri

In [7]:
# Prediction the probability of each class
logits = model(batch)
probas = logits.softmax(1)
preds = probas.argmax(1)

for i in range(len(preds)):
    class_id = preds[i].item()
    score = probas[i][class_id].item()
    category_name = weights.meta["categories"][class_id]
    print(f"{category_name}: {100 * score:.1f}%")

golden retriever: 100.0%
