# Example 2: PyTorch

#### Note: I recommend using Google Colab with GPU runtime if you plan to train your own models. Model training on CPU takes some time. 

### Install dependencies

Install dependencies - use pip install -r requirements.txt.

Please check the docs here: https://docs.fast.ai/tutorial.vision.html

There's a full explanation to all steps to train DL model using FastAI. I'll try to keep it high level.

## 1. Training a DL model (Optional)

### Dataset



1. Dataset used to train the model https://www.kaggle.com/dansbecker/hot-dog-not-hot-dog/data. For context https://www.youtube.com/watch?v=6ViobQys1iQ

### Models

1. Trained Resnet50 models here (if you do not wish to train your own model): 
        
        https://storage.googleapis.com/dsu-models-20020301/example-2-pytorch/hot_dog_resnet50_256_256.onnx
        https://storage.googleapis.com/dsu-models-20020301/example-2-pytorch/hot_dog_resnet50_256_256.pkl

2. Training your own models - If you'd like to train a model on your own dataset please keep the folder setup - each folder and images in the folder should be named the indended 'label'. This is due FastAI DataBlock implementation. Check https://docs.fast.ai/ for more advanced data loading examples. See the folder structure below:


In [None]:
dataset
├───hot_dog
│   ├───hot_dot_1.png
│   ├───hot_dot_2.png
│   └───hot_dot_xyz.png
└───not_hot_dog
    ├───not_hot_dot_1.png
    ├───not_hot_dot_2.png
    └───not_hot_dot_xyz.png

In [3]:
from fastai.vision.all import *

In [None]:
PATH = 'dataset/'

In [None]:
dogs = DataBlock(blocks=(ImageBlock, CategoryBlock), 
                 get_items=get_image_files, 
                 splitter=RandomSplitter(),
                 get_y=using_attr(RegexLabeller(r'(.+)_\d+.jpg$'), 'name'),
                 item_tfms=Resize(460),
                 batch_tfms=aug_transforms(size=256)
                 )

In [None]:
dls = dogs.dataloaders(PATH) # GPU implementation
dls = dogs.dataloaders(PATH, num_workers=0) # CPU implmentation

In [None]:
# show random images from the batch
dls.show_batch(max_n=3)

In [None]:
# creates a convolutional learner
# you can find other supported models here:
# please stick to resnet models - resnet18, resnet34, resnet50, resnet101, resnet152
# in general more layers mean longer training and better performance

learn = cnn_learner(dls, resnet50, metrics=[error_rate, accuracy])

In [None]:
# finds the best learning rate
# Check this article for the maths behind this: https://sgugger.github.io/how-do-you-find-a-good-learning-rate.html
learn.lr_find()

In [None]:
# you might want to change the number of training epochs - 1. argument
# you should adjust the learning rate based on the previous step - 2. argument

learn.fine_tune(5, 1e-3)

In [None]:
# export the FastAI model

learn.export('models/my_own_hot_dog_resnet50_256_256.pkl')

In [4]:
# loading the FastAI model
learn = load_learner('models/hot_dog_resnet50_256_256.pkl') # loading trained models

In [5]:
# exporting labels from dataloaders
# FastAI dataloader gives quick access to labes

labels = learn.dls.vocab
labels

['hot_dog', 'not_hot_dog']

In [7]:
%%time
# predicting with FastAI
# FastAIs models already contains a wraper including the softmax layers
learn.predict('dataset/hot_dog/hot_dog_1.jpg')

In [8]:
 # getting the PyTorch model
 # by using .model attribute on the FastAI learned we can get the 'pure' PyTorch model
 # by using eval() we are setting the model to 'prediction' mode - no backward propagation needed

In [9]:
fastai_model = learn.model.eval()
softmax_layer = torch.nn.Softmax(dim=1) # Resnet models from PyTorch hub are without the last SoftMax layer (FastAI models already include this, also the image transformation - image -> tensor, resize to 256)

final_model = nn.Sequential(fastai_model, softmax_layer)

In [10]:
# loading an image and converting to tensor
from torchvision import transforms

from PIL import Image
image = Image.open('dataset/hot_dog/hot_dog_1.jpg')

# creating a transformation pipeline
transformation = transforms.Compose([
            transforms.Resize([256,256]),
            transforms.ToTensor()
        ])

image_tensor = transformation(image).unsqueeze(0)
image_tensor.shape

torch.Size([1, 3, 256, 256])

In [13]:
%%time
with torch.no_grad():
    results = final_model(image_tensor)

CPU times: user 178 ms, sys: 19.1 ms, total: 197 ms
Wall time: 193 ms


## Get label and probabilities

In [14]:
labels[np.argmax(results.detach().numpy())], results.detach().numpy()


('hot_dog', array([[0.9486536 , 0.05134634]], dtype=float32))

# Converting to ONNX

We are converting the final model that already includes the softmax layer

In [None]:
torch.onnx.export(
    final_model,
    torch.randn(1, 3, 224, 224),
    "models/my_own_hot_dog_resnet50_256_256.onnx",
    export_params=True,
    input_names=["image_256_256"],
    output_names=["hot_dog"],
    opset_version=11
)

# 2. Running ONNX inference

In [None]:
import numpy as np
import onnxruntime as rt

In [None]:
sess = rt.InferenceSession("models/hot_dog_resnet50_256_256.onnx")

input_name = sess.get_inputs()[0].name
label_name = sess.get_outputs()[0].name
input_name, label_name # values defined at export 

## Converting image to tensor

ONNX runtime has the advantage of small module footprint. We don't want to bloat it with PyTorch's transforms to convert the image to tensor as previously. We can use Pillow with numpy to load and prepare the image for inference.

### Image load & prep

In [None]:
from PIL import Image

image = Image.open('dataset/hot_dog/hot_dog_1.jpg')
image = image.resize((256,256))
print(image.shape, image.mode)


# now our image is represented by 3 layers - Red, Green, Blue
# each layer has a 224 x 224 values representing
image = np.array(image)
print('Conversion to tensor: ',image.shape)

# dummy input for the model at export - torch.randn(1, 3, 224, 224)
image = image.transpose(2,0,1).astype(np.float32)
print('Transposing the tensor: ',image.shape)

# our image is currently represented by values ranging between 0-255
# we need to convert these values to 0.0-1.0 - those are the values that are expected by our model

print('Integer value: ', image[0][0][40])
image /= 255
print('Float value: ', image[0][0][40])

# expanding the alread existing tensor with the final dimension (similar to unsqueeze(0))
# currently our tensor only has rank of 3 which needs to be expanded to 4 - torch.randn(1, 3, 224, 224)
# 1 can be considered the batch size

image = image[None, ...]
print('Final shape of our tensor', image.shape)


## Run inference with ONNX Runtime

In [None]:
import onnxruntime as rt

sess = rt.InferenceSession('models/hot_dog_resnet50_256_256.onnx')

input_name = sess.get_inputs()[0].name
output_name = sess.get_outputs()[0].name
input_name, output_name

In [None]:
results = sess.run([output_name], {input_name: image})[0]

In [None]:
labels[np.argmax(results)], results, labels

## Gotchas

It's really important to use the expected input with ONNX. Let's check the following scenario when using a 1x3x224x224 tensor with on a model with a defined input of 1x3x256x256.

Let's see what happens

In [None]:
from PIL import Image

image = Image.open('dataset/hot_dog/hot_dog_1.jpg')
image = image.resize((224,224)) # <---- values changed here
print(image.shape, image.mode)

image = np.array(image)
print('Conversion to tensor: ',image.shape)

image = image.transpose(2,0,1).astype(np.float32)
print('Transposing the tensor: ',image.shape)

print('Integer value: ', image[0][0][40])
image /= 255
print('Float value: ', image[0][0][40])

image = image[None, ...]
print('Final shape of our tensor', image.shape)

In [None]:
results = sess.run([output_name], {input_name: image})[0]
# an ERROR is expected - no worries

## How to Debug

To check the inputs of a model you can use a tool like Netron to visualize it: https://netron.app. Desktop version available here: https://github.com/lutzroeder/netron

OR you can access the expected dimension by the following line:


In [None]:
# shows the required model input
sess.get_inputs()[0].shape