# EE382V - Hardware Architecture for Machine Learning
## NVIDIA Tensor Cores for Accelerating Machine Learning Workload

## Notebook 2 - Inference using Trained Model with FP32

In this notebook, we will try to make prediction using our trained model to see whether our model can classify cats and dogs from the image. We will use the test dataset to perform inference. Inference is not as heavy as training in terms of computational resources. Hardware designed for inference usually consider for low-power since most of the inference workloads are run on the edge device (e.g., smartphone).

### Import Library
We need to import some libraries which are needed to perform some functions in this notebook.

In [None]:
import os
import torch
import torch.nn as nn
import torch.optim as optim
from torch.optim import lr_scheduler
import numpy as np
import torchvision
from torchvision import datasets, models, transforms
import matplotlib.pyplot as plt
from PIL import Image

### Global Variable
Here, we define global variables.

In [None]:
data_dir       = './'
raw_dir        = f'{data_dir}/raw'
raw_dogs_dir   = f'{raw_dir}/dogs'
raw_cats_dir   = f'{raw_dir}/cats'
train_dir      = f'{data_dir}/train'
train_dogs_dir = f'{train_dir}/dogs'
train_cats_dir = f'{train_dir}/cats'
val_dir        = f'{data_dir}/val'
val_dogs_dir   = f'{val_dir}/dogs'
val_cats_dir   = f'{val_dir}/cats'
log_dir        = f'{data_dir}/log'
chk_dir        = f'{data_dir}/checkpoint'
test_dir       = f'{data_dir}/test'

### GPU Initialization
We will use GPU to make prediction using our model. The TACC Maverick2 V100 Compute Node is equipped with two NVIDIA Tesla V100 GPUs. In this assignment, we will only use one of them. If there is no GPU available, we will revert back to use the CPU.

In [None]:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

### Dataset Normalization
Before we feed the input data (test images) into our model to get the prediction, we need to preprocess the images. The preprocessing step includes normalization and resizing to 224px by 224px.

In [None]:
def apply_test_transforms(inp):
    out = transforms.functional.resize(inp, [224,224])
    out = transforms.functional.to_tensor(out)
    out = transforms.functional.normalize(out, [0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    return out

### Dataset Handler
We need a handler to open our input data (test images) and apply the transformation before feeding them into our model to get the prediction.

In [None]:
def test_data_from_fname(fname):
    im = Image.open(f'{test_dir}/{fname}')
    return apply_test_transforms(im)

### Load The Model
We load the trained model using the checkpoint that we have saved after we have finished trained our model

In [None]:
# Download the pre-trained model of ResNet-50
model_conv  = torchvision.models.resnet50(pretrained=True)

# Parameters of newly constructed modules have requires_grad=True by default
for param in model_conv.parameters():
    param.requires_grad = False

# We change the parameter of the final fully connected layer.
# We have to keep the number of input features to this layer.
# We change the output features from this layer into 2 features (i.e., we only have two classes).
num_ftrs = model_conv.fc.in_features
model_conv.fc = nn.Linear(num_ftrs, 2)

# Define the checkpoint location to save the trained model
check_point = f'{chk_dir}/model-checkpoint-fp32.tar'

# Load the check_point
checkpoint = torch.load(check_point)
print("Checkpoint Loaded")
print(f'Val loss: {checkpoint["best_val_loss"]}, Val accuracy: {checkpoint["best_val_accuracy"]}')
model_conv.load_state_dict(checkpoint['model_state_dict'])
 
# Copy the model to GPU memory
model_conv = model_conv.to(device)

# Set the model to eval mode
model_conv.eval()

### Define Prediction Function
We will create a method to return the probability whether an image depicts a dog or not. If the probability is more than 50%, then the model predicts that it is an image of dog. If the probability is less than 50%, we can say that the model predicts that it is not an image of dog, instead it is an image of cat.

In [None]:
def predict_dog_prob_of_single_instance(model, tensor):
    batch = torch.stack([tensor])
    batch = batch.to(device) # Send the input to GPU
    softMax = nn.Softmax(dim = 1)
    preds = softMax(model(batch))
    return preds[0,1].item()

### Prediction
Let's make prediction on some images in test dataset. You can change the number of test images that you want to predict.

In [None]:
###################### Change as needed ######################
num_of_test_images = 32
##############################################################

test_data_files = os.listdir(test_dir)

if(num_of_test_images<2) :
    num_of_test_images = 2
    
image_inferenced   = 0
fig, ax = plt.subplots(num_of_test_images, figsize=(num_of_test_images*5, num_of_test_images*5))
fig.tight_layout(pad=5)

for fname in test_data_files :    
    im         = Image.open(f'{test_dir}/{fname}')
    imstar     = apply_test_transforms(im)    
    outputs    = predict_dog_prob_of_single_instance(model_conv, imstar)
    ax[image_inferenced].imshow(im)
    ax[image_inferenced].axis('on')
    if(outputs<0.5) :
        ax[image_inferenced].set_title('predicted: cat \n probability: ' + str(1-outputs))
    else :
        ax[image_inferenced].set_title('predicted: dog \n probability: ' + str(outputs))
    image_inferenced += 1
    if(image_inferenced>=num_of_test_images) :
        break

### End
This is the end of Notebook 2. Please move forward to Notebook 3 where we will use FP16 to train our model.

!!IMPORTANT!! To close this Notebook, you have to use File -> Close and Halt. With this way, the Python process associated with this Notebook will also be killed.

Version 0.9  - January 7th, 2020 - ©2020 hanindhito@bagus.my.id