This notebook shows an example of loading two pytorch models to do the same inference. Pytorch has two main ways of exporting models. 

1. Exporting the weights, which requires the model definition in code to load the weights
2. Exporting the model with the structure and weights combined in a compiled file format: Torchscript (Torchscript itself has two flavors we won't get into)

### exporting the model for inference
see the readme for instructions on downloading the model weights.
clone the yolov5 repo (from [this commit](https://github.com/ultralytics/yolov5/blob/6dd6aea0866ba9115d38e2989f59cf1039b3c9d2/export.py) if master doesn't work). then:

In [1]:
!python ../api/yolov5/export.py --weights ../models/megadetectorv5/md_v5a.0.0.pt --img 640 --batch 1

python: can't open file 'export.py': [Errno 2] No such file or directory


## loading test images

In [None]:
import os
# TNC test images (local):
test_images_local = []
test_images_dir = os.path.abspath(os.path.join(os.path.abspath(''), '..', 'input'))
local_image_files = [
    'sample-img-empty.jpg',
    'sample-img.jpg',
    'sample-img-skunk-large.jpg',
    'sample-img-rodent.jpg',
    'sample-img-fox.jpg',
    'sample-img-fox-2.jpg',
]
for fil in local_image_files:
    test_images_local.append(os.path.join(test_images_dir, fil))

Torchscript model loading

In [1]:
import torch
import numpy as np

model = torch.jit.load('../models/megadetectorv5/md_v5a.0.0.torchscript')
# set model parameters can go in inference basehandler subclass
model.conf = 0.10  # NMS confidence threshold
model.iou = 0.45  # NMS IoU threshold
model.agnostic = False  # NMS class-agnostic
model.multi_label = False  # NMS multiple labels per box
model.max_det = 1000  # maximum number of detections per image

Wall time inference for single 2048x2048 image on cpu is 5.55 seconds

In [4]:
%%time
import skimage.io as skio
impath=test_images_local[2]
arr = skio.imread(impath)
padded_arr = three_channel_arr_to_shape(arr, (2048,2048))
im = torch.from_numpy(padded_arr)
im = torch.moveaxis(im,2,0).to("cpu").float()[None,...]
result = model(im)

CPU times: user 35.6 s, sys: 3.48 s, total: 39.1 s
Wall time: 7.94 s


  return forward_call(*input, **kwargs)


However the shape of the output is not what we would expect, the lists of coordinates, predicted category ids, and confidence scores need to be derived from this result. this is handled in mdv5_handler.py in the api/megadetectorv5 folder.

In [13]:
result[0].shape

torch.Size([1, 261120, 8])

Inference with the yolov5 code to load the model weights

In [17]:
impath = impath=test_images_local[-2]
impath = "../input/sample-img-skunk-large.jpg"

wall time inference for single 2048x2048 image on gpu is .5 seconds

In [44]:
import yolov5
yolomodel = yolov5.load('../models/megadetectorv5/md_v5a.0.0.pt')
result_lst = yolomodel(impath)
result_lst

<yolov5.models.common.Detections at 0x7fb8a0b1ae50>