# Lab 08 - Model Prediction Speed of Deep Neural Networks

During this lab, we will continue exploring the prediction speed of models. This time we will focus
on deep neural networks.

## 1. Model Serving

We will focus on an object detection task. 

The first task is to select a model from a popular deep learning library, you may use pre-trained
weights to start with, but you should be able to fine-tune the model on a dataset of your choice.
Then, save the model and serve it using NVIDIA Triton Inference Server.


If you prefer, you may use a different dataset of similar size or choose another task of comparable
complexity.

## 2. Measure Inference Speed

We are interested in the performance of the serving setup. Similar to the previous lab, we can use a
general-purpose tool for load testing and benchmarking web services. 

## 3. Experiment with Models and Serving Options

Experiment with different model architectures. In particular, try to prepare several models of
significantly different sizes and compare latency and inference RPS (requests per second) that you
can achieve.

Experiment also with different serving options, such as parallelization, model quantization and protocols
(REST vs. gRPC), etc. Try to draw conclusions from the results. Can you observe any difference in the
inference speed? Does batching influence the results? If possible, provide plots to visualize your
findings. You can obtain raw data from the load-testing tool `Locust` for further analysis \-
https://docs.locust.io/en/stable/retrieving-stats.html#retrieve-test-statistics-in-csv-format.

In [1]:
from torchvision.io.image import decode_image
from torchvision.models.detection import fasterrcnn_resnet50_fpn_v2, FasterRCNN_ResNet50_FPN_V2_Weights
from torchvision.utils import draw_bounding_boxes
from torchvision.transforms.functional import to_pil_image

img = decode_image("test/assets/encode_jpeg/grace_hopper_517x606.jpg")

# Step 1: Initialize model with the best available weights
weights = FasterRCNN_ResNet50_FPN_V2_Weights.DEFAULT
model = fasterrcnn_resnet50_fpn_v2(weights=weights, box_score_thresh=0.9)
model.eval()

# Step 2: Initialize the inference transforms
preprocess = weights.transforms()

# Step 3: Apply inference preprocessing transforms
batch = [preprocess(img)]

# Step 4: Use the model and visualize the prediction
prediction = model(batch)[0]
labels = [weights.meta["categories"][i] for i in prediction["labels"]]
box = draw_bounding_boxes(img, boxes=prediction["boxes"],
                          labels=labels,
                          colors="red",
                          width=4, font_size=30)
im = to_pil_image(box.detach())
im.show()

RuntimeError: [Errno 2] No such file or directory: 'test/assets/encode_jpeg/grace_hopper_517x606.jpg'