# Deploying ANN models on USB accelerators

One of the most important issues with big ANN models is the time it takes to compute the result for a sigle input, which may sometimes be quite long. Another issue is that high-advanced models, often consisting of billions of parameters, require a lot of energy to run.

It is especially troubling when we would like to use such models for specific applicatons e.g. vision system, where we would like to achive high effectivness of the model (like to detect as much objects as posibble with high probability) in short period of time (for standard cameras image acquisition frequency is 60Hz which would mean that the model must perform inference on input image in less than 16 ms). Additionaly, we are often limited by the hardware (be it small CPUs, low RAM memory or limited power supply).

For that reason some companies developed products with purpuse to speed up ANN models while trying to minimise the consumed energy. In todays laboratory we will use USB accelerators to implement an application that will take image from webcamera and use MobileNetV2 model to classify the objects in the image. The application will be quite simple and the only differene will be the core hardware on which the model will be executed, that is:

- CPU (on local machine)
- Intel Neural Compute Stick 2 (VPU)
- Google Coral USB Accelerator (TPU)

In this notebook we will focus on installing frameworks for deploying model on devices and to verifing the application (using local CPU).

## Framework installation

### Python libraries

When working on various applications with different devices, it is a good practice to not dump all necessary libraries into one place. That is why we will use Python virtual environment, as not to meddle with the global environment. This way if we mess something up with application dependencies, the main interpreter will remain intact. Remember to use Python version not lower that 3.7 and not higher than 3.10. To create Python virtual environment execute the following in your working directory (`.venv_ncs_cua` is an example name, you can change it if you want to):

```
python3 -m venv .venv_ncs_cua
source .venv_ncs_cua/bin/activate
```

By executing `source .venv_ncs_cua/bin/activate` you activate the virtual environment, meaning every Python script executed in current cmd will use the Python interpreter from it.

Next, install following libraries using pip:

```
pip install jupyter opencv-python tensorflow==2.9.3 openvino==2022.3.1 openvino-dev[tensorflow2]==2022.3.1 --extra-index-url https://google-coral.github.io/py-repo/ pycoral~=2.0
```

Due to discontinuation of support for Intel NCS2, we cannot use the lates OpenVINO version (the one you used on previous laboratories) and must work with older ones. This also means that some other libraries that OV depends on, like TensorFlow or Numpy, will be installed in older version too (and this is why virtual environmet should be used).

Lastly check if everything is in place:

```
pip check
```

Please, verify also if you installed OpenVINO properly be executing below command:

```
mo --help
```

### Intel NCS2

While OpenVINO library would be enough to convert our model and run it on local CPU, if we want to deploy it on NCS, some additional configurations are necessary. We will follow [this](https://www.intel.com/content/www/us/en/developer/articles/guide/get-started-with-neural-compute-stick.html) guide. First, download [OpenVINO Runtime](https://www.intel.com/content/www/us/en/developer/tools/openvino-toolkit/download.html?VERSION=v_2022_3_1&ENVIRONMENT=RUNTIME&OP_SYSTEM=LINUX&DISTRIBUTION=ARCHIVE) in version 2022.3.1 for Linux systems directly from [archives](https://storage.openvinotoolkit.org/repositories/openvino/packages/2022.3.1/linux) (it will provide us with all we need in the easiest way). Below you will find steps to do so using command console:

```
curl -L https://storage.openvinotoolkit.org/repositories/openvino/packages/2022.3.1/linux/l_openvino_toolkit_ubuntu20_2022.3.1.9227.cf2c7da5689_x86_64.tgz --output openvino_2022.3.1.tgz
tar -xf openvino_2022.3.1.tgz
mv l_openvino_toolkit_ubuntu20_2022.3.1.9227.cf2c7da5689_x86_64 openvino_2022.3.1
```

Next install OpenVINO Runtime:

```
cd openvino_2022.3.1
sudo -E ./install_dependencies/install_openvino_dependencies.sh
sudo ./install_dependencies/install_NCS_udev_rules.sh
cd ..
```

Lastly, before you try to deploy your model on NCS2, remember to update path in your cmd with OpenVINO variables (we will do it when the times come):

```
source openvino_2022.3.1/setupvars.sh
```

### Google Coral USB Accelerator

Similary to NCS, it is also required to install additional libraries for Google Coral to connect with our application. We will follow [this](https://coral.ai/docs/accelerator/get-started) guide. Below you will find steps to install necessary libraries using command console:

```
echo "deb https://packages.cloud.google.com/apt coral-edgetpu-stable main" | sudo tee /etc/apt/sources.list.d/coral-edgetpu.list
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
sudo apt-get update
sudo apt-get install libedgetpu1-std
```

Additionally, we will need [Edge TPU compiler](https://coral.ai/docs/edgetpu/compiler/) that will allow us to convert prepared model to be more TPU-friendly which will increase it runtime speed. Below you will find steps to install it in comand console:

```
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
echo "deb https://packages.cloud.google.com/apt coral-edgetpu-stable main" | sudo tee /etc/apt/sources.list.d/coral-edgetpu.list
sudo apt-get update
sudo apt-get install edgetpu-compiler
```

Verify installation:

```
edgetpu_compiler --help
```

In case the copiler does not work, Google provides online tool in a form of [Colab Notebook](https://colab.research.google.com/github/google-coral/tutorials/blob/master/compile_for_edgetpu.ipynb). There you will find set of instructions how to compile your model using that notebook.

## Model preparation and application verification

Now everything is ready and we can start implementig our application. In this notebook you don't have to use earlier created virtual environmet, but if you would like to, you have to change the Python interpreter to the one inside your venv.

In [None]:
# import necessary libraries
from typing import Any
from pathlib import Path
import time
import numpy as np
import cv2
from tensorflow.keras.applications.mobilenet_v2 import MobileNetV2, preprocess_input

from usb_accelerator_utils import draw_classification_results, run_program

For the application we will use [MobileNetV2](https://www.tensorflow.org/api_docs/python/tf/keras/applications/mobilenet_v2/MobileNetV2) model, but you can try and use [other](https://www.tensorflow.org/api_docs/python/tf/keras/applications) ones as well. The imported model has already pre-trainded weights on [ImageNet](https://www.image-net.org/) dataset.

```
! Important !
```

As we will use model pretrained and imported from TensorFlow, we need to verify what is the shape of the input data and waht is the range of the values (whether input is normalized). As for MobileNetV2 from Keras, input data range must be in `[-1, 1]`.

In [None]:
# @TODO: Load from the TF library MobileNetV2 model with pre-trained weights
MobNet = ...

print('Input  shape:', MobNet.input_shape)
print('Output shape:', MobNet.output_shape)

In [None]:
# @TODO: Set the shape of input data i.e. size of input image (it will be later used for input scaling)
INPUT_IMAGE_WIDTH  = ...
INPUT_IMAGE_HEIGHT = ...
print(f'Input  shape: ({INPUT_IMAGE_HEIGHT}, {INPUT_IMAGE_WIDTH})')

# @TODO: Load label names for MobilNet classes, you can find example files here: https://coral.ai/models/all/. Does labels' number equals model output shape?
LABELS = ...
print('Labels shape:', np.shape(LABELS))

The core of our application is different basing on hardware we use. Below function is used for executing TensorFlow model on CPU.

In [None]:
def run_classification(exec_net: Any, img: np.ndarray, max_classes: int=1, min_score: float=0.0) -> float:
  """Perforn one inference and draw results.

  Args:
    exec_net (Any): ANN model that performs classification.
    img (np.ndarray): Input image.
    max_classes (int, optional): Max number of best detections to diaplay. Defaults to 1.
    min_score (float, optional): Min score a detection should have to be displayed. Defaults to 0.0.

  Returns:
  float: Inference time in seconds.
"""
  # @TODO: Prepare input image - modify image shape to fit into model input
  #        Hint1: How many dimension the input image need to have if only one image is processed at the time?
  #        Hint2: Use imported from TF function `preprocess_input` prepare image for MobileNet
  conv_img = ...
  
  t0 = time.time()
  
  # @TODO: Perform one inference on prepared data
  result = ...
  
  elapsed = time.time() - t0
  
  # @TODO: Extract classification scores, make it as a one-dimensional array
  scores = ...
  
  class_idxs = np.arange(start=0, stop=scores.shape[0], step=1, dtype=int)
  draw_classification_results(
    img=img,
    class_idxs=class_idxs,
    scores=scores,
    labels=LABELS,
    max_classes=max_classes,
    min_score=min_score)
  return elapsed

Finally run the application.

In [None]:
run_program(
  exec_net=MobNet,
  c_func=run_classification,
  camera_idx=0,
  max_disp=3,
  min_score=0.1)

Last step is to export our model, so that it can be used later.

In [None]:
# @TODO: Provide path to place where model will be saved
model_tf_dir = Path(...)

model_tf_dir.mkdir(parents=True, exist_ok=True)
MobNet.save(str(model_tf_dir))