# How to Use Warboy Vision Models

This notebook demonstrates how to use this project with yolov8n object detection model.

## Prerequisites

### Make Python Environment

To follow this tutorial, you need Python 3.8 or higher. If you already have your own Python environment, you can skip this step. Otherwise, you can create a new Python environment using Conda.

First, here are the commands to install Miniconda:
```console
$ wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
$ sh ./Miniconda3-latest-Linux-x86_64.sh
$ rm -rf Miniconda3-latest-Linux-x86_64.sh
$ source ~/.bashrc
```


After installing Miniconda, you can create a new Python environment and install the Furiosa Python SDK using the following commands:

```console
$ conda create -n furiosa-3.9 python=3.9
$ conda activate furiosa-3.9
```


### Install Driver, Firmware, and Runtime packages

First, you can install the Driver, Firmware, and Runtime packages for the NPU device through the APT server. To do this, you need to set up the APT server. You can follow the instructions in [Korean](https://developer.furiosa.ai/docs/latest/ko/software/installation.html) or [English](https://developer.furiosa.ai/docs/latest/en/software/installation.html).

After setting up the APT server, you can install the packages using the following command:

```console
$ sudo apt-get update && sudo apt-get install -y furiosa-driver-warboy furiosa-libnux
```


Next, you can check NPU devices on your environment using the following command:

```console
$ sudo apt-get install -y furiosa-toolkit
$ furiosactl info --format full
```


### Install Furiosa Python SDK

The Furiosa SDK can be installed following instructions on [Korean](https://furiosa-ai.github.io/docs/latest/ko/) or [English](https://furiosa-ai.github.io/docs/latest/en/).

```console
$ pip install 'furiosa-sdk[full]'
```

### Install Datasets

If you have your own dataset or already downloaded the dataset, you can skip these steps.


In this notebook, we will use the COCO dataset. You can download the COCO dataset using the following command:

```console
./coco.sh
```
This will download the COCO dataset and save it in the `datasets/coco` directory.


Also, to run web demo, you need to install the demo videos. You can download the demo videos using the following command:

```console
./demo_videos.sh
```

This will download the demo videos and save them in the `datasets/demo_videos` directory. This includes the object detection and instacne segmentation videos in `datasets/demo_videos/detection` and pose estimation videos in `datasets/demo_videos/estimation` directory.

### Install required packages

You can install the required packages using the following command:

```console
$ pip install -r requirements.txt
```


### Build Yolo Decoders

In this project, C++ decoders are included for post-processing. You can build the decoders using the following command:

```console
$ ./build.sh
```


### Set Root Path


We should change the working directory to the project's root and add it to `sys.path` to ensure that modules and packages can be imported correctly, regardless of where the script is run from.


In [None]:
import os
import sys

ROOT = os.path.abspath(os.path.join(os.getcwd(), ".."))

os.chdir(ROOT)
sys.path.insert(0, ROOT)


### Installing a Custom CLI Tool (Optional)

In this notebook, we won't be using the custom CLI tool, but if you want to use it, you can install our custom CLI tool to run vision models on Warboy using the following command:

```console
$ pip install .
```
This will install the `warboy-vision` command line tool, which you can use to run models on Warboy.

## Prepare Model

First, you need to prepare the configuration file for the model you want to use. In this notebook, we will use the YOLOv8n model. You can check the configuration file in `yolov8n.yaml` file.

In [None]:
from warboy_vision_models.warboy.tools.onnx_tools import OnnxTools
from warboy_vision_models.warboy import get_model_params_from_cfg

cfg = 'notebooks/yolov8n.yaml'
onnx_tools = OnnxTools(cfg)
param = get_model_params_from_cfg(cfg)


### Export ONNX

To run the model on Warboy, you need a quantized ONNX model. First, let's export the YOLOv8n model to ONNX format.


For yolo models, due to a drop in accuracy after quantization caused by the concatenation operator (which combines class results and box results along the channel axis at each anchor), we need to modified the model by removing the decoding part from the model output. You can do this by giving the `need_edit` argument as `True` when exporting the model to ONNX format.


In [None]:
onnx_tools.export_onnx(need_edit=True)


### Quantize Model

Next, let's quantize the ONNX model. Quantization is a technique that converts a high-precision (usually FP32) DL model to a lower precision, reducing the model size and memory cost, and improving the inference speed. By quantizing the model, you can run efficient inference AI services.

In quantization phase, we need to prepare the calibration dataset. The calibration dataset is used to calibrate the quantization parameters of the model. In this notebook, we will use COCO dataset for calibration.

The calibration method and the number of calibration data configured in `yolov8n.yaml` file can be changed. You can see the specifics of quantization and calibration methods options in [Korean](https://developer.furiosa.ai/docs/latest/ko/software/quantization.html) or [English](https://developer.furiosa.ai/docs/v0.5.0/en/advanced/quantization.html).


In [None]:
onnx_tools.quantize()


## Run Inference

### End to End Performance Test

Now, we will run the end to end performance test. This will run the model to inference on the COCO dataset and measure the mAP.

In [None]:
from warboy_vision_models.tests.e2e.test_object_det import test_warboy_yolo_accuracy_det

test_warboy_yolo_accuracy_det(
    model_name = 'yolov8n', 
    model = param['onnx_i8_path'], 
    input_shape= param['input_shape'], 
    anchors= param['anchors'],
)


### Web Demo with Fast API

To run the web demo, you need to prepare the demo configuration file. You can check the configuration file in `demo_config.yaml` file.

After running the web demo, you can access the web demo at `http://localhost:20001` or `http://0.0.0.0:20001`.

If you're using a remote server, you should port foward the port 20001 to your local machine. You can do this by running the following command **at your local machine**:

```console
$ ssh -L 20001:localhost:20001 <username>@<ip_address>
```


In [None]:
from warboy_vision_models.demo.demo import run_web_demo

demo_cfg_path = 'notebooks/demo.yaml'

run_web_demo(
    cfg_path = demo_cfg_path
)


### NPU Profiling

In Furiosa SDK, we provide a profiling tool to analyze the performance of the model. You can use the profiling tool to measure the time taken by each operation in the model and identify the bottlenecks in the model.

After running the command, the trace file will be saved in the `models/trace` directory. You can visualize the trace analysis using the Chrome web browser's Trace Event Profiling Tool (chrome://tracing). This will help you understand the performance of the model and optimize it for better performance.


There can be `OpenTelemetry trace error occurred. cannot send span to the batch span processor because the channel is full` warning messages when writing the trace file. But you can ignore them.


In [None]:
from warboy_vision_models.tests.e2e.test_npu_performance import test_warboy_performance

test_warboy_performance(
    cfg = cfg,
    num_device = 1,
)
