# How to Use Furiosa SDK from Start to Finish

This notebook demonstrates how to use Furiosa SDK from start to finish.

## Prerequisites

The Furiosa SDK needs to have been installed. If not, it can be installed following instructions on https://furiosa-ai.github.io/docs/latest/ko/ (Korean) or https://furiosa-ai.github.io/docs/latest/en/ (English). The `torchvision` and `scipy` packages also need to be installed for this demonstration.

```console
$ pip install 'furiosa-sdk[quantizer]' torchvision scipy
```

In [1]:
import numpy as np
import onnx
import torch
import torchvision
from torchvision import transforms
import tqdm

import furiosa.runtime.session
from furiosa.quantizer.frontend.onnx import post_training_quantize

libnpu.so --- v2.0, built @ e328545


## Load PyTorch Model

As a running example, we employ the pre-trained ReseNet-50 model from Torchvision.

In [2]:
torch_model = torchvision.models.resnet50(pretrained=True)
torch_model = torch_model.eval()  # Set the model to inference mode.

The ResNet50 model has been trained with the following preprocessing applied: https://pytorch.org/vision/stable/models.html We will use the same preprocessing below.

In [3]:
preprocess = transforms.Compose(
    [
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
    ]
)

## Export PyTorch Model to ONNX Model

We call the `torch.onnx.export` function to export the PyTorch ResNet-50 model to an ONNX model. The function executes a PyTorch model provided as its first argument, recording a trace of what operators are used during the execution, and then converts those operators into ONNX equivalents. Because `torch.onnx.export` runs the model, we need to provide the function with an input tensor as its second argument, which can be random so long as it satisfies the shape and type of the model's input.

In [4]:
# Generate a dummy input of the shape, (1, 3, 224, 224), of the model's input.
dummy_input = (torch.randn(1, 3, 224, 224),)

# Export the PyTorch model into an ONNX model.
torch.onnx.export(
    torch_model,  # PyTorch model to export
    dummy_input,  # model input
    "resnet50.onnx",  # where to save an exported ONNX model
    opset_version=12,  # the ONNX version to export the model to
    do_constant_folding=True,  # whether to execute constant folding for optimization
    input_names=["input"],  # the model's input names
    output_names=["output"],  # the model's output names
)

# Load the exported ONNX model.
onnx_model = onnx.load_model("resnet50.onnx")

## Load Dataset

We will use subsets of the ImageNet dataset for calibration and validation. 

You need to download `ILSVRC2012_img_val.tar` and `ILSVRC2012_devkit_t12.tar.gz` externally and place them in the `imagenet` directory. Torchvision cannot download the ImageNet dataset automatically because it is no longer publicly accessible: https://github.com/pytorch/vision/pull/1457.

Note that it may take several minutes to run this step for the first time because it involves decompressing the archive files. It will take much less time to complete subsequently.

In [5]:
imagenet = torchvision.datasets.ImageNet("imagenet", split="val", transform=preprocess)

## Calibrate and Quantize ONNX Model

We call the `furiosa.quantizer.frontend.onnx.post_training_quantize` function to calibrate and quantize the ONNX model at one fell swoop. For quick demonstration, a small number of samples randomly chosen from the ImageNet dataset is used for calibration.

In [6]:
calibration_dataset = torch.utils.data.Subset(imagenet, torch.randperm(len(imagenet))[:100])
calibration_dataloader = torch.utils.data.DataLoader(calibration_dataset, batch_size=1)

onnx_model_quantized = post_training_quantize(
    onnx_model,
    ({"input": image.numpy()} for image, _ in calibration_dataloader),
)
onnx.save_model(onnx_model_quantized, "resnet50_quantized.onnx")

Calibration: 100it [00:23,  4.23it/s]
Quantization: 100%|███████████████████████████████████████████████| 122/122 [00:02<00:00, 51.24it/s]


## Run Inference with Quantized ONNX Model

For quick demonstration, we use randomly chosen 1000 samples from the ImageNet dataset for validation.

In [7]:
validation_dataset = torch.utils.data.Subset(imagenet, torch.randperm(len(imagenet))[:1000])
validation_dataloader = torch.utils.data.DataLoader(validation_dataset, batch_size=1)

correct_predictions, total_predictions = 0, 0
with furiosa.runtime.session.create("resnet50_quantized.onnx") as session:
    for image, label in tqdm.tqdm(validation_dataloader, desc="Evaluation", unit="images", mininterval=0.5):
        outputs = session.run(image.numpy())
        prediction = np.argmax(outputs[0].numpy(), axis=1)  # postprocessing  
        if prediction == label.numpy():
            correct_predictions += 1
        total_predictions += 1

Saving the compilation log into /root/.local/state/furiosa/logs/compile-20220322100708-a601ja.log
Using furiosa-compiler 0.6.0 (rev: a2a766906 built at 2022-02-25 18:07:42)
[2m2022-03-22T10:07:08.870893Z[0m [32m INFO[0m Npu (npu4pe0-1) is being initialized
[2m2022-03-22T10:07:08.871703Z[0m [32m INFO[0m NuxInner create with pes: [PeId(0)]


[1/6] 🔍   Compiling from onnx to dfg
Done in 0.038607s
[2/6] 🔍   Compiling from dfg to ldfg
Done in 341.4224s
[3/6] 🔍   Compiling from ldfg to cdfg
Done in 0.001947735s
[4/6] 🔍   Compiling from cdfg to gir
Done in 0.021545298s
[5/6] 🔍   Compiling from gir to lir
Done in 0.005200667s
[6/6] 🔍   Compiling from lir to enf
Done in 0.08561787s
✨  Finished in 341.57672s


[2m2022-03-22T10:12:50.475277Z[0m [32m INFO[0m [Profiler] Program binary notification has been arrived. Cleanup current profile queue data


Evaluation: 100%|███████████████████████████████████████████| 1000/1000 [00:23<00:00, 42.43images/s]

[2m2022-03-22T10:13:14.478140Z[0m [32m INFO[0m [Profiler] Received a termination signal.





[2m2022-03-22T10:13:14.480913Z[0m [32m INFO[0m session has been destroyed


In [8]:
accuracy = correct_predictions / total_predictions
print(f"Accuracy: {accuracy:%}")

Accuracy: 77.200000%
