# An Example of Inference Accuracy Check

This tutorial will explain how to compare the inference accuracies between furiosa-sdk using NPU and other runtimes using CPU or GPU. In this example, we are going to use Onnx runtime as a counterpart runtime.

## Prerequisites
To follow this tutorial, please install the followings first.

To run this examples, you must install the required packages and setup Python environment by following the guides:
* [FuriosaAI Driver, Firmware, Runtime Installation Guide](https://furiosa-ai.github.io/docs/latest/ko/software/installation.html)
* [Setting up a Python Environment](https://furiosa-ai.github.io/docs/latest/ko/software/python-sdk.html#python)

Then, please install the following python packages:
```sh
pip install furiosa-sdk matplotlib mnist onnxruntime
```

Or, you can run the following command to install all dependent packages for all notebook examples at once:
```sh
pip install -r examples/notebooks/requirements.txt
```

And then, let's check if your NPU device is ready as following:

In [1]:
!furiosactl info

[0m+[0m[0m------[0m[0m+[0m[0m--------[0m[0m+[0m[0m----------------[0m[0m+[0m[0m-------[0m[0m+[0m[0m--------[0m[0m+[0m[0m--------------[0m[0m+
[0m[0m[0m|[0m[0m [0m[0m[0m[1mNPU [0m [0m[0m|[0m[0m [0m[0m[0m[1mName  [0m [0m[0m|[0m[0m [0m[0m[0m[1mFirmware      [0m [0m[0m|[0m[0m [0m[0m[0m[1mTemp.[0m [0m[0m|[0m[0m [0m[0m[0m[1mPower [0m [0m[0m|[0m[0m [0m[0m[0m[1mPCI-BDF     [0m [0m[0m|[0m[0m
[0m[0m[0m+[0m[0m------[0m[0m+[0m[0m--------[0m[0m+[0m[0m----------------[0m[0m+[0m[0m-------[0m[0m+[0m[0m--------[0m[0m+[0m[0m--------------[0m[0m+
[0m[0m[0m|[0m[0m [0m[0m[0mnpu0[0m [0m[0m|[0m[0m [0m[0m[0mwarboy[0m [0m[0m|[0m[0m [0m[0m[0m1.7.0, 0a4411e[0m [0m[0m|[0m[0m [0m[0m[0m 39°C[0m [0m[0m|[0m[0m [0m[0m[0m2.29 W[0m [0m[0m|[0m[0m [0m[0m[0m0000:49:00.0[0m [0m[0m|[0m[0m
[0m[0m[0m+[0m[0m------[0m[0m+[0m[0m--------[0m[0m+[0m[0m-------

Then, let's make sure that your SDK is ready to run.

In [2]:
!python -c "from furiosa import runtime;print(runtime.__full_version__)"

libfuriosa_hal.so --- v0.11.0, built @ 43c901f
Furiosa SDK Runtime 0.10.0-dev (rev: e80482f4) (libnux 0.9.0 062c7dd1f 2023-04-12T20:55:14Z)


## Preparing the dataset and model

In [3]:
# Import MNIST dataset package
import numpy as np
import mnist

# The following line will download the MNIST dataset through the network.
mnist_images = mnist.train_images().reshape((60000, 1, 28, 28)).astype(np.float32)
mnist_images.shape

(60000, 1, 28, 28)

In [4]:
from pathlib import Path

model_path = 'models/mnist-8.onnx'

In [5]:
import onnxruntime

onnxrt = onnxruntime.InferenceSession(model_path)

In [6]:
from furiosa.runtime import session

sess = session.create(model_path)
sess.print_summary()

libfuriosa_hal.so --- v0.11.0, built @ 43c901f
Saving the compilation log into /home/hyunsik/.local/state/furiosa/logs/compile-20230413184426-9yr3jx.log
Using furiosa-compiler 0.9.0 (rev: 062c7dd1f built at 2023-04-12T20:55:14Z)


[2m2023-04-13T23:44:27.028635Z[0m [32m INFO[0m [2mnux::npu[0m[2m:[0m Npu (npu0pe0-1) is being initialized
[2m2023-04-13T23:44:27.031933Z[0m [32m INFO[0m [2mnux[0m[2m:[0m NuxInner create with pes: [PeId(0)]


[1m[2m[1/6][0m 🔍   Compiling from onnx to dfg
Done in 0.001068244s
[1m[2m[2/6][0m 🔍   Compiling from dfg to ldfg
Done in 0.000709566s
[1m[2m[3/6][0m 🔍   Compiling from ldfg to cdfg
Done in 0.000116216s
[1m[2m[4/6][0m 🔍   Compiling from cdfg to gir
Done in 0.000048078s
[1m[2m[5/6][0m 🔍   Compiling from gir to lir
Done in 0.000107976s
[1m[2m[6/6][0m 🔍   Compiling from lir to enf
Done in 0.000096677s
✨  Finished in 0.002333382s
Inputs:
{0: TensorDesc(name="Input3", shape=(1, 1, 28, 28), dtype=FLOAT32, format=NCHW, size=3136, len=784)}
Outputs:
{0: TensorDesc(name="Plus214_Output_0", shape=(1, 10), dtype=FLOAT32, format=??, size=40, len=10)}


In [7]:
# to compare floating point numbers with tolerance
def numpy_equals(expected, result, atol=0.1):
    return np.allclose(expected, result, atol=atol), "{} was expected, but the result was {}".format(
        expected, result
    )

The following function runs inferences on both CPU and NPU and compree the results. It will take some time depending on `total_run`.

In [8]:
%%time

import random
total_run = 20 # How many inferences are compared
matched = 0

for _ in range(0, total_run):
    # randomly picks the item
    idx = random.randrange(0, 60000, 1)
    ndarray_value = mnist_images[idx : idx + 1]
    
    result1 = sess.run_with(["Plus214_Output_0"], {"Input3": ndarray_value})
    result2 = onnxrt.run(["Plus214_Output_0"], {"Input3": ndarray_value})    
    
    if numpy_equals(result1[0].numpy(), result2, 0.04):
        matched += 1
        
print(f"Accuracy: {matched / total_run * 100}%")

Accuracy: 100.0%
CPU times: user 9.99 s, sys: 15.1 ms, total: 10 s
Wall time: 474 ms


In [9]:
# Close the session after you use.
sess.close()

[2m2023-04-13T23:44:27.557345Z[0m [32m INFO[0m [2mnux::npu[0m[2m:[0m NPU (npu0pe0-1) has been destroyed
[2m2023-04-13T23:44:27.557523Z[0m [32m INFO[0m [2mnux::capi[0m[2m:[0m session has been destroyed
