# An Example of Inference Accuracy Check

This tutorial will explain how to compare the inference accuracies between furiosa-sdk using NPU and other runtimes using CPU or GPU. In this example, we are going to use Onnx runtime as a counterpart runtime.

## Prerequisites
To follow this tutorial, please install the followings first.

To run this examples, you must install the required packages and setup Python environment by following the guides:
* [FuriosaAI Driver, Firmware, Runtime Installation Guide](https://furiosa-ai.github.io/docs/latest/ko/software/installation.html)
* [Setting up a Python Environment](https://furiosa-ai.github.io/docs/latest/ko/software/python-sdk.html#python)

Then, please install the following python packages:
```sh
pip install furiosa-sdk matplotlib mnist onnxruntime
```

Or, you can run the following command to install all dependent packages for all notebook examples at once:
```sh
pip install -r examples/notebooks/requirements.txt
```

And then, let's check if your NPU device is ready as following:

In [1]:
!furiosactl info

[0m[0m+[0m[0m------[0m[0m+[0m[0m------------------[0m[0m+[0m[0m-------[0m[0m+[0m[0m--------[0m[0m+[0m[0m--------------[0m[0m+[0m[0m---------[0m[0m+
[0m[0m|[0m[0m [0m[1mNPU [0m[0m [0m[0m|[0m[0m [0m[1mName            [0m[0m [0m[0m|[0m[0m [0m[1mTemp.[0m[0m [0m[0m|[0m[0m [0m[1mPower [0m[0m [0m[0m|[0m[0m [0m[1mPCI-BDF     [0m[0m [0m[0m|[0m[0m [0m[1mPCI-DEV[0m[0m [0m[0m|[0m[0m
[0m[0m+[0m[0m------[0m[0m+[0m[0m------------------[0m[0m+[0m[0m-------[0m[0m+[0m[0m--------[0m[0m+[0m[0m--------------[0m[0m+[0m[0m---------[0m[0m+
[0m[0m|[0m[0m [0mnpu4[0m[0m [0m[0m|[0m[0m [0mFuriosaAI Warboy[0m[0m [0m[0m|[0m[0m [0m 49°C[0m[0m [0m[0m|[0m[0m [0m0.00 W[0m[0m [0m[0m|[0m[0m [0m0000:a1:00.0[0m[0m [0m[0m|[0m[0m [0m503:0  [0m[0m [0m[0m|[0m[0m
[0m[0m+[0m[0m------[0m[0m+[0m[0m------------------[0m[0m+[0m[0m-------[0m[0m+[0m[0m--------[0m[0m+[0m

Then, let's make sure that your SDK is ready to run.

In [2]:
!python -c "from furiosa import runtime;print(runtime.__full_version__)"

libnpu.so --- v2.0, built @ fe1fca3
Furiosa SDK Runtime  (libnux 0.5.0 407c0c51f-modified 2021-11-22 20:18:37)


## Preparing the dataset and model

In [3]:
# Import MNIST dataset package
import numpy as np
import mnist

# The following line will download the MNIST dataset through the network.
mnist_images = mnist.train_images().reshape((60000, 1, 28, 28)).astype(np.float32)
mnist_images.shape

(60000, 1, 28, 28)

In [4]:
from pathlib import Path

model_path = 'models/mnist-8.onnx'

In [5]:
import onnxruntime

onnxrt = onnxruntime.InferenceSession(model_path)

In [6]:
from furiosa.runtime import session

sess = session.create(model_path)
sess.print_summary()

Saving the compilation log into /home/jovyan/.local/state/furiosa/logs/compile-20211214003445-mn7r2w.log
Using furiosa-compiler 0.5.0 (rev: 407c0c51f-modified built at 2021-11-22 20:18:37)
[2m2021-12-14T00:34:45.530199Z[0m [32m INFO[0m Npu (npu4pe0-1) is being initialized
[2m2021-12-14T00:34:45.531441Z[0m [32m INFO[0m NuxInner create with pes: [PeId(0)]
[2m2021-12-14T00:34:45.539545Z[0m [32m INFO[0m [Profiler] Program binary notification has been arrived. Cleanup current profile queue data
Inputs:
{0: TensorDesc(name="Input3", shape=(1, 1, 28, 28), dtype=FLOAT32, format=NCHW, size=3136, len=784)}
Outputs:
{0: TensorDesc(name="Plus214_Output_0", shape=(1, 10), dtype=FLOAT32, format=??, size=40, len=10)}


libnpu.so --- v2.0, built @ fe1fca3
[1/6] 🔍   Compiling from onnx to dfg
Done in 0.000943567s
[2/6] 🔍   Compiling from dfg to ldfg
Done in 0.001744975s
[3/6] 🔍   Compiling from ldfg to cdfg
Done in 0.000194827s
[4/6] 🔍   Compiling from cdfg to gir
Done in 0.000117248s
[5/6] 🔍   Compiling from gir to lir
Done in 0.000234277s
[6/6] 🔍   Compiling from lir to enf
Done in 0.000587912s
✨  Finished in 0.004302328s


In [10]:
# to compare floating point numbers with tolerance
def numpy_equals(expected, result, atol=0.1):
    return np.allclose(expected, result, atol=atol), "{} was expected, but the result was {}".format(
        expected, result
    )

The following function runs inferences on both CPU and NPU and compree the results. It will take some time depending on `total_run`.

In [12]:
%%time

import random
total_run = 20 # How many inferences are compared
matched = 0

for _ in range(0, total_run):
    # randomly picks the item
    idx = random.randrange(0, 60000, 1)
    ndarray_value = mnist_images[idx : idx + 1]
    
    result1 = sess.run_with(["Plus214_Output_0"], {"Input3": ndarray_value})
    result2 = onnxrt.run(["Plus214_Output_0"], {"Input3": ndarray_value})    
    
    if numpy_equals(result1[0].numpy(), result2, 0.04):
        matched += 1
        
print(f"Accuracy: {matched / total_run * 100}%")

Accuracy: 100.0%
CPU times: user 49.5 s, sys: 1.15 s, total: 50.7 s
Wall time: 12.6 s


In [14]:
# Close the session after you use.
sess.close()

[2m2021-12-14T00:38:46.750034Z[0m [32m INFO[0m [Profiler] Received a termination signal.
[2m2021-12-14T00:38:46.751019Z[0m [32m INFO[0m session has been destroyed
