# ONNX

Commonly code is run on ONNX, and trying to utilize `fastai` as best you can can be a challenge. The beneifits of ONNX is it uses C++ so it can be a faster runtime, and recently ONNX came out with `CUDA` support as well! How hard is it to integrate? Let's make a `tabular` problem again, this time importing `ONNX`:

In [None]:
!pip show fastinference

Name: fastinference
Version: 0.0.13
Summary: A collection of inference modules
Home-page: https://github.com/muellerzr/fastinference/tree/master/
Author: Zachary Mueller
Author-email: muellerzr@gmail.com
License: Apache Software License 2.0
Location: /home/ml1/anaconda3/envs/fastai/lib/python3.7/site-packages
Requires: onnxruntime-gpu, fastai, shap
Required-by: 


In [None]:
from fastai.tabular.all import *
from fastinference.onnx import *

In [None]:
path = untar_data(URLs.ADULT_SAMPLE)
df = pd.read_csv(path/'adult.csv')
splits = RandomSplitter()(range_of(df))
cat_names = ['workclass', 'education', 'marital-status', 'occupation', 'relationship', 'race']
cont_names = ['age', 'fnlwgt', 'education-num']
procs = [Categorify, FillMissing, Normalize]
y_names = 'salary'

In [None]:
to = TabularPandas(df, procs=procs, cat_names=cat_names, cont_names=cont_names,
                   y_names=y_names, splits=splits)

In [None]:
dls = to.dataloaders()

In [None]:
learn = tabular_learner(dls, layers=[200,100])

I've made a special `fastONNX` wrapper, which will take your `Learner` and export both your model and the `DataLoaders` so ONNX can use them via `learn.to_onnx()`

In [None]:
learn.to_onnx('tabular')

Now let's load it in:

In [None]:
onnx_learn = fastONNX('tabular')

What all can we do here? We can still do everything exactly the same minus one change: `predict`

`predict` requires the raw inputs, so instead *always* build a `test_dl` and pass to `get_preds` (just made it simpler to code for me). Let's run a few examples also testing the times:

In [None]:
single_dl = onnx_learn.test_dl(df.iloc[:1])

In [None]:
%%timeit
preds = onnx_learn.get_preds(dl=single_dl)

3.46 ms ± 5.44 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


Wow! 3.46 ms even beats our previous `predict` with 25 ms! what is it's output?

In [None]:
name, preds = onnx_learn.get_preds(dl=single_dl)

In [None]:
name, preds

(['<50k'], array([[[0.5036924, 0.4963076]]], dtype=float32))

Currently it doesn't support returning the raw inputs, I need to work on that some more