# Convert EfficientNet Model to ONNX and Inference with ONNX Runtime

In this tutorial, you'll be introduced to how to load a EfficientNet model (a Tensorflow Keras model), convert it to ONNX using Keras2onnx, and inference it for high performance using ONNX Runtime. In the following sections, we are going to use the pretrained EfficientNet model as an example. This EfficientNet model is used for image classification.

## 0. Prerequisites ##
First we need a python environment before running this notebook.

You can install [AnaConda](https://www.anaconda.com/distribution/) and [Git](https://git-scm.com/downloads) and open an AnaConda console when it is done. Then you can run the following commands to create a conda environment named cpu_env:

```console
conda create -n cpu_env python=3.7
conda activate cpu_env
```

Finally, launch Jupyter Notebook and you can choose cpu_env as kernel to run this notebook.

Let's install [Tensorflow](https://www.tensorflow.org/install), [OnnxRuntime](https://microsoft.github.io/onnxruntime/), Keras2Onnx and other packages like the following:

In [None]:
import sys

!{sys.executable} -m pip install --quiet --upgrade tensorflow==2.1.0
!{sys.executable} -m pip install --quiet --upgrade onnxruntime

# Install keras2onnx from source to support tensorflow 2.1 models currently.
!{sys.executable} -m pip install --quiet git+https://github.com/microsoft/onnxconverter-common
!{sys.executable} -m pip install --quiet git+https://github.com/onnx/keras-onnx
    
# Install other packages used in this notebook.   
!{sys.executable} -m pip install --quiet efficientnet

## 1. Load Pretrained EfficientNet model ##

Start to load the pretrained EfficientNet model.

In [None]:
import efficientnet.tfkeras as efn
from keras.applications.imagenet_utils import decode_predictions
from efficientnet.keras import center_crop_and_resize, preprocess_input
from skimage.io import imread
model = efn.EfficientNetB7(weights='imagenet')

## 2. Tensorfow Inference

Use one example to run inference using TensorFlow as baseline.

In [None]:
import numpy as np
image = imread('panda.jpg')
image_size = model.input_shape[1]
x = center_crop_and_resize(image, image_size=image_size)
x = preprocess_input(x)
inputs = np.expand_dims(x, 0)
expected = model.predict(inputs)
result = decode_predictions(expected)
print('classification_result_tfkeras = '+str(result))

## 3. Convert the model to ONNX

Now we use Keras2onnx to convert the model to ONNX format. It takes about 22 seconds for conversion.

In [None]:
import os
output_model_path = 'keras_efficientNet.onnx'
os.environ["TF_KERAS"] = "1"

import tensorflow
import keras2onnx

onnx_model = keras2onnx.convert_keras(model, model.name)
keras2onnx.save_model(onnx_model, output_model_path)


## 4. Inference the ONNX Model with ONNX Runtime

We enable OpenMP environment variable for thread parallelism. Setting environment variables shall be done before importing onnxruntime. Otherwise, they might not take effect.

In [None]:
import os
import psutil

# You may change the settings in this cell according to Performance Test Tool result after running the whole notebook.
use_openmp = True

# ATTENTION: these environment variables must be set before importing onnxruntime.
if use_openmp:
    os.environ["OMP_NUM_THREADS"] = str(psutil.cpu_count(logical=True))
else:
    os.environ["OMP_NUM_THREADS"] = '1'

print('os_omp_num_threads='+os.environ["OMP_NUM_THREADS"])

os.environ["OMP_WAIT_POLICY"] = 'ACTIVE'

Inference via onnxruntime:

In [None]:
import psutil
import onnxruntime
import numpy
  
sess_options = onnxruntime.SessionOptions()
sess_options.intra_op_num_threads=psutil.cpu_count(logical=True)

sess = onnxruntime.InferenceSession(output_model_path, sess_options)

data = inputs.astype(np.float32)
if isinstance(data, dict):
    feed_input = data
else:
    data = data if isinstance(data, list) else [data]
    input_names = sess.get_inputs()
    feed = zip(sorted(i_.name for i_ in input_names), data)
    feed_input = dict(feed)

actual = sess.run(None, feed_input)
result_onnx = decode_predictions(actual[0])
print('classification_result_onnx = '+str(result_onnx))

We compare the tensorflow and onnx results, and verify the correctness here:

In [None]:
print("***** Verifying correctness (TensorFlow and ONNX Runtime) *****")
print('Results are close:', np.allclose(actual[0], expected, rtol=1e-05, atol=1e-04))