<style>
    #codecell4 {
        y-overflow: scroll;
        max-height: 200px;
    }
</style>

[View the runnable example on GitHub](https://github.com/intel-analytics/BigDL/tree/main/python/nano/tutorial/notebook/inference/tensorflow/accelerate_tensorflow_inference_onnx.ipynb)

# Accelerate TensorFlow Inference using ONNXRuntime

You can use ``InferenceOptimizer.trace(..., accelerator='onnxruntime')`` API to enable the ONNXRuntime acceleration for TensorFlow inference. It only takes a few lines.

To apply ONNXRuntime acceleration, the following dependencies need to be installed first:

In [None]:
!pip install --pre --upgrade bigdl-nano[tensorflow,inference] # install the nightly-built version
!source bigdl-nano-init # set environment variables

> 📝 **Note**
>
> We recommend to run the commands above, especially `source bigdl-nano-init` before jupyter kernel is started, or some of the optimizations may not take effect.

Let's take an [EfficientNetB0 model](https://www.tensorflow.org/api_docs/python/tf/keras/applications/efficientnet/EfficientNetB0) pretrained on ImageNet dataset as an example. First, we load the model:

In [None]:
from tensorflow.keras.applications import EfficientNetB0

model = EfficientNetB0(weights='imagenet')

In [2]:
model.summary()

Model: "efficientnetb0"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 input_1 (InputLayer)           [(None, 224, 224, 3  0           []                               
                                )]                                                                
                                                                                                  
 rescaling (Rescaling)          (None, 224, 224, 3)  0           ['input_1[0][0]']                
                                                                                                  
 normalization (Normalization)  (None, 224, 224, 3)  7           ['rescaling[0][0]']              
                                                                                                  
 tf.math.truediv (TFOpLambda)   (None, 224, 224, 3)  0           ['normalization[0][0

To enable ONNXRuntime acceleration for your TensorFlow inference, **the only change you need to made is to import BigDL-Nano** `InferenceOptimizer`**, and trace your TensorFlow model to convert it into an ONNXRuntime accelerated module for inference**:

In [None]:
import tensorflow as tf
from bigdl.nano.tf.keras import InferenceOptimizer

ort_model = InferenceOptimizer.trace(model,
                                     accelerator="onnxruntime",
                                     input_spec=tf.TensorSpec(shape=(None, 224, 224, 3))
                                     )

> 📝 **Note**
>
> `input_spec` is a **required** argument for ONNXRuntime accelerator to know the **shape** of the model input, which could be a (list or tuple of) `tf.TensorSpec` or `numpy.ndarrary`.
>
> Based on the model summary, we could find that the input shape of the model is `(None, 224, 224, 3)`, so we create a `tf.TensorSpec(shape=(None, 224, 224, 3)` for `input_spec` here.
>
> Please refer to [API documentation](https://bigdl.readthedocs.io/en/latest/doc/PythonAPI/Nano/tensorflow.html#bigdl.nano.tf.keras.InferenceOptimizer.trace) for more information on `InferenceOptimizer.trace`.

You could then do the normal inference steps with the model optimized by ONNXRuntime:

In [None]:
x = tf.random.normal(shape=(2, 224, 224, 3))
# use the optimized model here
y_hat = ort_model(x)
predictions = tf.argmax(y_hat, axis=1)
print(predictions)

> 📚 **Related Readings**
> 
> - [How to install BigDL-Nano](https://bigdl.readthedocs.io/en/latest/doc/Nano/Overview/nano.html#install)
> - [How to install BigDL-Nano in Google Colab](https://bigdl.readthedocs.io/en/latest/doc/Nano/Howto/install_in_colab.html)