[View the runnable example on GitHub](https://github.com/intel-analytics/BigDL/tree/main/python/nano/tutorial/notebook/inference/pytorch/accelerate_pytorch_inference_onnx.ipynb)

# Accelerate PyTorch Inference using ONNXRuntime

You can use ``InferenceOptimizer.trace(..., accelerator='onnxruntime')`` API to enable the ONNXRuntime acceleration for PyTorch inference. It only takes a few lines.

To apply ONNXRuntime acceleration, the following dependencies need to be installed first:

In [None]:
!pip install --pre --upgrade bigdl-nano[pytorch,inference] # install the nightly-built version
!source bigdl-nano-init

> 📝 **Note**
>
> We recommend to run the commands above, especially `source bigdl-nano-init` before jupyter kernel is started, or some of the optimizations may not take effect.

Let's take an [ResNet-18 model](https://pytorch.org/vision/main/models/generated/torchvision.models.resnet18.html) pretrained on ImageNet dataset as an example. First, we load the model:

In [None]:
from torchvision.models import resnet18

model_ft = resnet18(pretrained=True)

Then we set it in evaluation mode:

In [None]:
model_ft.eval()

To enable ONNXRuntime acceleration for your PyTorch inference pipeline, **the only change you need to made is to import BigDL-Nano** `InferenceOptimizer`**, and trace your PyTorch model to convert it into an ONNXRuntime accelerated module for inference**:

In [None]:
import torch
from bigdl.nano.pytorch import InferenceOptimizer

ort_model = InferenceOptimizer.trace(model_ft,
                                     accelerator="onnxruntime",
                                     input_sample=torch.rand(1, 3, 224, 224))

> 📝 **Note**
> 
> `input_sample` is the parameter for ONNXRuntime accelerator to know the **shape** of the model input. So both the batch size and the specific values are not important to `input_sample`. If we want our test dataset to consist of images with $224 \times 224$ pixels, we could use `torch.rand(1, 3, 224, 224)` for `input_sample` here. 
> 
> Please refer to [API documentation](https://bigdl.readthedocs.io/en/latest/doc/PythonAPI/Nano/pytorch.html#bigdl.nano.pytorch.InferenceOptimizer.trace) for more information on `InferenceOptimizer.trace`.

You could then do the normal inference steps with the model optimized by ONNXRuntime:

In [None]:
with InferenceOptimizer.get_context(ort_model):
    x = torch.rand(2, 3, 224, 224)
    # use the optimized model here
    y_hat = ort_model(x)
    predictions = y_hat.argmax(dim=1)
    print(predictions)

> 📚 **Related Readings**
> 
> - [How to install BigDL-Nano](https://bigdl.readthedocs.io/en/latest/doc/Nano/Overview/install.html)
> - [How to install BigDL-Nano in Google Colab](https://bigdl.readthedocs.io/en/latest/doc/Nano/Howto/Install/install_in_colab.html)