[View the runnable example on GitHub](https://github.com/intel-analytics/BigDL/tree/main/python/nano/tutorial/notebook/inference/pytorch/accelerate_pytorch_inference_gpu.ipynb)

# Accelerate PyTorch Inference using Intel ARC series dGPU

You can use ``InferenceOptimizer.trace(..., device="GPU")`` API to enable the Intel ARC series dGPU acceleration for PyTorch inference. It only takes a few lines.

To apply Intel ARC series dGPU acceleration, you need to install OneAPI base tool kit and proper BigDL-Nano for PyTorch inference first. To install OneAPI base tool kit, click here [Download the Intel® oneAPI Base Toolkit](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html), and OneMKL and DPC++ compiler are needed, others are optional. And for installing BigDL-Nano properly, please use the following commands. 

In [None]:
pip install --pre bigdl-nano[pytorch_113_xpu] -f https://developer.intel.com/ipex-whl-stable-xpu # prepare proper nano environment and its dependencies

source bigdl-nano-init -g/--gpu # enable nano environment

Let's take an [ResNet-50 model](https://pytorch.org/vision/main/models/generated/torchvision.models.resnet50.html) pretrained on ImageNet dataset as an example. First, we load the model:

In [None]:
from torchvision.models import resnet50

original_model = resnet50(pretrained=True)
original_model.eval()

> 📝 **Note**
> 
> Currently Intel ARC series dGPU acceleration for BigDL-Nano is only supported on Linux platform. 

To enable Intel ARC series dGPU acceleration for your PyTorch inference pipeline, **the major change you need to make is to import BigDL-Nano** `InferenceOptimizer`**, and trace your PyTorch model to convert it into an `PytorchIPEXPUModel` for inference by specifying device as "GPU"**:

In [None]:
from bigdl.nano.pytorch import InferenceOptimizer

# default not use ipex acceleration
acc_model = InferenceOptimizer.trace(original_model, device="GPU", use_ipex=False)

# you can also choose to use ipex acceleration
acc_model = InferenceOptimizer.trace(original_model, device="GPU", use_ipex=True)

> 📝 **Note**
> 
> Please refer to [API documentation](https://bigdl.readthedocs.io/en/latest/doc/PythonAPI/Nano/pytorch.html#bigdl.nano.pytorch.InferenceOptimizer.trace) for more information on `InferenceOptimizer.trace`.

Currently Intel ARC series dGPU acceleration also supports **fp16 precision** for your PyTorch inference pipeline. Only a few lines of changes are needed (see below).

In [None]:
from bigdl.nano.pytorch import InferenceOptimizer

# default not use ipex acceleration
acc_model = InferenceOptimizer.quantize(original_model, device="GPU", precision="fp16", use_ipex=False)

# you can also choose to use ipex acceleration
acc_model = InferenceOptimizer.quantize(original_model, device="GPU", precision="fp16", use_ipex=True)

> 📝 **Note**
> 
> Please refer to [API documentation](https://bigdl.readthedocs.io/en/latest/doc/PythonAPI/Nano/pytorch.html#bigdl.nano.pytorch.InferenceOptimizer.quantize) for more information on `InferenceOptimizer.quantize`.

You could then do the normal inference steps **under the context manager provided by Nano**, with the model accelerated by Intel ARC series dGPU:

In [None]:
import torch

with InferenceOptimizer.get_context(acc_model):
    data = torch.rand(2, 3, 224, 224)
    predictions = acc_model(data)

> 📝 **Note**
> 
> For all Nano optimized models by `InferenceOptimizer.trace` or `InferenceOptimizer.quantize`, you need to wrap the inference steps with an automatic context manager `InferenceOptimizer.get_context(model=...)` provided by Nano. You could refer to [here](https://bigdl.readthedocs.io/en/latest/doc/Nano/Howto/Inference/PyTorch/pytorch_context_manager.html) for more detailed usage of the context manager.

> 📚 **Related Readings**
> 
> - [How to install BigDL-Nano](https://bigdl.readthedocs.io/en/latest/doc/Nano/Overview/install.html)
> - [How to enable automatic context management for PyTorch inference on Nano optimized models](https://bigdl.readthedocs.io/en/latest/doc/Nano/Howto/Inference/PyTorch/pytorch_context_manager.html)