# Model conversion for PyTorch models

This tutorial explains how to convert a PyTorch Model to a standardized ONNX format, which enables you to run your model on a GPU enabled AI Inference Server.  

In this tutorial you will learn how to 
- load a model in 'pth' format
- convert and save the loaded model into 'onnx' format
- verify the input and output shape of the model

For more information about the conversion and common pitfalls please refer to the official [PyTorch to ONNX exporter documentation](https://pytorch.org/docs/stable/onnx_torchscript.html).

## Configuration

Configure model path and input size. `MODEL_PATH` is the path of the PyTorch model you want to convert to ONNX format. The converted model will be saved to `ONNX_PATH`.

In [None]:
import os

MODEL_PATH = os.path.join("models", "model.pth")
ONNX_PATH = os.path.join("output", "model.onnx")

IMAGE_WIDTH = 224
IMAGE_HEIGHT = 224
PIXEL_DEPTH = 3

BATCH_SIZE = 1

Check if we have a GPU available, if so, define the map location accordingly, otherwise, we will be using CPU to run our model.

In [None]:
import torch

DEVICE = "cuda" if torch.cuda.is_available() else "cpu"

if DEVICE == "cuda":
	map_location = lambda storage, loc: storage.cuda()
else:
	map_location = "cpu"

## Load the PyTorch model

This model is an image classification model based on a pretrained ResNet50 model. It was retrained with the `simatic_photos` dataset. 

In [None]:
torch_model = torch.load(MODEL_PATH, map_location=map_location)

Move the model to the device and set it in evaluation mode

In [None]:
torch_model.to(DEVICE)
torch_model.eval()

## Convert to ONNX

PyTorch requires a random input for the conversion. The input size must be known beforehand.<br/>
Use the `input_names` and `output_names` arguments to specify the input / output variable names used in the inference pipeline.<br/>
Parameter `opset` defines the version of the `ONNX format`, opset version `13` refers to ONNX format `1.8.0` which is supported by AI Inference Server at the time of writing.

> ⚠️ Warning<br/>
> The `verbose` parameter must be set to `False`, otherwise the ONNX exporter can get stuck in an infinite loop.

In [None]:
torch_input = torch.randn(BATCH_SIZE, PIXEL_DEPTH, IMAGE_WIDTH, IMAGE_HEIGHT, requires_grad=True, device=DEVICE)

input_names = [ "input_1" ] 
output_names = [ "output_1" ]

torch.onnx.export(
    torch_model, 
    torch_input, 
    ONNX_PATH, 
    verbose=False, # Must be set to False
    input_names=input_names, 
    output_names=output_names,
    opset_version=13)

## Load the ONNX model and validate it

Check the consistency of a model with `onnx.checker.check_model`. An exception is raised if the test fails.

In [None]:
import onnx

onnx_model = onnx.load(ONNX_PATH)
onnx.checker.check_model(onnx_model)

## Input and Output shape

Let's inspect how the model and its inputs and outputs are shaped.
`graph.input` displays the input shape of the converted ONNX model.

In [None]:
onnx_model.graph.input

In this case, the shape of the input is `[1 x 3 x 224 x 224]`.<br/>
The shape of the output is `[1 x 5]`, which can be displayed with the following:

In [None]:
onnx_model.graph.output

## Executing the model

Before packaging the model, it is recommended to try it out with the `onnxruntime` Python package.  
To do so we need to provide  
- an `onnxruntime.InferenceSession` with the preloaded model  
- the dictionary of the `input` arrays with the expected shape and type.  
  To test the model we are generating a numpy array with randomized float values. 
- the list of the `output` arrays.


The `result` variable contains the output tensors in a list.

In [None]:
import numpy
from onnxruntime import InferenceSession

images = numpy.random.random((BATCH_SIZE, PIXEL_DEPTH, IMAGE_WIDTH, IMAGE_HEIGHT)).astype('float32')
session = InferenceSession(ONNX_PATH)

result = session.run(["output_1"], {"input_1": images})
result

## Usage of the ONNX model

The AI Inference Server with GPU support accepts ONNX models for execution.  
For this purpose the model must be packaged into a `GPURuntimeComponent` step using AI Software Development Kit.  
For details on how to create `GPURuntimeComponent` and build pipelines that run on a GPU enabled AI Inference Server you can study the [Object Detection]("../../use-cases/object-detection/Readme.md") example.