### PyTorch to ONNX

The following script will demonstrate the process of exporting a pretrained AlexNet. It runs a single round of inference and then saves the resulting traced model to alexnet.onnx
1. Load pretrained AlexNet from torchvision.models
2. Define Input/Output names for model
3. Convert model to ONNX via built-in methods
4. Run inferencing via ONNX Runtime

**Tested Package Versions:**
* PyTorch:     1.7.0
* Torchvision: 0.8.1

**Reference:**
https://pytorch.org/docs/stable/onnx.html

In [15]:
# User Input
modelOut = "alexnet.onnx"
verboseMode = True

In [10]:
# Setup
print("----PyTorch to ONNX.ipynb----\n")
import torch
import torchvision

print("Torch Version:\t\t", torch.__version__)
print("TorchVision Version:\t", torchvision.__version__)

----PyTorch to ONNX.ipynb----

Torch Version:		 1.7.0
TorchVision Version:	 0.8.1


In [13]:
# Load AlexNet Model
print("Importing AlexNet...")

# Note, this is being loaded for CPU usage
dummy_input = torch.randn(10, 3, 224, 224, device='cpu')
model = torchvision.models.alexnet(pretrained=True)
print("'AlexNet' model successfully imported!\n")

# The download took approximately 1 min

Importing AlexNet...
'AlexNet' model successfully imported!


#### Setup Model and Export

Providing input and output names sets the display names for values within the model's graph. Setting these does not change the semantics of the graph; it is only for readability.

The inputs to the network consist of the flat list of inputs (i.e. the values you would pass to the forward() method) followed by the flat list of parameters. You can partially specify names, i.e. provide a list here shorter than the number of inputs to the model, and we will only set that subset of names, starting from the beginning.

In [16]:
# Setup
print("Defining input/output names...\n")
input_names = [ "actual_input_1" ] + [ "learned_%d" % i for i in range(16) ]
output_names = [ "output1" ]

# Export 
torch.onnx.export(model, dummy_input, modelOut, verbose=verboseMode, input_names=input_names, output_names=output_names)
print("\nModel exported: ", modelOut, "\n")

Defining input/output names...

graph(%actual_input_1 : Float(10:150528, 3:50176, 224:224, 224:1, requires_grad=0, device=cpu),
      %learned_0 : Float(64:363, 3:121, 11:11, 11:1, requires_grad=1, device=cpu),
      %learned_1 : Float(64:1, requires_grad=1, device=cpu),
      %learned_2 : Float(192:1600, 64:25, 5:5, 5:1, requires_grad=1, device=cpu),
      %learned_3 : Float(192:1, requires_grad=1, device=cpu),
      %learned_4 : Float(384:1728, 192:9, 3:3, 3:1, requires_grad=1, device=cpu),
      %learned_5 : Float(384:1, requires_grad=1, device=cpu),
      %learned_6 : Float(256:3456, 384:9, 3:3, 3:1, requires_grad=1, device=cpu),
      %learned_7 : Float(256:1, requires_grad=1, device=cpu),
      %learned_8 : Float(256:2304, 256:9, 3:3, 3:1, requires_grad=1, device=cpu),
      %learned_9 : Float(256:1, requires_grad=1, device=cpu),
      %learned_10 : Float(4096:9216, 9216:1, requires_grad=1, device=cpu),
      %learned_11 : Float(4096:1, requires_grad=1, device=cpu),
      %learne

**Note:**
This essentially created a binary protobuf file named 'alexnet.onnx' which contains both the network structure and parameters of the model exported. 

The argument verbose=True causes the exporter to print out a human-readable representaton of the network.

In [22]:
# Load Model From ONNX
import onnx

model = onnx.load("alexnet.onnx")
print("'", modelOut, "' loaded successfully!\n")

# Check that IR is well formed
onnx.checker.check_model(model)

# Print human readable representation of graph
if verboseMode:
    print("Printing model graph...\n")
    onnx.helper.printable_graph(model.graph)


' alexnet.onnx ' loaded successfully!

Printing model graph...



#### Perform Inference with ONNX Runtime

**Note: We must designate the data type of the output such that it will match the input datatpye. Float32 is a fairly common and precise datatype.** 

In [23]:
import onnxruntime as ort
import numpy as np

ort.session = ort.InferenceSession('alexnet.onnx')

outputs = ort.session.run(None, {'actual_input_1': np.random.randn(10, 3, 224, 224).astype(np.float32)})

print(outputs[0])

[[-0.32908696 -1.4375494  -1.171913   ... -0.97837996 -1.1469944
   1.0900817 ]
 [-0.10875634 -1.1032635  -1.051388   ... -1.3563275  -0.9471036
   0.98995644]
 [ 0.13966255 -1.3960574  -1.5738554  ... -1.308495   -0.82824004
   0.8751602 ]
 ...
 [-0.1709329  -1.445117   -1.2844281  ... -1.512876   -1.0836242
   1.3018279 ]
 [-0.084096   -1.3492982  -1.3600118  ... -1.125779   -0.8830015
   1.2827814 ]
 [ 0.10905553 -1.2827294  -1.3019022  ... -1.3264313  -0.74955183
   0.54748124]]
