## Introduction to ONNX

- **Open Neural Network eXchange (ONNX)** is an open standard format for representing machine learning models.

- The `torch.onnx` module provides APIs to **`capture the computation graph from a native PyTorch torch.nn.Module model and convert it into an ONNX graph`**.

- The **`exported model can be consumed by any of the many runtime that support ONNX, including Microsoft’s ONNX Runtime`**.

---

## Two flavors of ONNX in PyTorch:

1. **TorchDynamo-based ONNX Exporter**
   - Supported in PyTorch 2.0 and later
   - TorchDynamo engine is leveraged to hook into Python’s frame evaluation API and dynamically rewrite its bytecode into an FX Graph. The resulting FX Graph is then polished before it is finally translated into an ONNX graph.

    - The **`main advantage`** of this approach is that the FX graph is captured using bytecode analysis that **`preserves the dynamic nature of the model instead of using traditional static tracing techniques`**.

2. **TorchScript-based ONNX Exporter**
    -  available since PyTorch 1.2.0    
    - TorchScript is leveraged to `trace (through torch.jit.trace()) the model and capture a static computation graph`.

    - As a consequence, the resulting graph has a couple limitations:

        - It **`does not record any control-flow, like if-statements or loops`**;

        - **`Does not handle nuances between training and eval mode`**;

        - Does not truly handle dynamic inputs

    - As an attempt to support the static tracing limitations, the exporter also supports TorchScript scripting (through torch.jit.script()), which adds support for data-dependent control-flow, for example. However, `TorchScript itself is a subset of the Python language, so not all features in Python are supported, such as in-place operations`.

---

### Conclusion:

- Obviously, the **`TorchDynamo-based ONNX Exporter`** is the **`preferred approach`** for exporting models to ONNX format, as it **`preserves the dynamic nature of the model`**.

---

### Installation:

```bash
pip install --upgrade onnx onnxscript onnxruntime
```

---

## Check Versions:

In [2]:
import torch

print(f"{torch.__version__=}")

import onnxscript

print(f"{onnxscript.__version__=}")

import onnxruntime

print(f"{onnxruntime.__version__=}")

torch.__version__='2.2.0'
onnxscript.__version__='0.1.0.dev20240223'
onnxruntime.__version__='1.17.0'


---

## Sample `neural network` model:

In [3]:
import torch
import torch.nn as nn
import torch.nn.functional as F


class MyModel(nn.Module):

    def __init__(self):
        super(MyModel, self).__init__()
        self.conv1 = nn.Conv2d(1, 6, 5)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
        x = F.max_pool2d(F.relu(self.conv2(x)), 2)
        x = torch.flatten(x, 1)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

---

## Export the model to ONNX format

In [5]:
torch_model = MyModel()
torch_input = torch.randn(1, 1, 32, 32)
onnx_program = torch.onnx.dynamo_export(torch_model, torch_input)



---

## Save the ONNX model in a file

In [6]:
onnx_program.save("my_image_classifier.onnx")

---

## Visualize the ONNX model graph using Netron

https://netron.app/

---

## Execute the ONNX model with ONNX Runtime

- Install ONNX runtime
```bash
pip install --upgrade onnxruntime
```

In [8]:
import onnxruntime

onnx_input = onnx_program.adapt_torch_inputs_to_onnx(torch_input)
print(f"Input length: {len(onnx_input)}")
print(f"Sample input: {onnx_input}")

ort_session = onnxruntime.InferenceSession(
    "./my_image_classifier.onnx", providers=["CPUExecutionProvider"]
)


def to_numpy(tensor):
    return (
        tensor.detach().cpu().numpy() if tensor.requires_grad else tensor.cpu().numpy()
    )


onnxruntime_input = {
    k.name: to_numpy(v) for k, v in zip(ort_session.get_inputs(), onnx_input)
}

onnxruntime_outputs = ort_session.run(None, onnxruntime_input)

Input length: 1
Sample input: (tensor([[[[-0.9046, -0.4413,  0.5450,  ..., -0.0977,  0.6759, -0.4582],
          [-1.5342,  1.3502,  0.6754,  ..., -1.5576, -0.3840,  1.8964],
          [-1.1224, -1.1482,  0.1003,  ...,  0.1244, -1.2301,  0.5959],
          ...,
          [-0.5470,  0.9113,  0.3746,  ..., -0.0066,  0.4829,  0.5727],
          [-0.5766, -0.0918, -0.1301,  ...,  0.4595, -0.7187, -2.3188],
          [ 0.4009,  0.5273, -2.4092,  ...,  0.0299,  0.3198, -0.7431]]]]),)


---

## Compare the PyTorch results with the ones from the ONNX Runtime

In [9]:
torch_outputs = torch_model(torch_input)
torch_outputs = onnx_program.adapt_torch_outputs_to_onnx(torch_outputs)

assert len(torch_outputs) == len(onnxruntime_outputs)
for torch_output, onnxruntime_output in zip(torch_outputs, onnxruntime_outputs):
    torch.testing.assert_close(torch_output, torch.tensor(onnxruntime_output))

print("PyTorch and ONNX Runtime output matched!")
print(f"Output length: {len(onnxruntime_outputs)}")
print(f"Sample output: {onnxruntime_outputs}")

PyTorch and ONNX Runtime output matched!
Output length: 1
Sample output: [array([[-0.00951168, -0.04558313,  0.07988013, -0.01976795,  0.0758441 ,
        -0.12008841, -0.0436755 ,  0.052744  ,  0.1311065 ,  0.07726569]],
      dtype=float32)]
