### Deployment with ONNX

ONNX (Open Neural Network Exchange) is designed to facilitate model deployment across various platforms and frameworks. ONNX provides a standardized format for representing deep learning models, making it possible to export a trained model from one deep learning framework and deploy it in another that supports ONNX.

Here are some key aspects of using ONNX for deployment:

1. **Interoperability:** ONNX allows you to export models trained in frameworks like PyTorch, TensorFlow, or scikit-learn to a common format. This facilitates interoperability and simplifies the deployment process.

2. **Deployment Targets:** ONNX models can be deployed on a wide range of platforms, including edge devices, cloud services, mobile devices, and IoT devices. ONNX Runtime is a cross-platform, high-performance scoring engine for ONNX models that supports deployment in various environments.

3. **Ecosystem Support:** Many popular deep learning frameworks and tools support ONNX, making it easier to integrate ONNX models into existing workflows. For example, ONNX models can be used with ONNX Runtime, TensorFlow, OpenVINO, and more.

4. **Model Optimization:** ONNX provides tools for optimizing models, including quantization and fusion, which can be important for deployment in resource-constrained environments.

5. **Language Support:** ONNX has bindings for several programming languages, making it accessible for deployment with languages like Python, C++, Java, and more.

Here is a high-level overview of the steps involved in using ONNX for deployment:

- **Export Model:** Export your trained model to ONNX format using the appropriate export functions provided by the deep learning framework (e.g., `torch.onnx.export` in PyTorch).

- **Load ONNX Model:** Load the exported ONNX model using an ONNX runtime engine, such as ONNX Runtime.

- **Deploy Model:** Deploy the ONNX model on the target platform or framework. This could involve integrating the model into a web service, deploying it on an edge device, or incorporating it into a mobile application.

- **Inference:** Use the deployed ONNX model to perform inference on new data.

By using ONNX, you can achieve greater flexibility in deploying your models across different environments and frameworks, making it a valuable tool for production deployment.

Here's an example of how you can export your PyTorch model to ONNX:


In [7]:
import torch
from mnist_model import Net

def main():
    pytorch_model = Net()

    # Load only the model weights and biases
    state_dict = torch.load('mnist_cnn.pt')['model_state_dict']
    
    # Update the model state dictionary to match the key names
    state_dict = {k.replace('module.', ''): v for k, v in state_dict.items()}  # Handle if model was saved with DataParallel

    # Load the updated state dictionary
    pytorch_model.load_state_dict(state_dict)
    
    pytorch_model.eval()
    dummy_input = torch.randn(1, 1, 28, 28)
    torch.onnx.export(pytorch_model, dummy_input, 'onnx_model.onnx', verbose=True)

if __name__ == '__main__':
    main()


Exported graph: graph(%input.1 : Float(1, 1, 28, 28, strides=[784, 784, 28, 1], requires_grad=0, device=cpu),
      %conv1.weight : Float(32, 1, 3, 3, strides=[9, 9, 3, 1], requires_grad=1, device=cpu),
      %conv1.bias : Float(32, strides=[1], requires_grad=1, device=cpu),
      %conv2.weight : Float(64, 32, 3, 3, strides=[288, 9, 3, 1], requires_grad=1, device=cpu),
      %conv2.bias : Float(64, strides=[1], requires_grad=1, device=cpu),
      %fc1.weight : Float(128, 9216, strides=[9216, 1], requires_grad=1, device=cpu),
      %fc1.bias : Float(128, strides=[1], requires_grad=1, device=cpu),
      %fc2.weight : Float(10, 128, strides=[128, 1], requires_grad=1, device=cpu),
      %fc2.bias : Float(10, strides=[1], requires_grad=1, device=cpu)):
  %/conv1/Conv_output_0 : Float(1, 32, 26, 26, strides=[21632, 676, 26, 1], requires_grad=0, device=cpu) = onnx::Conv[dilations=[1, 1], group=1, kernel_shape=[3, 3], pads=[0, 0, 0, 0], strides=[1, 1], onnx_name="/conv1/Conv"](%input.1, %conv1

### Excersie
- Try to run the model pipeline on your own dataset and export them into ONNX.  