### Lab 1-1. Converting a PyTorch model into an ONNX model


<!-- 
!pip install ipykernel 
!pip install tensorflow 
!pip install onnxruntime
!pip install -U tf2onnx
!pip install torch torchvision torchaudio tensorboard
!pip install transformers 
!pip install torchinfo
!pip install netron
!pip install onnxmltools
!pip install sclblonnx  -->



A simple example in PyTorch is available below. This simple example shows how to take a pre-trained PyTorch model (a weights object and network class object) and convert it to ONNX format (that contains the weights and net structur

In [3]:
from platform import python_version 
print("python_version=" + python_version())

import tensorflow as tf
print("tensorflow_version=" + tf.__version__)

python_version=3.9.18
tensorflow_version=2.12.0


In [4]:
import torch
import torchvision.models as models

# Use an existing model from Torchvision, note it 
# will download this if not already on your computer (might take time)
model = models.alexnet(pretrained=True)

# Create some sample input in the shape this model expects
dummy_input = torch.randn(10, 3, 224, 224)

# It's optional to label the input and output layers
input_names = [ "actual_input_1" ] + [ "learned_%d" % i for i in range(16) ]
output_names = [ "output1" ]

# Use the exporter from torch to convert to onnx 
# model (that has the weights and net arch)
torch.onnx.export(model, dummy_input, "models/alexnet.onnx", verbose=True, input_names=input_names, output_names=output_names)

Downloading: "https://download.pytorch.org/models/alexnet-owt-7be5be79.pth" to /home/weifen/.cache/torch/hub/checkpoints/alexnet-owt-7be5be79.pth
100%|██████████| 233M/233M [00:16<00:00, 14.9MB/s] 


Exported graph: graph(%actual_input_1 : Float(10, 3, 224, 224, strides=[150528, 50176, 224, 1], requires_grad=0, device=cpu),
      %learned_0 : Float(64, 3, 11, 11, strides=[363, 121, 11, 1], requires_grad=1, device=cpu),
      %learned_1 : Float(64, strides=[1], requires_grad=1, device=cpu),
      %learned_2 : Float(192, 64, 5, 5, strides=[1600, 25, 5, 1], requires_grad=1, device=cpu),
      %learned_3 : Float(192, strides=[1], requires_grad=1, device=cpu),
      %learned_4 : Float(384, 192, 3, 3, strides=[1728, 9, 3, 1], requires_grad=1, device=cpu),
      %learned_5 : Float(384, strides=[1], requires_grad=1, device=cpu),
      %learned_6 : Float(256, 384, 3, 3, strides=[3456, 9, 3, 1], requires_grad=1, device=cpu),
      %learned_7 : Float(256, strides=[1], requires_grad=1, device=cpu),
      %learned_8 : Float(256, 256, 3, 3, strides=[2304, 9, 3, 1], requires_grad=1, device=cpu),
      %learned_9 : Float(256, strides=[1], requires_grad=1, device=cpu),
      %learned_10 : Float(409

### Lab 1-2. Converting a Tensorflow model into an ONNX model

Tensorflow uses several file formats to represent a model, such as checkpoint files, graph with weight(called frozen graph next) and saved_model, and it has APIs to generate these files. TensorFlow models (including keras and TFLite models) can be converted to ONNX using the Tensorflow-onnx tool. Tensorflow-onnx can accept all the three formats to represent a Tensorflow model, the format “saved_model” is typically the preference since it doesn’t require the user to specify input and output names of graph. Another format, “tflite”, is very popular as well.

In [5]:
import tensorflow as tf
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
import os

def create_simple_cnn_model(input_shape=(224, 224, 3), num_classes=10):
    model = tf.keras.Sequential([
        Conv2D(32, (3, 3), activation='relu', input_shape=input_shape),
        MaxPooling2D((2, 2)),
        Flatten(),
        Dense(128, activation='relu'),
        Dense(num_classes, activation='softmax')
    ])
    return model

directory = "models/tf_cnn_models"
if not os.path.exists(directory):
    os.makedirs(directory)

# 創建你的CNN模型
simple_cnn_model = create_simple_cnn_model()

# 儲存模型為SavedModel格式
tf.saved_model.save(simple_cnn_model, directory)


2024-02-12 18:30:49.349926: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.


INFO:tensorflow:Assets written to: models/tf_cnn_models/assets


INFO:tensorflow:Assets written to: models/tf_cnn_models/assets


In [6]:
!python3 -m tf2onnx.convert --saved-model models/tf_cnn_models --output models/tf2onnx_cnn_model.onnx

2024-02-12 18:30:55.921561: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: SSE4.1 SSE4.2 AVX AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-02-12 18:30:57.498575: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.
2024-02-12 18:30:58,536 - INFO - Signatures found in model: [serving_default].
2024-02-12 18:30:58,536 - INFO - Output names: ['dense_1']
2024-02-12 18:30:58.537500: I tensorflow/core/grappler/devices.cc:75] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0 (Note: TensorFlow was not compiled with CUDA or ROCm support)
2024-02-12 18:30:58.537705: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
2024-02-12 18:31:

Load the converted ONNX model using onnxruntime.



In [7]:
import onnxruntime as ort
onnx_session = ort.InferenceSession("models/tf2onnx_cnn_model.onnx")

Prepare Test Data:



In [8]:
import numpy as np
test_input = np.random.rand(1, 224, 224, 3).astype(np.float32)

Inference with ONNX Model:



In [9]:
onnx_input_name = onnx_session.get_inputs()[0].name
onnx_output = onnx_session.run(None, {onnx_input_name: test_input})

Inference with TensorFlow Model:



In [10]:
tf_output = simple_cnn_model.predict(test_input)



Compare outputs from both models to check for consistency:



In [12]:
if np.allclose(tf_output, onnx_output[0], rtol=1e-3, atol=1e-5):
    print("Model outputs are consistent!")
else:
    print("Model outputs are inconsistent!")

Model outputs are consistent!


In [13]:
tf_output

array([[0.09093276, 0.07652456, 0.0647765 , 0.08905411, 0.07794573,
        0.09307253, 0.09439282, 0.13438646, 0.18836619, 0.09054824]],
      dtype=float32)

In [14]:
onnx_output[0]

array([[0.0909328 , 0.0765245 , 0.06477651, 0.08905414, 0.07794575,
        0.09307254, 0.09439281, 0.13438638, 0.18836643, 0.09054821]],
      dtype=float32)

### Lab 2-1-3. Model Analysis in Pytorch


In PyTorch, AlexNet is a popular deep learning architecture commonly used for image recognition and classification. This model demonstrates the typical architecture and workload of a deep learning model.

In [15]:
import torchvision.models as models

# Using an existing model from Torchvision, it will download the model if not already on your computer
model = models.alexnet(pretrained=True)
print(model)



AlexNet(
  (features): Sequential(
    (0): Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
    (1): ReLU(inplace=True)
    (2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
    (3): Conv2d(64, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
    (4): ReLU(inplace=True)
    (5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
    (6): Conv2d(192, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (7): ReLU(inplace=True)
    (8): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (9): ReLU(inplace=True)
    (10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (11): ReLU(inplace=True)
    (12): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (avgpool): AdaptiveAvgPool2d(output_size=(6, 6))
  (classifier): Sequential(
    (0): Dropout(p=0.5, inplace=False)
    (1): Linear(in_features=9216, out_features=4096, bias=True)
 

#### Get Model Parameter Size

In [16]:
# Calculating the total number of parameters in the model
total_params = sum(p.numel() for p in model.parameters())
print("Total number of parameters: ", total_params)

Total number of parameters:  61100840


#### Get Memory Requirement

In [17]:
# Calculating the size of the model's parameters in bytes
param_size = sum(p.numel() * p.element_size() for p in model.parameters())
print("Total memory for parameters: ", param_size)

Total memory for parameters:  244403360


#### Print Pytorch Summary

In [18]:
from torchvision import models
from torchsummary import summary

model = models.alexnet(pretrained=True)
summary(model, (3, 224, 224))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1           [-1, 64, 55, 55]          23,296
              ReLU-2           [-1, 64, 55, 55]               0
         MaxPool2d-3           [-1, 64, 27, 27]               0
            Conv2d-4          [-1, 192, 27, 27]         307,392
              ReLU-5          [-1, 192, 27, 27]               0
         MaxPool2d-6          [-1, 192, 13, 13]               0
            Conv2d-7          [-1, 384, 13, 13]         663,936
              ReLU-8          [-1, 384, 13, 13]               0
            Conv2d-9          [-1, 256, 13, 13]         884,992
             ReLU-10          [-1, 256, 13, 13]               0
           Conv2d-11          [-1, 256, 13, 13]         590,080
             ReLU-12          [-1, 256, 13, 13]               0
        MaxPool2d-13            [-1, 256, 6, 6]               0
AdaptiveAvgPool2d-14            [-1, 25

#### Using torchInfo 

Using torchinfo.summary, we can get a lot of information by giving currently supported options as input for the argument col_names:

In [19]:
import torchinfo
torchinfo.summary(model, (3, 224, 224), batch_dim=0, col_names=("input_size", "output_size", "num_params", "kernel_size", "mult_adds"), verbose=0)

Layer (type:depth-idx)                   Input Shape               Output Shape              Param #                   Kernel Shape              Mult-Adds
AlexNet                                  [1, 3, 224, 224]          [1, 1000]                 --                        --                        --
├─Sequential: 1-1                        [1, 3, 224, 224]          [1, 256, 6, 6]            --                        --                        --
│    └─Conv2d: 2-1                       [1, 3, 224, 224]          [1, 64, 55, 55]           23,296                    [11, 11]                  70,470,400
│    └─ReLU: 2-2                         [1, 64, 55, 55]           [1, 64, 55, 55]           --                        --                        --
│    └─MaxPool2d: 2-3                    [1, 64, 55, 55]           [1, 64, 27, 27]           --                        3                         --
│    └─Conv2d: 2-4                       [1, 64, 27, 27]           [1, 192, 27, 27]          307,

Consider a model structured as follows, with several branches where each branch takes a different input:

In [20]:
import torchvision.models as models
import torch

class Model(torch.nn.Module):
    def __init__(self):
        super().__init__()
        # Initialize three instances of pretrained AlexNet
        self.alexnet1 = models.alexnet(pretrained=True)
        self.alexnet2 = models.alexnet(pretrained=True)
        self.alexnet3 = models.alexnet(pretrained=True)
    
    def forward(self, *x):
        # Ensure that the input is a tuple of three tensors
        if len(x) != 3:
            raise ValueError("Expected three input tensors")

        # Pass each tensor through the corresponding AlexNet model
        out1 = self.alexnet1(x[0])
        out2 = self.alexnet2(x[1])
        out3 = self.alexnet3(x[2])

        # Concatenate the outputs along the 0th dimension
        out = torch.cat([out1, out2, out3], dim=0)
        return out

In [21]:
import torchinfo
torchinfo.summary(Model(), [(3, 64, 64)]*3, batch_dim=0, col_names=("input_size", "output_size", "num_params", "kernel_size", "mult_adds"), verbose=0)

Layer (type:depth-idx)                   Input Shape               Output Shape              Param #                   Kernel Shape              Mult-Adds
Model                                    [1, 3, 64, 64]            [3, 1000]                 --                        --                        --
├─AlexNet: 1-1                           [1, 3, 64, 64]            [1, 1000]                 --                        --                        --
│    └─Sequential: 2-1                   [1, 3, 64, 64]            [1, 256, 1, 1]            --                        --                        --
│    │    └─Conv2d: 3-1                  [1, 3, 64, 64]            [1, 64, 15, 15]           23,296                    [11, 11]                  5,241,600
│    │    └─ReLU: 3-2                    [1, 64, 15, 15]           [1, 64, 15, 15]           --                        --                        --
│    │    └─MaxPool2d: 3-3               [1, 64, 15, 15]           [1, 64, 7, 7]             --   