# Measuring FLOPs
MACs (Multiply-Accumulate Operations) and FLOPs (Floating Point Operations) are two metrics used to measure inference speed of models


Minimizing MACs/FLOPs helps in building faster to real-time models.



Check out [ref] for more.

[ref]: https://medium.com/@pashashaik/a-guide-to-hand-calculating-flops-and-macs-fa5221ce5ccc

Library used: [thop]

[thop]: https://github.com/Lyken17/pytorch-OpCounter

Other tool(s): [fvcore]

[fvcore]: https://github.com/facebookresearch/fvcore/blob/main/docs/flop_count.md

In [None]:
!pip install thop

In [11]:
import torch
from torchvision.models import resnet34, resnet18
from thop import profile
from thop import clever_format

model = resnet34()
input = torch.randn(1, 3, 224, 224)
flops, params = profile(model, inputs=(input, ))
flops, params = clever_format([flops, params], "%.3f")

[INFO] Register count_convNd() for <class 'torch.nn.modules.conv.Conv2d'>.
[INFO] Register count_normalization() for <class 'torch.nn.modules.batchnorm.BatchNorm2d'>.
[INFO] Register zero_ops() for <class 'torch.nn.modules.activation.ReLU'>.
[INFO] Register zero_ops() for <class 'torch.nn.modules.pooling.MaxPool2d'>.
[INFO] Register zero_ops() for <class 'torch.nn.modules.container.Sequential'>.
[INFO] Register count_adap_avgpool() for <class 'torch.nn.modules.pooling.AdaptiveAvgPool2d'>.
[INFO] Register count_linear() for <class 'torch.nn.modules.linear.Linear'>.


In [3]:
flops, params

('3.679G', '21.798M')

In [None]:
model

In [6]:
pip install torchsummary

Defaulting to user installation because normal site-packages is not writeableNote: you may need to restart the kernel to use updated packages.

Collecting torchsummary
  Downloading torchsummary-1.5.1-py3-none-any.whl (2.8 kB)
Installing collected packages: torchsummary
Successfully installed torchsummary-1.5.1



[notice] A new release of pip is available: 23.2.1 -> 23.3.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [None]:
from torchsummary import summary
summary(model, (3,224,224))

# Converting models to ONNX and visualize

ONNX, or Open Neural Network Exchange, is an open format for representing machine learning (ML) models. It defines a computation graph model, as well as definitions of built-in operators and standard data types. ONNX is widely supported by many ML frameworks, tools, and hardware platforms.

One of the main benefits of ONNX is that it enables interoperability between different ML frameworks. This means that you can train a model in one framework, such as TensorFlow or PyTorch, and then export it to ONNX. You can then use the ONNX model in another framework, or on a different hardware platform, without having to retrain the model.

Another benefit of ONNX is that it makes it easier to deploy ML models in production. There are many ONNX-compatible runtimes and libraries that can be used to accelerate inference on a variety of hardware platforms, including CPUs, GPUs, and FPGAs.

The following piece of code is taken from https://pytorch.org/tutorials/advanced/super_resolution_with_onnxruntime.html, you can see more on how to infer using a model in ONNX format

In [None]:
pip install --upgrade pip

In [7]:
!pip install onnx

Defaulting to user installation because normal site-packages is not writeable


In [8]:
import torch.nn as nn
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        # 3x(Conv2d -> ReLU) -> MaxPool -> Dropout -> Flatten -> 2x(Linear ->  ReLU) -> Linear

        self.relu = nn.ReLU()
        self.dropout = nn.Dropout(0.3)
        self.flatten = nn.Flatten()
        self.conv1 = nn.Conv2d(in_channels=1, out_channels=32, kernel_size=3, stride=1, padding=0)
        self.conv2 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, stride=1, padding=0)
        self.conv3 = nn.Conv2d(in_channels=64, out_channels=64, kernel_size=3, stride=1, padding=0)
        self.maxpool = nn.MaxPool2d(kernel_size=2, stride=2)
        self.fc1 = nn.Linear(in_features=11*11*64, out_features=128)
        self.fc2 = nn.Linear(in_features=128, out_features=64)
        self.fc3 = nn.Linear(in_features=64, out_features=10)

    def forward(self, x):
        ## 3x(Conv2d -> ReLU) -> MaxPool -> Dropout -> Flatten -> 2x(Linear ->  ReLU) -> Linear

        x = self.conv1(x)
        x = self.relu(x)
        x = self.conv2(x)
        x = self.relu(x)
        x = self.conv3(x)
        x = self.relu(x)
        x = self.maxpool(x)
        x = self.dropout(x)
        x = self.flatten(x)
        x = self.fc1(x)
        x = self.relu(x)
        x = self.fc2(x)
        x = self.relu(x)
        x = self.fc3(x)

        return x

In [9]:
import onnx
import torch.onnx

model = Net()
input = torch.randn(1, 1, 28, 28)

torch.onnx.export(model,                     # model being run
                  input,                     # model input (or a tuple for multiple inputs)
                  "example.onnx",   # where to save the model (can be a file or file-like object)
                  export_params=True,        # store the trained parameter weights inside the model file
                  opset_version=10,          # the ONNX version to export the model to
                  do_constant_folding=True,  # whether to execute constant folding for optimization
                  input_names = ['input'],   # the model's input names
                  output_names = ['output'], # the model's output names
                  dynamic_axes={'input' : {0 : 'batch_size'},    # variable length axes
                                'output' : {0 : 'batch_size'}})

Finally, we will download the model and upload it to https://netron.app/ to visualize and debug

**Exercise 1: Implement ResNet-18**

In [25]:
import torchvision.models as models
from torchsummary import summary

In [26]:
resnet_18 = models.resnet18()
input = torch.randn(1, 3, 224, 224)

In [None]:
summary(resnet_18, (3, 224, 224))

In [31]:
torch.onnx.export(resnet_18,
                  input, 
                  "resnet_18.onnx",
                  export_params=True,
                  opset_version=10,
                  do_constant_folding=True,
                  input_names = ['input'],
                  output_names = ['output'],
                  dynamic_axes={'input' : {0 : 'batch_size'},
                                'output' : {0 : 'batch_size'}})

**Exercise 2: Use thop library to verify the number of parameters & FLOPs of VGG16 and ResNet 101**

In [32]:
import torch
from torchvision.models import vgg16, resnet101
from thop import profile
from thop import clever_format

Params and FLOPs of VGG 16

In [33]:
vgg16_ = vgg16()
input = torch.randn(1, 3, 224, 224)
flops, params = profile(vgg16_, inputs=(input, ))
flops, params = clever_format([flops, params], "%.3f")

[INFO] Register count_convNd() for <class 'torch.nn.modules.conv.Conv2d'>.
[INFO] Register zero_ops() for <class 'torch.nn.modules.activation.ReLU'>.
[INFO] Register zero_ops() for <class 'torch.nn.modules.pooling.MaxPool2d'>.
[INFO] Register zero_ops() for <class 'torch.nn.modules.container.Sequential'>.
[INFO] Register count_adap_avgpool() for <class 'torch.nn.modules.pooling.AdaptiveAvgPool2d'>.
[INFO] Register count_linear() for <class 'torch.nn.modules.linear.Linear'>.
[INFO] Register zero_ops() for <class 'torch.nn.modules.dropout.Dropout'>.


In [34]:
flops, params

('15.470G', '138.358M')

Params and FLOPs of ResNet 101

In [35]:
vgg16_ = resnet101()
input = torch.randn(1, 3, 224, 224)
flops, params = profile(vgg16_, inputs=(input, ))
flops, params = clever_format([flops, params], "%.3f")

[INFO] Register count_convNd() for <class 'torch.nn.modules.conv.Conv2d'>.
[INFO] Register count_normalization() for <class 'torch.nn.modules.batchnorm.BatchNorm2d'>.
[INFO] Register zero_ops() for <class 'torch.nn.modules.activation.ReLU'>.
[INFO] Register zero_ops() for <class 'torch.nn.modules.pooling.MaxPool2d'>.
[INFO] Register zero_ops() for <class 'torch.nn.modules.container.Sequential'>.
[INFO] Register count_adap_avgpool() for <class 'torch.nn.modules.pooling.AdaptiveAvgPool2d'>.
[INFO] Register count_linear() for <class 'torch.nn.modules.linear.Linear'>.


In [36]:
flops, params

('7.866G', '44.549M')