<a href="https://colab.research.google.com/github/IANGECHUKI176/deeplearning/blob/main/pytorch/convnets/MobileNetV1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
import torch
import torch.nn as nn


MobileNetV1 [paper](https://arxiv.org/pdf/1704.04861v1.pdf)

A convolutional neural network with large number of layers is expensive, both interms of memory and the
hardware requirement for inference and thus deploying such models in mobile devices is not feasible.

To overcome the above challenge, a group of researchers from Google built a neural network model
optimized for mobile devices referred as MobileNet. Underlying idea of mobilenet is depthwise
seperable convolutions consisting of depthwise and a pointwise convolution to build lighter models.

MobileNet introduces two hyperparameters

* Width Multiplier

Width muliplier (denoted by α) is a global hyperparameter that is used to construct smaller and less
computionally expensive models.Its value lies between 0 and 1.For a given layer and value of α, the
number of input channels 'M' becomes α * M and the number of output channels 'N' becomes α * N hence
reducing the cost of computation and size of the model at the cost of performance.The computation cost
and number of parameters decrease roughly by a factor of α2.Some commonly used values of α are 1,0.75,0.5,0.25.

* Resolution Multiplier

The second parameter introduced in MobileNets is called resolution multiplier and is denoted by ρ.This
hyperparameter is used to decrease the resolution of the input image and this subsequently reduces the
input to every layer by the same factor. For a given value of ρ the resolution of the input image becomes
224 * ρ. This reduces the computational cost by a factor of ρ2.

The above parameters helps in trade-off between latency (speed of inference) and accuracy.

MobileNet is 28 layers neural net represented by both the depthwise convolution and pointwise convolution.

In [2]:
class MobileNetV1(nn.Module):
    def __init__(self,c_in,n_classes):
        super(MobileNetV1,self).__init__()

        def conv_bn(c_in,c_out,stride):
            return nn.Sequential(
                nn.Conv2d(c_in,c_out,kernel_size = 3,stride = stride,padding = 1,bias = False),
                nn.BatchNorm2d(c_out),
                nn.ReLU(inplace = True)
            )
        def conv_dw(c_in,c_out,stride):
            return nn.Sequential(
                #dw
                nn.Conv2d(c_in,c_in,kernel_size = 3,stride = stride , padding = 1,groups = c_in,bias = False),
                nn.BatchNorm2d(c_in),
                nn.ReLU(inplace = True),

                #pw
                nn.Conv2d(c_in,c_out,kernel_size = 1,stride = 1,padding = 0,bias = False),
                nn.BatchNorm2d(c_out),
                nn.ReLU(inplace=True)
            )
        self.model = nn.Sequential(
            conv_bn(c_in,32,2),
            conv_dw(32,64,1),
            conv_dw(64,128,2),
            conv_dw(128,128,1),
            conv_dw(128,256,2),
            conv_dw(256,256,1),
            conv_dw(256,512,2),
            conv_dw(512,512,1),
            conv_dw(512,512,1),
            conv_dw(512,512,1),
            conv_dw(512,512,1),
            conv_dw(512,512,1),
            conv_dw(512,1024,2),
            conv_dw(1024,1024,1),
            nn.AdaptiveAvgPool2d(1) # or nn.AvgPool2d(7)
        )
        self.fc = nn.Linear(1024,n_classes)
    def forward(self,x):
        out = self.model(x)
        out = out.view(-1,1024)
        out = self.fc(out)
        return out

In [3]:
from torchsummary import summary
device  = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
blk = MobileNetV1(3,10)

summary(blk,(3,224,224))


----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1         [-1, 32, 112, 112]             864
       BatchNorm2d-2         [-1, 32, 112, 112]              64
              ReLU-3         [-1, 32, 112, 112]               0
            Conv2d-4         [-1, 32, 112, 112]             288
       BatchNorm2d-5         [-1, 32, 112, 112]              64
              ReLU-6         [-1, 32, 112, 112]               0
            Conv2d-7         [-1, 64, 112, 112]           2,048
       BatchNorm2d-8         [-1, 64, 112, 112]             128
              ReLU-9         [-1, 64, 112, 112]               0
           Conv2d-10           [-1, 64, 56, 56]             576
      BatchNorm2d-11           [-1, 64, 56, 56]             128
             ReLU-12           [-1, 64, 56, 56]               0
           Conv2d-13          [-1, 128, 56, 56]           8,192
      BatchNorm2d-14          [-1, 128,

In [4]:
input = torch.randn(4,3,224,224)
out = blk(input)
out.shape

torch.Size([4, 10])