## MobilenetV1 and MobilenetV2

#### Normal Convolution

![normal_convolution](image01.jpg)

The number of multiplications in 1 convolution operation = Dk x Dk x M

Since there are N filters and each filter slides vertically and horizontally Dp times,

the total number of multiplications become N x Dp x Dp x (Multiplications per convolution)

So for normal convolution operation

Total no of multiplications = $N x Dp^2 x Dk^2 x M$

#### Depth-Wise Separable Convolutions

Now look at depth-wise separable convolutions. This process is broken down into 2 operations –

- **Depth-wise convolutions**
- **Point-wise convolutions**

##### DEPTH WISE CONVOLUTION

In **depth-wise operation**, convolution is applied to a single channel at a time unlike standard/normal/regular convolution operations, in which it is done for all the M channels. So here the filters/kernels will be of size: 
**Dk x Dk x 1**. Given there are M channels in the input data, then M such filters are required. Output will be of size:
**Dp x Dp x M**.

![depthwise_conv](image02.png)

**Cost of this operation:**

A single convolution operation require **Dk x Dk x 1** multiplications.

Since the filter are slided by Dp x Dp times across all the M channels,

the total number of multiplications is equal to **M x Dp x Dp x Dk x Dk x 1**

So for depth wise convolution operation

Total no of multiplications = $M x Dk^2 x Dp^2$

##### POINT WISE CONVOLUTION

In **point-wise operation**, a **1×1** convolution operation is applied on the M channels. So the filter size for this operation will be **1 x 1 x M**. Say we use N such filters, the output size becomes **Dp x Dp x N.**

![point_wise_conv](image03.png)

**Cost of this operation:**

A single convolution operation require **1 x 1 x M** multiplications.

Since the filter is being slided by Dp x Dp times,

the total number of multiplications is equal to **(1x 1x M) x (Dp x Dp x N)**

So for point wise convolution operation

Total no of multiplications = **M x $Dp^2$ x N**

Therefore, for overall operation:


Total multiplications = Depth wise conv. multiplications + Point wise conv. multiplications

Total multiplications = $M x Dk^2 x Dp^2 + M x Dp^2 x N$ 

**Comparison between the complexities of these types of convolution operations**

**Depth wise separable:    $M x Dk^2 x Dp^2 + M x Dp^2 x N$** 

**normal convolution:        $N x Dp^2 x Dk^2 x M$**


 **RATIO ( R )**=*Complexity of depth wise separable convolutions*/*Complexity of standard convolution* 

 



Upon solving:

**Ratio(R) = $1/N + 1/Dk^2$**

Example if N=512 and Dk=3,
Ratio = 0.113, means **depthwise separable conv has just 11.3 percent FLOPS of regular convlotions**

![depthwise_separable_conv](depthwise_separable_conv.png)


![mobilenet_table01](mobilenet_Table1-1.png)

**exercise: build the mobilenetv1 model using keras or tf.keras libarry and the above table**

[moblenetV1 review by Sik-Ho Tsang
](https://towardsdatascience.com/review-mobilenetv1-depthwise-separable-convolution-light-weight-model-a382df364b69)

[mobilenetV2 review by Sik-HO Tsang](https://towardsdatascience.com/review-mobilenetv2-light-weight-model-image-classification-8febb490e61c)

[MobilenetV1 concept](https://machinethink.net/blog/googles-mobile-net-architecture-on-iphone/)

[MobilenetV2 details](http://www.machinethink.net/blog/mobilenet-v2/)

In [None]:
#%run mobilenet.py

In [None]:
import tensorflow as tf
from tensorflow.keras.optimizers import *
from tensorflow.keras.models import Model
from tensorflow.keras.layers import *
from tensorflow.keras.activations import *
from tensorflow.keras.callbacks import *


def get_conv_block(tensor, channels, strides, alpha=1.0, name=''):
    channels = int(channels * alpha)

    x = Conv2D(channels,
               kernel_size=(3, 3),
               strides=strides,
               use_bias=False,
               padding='same',
               name='{}_conv'.format(name))(tensor)
    x = BatchNormalization(name='{}_bn'.format(name))(x)
    x = Activation('relu', name='{}_act'.format(name))(x)
    return x


def get_dw_sep_block(tensor, channels, strides, alpha=1.0, name=''):
    """Depthwise separable conv: A Depthwise conv followed by a Pointwise conv."""
    channels = int(channels * alpha)

    # Depthwise
    x = DepthwiseConv2D(kernel_size=(3, 3),
                        strides=strides,
                        use_bias=False,
                        padding='same',
                        name='{}_dw'.format(name))(tensor)
    x = BatchNormalization(name='{}_bn1'.format(name))(x)
    x = Activation('relu', name='{}_act1'.format(name))(x)

    # Pointwise
    x = Conv2D(channels,
               kernel_size=(1, 1),
               strides=(1, 1),
               use_bias=False,
               padding='same',
               name='{}_pw'.format(name))(x)
    x = BatchNormalization(name='{}_bn2'.format(name))(x)
    x = Activation('relu', name='{}_act2'.format(name))(x)
    return x


def MobileNet(shape, num_classes, alpha=1.0, include_top=True, weights=None):
    x_in = Input(shape=shape)

    x = get_conv_block(x_in, 32, (2, 2), alpha=alpha, name='initial')

    layers = [
        (64, (1, 1)),
        (128, (2, 2)),
        (128, (1, 1)),
        (256, (2, 2)),
        (256, (1, 1)),
        (512, (2, 2)),
        *[(512, (1, 1)) for _ in range(5)],
        (1024, (2, 2)),
        (1024, (2, 2))
    ]

    for i, (channels, strides) in enumerate(layers):
        x = get_dw_sep_block(x, channels, strides, alpha=alpha, name='block{}'.format(i))

    if include_top:
        x = GlobalAvgPool2D(name='global_avg')(x)
        x = Dense(num_classes, activation='softmax', name='softmax')(x)

    model = Model(inputs=x_in, outputs=x)

    if weights is not None:
        model.load_weights(weights, by_name=True)

    return model

In [None]:
mobilenetv1= MobileNet(shape=(224,224,3) , num_classes= 100, alpha=1.0, include_top=True, weights=None)

In [None]:
mobilenetv1.summary()