# Image Classification with MobileNetV2
MobileNetV2 (https://arxiv.org/abs/1801.04381) builds upon the ideas from MobileNetV1, using depthwise separable convolution as efficient building blocks. However, V2 introduces two new features to the architecture:
1. linear bottlenecks between the layers, and
2. shortcut connections between the bottlenecks.

Overall, the MobileNetV2 is faster for the same accuracy across the entire latency spectrum. It uses 2x fewer operations, needs 30% fewer parameters and is about 30-40% faster than MobileNetV1, all while achieving higher accuracy.

In this playground we load a pre-trained MobileNetV2 model and do some inference and evaluation on ImageNet dataset.

Note: Fireball uses the OpenCV python package to process images in the ImageNet dataset. If this package is not installed already, you can just run the following command in a new cell and restart the kernel.

```
%pip install opencv-python
```

## Create an ImageNet dataset
Lets first load the ImageNet dataset and see dataset statistics.

**NOTE**: MobileNetV2 uses the "Crop256Tf" pre-processing for the images. This pre-processing first resizes the image (keeping the aspect ratio) so that its smaller dimension is 256. It then crops a 224x224 image from the center of the resized image. The image is in RGB format and the values are normalized to values between -1 and 1. Please refer to Fireball's ```imagenet.py``` file for more information about different types of pre-processing supported by Fireball.


In [2]:
import numpy as np
import time, os
from fireball import Model, Block, myPrint
from fireball.datasets.imagenet import ImageNetDSet

gpus="0,1,2,3"

myPrint('\nPreparing ImageNet dataset ... ', False)
trainDs,testDs = ImageNetDSet.makeDatasets('Train,Test', batchSize=256, preProcessing='Crop256Tf', numWorkers=8)
myPrint('Done.')
ImageNetDSet.printDsInfo(trainDs, testDs)


Preparing ImageNet dataset ... Done.
ImageNetDSet Dataset Info:
    Number of Classes .............................. 1000
    Dataset Location ............................... /data/ImageNet/
    Number of Training Samples ..................... 1281167
    Number of Test Samples ......................... 50000
    Sample Shape ................................... (224, 224, 3)


## Create a MobileNetV2 Fireball model and print the model information
Let's load the model information from a pre-trained fireball model and print information about different layers of the model. For your information, the MobileNetV2's layer info text and blocks are as follows. Since we already have a trained fbm file for MobileNetV2, we don't need them here.

```
blocks = [ 
    Block('MN1|x_expansion_i,o_outDept_i|' +     # MobileNet Block with Stride 1 No shortcut
          'add|' +
          'CONV_K1_O%x_Ps_B0,BN:ReLU:CLP_H6,DWCN_K3_S1_Ps_B0,BN:ReLU:CLP_H6,CONV_K1_O%o_Ps_B0,BN'),

    Block('MN1S|x_expansion_i,o_outDept_i|' +    # MobileNet Block with Stride 1 With shortcut
          'add|' +
          'CONV_K1_O%x_Ps_B0,BN:ReLU:CLP_H6,DWCN_K3_Ps_B0,BN:ReLU:CLP_H6,CONV_K1_O%o_Ps_B0,BN;ID'),

    Block('MN2|x_expansion_i,o_outDept_i|' +     # MobileNet Block with Stride 2 No shortcut
          'add|' +
          'CONV_K1_O%x_Ps_B0,BN:ReLU:CLP_H6,DWCN_K3_S2_P0x1x0x1_B0,BN:ReLU:CLP_H6,CONV_K1_O%o_Ps_B0,BN')
    ]

layers = ('IMG_S224_D3;CONV_K3_O32_S2_P0x1x0x1_B0,BN:ReLU:CLP_H6;'  +         # Input Stages
          'DWCN_K3_S1_Ps_B0,BN:ReLU:CLP_H6,CONV_K1_O16_S1_Ps_B0,BN;' +        # Block 0
          'MN2_X96_O24,MN1S_X144_O24;' +                                      # Blocks 1, 2
          'MN2_X144_O32,MN1S_X192_O32,MN1S_X192_O32;' +                       # Blocks 3, 4, 5
          'MN2_X192_O64,MN1S_X384_O64,MN1S_X384_O64,MN1S_X384_O64;' +         # Blocks 6, 7, 8, 9
          'MN1_X384_O96,MN1S_X576_O96,MN1S_X576_O96;' +                       # Blocks 10, 11, 12
          'MN2_X576_O160,MN1S_X960_O160,MN1S_X960_O160,MN1_X960_O320;' +      # Blocks 13, 14, 15, 16
          'CONV_K1_O1280_Ps_B0,BN:ReLU:CLP_H6:GAP,FC_O1000:None;CLASS_C1000') # Output Stages
```

In [3]:
model = Model.makeFromFile("Models/MobileNetV2.fbm", testDs=testDs, gpus=gpus)
model.printLayersInfo()
model.initSession()


Reading from "Models/MobileNetV2.fbm" ... Done.
Creating the fireball model "MobileNetV2" ... Done.

Scope            InShape       Comments                 OutShape      Activ.   Post Act.        # of Params
---------------  ------------  -----------------------  ------------  -------  ---------------  -----------
IN_IMG                         Image Size: 224x224x3    224 224 3     None                      0          
S1_L1_CONV       224 224 3     KSP: 3 2 0x1x0x1         112 112 32    None                      864        
S1_L2_BN         112 112 32                             112 112 32    ReLU     x<6.0            128        
S2_L1_DWCN       112 112 32    KSP: 3 1 s               112 112 32    None                      288        
S2_L2_BN         112 112 32                             112 112 32    ReLU     x<6.0            128        
S2_L3_CONV       112 112 32    KSP: 1 1 s               112 112 16    None                      512        
S2_L4_BN         112 112 16       

## A quick inference demo
Now let's show how this model can be used to classify an image. Here we are using a JPEG image of a coffee mug.
The function ```getPreprocessedImage```, loads, scales, and preprocesses the image before returning it as numpy array of floating point numbers. We can pass the preprocessed image to the ```inferOne``` function to get the probabilities for each one of 1000 classes. We then print the top-3 classes with highest probabilities.

In [4]:
imageFileName = 'CoffeeMug.jpg'
image = testDs.getPreprocessedImage(imageFileName)
classProbs = model.inferOne(image, returnProbs=True)
top3Indexes = np.argsort(classProbs)[-3:][::-1]   # Indexes of classes with 3 highest probs (decreasing order)
top3Porbs = classProbs[top3Indexes]
print('Top-3 Classes (For "%s"):'%(imageFileName))
for i in range(3):
    print('    %s (%f)'%(ImageNetDSet.classNames[top3Indexes[i]], top3Porbs[i]))


Top-3 Classes (For "CoffeeMug.jpg"):
    coffee_mug (0.777315)
    cup (0.198125)
    espresso (0.005773)


## Evaluation of the model
This code runs inference on all images in the ImageNet dataset and compares the results with the ground truth labels in the testDs. The accuracy of the model is then printed.

In [5]:
myPrint('Running inference on %d Test Samples (batchSize:%d, %d towers) ... '%(testDs.numSamples,
                                                                               testDs.batchSize,
                                                                               len(model.towers)))
results = model.evaluate(topK=5)    # Calculate and print top-5 accuracy as well as the default top-1.

Running inference on 50000 Test Samples (batchSize:256, 4 towers) ... 
  Processed 50000 Sample. (Time: 45.06 Sec.)                              

Observed Accuracy: 0.711160
Top-5 Accuracy:   0.900680


## Where do I go from here?

[Reducing number of parameters of MobileNetV2 Model](MobileNetV2-Reduce.ipynb)

[Pruning MobileNetV2 Model](MobileNetV2-Prune.ipynb)

[Quantizing MobileNetV2 Model](MobileNetV2-Quantize.ipynb)

[Exporting MobileNetV2 Model to ONNX](MobileNetV2-ONNX.ipynb)

[Exporting MobileNetV2 Model to CoreML](MobileNetV2-CoreML.ipynb)

[Exporting MobileNetV2 Model to TensorFlow](MobileNetV2-TF.ipynb)

---

[Fireball Playgrounds](../Contents.ipynb)

