# Reducing number of parameters of SSD Model
This notebook shows how to use Low-Rank decomposition to reduce the number of parameters of a SSD model. It assumes that a trained model already exist in the ```Models``` directory. Please refer to the notebook [Object Detection with SSD](SSD.ipynb) for more info about using a pretrained SSD model.

## Load and evaluate the trained model

In [1]:
from fireball import Model, myPrint
from fireball.datasets.coco import CocoDSet
gpus = "upto4"

myPrint('\nPreparing Coco dataset ... ', False)
trainDs,testDs = CocoDSet.makeDatasets('Train,Test', batchSize=128, resolution=512, keepAr=False, numWorkers=4)
trainDs.batchSize = 64
myPrint('Done.')

model = Model.makeFromFile("Models/SSD512.fbm", testDs=testDs, gpus=gpus)
model.initSession()
results = model.evaluate()


Preparing Coco dataset ... Done.

Reading from "Models/SSD512.fbm" ... Done.
Creating the fireball model "SSD512" ... Done.
  Processed 5000 Sample. (Time: 55.47 Sec.)                              

Evaluating inference results for 5000 images ... 
  Calculating IoUs - Done (8.2 Seconds)                       
  Finding matches - Done (117.2 Seconds)                     
  Processing the matches - Done (4.1 Seconds)                    
Done (129.5 Seconds)

Average Precision (AP):
    IoU=0.50:0.95   Area: All      MaxDet: 100  = 0.258
    IoU=0.50        Area: All      MaxDet: 100  = 0.476
    IoU=0.75        Area: All      MaxDet: 100  = 0.256
    IoU=0.50:0.95   Area: Small    MaxDet: 100  = 0.102
    IoU=0.50:0.95   Area: Medium   MaxDet: 100  = 0.300
    IoU=0.50:0.95   Area: Large    MaxDet: 100  = 0.379
Average Recall (AR):
    IoU=0.50:0.95   Area: All      MaxDet: 1    = 0.234
    IoU=0.50:0.95   Area: All      MaxDet: 10   = 0.359
    IoU=0.50:0.95   Area: All      MaxDet: 1

## Reducing number of parameters
Here we apply Low-Rank Decomposition on different layers of the model to reduce the number of parameters. We first create a list of layers we want to apply Low-Rank Decomposition, specify our tolerance (MSE), and pass this information to the [createLrModel](https://interdigitalinc.github.io/Fireball/html/source/model.html#fireball.model.Model.createLrModel) method. This creates a new fireball model saved to the file ```Models/SSD512R.fbm```.

In [2]:
import time, os

layers = ['S3_L1_CONV', 'S3_L2_CONV', 'S3_L3_CONV',
          'S4_L1_CONV', 'S4_L2_CONV', 'S4_L3_CONV',
          'S5_L1_CONV', 'S5_L2_CONV', 'S5_L3_CONV',
          'S6_L1_CONV', 'S6_L2_CONV',
          'S7_L1_CONV', 'S7_L2_CONV',
          'S8_L2_CONV', 'S9_L2_CONV', 'S10_L2_CONV',
          'S12_L1_AFM']
mse = 0.00002
layerParams = [ (layer, mse) for layer in layers]

myPrint('Now reducing number of network parameters ... ')
t0 = time.time()
model.createLrModel("Models/SSD512R.fbm", layerParams)
myPrint('Done. (%.2f Seconds)'%(time.time()-t0))

Now reducing number of network parameters ... 
  S3_L1_CONV => LR(136), MSE=0.000020, Shape: (1152, 256), Params: 294912->191488 (Reduction: 35.1%)
  S3_L2_CONV => LR(136), MSE=0.000020, Shape: (2304, 256), Params: 589824->348160 (Reduction: 41.0%)
  S3_L3_CONV => LR(160), MSE=0.000021, Shape: (2304, 256), Params: 589824->409600 (Reduction: 30.6%)
  S4_L1_CONV => LR(240), MSE=0.000019, Shape: (2304, 512), Params: 1179648->675840 (Reduction: 42.7%)
  S4_L2_CONV => LR(216), MSE=0.000020, Shape: (4608, 512), Params: 2359296->1105920 (Reduction: 53.1%)
  S4_L3_CONV => LR(208), MSE=0.000020, Shape: (4608, 512), Params: 2359296->1064960 (Reduction: 54.9%)
  S5_L1_CONV => LR(224), MSE=0.000020, Shape: (4608, 512), Params: 2359296->1146880 (Reduction: 51.4%)
  S5_L2_CONV => LR(208), MSE=0.000020, Shape: (4608, 512), Params: 2359296->1064960 (Reduction: 54.9%)
  S5_L3_CONV => LR(200), MSE=0.000020, Shape: (4608, 512), Params: 2359296->1024000 (Reduction: 56.6%)
  S6_L1_CONV => LR(224), MSE=0.00

Compare the new number of parameters with the original 35,644,468. 

## Evaluating the new model
Let's see the impact of this reduction to the performance of the model.

In [3]:
model = Model.makeFromFile("Models/SSD512R.fbm", testDs=testDs, gpus=gpus)
model.printLayersInfo()
model.initSession()

results = model.evaluate()


Reading from "Models/SSD512R.fbm" ... Done.
Creating the fireball model "SSD512" ... Done.

Scope            InShape       Comments                 OutShape      Activ.   Post Act.        # of Params
---------------  ------------  -----------------------  ------------  -------  ---------------  -----------
IN_IMG                         Image Size: 512x512x3    512 512 3     None                      0          
S1_L1_CONV       512 512 3     KSP: 3 1 s               512 512 64    ReLU                      1,792      
S1_L2_CONV       512 512 64    KSP: 3 1 s               256 256 64    ReLU     MP(KSP):2 2 s    36,928     
S2_L1_CONV       256 256 64    KSP: 3 1 s               256 256 128   ReLU                      73,856     
S2_L2_CONV       256 256 128   KSP: 3 1 s               128 128 128   ReLU     MP(KSP):2 2 s    147,584    
S3_L1_CONV       128 128 128   KSP: 3 1 s, LR136        128 128 256   ReLU                      191,744    
S3_L2_CONV       128 128 256   KSP: 3 1 s, 

## Re-train and evaluate
Here we make a new model from the ```Models/SSD512R.fbm``` file for training. We then call the [train](https://interdigitalinc.github.io/Fireball/html/source/model.html#fireball.model.Model.train) method of the model to start the training. Note that the re-training can take up to 2 hours on a 4-GPU machine.

In [4]:
model = Model.makeFromFile("Models/SSD512R.fbm", trainDs=trainDs, testDs=testDs,
                           numEpochs=5,
                           learningRate=(0.002, 0.0004),   # Exponentially decay from 0.002 to 0.0004
                           optimizer="Momentum",
                           gpus=gpus)
model.printNetConfig()
model.initSession()
model.train()
results = model.evaluate()

model.save("Models/SSD512RR.fbm")


Reading from "Models/SSD512R.fbm" ... Done.
Creating the fireball model "SSD512" ... Done.

Network configuration:
  Input:                     Color images of size 512x512
  Output:                    A tuple of class labels, boxes, class probabilities, and number of detections.
  Network Layers:            28
  Tower Devices:             GPU0, GPU1, GPU2, GPU3
  Total Network Parameters:  23,147,860
  Total Parameter Tensors:   92
  Trainable Tensors:         92
  Training Samples:          82,783
  Test Samples:              5,000
  Num Epochs:                5
  Batch Size:                64
  L2 Reg. Factor:            0     
  Global Drop Rate:          0   
  Learning Rate: (Exponential Decay)
    Initial Value:           0.002        
    Final Value:             0.0004       
  Optimizer:                 Momentum

+--------+---------+---------------+-----------+-------------------+
| Epoch  | Batch   | Learning Rate | Loss      | Valid/Test mAP    |
+--------+---------+------

## Also look at

[Pruning SSD Model](SSD-Prune.ipynb)

[Quantizing SSD Model](SSD-Quantize.ipynb)

[Exporting SSD Model to ONNX](SSD-ONNX.ipynb)

[Exporting SSD Model to TensorFlow](SSD-TF.ipynb)

[Exporting SSD Model to CoreML](SSD-CoreML.ipynb)

________________

[Fireball Playgrounds](../Contents.ipynb)

[Object Detection with SSD](SSD.ipynb)
