# SlimYOLOv3 in Colab

This is a Google Colaboratory notebook file to demonstrate channel pruning and the inference using a TensorFlow implementation of SlimYOLOv3 on the VisDrone2018-DET dataset.

First, we will mount Google Drive to access the models.

In [None]:
from google.colab import drive
drive.mount('/content/drive')
%cd /content/drive/My\ Drive/SlimYOLOv3-tf/

Next the requirements for the project will be installed.

In [None]:
!pip install -r requirements.txt
import tensorflow as tf
import tensorflow_addons as tfa
import keras_flops

## Demonstration of SlimYOLOv3 after Pruning and Fine Tuning

Now you are ready to try out the model that I pruned and trained.

In [None]:
from IPython.display import display, Image
from analysis.predict_image import image_processor
import tensorflow as tf

# https://stackoverflow.com/a/37061069
import PIL.Image
from io import BytesIO
import IPython.display
import numpy as np
def showarray(a, fmt='png'):
    a = np.uint8(a)
    f = BytesIO()
    PIL.Image.fromarray(a).save(f, fmt)
    IPython.display.display(IPython.display.Image(data=f.getvalue()))

model = tf.keras.models.load_model('checkpoints/1-finetuned')
image_processor(model, 'subset/0000193_00000_d_0000103.jpg', showarray)

The above image is an example of the large bias towards cars in the detection. This bias was caused by a dataset bias. Because the VisDrone dataset consists of street images in China, there are many more cars in the images than pedestrians or motorcyclists. Feel free to change the model specified or the image. Getting accurate measure of the latency in Colab can be difficult.

## Example of Channel Pruning

The core portion of my implementation is the channel pruning utility for TensorFlow. The syntax is as shown below:

```{python3}
prune(input_file, layer_ratio, total_ratio, output_file)
```

The pruning function requires two pruning hyperparameters: the layer ratio and the total ratio. First, a threshold for weights that are considered insigificant is obtained from the total ratio. For example, if the total ratio is 0.5, then the bottom 50% of channel weights are considered insignificant to the functioning of the model and can be removed. They are only removed if the the layer ratio is met. For example, if the layer ratio is 0.1, then channels are not removed from a layer until 10% of the channels in a layer are unimportant. If either threshold is not met, a channel is not pruned. Below is the general syntax of calling this function.

```{python3}
from training.prune import prune
prune('sparse', 0.1, 0.5, 'pruned')
```

In [None]:
from training.prune import prune
model = tf.keras.models.load_model('checkpoints/1-sparse')
flops = keras_flops.get_flops(model, batch_size=1)
print(f"Before Pruning FLOPS: {flops / 10 ** 9:.03} G")
new_model = prune('checkpoints/1-sparse', 0.1, 0.5, 'checkpoints/1-pruned')
model.summary()
new_model.summary()

There is a lot of output from this cell, but the key results are that the pruning process reduced the number of operations in the model from 152 billion floating point operations per second (FLOPS) to just 69.7 FLOPS. The number of parameters in the model also reduced from 63.9 million to just 29.1 million.

# Reproduction of the Evaluation

This cell is used to calculate all of the numbers in Table 1 in the paper. Unfortunately, TensorFlow 2 removed the field that allows direct access to the number of trainable parameters, so the "Trainable Parameters" line on the summary will give the numbers for this for each model. Latency is very unpredictable on Colab because it is a shared system. Running late at night on a user that didn't have too much Colab use in the recent past seem to have the most consistant results.

In [None]:
import timeit
import keras.backend as K
import numpy as np
import tensorflow as tf
import tensorflow_addons as tfa
import keras_flops

dummy = tf.ones((1, 608, 608, 3))
for path in ['checkpoints/0-unpruned', 'checkpoints/1-sparse', 'checkpoints/1-pruned', 'checkpoints/1-finetuned']:
  model = tf.keras.models.load_model(path)
  flops = keras_flops.get_flops(model, batch_size=1)
  model.summary() # for number of (trainable) parameters
  print(f"{path} FLOPS: {flops / 10 ** 9:.03} G")
  print(f'{path} Latency: {timeit.timeit("model(dummy)", globals=vars(), number=5)} s')
  del model