<a href="https://colab.research.google.com/github/AIWintermuteAI/aXeleRate/blob/master/resources/aXeleRate_pascal20_detector.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## PASCAL-VOC Detection model Training and Inference

In this notebook we will use axelerate, Keras-based framework for AI on the edge, to quickly setup model training and then after training session is completed convert it to .tflite and .kmodel formats.

First, let's take care of some administrative details. 

1) Before we do anything, make sure you have choosen GPU as Runtime type (in Runtime - > Change Runtime type).

2) We need to mount Google Drive for saving our model checkpoints and final converted model(s). Press on Mount Google Drive button in Files tab on your left. 

In the next cell we clone axelerate Github repository and import it. 

**It is possible to use pip install or python setup.py install, but in that case you will need to restart the enironment.** Since I'm trying to make the process as streamlined as possibile I'm using sys.path.append for import.

In [None]:
%tensorflow_version 1.x
!git clone https://github.com/AIWintermuteAI/aXeleRate.git
import sys
sys.path.append('/content/aXeleRate')
from axelerate import setup_training,setup_inference

At this step you typically need to get the dataset. You can use !wget command to download it from somewhere on the Internet or !cp to copy from My Drive as in this example
```
!cp -r /content/drive/'My Drive'/pascal_20_segmentation.zip .
!unzip --qq pascal_20_segmentation.zip
```
For this notebook we will use PASCAL-VOC 2012 object detection dataset, which you can download here:

http://host.robots.ox.ac.uk:8080/pascal/VOC/voc2012/index.html#devkit

I split the dataset into training and validation using a simple Python script. Since most of the models trained with aXeleRate are to be run on embedded devices and thus have memory and latency constraints, the validation images are easier than most of the images in training set. The validation images include one(or many) instance of a particular class, no mixed classes in one image.

Let's visualize our detection model test dataset. We use img_num=10 to show only first 10 images. Feel free to change the number to None to see all 100 images.


In [None]:
!gdown https://drive.google.com/uc?id=1xgk7svdjBiEyzyUVoZrCz4PP6dSjVL8S  #pascal-voc dataset
!gdown https://drive.google.com/uc?id=1-ccBXBEUhyzG2_jopf6d13hTH9X7qpNl  #pre-trained model
!unzip --qq pascal_20_detection.zip

from axelerate.networks.yolo.backend.utils.augment import visualize_dataset

visualize_dataset(img_folder='pascal_20_detection/imgs_validation', ann_folder='pascal_20_detection/anns_validation', num_imgs=10, img_size=224, jitter=True)

Next step is defining a config dictionary. Most lines are self-explanatory.

Type is model frontend - Classifier, Detector or Segnet

Architecture is model backend (feature extractor) 

- Full Yolo
- Tiny Yolo
- MobileNet1_0
- MobileNet7_5 
- MobileNet5_0 
- MobileNet2_5 
- SqueezeNet
- NASNetMobile
- DenseNet121
- ResNet50

For more information on anchors, please read here
https://github.com/pjreddie/darknet/issues/568

Labels are labels present in your dataset.
IMPORTANT: Please, list all the labels present in the dataset.

object_scale determines how much to penalize wrong prediction of confidence of object predictors

no_object_scale determines how much to penalize wrong prediction of confidence of non-object predictors

coord_scale determines how much to penalize wrong position and size predictions (x, y, w, h)

class_scale determines how much to penalize wrong class prediction

**Since it is an example notebook, we will use pretrained weights and set all layers of the model to be "frozen"(non-trainable).** 

In [None]:
config = {
        "model":{
            "type":                 "Detector",
            "architecture":         "MobileNet1_0",
            "input_size":           224,
            "anchors":              [0.57273, 0.677385, 1.87446, 2.06253, 3.33843, 5.47434, 7.88282, 3.52778, 9.77052, 9.16828],
            "labels":               ["person", "bird", "cat", "cow", "dog", "horse", "sheep", "aeroplane", "bicycle", "boat", "bus", "car", "motorbike", "train","bottle", "chair", "diningtable", "pottedplant", "sofa", "tvmonitor"],
            "coord_scale" : 		1.0,
            "class_scale" : 		1.0,
            "object_scale" : 		5.0,
            "no_object_scale" : 	1.0
        },
        "weights" : {
            "full":   				"/content/2020-04-12_17-09-43.h5",
            "backend":   		    "imagenet"
        },
        "train" : {
            "actual_epoch":         1,
            "train_image_folder":   "pascal_20_detection/imgs",
            "train_annot_folder":   "pascal_20_detection/anns",
            "train_times":          1,
            "valid_image_folder":   "pascal_20_detection/imgs_validation",
            "valid_annot_folder":   "pascal_20_detection/anns_validation",
            "valid_times":          1,
            "valid_metric":         "mAP",
            "batch_size":           32,
            "learning_rate":        1e-4,
            "saved_folder":   		F"/content/drive/My Drive/pascal20_detection",
            "first_trainable_layer": "reshape_1",
            "augumentation":				False,
            "is_only_detect" : 		False
        },
        "converter" : {
            "type":   				["k210","tflite"]
        }
    }

Let's check what GPU we have been assigned in this Colab session, if any.

In [None]:
from tensorflow.python.client import device_lib
device_lib.list_local_devices()

Finally we start the training by passing config dictionary we have defined earlier to setup_training function. The function will start the training with  Reduce Learning Rate on Plateau and save on best mAP callbacks. Every epoch mAP of the model predictions is measured on the validation dataset. After the training has stopped, it will convert the best model into the format you have specified in config and save it to the project folder.

Let's train for one epoch to see how the whole pipeline works.

In [None]:
from keras import backend as K 
K.clear_session()
model_path = setup_training(config_dict=config)

After training it is good to check the actual perfomance of your model by doing inference on your validation dataset and visualizing results. This is exactly what next block does. Our model used pre-trained weights and since all the layers were set as non-trainable, we are just observing the perfomance of the model that was trained before.

In [None]:
from keras import backend as K 
K.clear_session()
setup_inference(config, model_path)

To train the model from scratch use the following config and then run the cells with training and (optinally) inference functions again.

In [None]:
config = {
        "model":{
            "type":                 "Detector",
            "architecture":         "MobileNet1_0",
            "input_size":           224,
            "anchors":              [0.57273, 0.677385, 1.87446, 2.06253, 3.33843, 5.47434, 7.88282, 3.52778, 9.77052, 9.16828],
            "labels":               ["person", "bird", "cat", "cow", "dog", "horse", "sheep", "aeroplane", "bicycle", "boat", "bus", "car", "motorbike", "train","bottle", "chair", "diningtable", "pottedplant", "sofa", "tvmonitor"],
            "coord_scale" : 		1.0,
            "class_scale" : 		1.0,
            "object_scale" : 		5.0,
            "no_object_scale" : 	1.0
        },
        "weights" : {
            "full":   				"",
            "backend":   		    "imagenet"
        },
        "train" : {
            "actual_epoch":         100,
            "train_image_folder":   "pascal_20_detection/imgs",
            "train_annot_folder":   "pascal_20_detection/anns",
            "train_times":          1,
            "valid_image_folder":   "pascal_20_detection/imgs_validation",
            "valid_annot_folder":   "pascal_20_detection/anns_validation",
            "valid_times":          1,
            "valid_metric":         "mAP",
            "batch_size":           32,
            "learning_rate":        1e-4,
            "saved_folder":   		F"/content/drive/My Drive/pascal20_detection",
            "first_trainable_layer": "",
            "augumentation":				False,
            "is_only_detect" : 		False
        },
        "converter" : {
            "type":   				["k210","tflite"]
        }
    }

In [None]:
from keras import backend as K 
K.clear_session()
model_path = setup_training(config_dict=config)

In [None]:
from keras import backend as K 
K.clear_session()
setup_inference(config, model_path)

Good luck and happy training! Have a look at these articles, that would allow you to get the most of Google Colab or connect to local runtime if there are no GPUs available;

https://medium.com/@oribarel/getting-the-most-out-of-your-google-colab-2b0585f82403

https://research.google.com/colaboratory/local-runtimes.html