# Tensorflow Object Detection Colab Framework

This notebook provides an out-of-the-box training pipeline for an Object Detection model.

## Mounting your Google Drive

The easies way to make large files, such as your training images, available to Colab is by uploading them to your Google Drive and then linking the respective paths. Colab can access Google Drive after it has been mounted.

In [0]:
%tensorflow_version 1.x
from google.colab import drive, files
drive.mount('/content/gdrive')

## Paths to configure

The training directory, where the model checkpoints are going to be stored during and at the end of the training process.

In [0]:
model_dir = '/content/training/'

The output directory, where the model is going to be exported to as a frozen inference graph.

In [0]:
output_directory = "/content/gdrive/My\ Drive/RollingSwarm/fine_tuned_model"

The filenames for the TFRecords files, as well as the label_map.pbtxt file

In [0]:
test_record_fname = '/content/gdrive/My Drive/RollingSwarm/output/FirstStage_X/validation_width400.record'

In [0]:
train_record_fname = '/content/gdrive/My Drive/RollingSwarm/output/FirstStage_X/training_width400.record'

In [0]:
label_map_pbtxt_fname = '/content/gdrive/My Drive/RollingSwarm/label_map.pbtxt'

## Configure your main training parameters

These include the number of training and evaluation steps, as well as the default model to be used (which will be automatically downloaded from the Tensorflow Research git repository).

In [0]:
# Number of training steps.
num_steps = 500

# Number of evaluation steps.
num_eval_steps = 500

MODELS_CONFIG = {
    'ssd_mobilenet_v2': {
        'model_name': 'ssd_mobilenet_v2_coco_2018_03_29',
        'pipeline_file': 'ssd_mobilenet_v2_coco.config',
        'batch_size': 12
    },
    'faster_rcnn_inception_v2': {
        'model_name': 'faster_rcnn_inception_v2_coco_2018_01_28',
        'pipeline_file': 'faster_rcnn_inception_v2_pets.config',
        'batch_size': 12
    },
    'rfcn_resnet101': {
        'model_name': 'rfcn_resnet101_coco_2018_01_28',
        'pipeline_file': 'rfcn_resnet101_pets.config',
        'batch_size': 8
    }
}

# Pick the model you want to use
# Select a model in `MODELS_CONFIG`.
selected_model = 'ssd_mobilenet_v2'

# Name of the object detection model to use.
MODEL = MODELS_CONFIG[selected_model]['model_name']

# Name of the pipline file in tensorflow object detection API.
pipeline_file = MODELS_CONFIG[selected_model]['pipeline_file']

# Training batch size fits in Colabe's Tesla K80 GPU memory for selected model.
batch_size = MODELS_CONFIG[selected_model]['batch_size']

## Setting up the framework

The following cells clone the Tensorflow Models repository and install all necessary dependencies such as the Protobuf compiler.

In [0]:
!git clone --quiet https://github.com/tensorflow/models.git

In [0]:
!apt-get install -qq protobuf-compiler python-pil python-lxml python-tk

!pip install -q Cython contextlib2 pillow lxml matplotlib

!pip install -q pycocotools

In [0]:
%cd models/research

In [0]:
!protoc object_detection/protos/*.proto --python_out=.

In [0]:
import os
os.environ['PYTHONPATH'] += ':/content/models/research/:/content/models/research/slim/'

In [0]:
!python object_detection/builders/model_builder_test.py

In [0]:
%cd /content/models/research

In [0]:
import os
# Optionally remove content in output model directory to fresh start.
!rm -rf {model_dir}
os.makedirs(model_dir, exist_ok=True)

In [0]:
import shutil
import glob
import urllib.request
import tarfile
MODEL_FILE = MODEL + '.tar.gz'
DOWNLOAD_BASE = 'http://download.tensorflow.org/models/object_detection/'
DEST_DIR = '/content/models/research/pretrained_model'

if not (os.path.exists(MODEL_FILE)):
    urllib.request.urlretrieve(DOWNLOAD_BASE + MODEL_FILE, MODEL_FILE)

tar = tarfile.open(MODEL_FILE)
tar.extractall()
tar.close()

os.remove(MODEL_FILE)
if (os.path.exists(DEST_DIR)):
    shutil.rmtree(DEST_DIR)
os.rename(MODEL, DEST_DIR)

In [0]:
!echo {DEST_DIR}
!ls -alh {DEST_DIR}

In [0]:
fine_tune_checkpoint = os.path.join(DEST_DIR, "model.ckpt")
fine_tune_checkpoint

## Configuring the Training Pipeline

Here the default pipeline file for the selected model is loaded. 



In [0]:
import os
pipeline_fname = os.path.join('/content/models/research/object_detection/samples/configs/', pipeline_file)

assert os.path.isfile(pipeline_fname), '`{}` not exist'.format(pipeline_fname)

In [0]:
def get_num_classes(pbtxt_fname):
    from object_detection.utils import label_map_util
    label_map = label_map_util.load_labelmap(pbtxt_fname)
    categories = label_map_util.convert_label_map_to_categories(
        label_map, max_num_classes=90, use_display_name=True)
    category_index = label_map_util.create_category_index(categories)
    return len(category_index.keys())

## Replacing default pipeline config values

Via regex, certain parameters in the pipeline file are amended to finetune the training process. The following 

In [0]:
import re

num_classes = get_num_classes(label_map_pbtxt_fname)
with open(pipeline_fname) as f:
    s = f.read()
with open(pipeline_fname, 'w') as f:
    # fine_tune_checkpoint
    s = re.sub('fine_tune_checkpoint: ".*?"',
               'fine_tune_checkpoint: "{}"'.format(fine_tune_checkpoint), s)
    
    # tfrecord files train and test.
    s = re.sub(
        '(input_path: ".*?)(train.record)(.*?")', 'input_path: "{}"'.format(train_record_fname), s)
    s = re.sub(
        '(input_path: ".*?)(val.record)(.*?")', 'input_path: "{}"'.format(test_record_fname), s)

    # label_map_path
    s = re.sub(
        'label_map_path: ".*?"', 'label_map_path: "{}"'.format(label_map_pbtxt_fname), s)

    # Set training batch_size.
    s = re.sub('batch_size: [0-9]+',
               'batch_size: {}'.format(batch_size), s)

    # Set training steps, num_steps
    s = re.sub('num_steps: [0-9]+',
               'num_steps: {}'.format(num_steps), s)
    
    # Set number of classes num_classes.
    s = re.sub('num_classes: [0-9]+',
               'num_classes: {}'.format(num_classes), s)
    
    # ----------------------------------------------
    # The following lines work well with the ssd_mobilenet_v2. When using any other model, the respective pipeline file will most likely have different parameters.
    # Feel free to comment out some of the following expressions as you see fit.
    # ----------------------------------------------

    # Set minimum scale
    s = re.sub('min_scale: [0-9].[0-9]+',
               'min_scale: {}'.format(0.021875), s)
    
    # Set maximum scale
    s = re.sub('max_scale: [0-9].[0-9]+',
               'max_scale: {}'.format(0.02917), s)
    
    # Remove useless aspect ratios
    s = re.sub('aspect_ratios: [0, 2-9].[0-9]+\n',
               '', s)
    
    # Set default training height
    s = re.sub('height: [0-9]+',
               'height: {}'.format(300), s)
    
    # Set default training height
    s = re.sub('width: [0-9]+',
               'width: {}'.format(400), s)
    
    # Set default kernel size
    s = re.sub('kernel_size: [0-9]+',
               'kernel_size: {}'.format(1), s)
    
    # Set dropout
#    s = re.sub('use_dropout: false',
#               'use_dropout: true', s)
    
    # Set dropout keep probability
#    s = re.sub('dropout_keep_probability: [0-9].[0-9]+',
#               'dropout_keep_probability: {}'.format(0.6), s)

    # Amend the optimizer from RMS Prop to Adam
    s = re.sub('rms_prop_optimizer: [^}]+}[^}]+}[^}]+}',
               'adam_optimizer: {\n      learning_rate: {\n        constant_learning_rate {\n          learning_rate: 0.001\n        }\n}\n}', s)

    # Change loss type
    s = re.sub('CLASSIFICATION',
               'BOTH', s)
    
    # Replace random cropping by vertical
    s = re.sub('ssd_random_crop',
               'random_vertical_flip', s)

    # ----------------------------------------------
    # End of custom parameters
    # ----------------------------------------------

    f.write(s)

In [0]:
!cat {pipeline_fname}

In [0]:
!python /content/models/research/object_detection/model_main.py \
    --pipeline_config_path={pipeline_fname} \
    --model_dir={model_dir} \
    --alsologtostderr \
    --num_train_steps={num_steps} \
    --num_eval_steps={num_eval_steps}

In [0]:
%load_ext tensorboard
%tensorboard --logdir /content/training/ --port 6001

## Exporting a Trained Inference Graph
Once your training job is complete, you need to extract the newly trained inference graph, which will be later used to perform the object detection

In [0]:
import re
import numpy as np

lst = os.listdir('/content/training/')
lst = [l for l in lst if 'model.ckpt-' in l and '.meta' in l]
steps=np.array([int(re.findall('\d+', l)[0]) for l in lst])
last_model = lst[steps.argmax()].replace('.meta', '')

last_model_path = os.path.join(model_dir, last_model)
print(last_model_path)
!python /content/models/research/object_detection/export_inference_graph.py \
    --input_type=image_tensor \
    --pipeline_config_path={pipeline_fname} \
    --output_directory={output_directory} \
    --trained_checkpoint_prefix={last_model_path}

In [0]:
!ls {output_directory}