# Object Detection: An End to End Training Pipeline (Part Three)
In this notebook, we shall finally train an object detection model, using the [Tensorflow Object Detection API](https://github.com/tensorflow/models/tree/master/research/object_detection). 
***
**Objectives:**
1. Select and configure a pretrained model.
2. Initiate `Tensorboard` to monitor the training process.
3. Train an object detection model.
4. Export the resulting model.
**NOTE: The notebook has a number of TO-DO exercises for you to complete, please endeavor to attempt them.**


We shall borrow a number of steps from the previous notebooks especially `Object Detection: Environment Setup and Creating TFRecords (Part Two)`, including:
- Environment setup
- Loading the dataset.
- Setting the workspace.
- Converting the dataset to TFRecords.

Steps:
1. We shall configure the notebook as we did in the last notebook upto conversion to TFecords.
2. Download and configure a pretrained model.
3. Initiate `Tensorboard` for training monitoring.
4. Train the model.
5. Export the resulting model for object detection tasks.

### 0.0 Notebook Configuration
As in the past notebooks, in this section, we install all the necessary libraries that are needed for use to train an object detection model.

In [None]:
#@title [IMPORTANT] Run Cell to Set up Environment {display-mode: "form"}
%%capture
# 1. Create a directory named `object_detection`, this is where the model files will be stored.
# 2. Download and install the Tensorflow Object Detection API models using [Git](https://en.wikipedia.org/wiki/Git) into the `object_detection` folder.
# 3. Install the required libraries for the API.
# 4. Install TF-Slim, a lightweight library for defining, training and evaluating complex models in TensorFlow.

# We shall import the os library, it provides a number of functions that allow 
# interaction with the operating environment.
import os
import json
import sys
import glob
import urllib
import io
import xml.etree.ElementTree as ET
import argparse
from pathlib import Path
import numpy as np
from shutil import copyfile
import csv
import pandas as pd
from PIL import Image
from tqdm import tqdm
import tensorflow.compat.v1 as tf
from collections import namedtuple

# 1. Create a directory named detection
!mkdir detection
os.chdir('detection')

# 2a. Download and install the Tensorflow Object Detection API models using Git into the object_detection folder,
# we use git command to get the source files from GitHub, git is preinstalled in the Google Colab environment.
!git clone --depth 1 https://github.com/tensorflow/models.git

# 2b. Add the folder to the path, this allows to import 
#     scripts as we do with install libraries
os.environ['PYTHONPATH'] += ':'+'/content/detection/models'

# 3. Install the required libraries for the API. The libraries are installed
#    from the setup.py.

# 3a. First we shall move into research/models, this is where the setup.py script
#     is stored.
%cd models/research

# Import dataset preparations from the object detection folder
from object_detection.utils import dataset_util

# 3b. This commands the environment to run the setup.py script if it exists.
!pip install .

# 4. Install TF-Slim, a lightweight library for defining, training and evaluating complex models in TensorFlow.
!pip install tf_slim

# 5a. Install the protobuf dependencies.
!protoc object_detection/protos/*.proto --python_out=.;
pwd = os.getcwd();

# 5b. Add TF_slim to the system path.
os.environ['PYTHONPATH'] += f':{pwd}:{pwd}/slim';

# 6. Move back to the object_detection folder.
%cd ../../

def split_indices(x, train=0.8, test=0.0, validate=0.2, shuffle=True):
    """
      Returns the indices at which the data is split.
    """
    # split training data
    n = len(x)
    v = np.arange(n)
    if shuffle:
        np.random.shuffle(v)

    i = round(n * train)  # train
    j = round(n * test) + i  # test
    k = round(n * validate) + j  # validate
    return v[:i], v[i:j], v[j:k]  # return indices

def split_files(file_names,train=0.8, test=0.2, validate=0.0):
    """
      Split the files provided according to the specified distributions.

      file_names  this is a list of file names that are split.
      train       the distribution for the train files, 0 <= x <= 1
      test        the distribution for the test files, it should complement train to add to one.
                  i.e. if train is 0.8, test should be 0.2 such that 0.8 + 0.2 = 1.0
      validate    this is the distribution for the validation set, it should also 
                  complement the train and test values i.e if train is 0.8, test is 0.1, validate
                  should be 0.1 such that 0.8 + 0.1 + 0.1 = 1.0

      Returns:
        (tuple) train, test, val
                train   a list containing the file names of the train set
                test    a list containing the file names of the test set
                val     a list containing the file name of the validation set
    """
    # split training data
    file_name = list(filter(lambda x: len(x) > 0, file_names))
    file_name = sorted(file_name)
    i, j, k = split_indices(file_names, train=train, test=test, validate=validate)
    train = []
    test = []
    val = []
    datasets = {'train': i, 'test': j, 'val': k}
    for key, item in datasets.items():
        if item.any():
            for ix in item:
                if key == 'train':
                    train.append(file_names[ix])
                if key == 'test':
                    test.append(file_names[ix])
                if key == 'val':
                    val.append(file_names[ix])

    return train, test, val

def json_to_csv(file_names, images_dir, labels_path, annotations_dir, label_file_name):
  """
  Converts a JSON file to a csv file.

  file_names       list of file names.
  images_dir       path to the images directory
  labels_path      is the path to the labels JSON file.
  annotations_dir  is the directory in which the annotations will be stored.
  label_file_name  is the name of the output .csv file i.e. labels_train.csv
  """
  # Load coco file
  f = open(labels_path, 'r')
  COCO_DATA = json.load(f)
  f.close()

  images = COCO_DATA["images"]
  annotations = COCO_DATA["annotations"]
      
  # Generating the csv in the annotations folder under data directory. (Ideally)
  csv_file_name = os.path.join(str(annotations_dir), label_file_name)

  class_name = 'brownspot' # Normally, there will be more than one class, extract accordingly.

  with open(csv_file_name, 'w') as csv_label_file:
    f = csv.writer(csv_label_file)
    f.writerow(['file_name', 'width', 'height', 'class', 'xmin', 'ymin', 
                'xmax', 'ymax'])

    for file_name in tqdm(file_names, desc = "Processing CSV"):
        id = None
        for image in images:
          if file_name == image['file_name']:
            id = image['id']

        im = Image.open(os.path.join(images_dir, file_name))
        width, height = im.size

        for annotation in annotations:
            if id == annotation['image_id']:
                bbox = annotation['bbox']

                # COCO bbox label format: [xmin, ymin, width, height]
                xmin = bbox[0]
                xmax = bbox[0] + bbox[2]
                ymin = bbox[1]
                ymax = bbox[1] + bbox[3]

                # Write to .csv file.
                f.writerow([file_name, width, height, class_name, xmin, ymin, xmax, ymax])  

def create_pbtxt(annotations_dir):
  """
    Creates a pbtxt file.

    TensorFlow requires a label map, which maps each of the used labels 
    to an integer values. This label map is used both by the training and detection 
    processes. Notice the labels are one-indexed i.e. start at 1 (one).

    Example:
    # example.pbtxt
    item {
      id: 1
      name: 'cat'
    }

    item {
      id: 2
      name: 'dog'
    }

    The file is stored under the annotations folder.
  """
  # Create the label map
  label_map_path = os.path.join(annotations_dir, "label_map.pbtxt")
  pbtxt_content = ""

  class_name = 'brownspot' # Could be more than one class name.

  pbtxt_content = (
      pbtxt_content
      + "item {{\n    id: {0}\n    name: '{1}'\n}}\n\n".format(1, class_name)
  )
  pbtxt_content = pbtxt_content.strip()
  with open(label_map_path, "w") as f:
      f.write(pbtxt_content)

def move_files(files, source, dest):
  """Move files from the source directory to the destination directory."""
  for filename in files:
    copyfile(os.path.join(source, filename),
                 os.path.join(dest, filename))
    
def class_text_to_int(row_label):
  if row_label == 'brownspot': # the respective class_name
    return 1
  else:
    None

def split(df, group):
    data = namedtuple('data', ['filename', 'object'])
    gb = df.groupby(group)
    return [data(filename, gb.get_group(x)) for filename, x in zip(gb.groups.keys(), gb.groups)]

def create_tf_example(group, path):
    with tf.io.gfile.GFile(os.path.join(path, '{}'.format(group.filename)), 'rb') as fid:
        encoded_jpg = fid.read()
    encoded_jpg_io = io.BytesIO(encoded_jpg)
    image = Image.open(encoded_jpg_io)
    width, height = image.size

    filename = group.filename.encode('utf8')
    image_format = b'jpg'
    # check if the image format is matching with your images.
    xmins = []
    xmaxs = []
    ymins = []
    ymaxs = []
    classes_text = []
    classes = []

    for index, row in group.object.iterrows():
        xmins.append(row['xmin'] / width)
        xmaxs.append(row['xmax'] / width)
        ymins.append(row['ymin'] / height)
        ymaxs.append(row['ymax'] / height)
        classes_text.append(row['class'].encode('utf8'))
        classes.append(class_text_to_int(row['class']))

    tf_example = tf.train.Example(features=tf.train.Features(feature={
        'image/height': dataset_util.int64_feature(height),
        'image/width': dataset_util.int64_feature(width),
        'image/filename': dataset_util.bytes_feature(filename),
        'image/source_id': dataset_util.bytes_feature(filename),
        'image/encoded': dataset_util.bytes_feature(encoded_jpg),
        'image/format': dataset_util.bytes_feature(image_format),
        'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
        'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
        'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
        'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
        'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
        'image/object/class/label': dataset_util.int64_list_feature(classes),
    }))
    return tf_example

def create_tfrecord(labels_file, images_dir, annotations_dir):
  """
    Create a tfrecord from the .csv file.
  """
  csv_name = labels_file.split('/')[-1][:-4]
  writer = tf.io.TFRecordWriter(os.path.join(annotations_dir, csv_name + '.record'))
  path = os.path.join(images_dir)
  examples = pd.read_csv(labels_file)
  grouped = split(examples, 'file_name')
  for group in grouped:
    tf_example = create_tf_example(group, path)
    writer.write(tf_example.SerializeToString())
  writer.close()
  print('Successfully created the TFRecords: {}'.format(csv_name + '.record'))

# The training helper function.
def train_model(path):
  """
    This is a helper function to aid with the training.
  """
  !python workspace/data/model_main_tf2.py \
    --model_dir=workspace/data/training \
    --pipeline_config_path={path}

# The model exporter helper function.
def export_model(config_path, checkpoint_dir, export_dir):
  """
    This function calls the model exporter script that creates the graph format
    of the model that can be exported.
  """
  !python workspace/data/exporter_main_v2.py \
    --input_type image_tensor \
    --pipeline_config_path {config_path} \
    --trained_checkpoint_dir {checkpoint_dir} \
    --output_directory {export_dir}


# Finally, we prepare the workspace folders as discussed in the last notebook. 
# Refer to it for details.
# Let's go ahead and create the folders.
!mkdir workspace
!mkdir workspace/data
!mkdir workspace/data/annotations
!mkdir workspace/data/images
!mkdir workspace/data/images/train
!mkdir workspace/data/images/test 
!mkdir workspace/data/pre-trained-model
!mkdir workspace/data/training

# Copy the model_main.py into the object_detection/workspace/data
# This is to aid with the training later on.
copyfile('/content/detection/models/research/object_detection/model_main_tf2.py',\
         '/content/detection/workspace/data/model_main_tf2.py')

copyfile('/content/detection/models/research/object_detection/exporter_main_v2.py', \
         '/content/detection/workspace/data/exporter_main_v2.py')

Next, we shall upload the dataset shared for this notebook, the dataset contains 100 images of passion fruit leaves diseased with the [Brown Spot Disease](https://rifkatunda.github.io/knowledge-center/#brownspot) and the corresponding labels shared in the `COCO` format.
***
In this notebook, we shall train a model to detect the brownspot patches on each leaf. (Refer to the first notebook `Object Detection: Data Parsing and Visualization (Intro: One)` for visualization.

In [None]:
# 1. The dataset is uploaded to the content directory, to unzip it, we have two 
#    options, either to move back in the directory tree or use the full path.
# We shall use the full path in this notebook.
!unzip /content/dataset.zip

# This function, unzips the folder to your current directory, so it is important
# to always know, where you are in the directory tree.
# Use the !pwd command to find out.

In the next step, we shall follow a couple sub-steps as listed below:
- Split the dataset into the train set and the test set. 
- Create the respective labels.csv for the test and train set.
- Create the `label_map.pbtxt` file.
- Move files to the respective prepared directories.
- Generate TFRecords from the datasets.
***
**NOTE: This sub-section is left as a TO-DO exercise for you. Please refer to `Object Detection: Environment Setup and Creating TFRecords (Part Two)` for a recap.**

In [None]:
# 0. Get the list of the files in the images directory.
files = os.listdir('/content/detection/dataset/images')

# [TO-DO]
# 1. Split the dataset into the train set and test set.

# 2. Create the respective labels.csv for the test and train set.

# 3. Create the label_map.pbtxt file.

# 4. Move files to the respective prepared directories.

# 5. Generate TFRecords for the train set.

# 6. Generate TFRecords for the test set.

# NOTE: Please refer to the last notebook [Object Detection: Environment Setup 
# and Creating TFRecords (Part Two)] for a recap.

### 1.0 Select and configure a pretrained model.
We shall use pretrained weights (randomly choosen from the [TensorFlow’s detection model zoo](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md), this is an online storage platform of pre-trained `Tensorflow` models). For this tutorial, we shall use [ SSD ResNet50 V1 FPN 640x640 ](http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_resnet50_v1_fpn_640x640_coco17_tpu-8.tar.gz), pretrained weights. This is an application of transfer learning which is used in training scenarios when there isn't enough training data, like in this case.

Model configuration steps:
Steps to follow:
1. Download the pretrained weights from [TensorFlow’s detection model zoo](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md).
2. Edit the config file to fit the dataset.

In [None]:
# 1. Download the pretrained weights of your choice from TensorFlow’s detection model zoo.
PRE_TRAINED_MODEL_DIR = r'/content/detection/workspace/data/pre-trained-model'
MODEL = 'ssd_resnet50_v1_fpn_640x640_coco17_tpu-8.tar.gz'

!wget -P {PRE_TRAINED_MODEL_DIR} http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_resnet50_v1_fpn_640x640_coco17_tpu-8.tar.gz
!tar -zxvf {PRE_TRAINED_MODEL_DIR}/{MODEL} -C {PRE_TRAINED_MODEL_DIR} # Extract to a given directory

Finally, open and edit the config file to fit the dataset. The config file also contains numerous hyperparameters that will be continously optimized for the model. But for this tutorial, we shall focus on the following parameters.
- `num_classes`: This specifies the number of classes in your dataset, for this dataset we have only one class.
- `batch_size`: specifies the number of images loaded for one iteration, try a smaller batch size if the training hangs.
- `fine_tune_checkpoint`: this is a path to the directory with the pretrained model, that we downloaded in the last step.
- `input_path`: this is under the `train_input_reader` or `eval_input_reader` section, and specifies the path to the training data or testing data respectively.
- `label_map_path`: this is also under the `train_input_reader` and or `eval_input_reader`, and specifies the path to the `label_map.pbtxt` file that we created earlier in both sections.
- `data_augmentation_options`: This is an interesting section to tweak and left as an exercise to the try out a number of augmentation strategies. Data Augmentation is a technique used to artificially increase data used for training, apart from increasing the data points, the model generalizes better on unseen data.
- `num_steps`: This is the flag used to set the number of epochs, say you have have 50 images, and a batch of 2 images per step, then it takes 25 steps to go through all 50 images once. This is then one epoch. So if you want to train for 100 epochs you just need to set the number of steps to 25 * 100 = 2500.

In [None]:
%%writefile /content/detection/workspace/data/pre-trained-model/ssd_resnet50_v1_fpn_640x640_coco17_tpu-8/pipeline.config

# Open the file to see it's contents.
# Faster R-CNN with Inception v2, configuration for MSCOCO Dataset.
# Users should configure the fine_tune_checkpoint field in the train config as
# well as the label_map_path and input_path fields in the train_input_reader and
# eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that
# should be configured.

model {
  ssd {
    num_classes: 1 # Change to the number of classes in your dataset.
    image_resizer {
      fixed_shape_resizer {
        # Change to the your image set resolution, for our dataset, it is 400x400
        height: 400  
        width: 400
      }
    }
    feature_extractor {
      type: "ssd_resnet50_v1_fpn_keras"
      depth_multiplier: 1.0
      pad_to_multiple: 32
      min_depth: 16
      conv_hyperparams {
        regularizer {
          l2_regularizer {
            weight: 0.00039999998989515007
          }
        }
        initializer {
          truncated_normal_initializer {
            mean: 0.0
            stddev: 0.029999999329447746
          }
        }
        activation: RELU_6
        batch_norm {
          decay: 0.996999979019165
          scale: true
          epsilon: 0.0010000000474974513
        }
      }
      override_base_feature_extractor_hyperparams: true
      fpn {
        min_level: 3
        max_level: 7
      }
    }
    box_coder {
      faster_rcnn_box_coder {
        y_scale: 10.0
        x_scale: 10.0
        height_scale: 5.0
        width_scale: 5.0
      }
    }
    matcher {
      argmax_matcher {
        matched_threshold: 0.5
        unmatched_threshold: 0.5
        ignore_thresholds: false
        negatives_lower_than_unmatched: true
        force_match_for_each_row: true
        use_matmul_gather: true
      }
    }
    similarity_calculator {
      iou_similarity {
      }
    }
    box_predictor {
      weight_shared_convolutional_box_predictor {
        conv_hyperparams {
          regularizer {
            l2_regularizer {
              weight: 0.00039999998989515007
            }
          }
          initializer {
            random_normal_initializer {
              mean: 0.0
              stddev: 0.009999999776482582
            }
          }
          activation: RELU_6
          batch_norm {
            decay: 0.996999979019165
            scale: true
            epsilon: 0.0010000000474974513
          }
        }
        depth: 256
        num_layers_before_predictor: 4
        kernel_size: 3
        class_prediction_bias_init: -4.599999904632568
      }
    }
    anchor_generator {
      multiscale_anchor_generator {
        min_level: 3
        max_level: 7
        anchor_scale: 4.0
        aspect_ratios: 1.0
        aspect_ratios: 2.0
        aspect_ratios: 0.5
        scales_per_octave: 2
      }
    }
    post_processing {
      batch_non_max_suppression {
        score_threshold: 9.99999993922529e-09
        iou_threshold: 0.6000000238418579
        max_detections_per_class: 100
        max_total_detections: 100
        use_static_shapes: false
      }
      score_converter: SIGMOID
    }
    normalize_loss_by_num_matches: true
    loss {
      localization_loss {
        weighted_smooth_l1 {
        }
      }
      classification_loss {
        weighted_sigmoid_focal {
          gamma: 2.0
          alpha: 0.25
        }
      }
      classification_weight: 1.0
      localization_weight: 1.0
    }
    encode_background_as_zeros: true
    normalize_loc_loss_by_codesize: true
    inplace_batchnorm_update: true
    freeze_batchnorm: false
  }
}
train_config {
  batch_size: 4 # Increase and decrease depending on memory available.
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
  data_augmentation_options {
    random_crop_image {
      min_object_covered: 0.0
      min_aspect_ratio: 0.75
      max_aspect_ratio: 3.0
      min_area: 0.75
      max_area: 1.0
      overlap_thresh: 0.0
    }
  }
  sync_replicas: true
  optimizer {
    momentum_optimizer {
      learning_rate {
        cosine_decay_learning_rate {
          learning_rate_base: 0.03999999910593033
          total_steps: 25000
          warmup_learning_rate: 0.013333000242710114
          warmup_steps: 2000
        }
      }
      momentum_optimizer_value: 0.8999999761581421
    }
    use_moving_average: false
  }
  fine_tune_checkpoint: "/content/detection/workspace/data/pre-trained-model/ssd_resnet50_v1_fpn_640x640_coco17_tpu-8/checkpoint/ckpt-0"
  num_steps: 25000
  startup_delay_steps: 0.0
  replicas_to_aggregate: 8
  max_number_of_boxes: 100
  unpad_groundtruth_tensors: false
  fine_tune_checkpoint_type: "detection" # Set this to "detection" since we want to be training the full detection model
  use_bfloat16: false  # Set this to false if you are not training on a TPU
  fine_tune_checkpoint_version: V2
}
train_input_reader {
  label_map_path: "/content/detection/workspace/data/annotations/label_map.pbtxt" # Path to the label_map
  tf_record_input_reader {
    input_path: "/content/detection/workspace/data/annotations/train_labels.record" # Path to training TFRecord
  }
}
eval_config {
  metrics_set: "coco_detection_metrics"
  use_moving_averages: false
}
eval_input_reader {
  label_map_path: "/content/detection/workspace/data/annotations/label_map.pbtxt" # Path to the label_map
  shuffle: false
  num_epochs: 1
  tf_record_input_reader {
    input_path: "/content/detection/workspace/data/annotations/test_labels.record" # Path to the test set.
  }
}

### 2.0 Initiate `Tensorboard` for training monitoring.
Tensorflow provides the [Tensorboard](https://www.tensorflow.org/programmers_guide/summaries_and_tensorboard) utility library that allows you to continuously monitor and visualize a number of different training/evaluation metrics.
***
If you run Tensorboard before training the model, then you are able to monitor progress concurrently.

In [None]:
# 1. Run the command below to run tensorboard, --logdir should point to the folder that stores the 
#    checkpoints during training, in this case it is the training folder under the data directory.
%load_ext tensorboard
%tensorboard --logdir workspace/data/training

### 3.0 Training the Model
Finally we get to train the model. For the training process, we shall work in the `workspace/data` directory.
***
Steps:
- First, we shall copy the `object_detection/models/research/object_detection/model_main.py` file to `object_detection/workspace/data` folder.
- Next, we shall move into the `workspace/data` directory.
- Then we shall call the `train_model(config_path)` function, this is the function used to train the model, it expects as argument the path to the config file.

In [None]:
# TO-DO
# 1. Set the path argument with the path to the config file.
path = ''

# 2. Call the train_model function with the set path.

### 4.0 Exporting a Trained Model
With the training job complete, you need to extract the newly trained inference graph, which will be later used to perform the object detection.
***
For this, we shall call the `export_model(config_path, checkpoint_dir, export_dir)` function, defined in the notebook configuration section. A few details about the arguments:
- `config_path`: This is the path to the configuration file, the same one used in the training step.
- `checkpoint_dir`: This is the path to the directory that stores the checkpoints as the model is trained, in this case, it is `/content/detection/workspace/data/training`.
- `export_dir`: This specifies the path to directory in which the exported model should be stored. If it doesn't exist, it is created.

In [None]:
# TO-DO
# 1. Set the arguments to the corresponding values.
path = r''
checkpoint_dir = r'
export_dir = r''

# 2. Call the export_model function and pass it the arguments as defined.

After the above process has completed, you should find a new folder `exported_model` under the `workspace/data` directory, that has the following structure:
```
/data
├─ ...
├─ exported-model/
│  ├─ checkpoint/
│  ├─ saved_model/
│  └─ pipeline.config
└─ ...
```
The exported model is under the `saved_model` folder, and it is named `saved_model.pb`.

That is it for the part three, for more details. Please checkout the official [TensorFlow 2 Object Detection API tutorial](https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/index.html).