# Object Detection Training Template

## Introductory Notes

This notebook serves as a template on how to train an EfficientDet model for detecting cotrollers using the TensorFlow Object Detection API. The models use our synthetic data for training. The trained models are then exported and can be used for evaluation.

For the code in thise notebook to work the TF Object Detection API and a few standard modules need to be installed - see the README-file in the project main directory for more detailed information.

The following files, included in what is provided by the EPRI-CV team, are needed:
*   __src/preproc.py__ - this module contains the functions that are used to preprocess the data
*   __src/visualize.py__ - this module contains functions to visualize different aspects of the evaluation
*   __src/utils.py__ - utils for writing/reading XML annotations
*   __src/generate_tfrecord.py__ - script to generate tfrecord files from image data and XML annotations

Assumed project structure:
```
main_project_dir/
├─ README.md                               <- The top-level README you are currently reading.
├─ generate-synthetic-images/              <- the scripts to generate synthetic images that will serve as training
├─ data/                  
│   ├─ test-annotated-images/              <- test images and the xml files with their labels
│   ├─ synthetic-annotated-train-images/   <- synthetic images used for training and the xml files with their labels
│   ├─ train-valid-split
│   │    ├─ training/                      <- training images and the xml with their labels used for training
│   │    ├─ validation/                    <- validation images and the xml with their labels used for training
│   │    ├─ train.tfrec                    <- tfrecord of the training images and labels
│   │    ├─ valid.tfrec                    <- tfrecord of the validation images and labels
│   │    └─ label_map.pbtxt                <- label map for this training/validation split
├─ models/                              
│   ├─ pre-trained/                        <- pre-trained models downlowaded from the Tensorflow API zoo
│   ├─ fine-tuned/                         <- fine-tuned models, obtained from training with train-valid-split
│   └─ default-pipeline-configs/           <- pipeline configuration files for our models
├─ notebooks/                              <- notebooks used for training and evaluation
│   ├─ EPRI_ObjectDetection_Training.ipynb
│   └─ EPRI_ObjectDetection_Evaluation.ipynb
└─ src/                                    <- scripts needed in the notebooks
```

# Preparation / initialization

#### Set the root folder (assuming running from the `notebooks` folder)

In [35]:
from pathlib import Path
import sys
ROOT_DIR = Path("/content/gdrive/MyDrive/epri-deliver")

#### Set the path to the TF Object Detection API

In [36]:
TF_DIR = ROOT_DIR/"tf-models" 

#### Colab Init (REMOVE)

In [21]:
from google.colab import drive
drive.mount('/content/gdrive')
%cd /content/gdrive/MyDrive/epri-deliver/models

Drive already mounted at /content/gdrive; to attempt to forcibly remount, call drive.mount("/content/gdrive", force_remount=True).
/content/gdrive/MyDrive/epri-deliver/models


In [None]:
if not TF_DIR.exists():
  !git clone --depth 1 https://github.com/tensorflow/models $TF_DIR

In [27]:
%cd {TF_DIR/'research'}
!protoc object_detection/protos/*.proto --python_out=.
!cp object_detection/packages/tf2/setup.py .
!python setup.py build
!python setup.py install
%cd /content/gdrive/MyDrive/epri-deliver/data

/content/gdrive/MyDrive/epri-deliver/data


In [34]:
import sys
sys.path.append(str(TF_DIR/"research"))

In [None]:
!pip uninstall imgaug
!pip install git+https://github.com/aleju/imgaug.git

### Project folder structure: set paths to existing directories in the root folder

In [46]:
DATA_DIR = ROOT_DIR/'data'
# Folder containing unprocessed/unaugmented annotated synthesised data 
# (images + corresponding XML annotation files) used for training
RAW_SYN_DATA_DIR = DATA_DIR/'synthetic-annotated-train-images'
# Unprocessed annotated test data
RAW_TEST_DATA_DIR = DATA_DIR/'test-annotated-images'

# Processed (e.g. resized, augmented) training/validation data 
# ready for the model to use 
TRAIN_VALID_DIR = DATA_DIR/'train-valid-split'

# Models location
MODELS_DIR = ROOT_DIR/'models'
# Pretrained models from TF model zoo
PRE_MODELS_DIR = MODELS_DIR/'pre-trained'
# Our fine-tuned models
FT_MODELS_DIR = MODELS_DIR/'fine-tuned'
DEFAULT_CONFIGS_DIR = MODELS_DIR/'default-pipeline-configs'

# Add our source directory to python PATH


In [57]:
sys.path.append(str(ROOT_DIR/'src'))

### Choose a synthesised data folder used for training the model

* Each image in the main synthesised dataset we use below contains 5 to 9 Siemens S7-300 and Allen Bradley controllers and 3 to 6 distractor Siemens ET200M controllers pasted onto backgrounds. 
* The scale of the controllers is adjusted to each background. We also introduce additional 10% random variations of size and 5-degree random rotations.

In [47]:
RAW_TRAIN_DATA_DIR = RAW_SYN_DATA_DIR/'bg_adjusted_with_distractors'

### Choose our model

We find that three models with different levels of augmentation (explained further) produced best results in our experiments:
* `efficient_det_1024_rand_aug_1_4`: 1024x1024 resolution with strong augmentation
* `efficient_det_1024_rand_aug_1_2`: 1024x1024 resolution with medium augmentation
* `efficient_det_1024_rand_aug_1_0`: 1024x1024 resolution with light augmentation

The model with the strongest augmentation provides significantly better mAP score at 0.75 IOU (~50%, as opposed to ~30% for the model with light augmentation), meaning that it localizes the objects better. All the models have the same mAP=77% at 0.5 IOU, which is the most important metric if we care more about identification then precise localisation. The model with light augmentation however sometimes produces 1-2 less Allen Bradley non-detections. The model choice should be based on the available validation/testing data. Here we train the model with strong augmentation. 

All the models should use the same synthesised dataset in `data/synthetic-annotated-train-images/bg_adjusted_with_distractors` for best results.

In [48]:
# List of TF pre-trained models with download links
PRE_MODELS = {
    'efficient_det_1024': "http://download.tensorflow.org/models/object_detection/tf2/20200711/efficientdet_d4_coco17_tpu-32.tar.gz",
}
# Select a pre-trained model
PRE_MODEL_NAME = 'efficient_det_1024'
# Set the name of our fine-tuned model
MY_MODEL_NAME = 'efficient_det_1024_rand_aug_1_4'

### Global parameters

In [49]:
# Target size of training/validation images after preprocessing
IMAGE_SIZE = 1024
# batch size (reduce if out of GPU memory)
BATCH_SIZE = 32
# number of steps for training
NUM_STEPS = 6000

### Initialize folder structure for custom training

Make directories for customized processed training/validation data ready to be fed to the model. 

Set paths to tfrec files and the label map.

In [50]:
%cd /content/gdrive/MyDrive/epri-deliver/data

/content/gdrive/MyDrive/epri-deliver/data


In [53]:
# Directories for preprocessed annotated image data used for training/validation
if not (TRAIN_VALID_DIR/'training').exists():
  (TRAIN_VALID_DIR/'training').mkdir()
if not (TRAIN_VALID_DIR/'validation').exists():
  (TRAIN_VALID_DIR/'validation').mkdir()

# Paths to TF record files containing training and validation data
train_tfrec_path = TRAIN_VALID_DIR/'train.record'
valid_tfrec_path = TRAIN_VALID_DIR/'valid.record'

# Path to the label map file that contains class names and IDs
label_map_path = TRAIN_VALID_DIR/'label_map.pbtxt'

Make directories and set paths for the model

In [54]:
my_model_dir = FT_MODELS_DIR / MY_MODEL_NAME
if not my_model_dir.exists(): my_model_dir.mkdir()

# Path to the model configuration file
config_path = my_model_dir / 'pipeline.config'

# Path to the initial fine tune checkpoint (from the pre-trained model)
ft_ckpt_dir = my_model_dir / 'fine_tune_checkpoint'
if not ft_ckpt_dir.exists(): ft_ckpt_dir.mkdir()
ft_ckpt_path = ft_ckpt_dir / 'ckpt-0'

# Make a folder for exported model
my_export_dir = my_model_dir/'exported'
if not my_export_dir.exists():
  my_export_dir.mkdir()

Set directory names specific for the current computation environment (TPU cluster or local GPU)

###Imports

In [60]:
%cd /content/gdrive/MyDrive/epri-deliver

/content/gdrive/MyDrive/epri-deliver


In [64]:
import numpy as np
import matplotlib.pyplot as plt
import math, os, shutil, glob
import urllib.request
import tarfile
import cv2

# TF object detection API utils
from object_detection.utils import label_map_util 
from object_detection.utils import config_util 
from object_detection.utils import visualization_utils as viz_utils

# Our src/ functions
import src.utils as src_util
import src.preproc as src_pre
import src.visualize as src_viz

%matplotlib inline

In [65]:
from importlib import reload  
src_viz = reload(src_viz)
src_pre = reload(src_pre)

# Create and save the label map

The label map file maps class IDs to class names. It is needed for the model to initialize. We save it in `TRAIN_VALID_DIR`.

In [66]:
%%writefile $label_map_path
  item {
    id: 1
    name: 'AllenBradley'
  }
  item {
    id: 2
    name: 'Siemens'
  }

Writing /content/gdrive/MyDrive/epri-deliver/data/train-valid-split/label_map.pbtxt


In [67]:
category_index = label_map_util.create_category_index_from_labelmap(label_map_path)
label_map_dict = label_map_util.get_label_map_dict(str(label_map_path))
# Number of classes extracted from the label map
num_classes = len(label_map_dict.items())

# Data preparation and augmentation

In [None]:
# clean up image data in our train/valid data folder if it's not empty 
src_pre.clear_dir(TRAIN_VALID_DIR)

## Training data augmentation

* Augmenting the training images by applying random transformations helps diversify the training set and avoid overfitting. 
* Without augmentation, the model quickly learns how to find controllers in our artificial images and is prone to relying on the unnatural artifacts in the synthesised data. This reduces performance on the test set.

The augmentation strategy we use is RandAugment. For each synthesised image, we apply 2 types of augmentations randomly chosen from the list of ten transformations:
* Rotation
* Shear along X
* Shear along Y
* Brightness
* Hue
* Contrast
* Saturation
* Gaussian noise
* Motion blur
* Sharpness

RandAugment has only two parameters:
* Number of augmentation to choose (`rand_aug_num`)
* Magnitude of augmentations (`rand_aug_mag` same for all)


* Before augmentation, resize synthesised images to IMAGE_SIZE if needed.
* We repeat the augmentation process `augment_mult` times to generate `augment_mult` times more images.
* Annotation files are transformed together with the images.
* Place the augmented data with annotations in `my_train_valid_dir/training`

In [None]:
# synthetic images/labels (no preprocessing required, proper size already)
src_pre.copy_augment_data(
    RAW_TRAIN_DATA_DIR, my_train_valid_dir/'training', 
    target_size = IMAGE_SIZE,
    reshape2square = True,
    augment_kwargs = {
        'rand_augment' : True,
        'rand_aug_mag' : 1.4,
        'rand_aug_num' : 2
    },
    augment_mult = 10
    )

Produced 3000 images with labels of class all


In [None]:
# test images/labels
if os.path.exists('tmp'): shutil.rmtree('tmp')
src_pre.make_split_images_and_labels(SRC_TST_DIR, 'tmp', only_top_left=True)
src_mkdat.copy_augment_data('tmp', image_data_dir, 
                            class_subdirs=False, 
                            reshape2square='stretch',
                            target_size = IMAGE_SIZE)

Produced 45 images with labels of class all


### Make a training / validation split in `image_data` from the files currently in this folder.

In [None]:
# clear train/validation directories if they exist
src_mkdat.clear_dir(image_data_dir/'training')
src_mkdat.clear_dir(image_data_dir/'validation')

Splitting can be done in several steps to have different train/validate ratios for each data source: 
* Copying files from a single source to `image_data`
* Distributing them between `image_data/training` and `image_data/validation` with a specific `train_valid_split` 

In [None]:
# make the split
train_img_dir, valid_img_dir = src_mkdat.make_split(image_data_dir, train_valid_split=0.)

number of training examples: 0
number of validation examples: 45


### Convert images and .xml labels to .tfrec files

In [None]:
!python src/generate_tfrecord_sk.py -x $train_img_dir -l $label_map_path -o $train_tfrec_path
!python src/generate_tfrecord_sk.py -x $valid_img_dir -l $label_map_path -o $valid_tfrec_path

2021-09-02 23:08:28.483527: E tensorflow/stream_executor/cuda/cuda_driver.cc:271] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
Successfully created the TFRecord file: object_detection/tfrec_data/train.record
2021-09-02 23:09:11.754892: E tensorflow/stream_executor/cuda/cuda_driver.cc:271] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
Successfully created the TFRecord file: object_detection/tfrec_data/valid.record
