<a href="https://colab.research.google.com/github/y-kallel/object-detection/blob/master/Object_Detection_with_Data_Augmentation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Object Detection Transfer Learning and Conversion to TFlite**

## Configs and Hyperparameters

Support a variety of models, you can find more pretrained model from [Tensorflow detection model zoo: COCO-trained models](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md#coco-trained-models), as well as their pipline config files in [object_detection/samples/configs/](https://github.com/tensorflow/models/tree/master/research/object_detection/samples/configs).

**Object-Detection Repo:**
The repo_url links to a github repo storing the raw dataset image and annotation files (Pascal VOC format) along with functions used in converting the xml annotations to csv files and then to the tf.record format for the datasets used during training for transfer learning.

**Models Config and Training Params:**
The MODELS_CONFIG parameters should be set to match the desired pretrained model you want to perform transfer learning on, with the model name matching the pretrained model's zip file name in the tf model zoo, and the pipeline file directing to the correct config file in the tensorflow models directory cloned in this colab. The training/evaluation steps values are dependent on the size of your dataset, but you can use this formula to calculate a good approximation of the right value for this colab: (num_images / batch_size) * num_epochs = num_steps.

**Data Augmentation:**
The primary method to use data augmentation on an original dataset is by simply providing data augmentation options in the model's pipeline.config file during training, and tensorflow will take care of applying these methods to your dataset internally before training. This first method works the best and should be used. There is also an external data augmentation implemented in this colab which has previously been used, it's not as powerful as the built-in methods for tf but there was no reason to delete that work which has been labeled as "Data augmentation session" and commented out on several cells within this colab.


In [None]:
# If you forked the repository, you can replace the link.
repo_url = 'https://github.com/y-kallel/object-detection'

# Number of training steps.
num_steps = 1  # 200000

# Number of evaluation steps.
num_eval_steps = 1
#http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v2_coco_2018_03_29.tgz
MODELS_CONFIG = {
    'ssd_mobilenet_v2': {
        'model_name': 'ssd_mobilenet_v2_coco_2018_03_29',
        'pipeline_file': 'ssd_mobilenet_v2_coco.config',
        'batch_size': 12
    },
    'faster_rcnn_inception_v2': {
        'model_name': 'faster_rcnn_inception_v2_coco_2018_01_28',
        'pipeline_file': 'faster_rcnn_inception_v2_pets.config',
        'batch_size': 12
    },
    'ssd_resnet50_v1_fpn': {
        'model_name': 'ssd_resnet50_v1_fpn_shared_box_predictor_640x640_coco14_sync_2018_07_03',
        'pipeline_file': 'ssd_resnet50_v1_fpn_shared_box_predictor_640x640_coco14_sync.config',
        'batch_size': 8
    },
    'rfcn_resnet101': {
        'model_name': 'rfcn_resnet101_coco_2018_01_28',
        'pipeline_file': 'rfcn_resnet101_pets.config',
        'batch_size': 8
    }
}

# Pick the model you want to use
# Select a model in `MODELS_CONFIG`.
selected_model = 'ssd_mobilenet_v2'

# Name of the object detection model to use.
MODEL = MODELS_CONFIG[selected_model]['model_name']

# Name of the pipline file in tensorflow object detection API.
pipeline_file = MODELS_CONFIG[selected_model]['pipeline_file']
pipeline_file = 'pipeline.config'

# Training batch size fits in Colabe's Tesla K80 GPU memory for selected model.
batch_size = MODELS_CONFIG[selected_model]['batch_size']

## Clone the `object_detection_demo` repository or your fork.

In [None]:
import os

%cd /content

repo_dir_path = os.path.abspath(os.path.join('.', os.path.basename(repo_url)))

!git clone {repo_url}
%cd {repo_dir_path}
!git checkout pedestrian
!git pull origin pedestrian

/content
Cloning into 'object-detection'...
remote: Enumerating objects: 6, done.[K
remote: Counting objects: 100% (6/6), done.[K
remote: Compressing objects: 100% (6/6), done.[K
remote: Total 2713 (delta 2), reused 0 (delta 0), pack-reused 2707[K
Receiving objects: 100% (2713/2713), 270.11 MiB | 17.50 MiB/s, done.
Resolving deltas: 100% (1305/1305), done.
/content/object-detection
Branch 'pedestrian' set up to track remote branch 'pedestrian' from 'origin'.
Switched to a new branch 'pedestrian'
From https://github.com/y-kallel/object-detection
 * branch            pedestrian -> FETCH_HEAD
Already up to date.


## Install required packages

This cell also installs the tensorflow object detection api into this runtime session. Use tensorflow version 1.15 (1.x) has been the most stable for me, and this colab uses some methods later on in converting to a frozen graph and tflite that rely on this version of tensorflow. Other methods may require you to change the used version.

In [None]:
# Data augmentation enabled session packages:
'''
%cd /content
!git clone --quiet https://github.com/y-kallel/models.git

#!apt-get install -qq protobuf-compiler python-pil python-lxml python-tk

!pip install -q Cython contextlib2 pillow lxml matplotlib

!pip install -q pycocotools

#!pip install tf_slim

#!pip install numpy==1.16

#!pip install pandas

!pip install imgaug

import imgaug as ia
ia.seed(1)
# imgaug uses matplotlib backend for displaying images
%matplotlib inline
from imgaug.augmentables.bbs import BoundingBox, BoundingBoxesOnImage
from imgaug import augmenters as iaa 
# imageio library will be used for image input/output
import imageio
import xml.etree.ElementTree as ET
import shutil
import re

#%tensorflow_version 1.x

%cd /content/models/research
!protoc object_detection/protos/*.proto --python_out=.

import os
os.environ['PYTHONPATH'] += ':/content/models/research/:/content/models/research/slim/'
'''

#!python object_detection/builders/model_builder_test.py

%cd /content

!git clone --quiet https://github.com/y-kallel/models.git

%cd models

!git pull -f origin

%cd /content

!apt-get install -qq protobuf-compiler python-pil python-lxml python-tk

!pip install -q Cython contextlib2 pillow lxml matplotlib

!pip install -q pycocotools

%tensorflow_version 1.x

!pip install tf_slim

!pip install numpy==1.16
#!pip install tensorflow==1.15.2

%cd /content/models/research
!protoc object_detection/protos/*.proto --python_out=.

import os
os.environ['PYTHONPATH'] += ':/content/models/research/:/content/models/research/slim/'

!python object_detection/builders/model_builder_test.py

/content
fatal: destination path 'models' already exists and is not an empty directory.
/content/models
Already up to date.
/content
TensorFlow 1.x selected.
/content/models/research


## Prepare `tfrecord` files

Use the following scripts to generate the `tfrecord` files.
```bash
# Convert train folder annotation xml files to a single csv file,
# generate the `label_map.pbtxt` file to `data/` directory as well.
python xml_to_csv.py -i data/images/train -o data/annotations/train_labels.csv -l data/annotations

# Convert test folder annotation xml files to a single csv.
python xml_to_csv.py -i data/images/test -o data/annotations/test_labels.csv

# Generate `train.record`
python generate_tfrecord.py --csv_input=data/annotations/train_labels.csv --output_path=data/annotations/train.record --img_path=data/images/train --label_map data/annotations/label_map.pbtxt

# Generate `test.record`
python generate_tfrecord.py --csv_input=data/annotations/test_labels.csv --output_path=data/annotations/test.record --img_path=data/images/test --label_map data/annotations/label_map.pbtxt
```

In [None]:
'''
# Data Augmentation version:

%cd {repo_dir_path}

import pandas as pd

# Convert train folder annotation xml files to a single csv file,
# generate the `label_map.pbtxt` file to `data/` directory as well.
!python xml_to_csv.py -i data/images/train -o data/annotations/train_labels.csv -l data/annotations

train_images_df = pd.read_csv("data/annotations/train_labels.csv") # convert train csv to df

# Convert test folder annotation xml files to a single csv.
!python xml_to_csv.py -i data/images/test -o data/annotations/test_labels.csv

test_images_df = pd.read_csv("data/annotations/test_labels.csv") # convert test csv to df

# Generate records after data augmentation now

# Generate `train.record`
#!python generate_tfrecord.py --csv_input=data/annotations/train_labels.csv --output_path=data/annotations/train.record --img_path=data/images/train --label_map data/annotations/label_map.pbtxt

# Generate `test.record`
#!python generate_tfrecord.py --csv_input=data/annotations/test_labels.csv --output_path=data/annotations/test.record --img_path=data/images/test --label_map data/annotations/label_map.pbtxt

'''

'\n# Data Augmentation version:\n\n%cd {repo_dir_path}\n\nimport pandas as pd\n\n# Convert train folder annotation xml files to a single csv file,\n# generate the `label_map.pbtxt` file to `data/` directory as well.\n!python xml_to_csv.py -i data/images/train -o data/annotations/train_labels.csv -l data/annotations\n\ntrain_images_df = pd.read_csv("data/annotations/train_labels.csv") # convert train csv to df\n\n# Convert test folder annotation xml files to a single csv.\n!python xml_to_csv.py -i data/images/test -o data/annotations/test_labels.csv\n\ntest_images_df = pd.read_csv("data/annotations/test_labels.csv") # convert test csv to df\n\n# Generate records after data augmentation now\n\n# Generate `train.record`\n#!python generate_tfrecord.py --csv_input=data/annotations/train_labels.csv --output_path=data/annotations/train.record --img_path=data/images/train --label_map data/annotations/label_map.pbtxt\n\n# Generate `test.record`\n#!python generate_tfrecord.py --csv_input=data/an

# Converting txt annotations from the INRIA Dataset to Pascal XML

This is an optional cell to run used in the case where you are using the pedestrian dataset I got from the INRIA pedestrian dataset hosted on this [github repo](https://github.com/YoungYoung619/pedestrian-detection-in-hazy-weather.git)

In [None]:

import os

import xml.etree.cElementTree as ET

dir = '/content/object-detection/data/images/train'

for file in os.listdir(dir):
    if (file.endswith('.xml')):
      tree = ET.parse(os.path.join(dir, file))
      root_xml = tree.getroot()

      for filename in root_xml.findall('filename'):
          if filename.text.endswith('.txt'):
            filename.text = filename.text[:-4] + ".jpg"
      tree.write(os.path.join(dir, file))

# Edit xml file to look for jpg images not txt


dir = '/content/object-detection/data/images/test'

for file in os.listdir(dir):
    if (file.endswith('.xml')):
      tree = ET.parse(os.path.join(dir, file))
      root_xml = tree.getroot()

      for filename in root_xml.findall('filename'):
          if filename.text.endswith('.txt'):
            filename.text = filename.text[:-4] + ".jpg"
      tree.write(os.path.join(dir, file))

In [None]:
# No Data Augmentation:

%cd {repo_dir_path}

# Convert train folder annotation xml files to a single csv file,
# generate the `label_map.pbtxt` file to `data/` directory as well.
!python xml_to_csv.py -i data/images/train -o data/annotations/train_labels.csv -l data/annotations

# Convert test folder annotation xml files to a single csv.
!python xml_to_csv.py -i data/images/test -o data/annotations/test_labels.csv

# Generate `train.record`
!python generate_tfrecord.py --csv_input=data/annotations/train_labels.csv --output_path=data/annotations/train.record --img_path=data/images/train --label_map data/annotations/label_map.pbtxt

# Generate `test.record`
!python generate_tfrecord.py --csv_input=data/annotations/test_labels.csv --output_path=data/annotations/test.record --img_path=data/images/test --label_map data/annotations/label_map.pbtxt

/content/object-detection
Successfully converted xml to csv.
Generate `data/annotations/label_map.pbtxt`
Successfully converted xml to csv.


W0807 22:15:55.934962 140232506926976 module_wrapper.py:139] From generate_tfrecord.py:107: The name tf.python_io.TFRecordWriter is deprecated. Please use tf.io.TFRecordWriter instead.


W0807 22:15:56.223119 140232506926976 module_wrapper.py:139] From generate_tfrecord.py:53: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.

Successfully created the TFRecords: /content/object-detection/data/annotations/train.record


W0807 22:16:00.581730 140607318943616 module_wrapper.py:139] From generate_tfrecord.py:107: The name tf.python_io.TFRecordWriter is deprecated. Please use tf.io.TFRecordWriter instead.


W0807 22:16:00.717382 140607318943616 module_wrapper.py:139] From generate_tfrecord.py:53: The name tf.gfile.GFile is deprecated. Please use tf.io.gfile.GFile instead.

Successfully created the TFRecords: /content/object-

In [None]:
'''
# Data Augmentation Run:
# This setup of augmentation parameters will pick two of four given augmenters and apply them in random order
aug = iaa.SomeOf(2, [    
    iaa.Affine(scale=(0.5, 1.5)),
    iaa.Affine(rotate=(-60, 60)),
    iaa.Affine(translate_percent={"x": (-0.3, 0.3), "y": (-0.3, 0.3)}),
    iaa.Fliplr(1),
    iaa.Multiply((0.5, 1.5)),
    iaa.GaussianBlur(sigma=(1.0, 3.0)),
    iaa.AdditiveGaussianNoise(scale=(0.03*255, 0.05*255))
])
'''

'\n# Data Augmentation Run:\n# This setup of augmentation parameters will pick two of four given augmenters and apply them in random order\naug = iaa.SomeOf(2, [    \n    iaa.Affine(scale=(0.5, 1.5)),\n    iaa.Affine(rotate=(-60, 60)),\n    iaa.Affine(translate_percent={"x": (-0.3, 0.3), "y": (-0.3, 0.3)}),\n    iaa.Fliplr(1),\n    iaa.Multiply((0.5, 1.5)),\n    iaa.GaussianBlur(sigma=(1.0, 3.0)),\n    iaa.AdditiveGaussianNoise(scale=(0.03*255, 0.05*255))\n])\n'

In [None]:
'''
# Data Augmentation Run:
# function to convert BoundingBoxesOnImage object into DataFrame
def bbs_obj_to_df(bbs_object):
#     convert BoundingBoxesOnImage object into array
    bbs_array = bbs_object.to_xyxy_array()
#     convert array into a DataFrame ['xmin', 'ymin', 'xmax', 'ymax'] columns
    df_bbs = pd.DataFrame(bbs_array, columns=['xmin', 'ymin', 'xmax', 'ymax'])
    return df_bbs
'''

"\n# Data Augmentation Run:\n# function to convert BoundingBoxesOnImage object into DataFrame\ndef bbs_obj_to_df(bbs_object):\n#     convert BoundingBoxesOnImage object into array\n    bbs_array = bbs_object.to_xyxy_array()\n#     convert array into a DataFrame ['xmin', 'ymin', 'xmax', 'ymax'] columns\n    df_bbs = pd.DataFrame(bbs_array, columns=['xmin', 'ymin', 'xmax', 'ymax'])\n    return df_bbs\n"

In [None]:
# Data Augmentation Run:
# image data augmentation, df = dataframe of current images, images_path = path to images, aug_images_path = desired destination of aug_imgs, image_prefix = file prefix for aug_imgs, augmentor = aug settings
'''
def image_aug(df, images_path, aug_images_path, image_prefix, augmentor):
    # create data frame which we're going to populate with augmented image info
    aug_bbs_xy = pd.DataFrame(columns=
                              ['filename','width','height','class', 'xmin', 'ymin', 'xmax', 'ymax']
                             )
    grouped = df.groupby('filename')
    
    for filename in df['filename'].unique():
    #   get separate data frame grouped by file name
        group_df = grouped.get_group(filename)
        group_df = group_df.reset_index()
        group_df = group_df.drop(['index'], axis=1)   
    #   read the image
        image = imageio.imread(images_path+filename)
    #   get bounding boxes coordinates and write into array        
        bb_array = group_df.drop(['filename', 'width', 'height', 'class'], axis=1).values
    #   pass the array of bounding boxes coordinates to the imgaug library
        bbs = BoundingBoxesOnImage.from_xyxy_array(bb_array, shape=image.shape)
    #   apply augmentation on image and on the bounding boxes
        image_aug, bbs_aug = augmentor(image=image, bounding_boxes=bbs)
    #   disregard bounding boxes which have fallen out of image pane    
        bbs_aug = bbs_aug.remove_out_of_image()
    #   clip bounding boxes which are partially outside of image pane
        bbs_aug = bbs_aug.clip_out_of_image()
        
    #   don't perform any actions with the image if there are no bounding boxes left in it    
        if re.findall('Image...', str(bbs_aug)) == ['Image([]']:
            pass
        
    #   otherwise continue
        else:
        #   write augmented image to a file
            imageio.imwrite(aug_images_path+image_prefix+filename, image_aug)  
        #   create a data frame with augmented values of image width and height
            info_df = group_df.drop(['xmin', 'ymin', 'xmax', 'ymax'], axis=1)    
            for index, _ in info_df.iterrows():
                info_df.at[index, 'width'] = image_aug.shape[1]
                info_df.at[index, 'height'] = image_aug.shape[0]
        #   rename filenames by adding the predifined prefix
            info_df['filename'] = info_df['filename'].apply(lambda x: image_prefix+x)
        #   create a data frame with augmented bounding boxes coordinates using the function we created earlier
            bbs_df = bbs_obj_to_df(bbs_aug)
        #   concat all new augmented info into new data frame
            aug_df = pd.concat([info_df, bbs_df], axis=1)
        #   append rows to aug_bbs_xy data frame
            aug_bbs_xy = pd.concat([aug_bbs_xy, aug_df])            
    
    # return dataframe with updated images and bounding boxes annotations 
    aug_bbs_xy = aug_bbs_xy.reset_index()
    aug_bbs_xy = aug_bbs_xy.drop(['index'], axis=1)
    return aug_bbs_xy
'''

"\ndef image_aug(df, images_path, aug_images_path, image_prefix, augmentor):\n    # create data frame which we're going to populate with augmented image info\n    aug_bbs_xy = pd.DataFrame(columns=\n                              ['filename','width','height','class', 'xmin', 'ymin', 'xmax', 'ymax']\n                             )\n    grouped = df.groupby('filename')\n    \n    for filename in df['filename'].unique():\n    #   get separate data frame grouped by file name\n        group_df = grouped.get_group(filename)\n        group_df = group_df.reset_index()\n        group_df = group_df.drop(['index'], axis=1)   \n    #   read the image\n        image = imageio.imread(images_path+filename)\n    #   get bounding boxes coordinates and write into array        \n        bb_array = group_df.drop(['filename', 'width', 'height', 'class'], axis=1).values\n    #   pass the array of bounding boxes coordinates to the imgaug library\n        bbs = BoundingBoxesOnImage.from_xyxy_array(bb_array, sh

In [None]:
'''
# Data Augmentation Run:
# Apply augmentation to our images and save files into 'aug_images/' folder with 'aug1_' prefix.
# Write the updated images and bounding boxes annotations to the augmented_images_df dataframe.
%cd {repo_dir_path}
augmented_train_df = image_aug(train_images_df, 'data/images/', 'data/images/train', 'augaug1_', aug)

augmented_test_df = image_aug(test_images_df, 'data/images/', 'data/images/test', 'augaug1_', aug)

augmented_train_df.to_csv('data/annotations/aug_train_labels.csv')
augmented_test_df.to_csv('data/annotations/aug_test_labels.csv')
'''

"\n# Data Augmentation Run:\n# Apply augmentation to our images and save files into 'aug_images/' folder with 'aug1_' prefix.\n# Write the updated images and bounding boxes annotations to the augmented_images_df dataframe.\n%cd {repo_dir_path}\naugmented_train_df = image_aug(train_images_df, 'data/images/', 'data/images/train', 'augaug1_', aug)\n\naugmented_test_df = image_aug(test_images_df, 'data/images/', 'data/images/test', 'augaug1_', aug)\n\naugmented_train_df.to_csv('data/annotations/aug_train_labels.csv')\naugmented_test_df.to_csv('data/annotations/aug_test_labels.csv')\n"

In [None]:
'''
# Data Augmentation Run:
augmented_test_df
'''


'\n# Data Augmentation Run:\naugmented_test_df\n'

In [None]:
'''
# Data Augmentation Run:
# Generate `train.record`
%cd data/images/test/
%ls
%cd {repo_dir_path}

!python generate_tfrecord.py --csv_input=data/annotations/aug_train_labels.csv --output_path=data/annotations/train.record --img_path=data/images/train --label_map data/annotations/label_map.pbtxt

# Generate `test.record`
!python generate_tfrecord.py --csv_input=data/annotations/aug_test_labels.csv --output_path=data/annotations/test.record --img_path=data/images/test --label_map data/annotations/label_map.pbtxt
'''

'\n# Data Augmentation Run:\n# Generate `train.record`\n%cd data/images/test/\n%ls\n%cd {repo_dir_path}\n\n!python generate_tfrecord.py --csv_input=data/annotations/aug_train_labels.csv --output_path=data/annotations/train.record --img_path=data/images/train --label_map data/annotations/label_map.pbtxt\n\n# Generate `test.record`\n!python generate_tfrecord.py --csv_input=data/annotations/aug_test_labels.csv --output_path=data/annotations/test.record --img_path=data/images/test --label_map data/annotations/label_map.pbtxt\n'

In [None]:
test_record_fname = '/content/object-detection/data/annotations/test.record'
train_record_fname = '/content/object-detection/data/annotations/train.record'
label_map_pbtxt_fname = '/content/object-detection/data/annotations/label_map.pbtxt'


## Download base model

This cell downloads a pretrained model and places it in the pretrained_model dir located at models/research/ based on the model paramters you selected in the first cell.

In [None]:
%cd /content/models/research

import os
import shutil
import glob
import urllib.request
import tarfile
MODEL_FILE = MODEL + '.tar.gz'
DOWNLOAD_BASE = 'http://download.tensorflow.org/models/object_detection/'
DEST_DIR = '/content/models/research/pretrained_model'

if not (os.path.exists(MODEL_FILE)):
    urllib.request.urlretrieve(DOWNLOAD_BASE + MODEL_FILE, MODEL_FILE)

tar = tarfile.open(MODEL_FILE)
tar.extractall()
tar.close()

os.remove(MODEL_FILE)
if (os.path.exists(DEST_DIR)):
    shutil.rmtree(DEST_DIR)
os.rename(MODEL, DEST_DIR)

/content/models/research


In [None]:
!echo {DEST_DIR}
!ls -alh {DEST_DIR}

/content/models/research/pretrained_model
total 135M
drwxr-xr-x  3 345018 89939 4.0K Mar 30  2018 .
drwxr-xr-x 63 root   root  4.0K Aug  7 22:16 ..
-rw-r--r--  1 345018 89939   77 Mar 30  2018 checkpoint
-rw-r--r--  1 345018 89939  67M Mar 30  2018 frozen_inference_graph.pb
-rw-r--r--  1 345018 89939  65M Mar 30  2018 model.ckpt.data-00000-of-00001
-rw-r--r--  1 345018 89939  15K Mar 30  2018 model.ckpt.index
-rw-r--r--  1 345018 89939 3.4M Mar 30  2018 model.ckpt.meta
-rw-r--r--  1 345018 89939 4.2K Mar 30  2018 pipeline.config
drwxr-xr-x  3 345018 89939 4.0K Mar 30  2018 saved_model


In [None]:
fine_tune_checkpoint = os.path.join(DEST_DIR, "model.ckpt")
fine_tune_checkpoint

'/content/models/research/pretrained_model/model.ckpt'

## Configuring a Training Pipeline

The training pipeline is stored in the pipeline_fname variable which directs the model.config file and determines the paramters used during training

In [None]:
import os
pipeline_fname = os.path.join('/content/models/research/object_detection/samples/configs/', pipeline_file)

assert os.path.isfile(pipeline_fname), '`{}` not exist'.format(pipeline_fname)

In [None]:

def get_num_classes(pbtxt_fname):
    from object_detection.utils import label_map_util
    label_map = label_map_util.load_labelmap(pbtxt_fname)
    categories = label_map_util.convert_label_map_to_categories(
        label_map, max_num_classes=90, use_display_name=True)
    category_index = label_map_util.create_category_index(categories)
    return len(category_index.keys())

# Edit the pipeline.config Model File
This cell replaces the paths in the model.config to the previously created tf.record datasets, and replaces the step and batch size values with your inputs from the beginning of the colab. 

Other useful options are not edited however, including useful, built-in data augmentation methods and the inference type of the model (uint8 vs float32). You can edit this directly by double clicking on the config file within the colab files section and then saving your edits. Possible data augmentation options to add and info on their formatting can be found [here](https://stackoverflow.com/questions/44906317/what-are-possible-values-for-data-augmentation-options-in-the-tensorflow-object).

In [None]:

import re
 
num_classes = get_num_classes(label_map_pbtxt_fname)

learn_rate = 0.0001

with open(pipeline_fname) as f:
    s = f.read()
with open(pipeline_fname, 'w') as f:
    
    # fine_tune_checkpoint
    s = re.sub('fine_tune_checkpoint: ".*?"',
               'fine_tune_checkpoint: "{}"'.format(fine_tune_checkpoint), s)
    
    # tfrecord files train and test.
    s = re.sub(
        '(input_path: ".*?)(train.record)(.*?")', 'input_path: "{}"'.format(train_record_fname), s)
    s = re.sub(
        '(input_path: ".*?)(val.record)(.*?")', 'input_path: "{}"'.format(test_record_fname), s)

    # label_map_path
    s = re.sub(
        'label_map_path: ".*?"', 'label_map_path: "{}"'.format(label_map_pbtxt_fname), s)

    # Set training batch_size.
    s = re.sub('batch_size: [0-9]+',
               'batch_size: {}'.format(batch_size), s)

    # Set training steps, num_steps
    s = re.sub('num_steps: [0-9]+',
               'num_steps: {}'.format(num_steps), s)
    
    # Set number of classes num_classes.
    s = re.sub('num_classes: [0-9]+',
               'num_classes: {}'.format(num_classes), s)
    # Set learning rate.
    s = re.sub('learning_rate_base: 0.04', 'learning_rate_base: {}'.format(learn_rate), s)
    f.write(s)

This will display the pipeline config file that you will use for the training session.

In [None]:
!cat {pipeline_fname}

# SSD with Mobilenet v2 configuration for MSCOCO Dataset.
# Users should configure the fine_tune_checkpoint field in the train config as
# well as the label_map_path and input_path fields in the train_input_reader and
# eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that
# should be configured.

model {
  ssd {
    num_classes: 1
    box_coder {
      faster_rcnn_box_coder {
        y_scale: 10.0
        x_scale: 10.0
        height_scale: 5.0
        width_scale: 5.0
      }
    }
    matcher {
      argmax_matcher {
        matched_threshold: 0.5
        unmatched_threshold: 0.5
        ignore_thresholds: false
        negatives_lower_than_unmatched: true
        force_match_for_each_row: true
      }
    }
    similarity_calculator {
      iou_similarity {
      }
    }
    anchor_generator {
      ssd_anchor_generator {
        num_layers: 6
        min_scale: 0.2
        max_scale: 0.95
        aspect_ratios: 1.0
        aspect_ratios: 2.0
        aspect_

In [None]:
model_dir = 'training/' # dir that stores current training checkpoints
# Optionally remove content in output model directory to fresh start.
!rm -rf {model_dir}
os.makedirs(model_dir, exist_ok=True)

## Run Tensorboard(Optional)

In [None]:
!wget https://bin.equinox.io/c/4VmDzA7iaHb/ngrok-stable-linux-amd64.zip
!unzip -o ngrok-stable-linux-amd64.zip

--2020-08-04 19:03:50--  https://bin.equinox.io/c/4VmDzA7iaHb/ngrok-stable-linux-amd64.zip
Resolving bin.equinox.io (bin.equinox.io)... 18.214.118.253, 3.209.27.98, 34.225.3.211, ...
Connecting to bin.equinox.io (bin.equinox.io)|18.214.118.253|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 13773305 (13M) [application/octet-stream]
Saving to: ‘ngrok-stable-linux-amd64.zip’


2020-08-04 19:03:51 (18.9 MB/s) - ‘ngrok-stable-linux-amd64.zip’ saved [13773305/13773305]

Archive:  ngrok-stable-linux-amd64.zip
  inflating: ngrok                   


In [None]:
LOG_DIR = model_dir
get_ipython().system_raw(
    'tensorboard --logdir {} --host 0.0.0.0 --port 6006 &'
    .format(LOG_DIR)
)

In [None]:
get_ipython().system_raw('./ngrok http 6006 &')

### Get Tensorboard link

In [None]:
! curl -s http://localhost:4040/api/tunnels | python3 -c \
    "import sys, json; print(json.load(sys.stdin)['tunnels'][0]['public_url'])"

https://fdfb4a0717ba.ngrok.io


## Train the model

In [None]:
!python /content/models/research/object_detection/model_main.py \
    --pipeline_config_path={pipeline_fname} \
    --model_dir={model_dir} \
    --alsologtostderr \
    --num_train_steps={num_steps} \
    --num_eval_steps={num_eval_steps}

W0806 20:18:42.408151 140009913661312 model_lib.py:717] Forced number of epochs for all eval validations to be 1.
INFO:tensorflow:Maybe overwriting train_steps: 1
I0806 20:18:42.408396 140009913661312 config_util.py:523] Maybe overwriting train_steps: 1
INFO:tensorflow:Maybe overwriting use_bfloat16: False
I0806 20:18:42.408510 140009913661312 config_util.py:523] Maybe overwriting use_bfloat16: False
INFO:tensorflow:Maybe overwriting sample_1_of_n_eval_examples: 1
I0806 20:18:42.408601 140009913661312 config_util.py:523] Maybe overwriting sample_1_of_n_eval_examples: 1
INFO:tensorflow:Maybe overwriting eval_num_epochs: 1
I0806 20:18:42.408688 140009913661312 config_util.py:523] Maybe overwriting eval_num_epochs: 1
INFO:tensorflow:Maybe overwriting load_pretrained: True
I0806 20:18:42.408767 140009913661312 config_util.py:523] Maybe overwriting load_pretrained: True
INFO:tensorflow:Ignoring config override key: load_pretrained
I0806 20:18:42.408850 140009913661312 config_util.py:533] Ig

In [None]:
!ls {model_dir}

checkpoint				     model.ckpt-0.index
eval_0					     model.ckpt-0.meta
events.out.tfevents.1596745154.60f465aa22c2  model.ckpt-1.data-00000-of-00001
export					     model.ckpt-1.index
graph.pbtxt				     model.ckpt-1.meta
model.ckpt-0.data-00000-of-00001


In [None]:
# Legacy way of training(also works).
# !python /content/models/research/object_detection/legacy/train.py --logtostderr --train_dir={model_dir} --pipeline_config_path={pipeline_fname}

## Exporting a Trained Inference Graph
Once your training job is complete, you need to extract the newly trained inference graph, which will be later used to perform the object detection. The optional flag max_detections can be used to change the max number of detections outputted by the model, this value should match what was given in the pipeline.config used during training. This can be done as follows:

In [None]:
import re
import numpy as np
import os

%mv "/content/drive/My Drive/ssd_mobilenet_v2_quantized_300x300_coco_2019_01_03" /content/
%mv "/content/drive/My Drive/ssd_mobilenet_v2_quantized_300x300_coco_2019_01_03/pipeline.config" /content/ssd_mobilenet_v2_quantized_300x300_coco_2019_01_03/

output_directory = './fine_tuned_model'
model_dir = './pretrained_model'
#model_dir = '/content/ssd_mobilenet_v2_quantized_300x300_coco_2019_01_03'

lst = os.listdir(model_dir)
lst = [l for l in lst if 'model.ckpt-' in l and '.meta' in l]
steps=np.array([int(re.findall('\d+', l)[0]) for l in lst])
last_model = lst[steps.argmax()].replace('.meta', '')

last_model_path = os.path.join(model_dir, last_model)
print(last_model_path)

!python object_detection/export_tflite_ssd_graph.py \
    --add_postprocessing_op=true \
    --pipeline_config_path={pipeline_fname} \
    --trained_checkpoint_prefix={last_model_path} \
    --output_directory={output_directory} \
    --max_detections=100

#Not optimized for tflite conversion:
#!python /content/models/research/object_detection/export_inference_graph.py \
#    --input_type=image_tensor \
#    --pipeline_config_path={pipeline_fname} \
#    --output_directory={output_directory} \
#    --trained_checkpoint_prefix={last_model_path}

mv: cannot stat '/content/drive/My Drive/ssd_mobilenet_v2_quantized_300x300_coco_2019_01_03': No such file or directory
mv: cannot stat '/content/drive/My Drive/ssd_mobilenet_v2_quantized_300x300_coco_2019_01_03/pipeline.config': No such file or directory
/content/ssd_mobilenet_v2_quantized_300x300_coco_2019_01_03/model.ckpt
Instructions for updating:
Please use `layer.__call__` method instead.
W0811 00:11:47.947250 139926682937216 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tf_slim/layers/layers.py:1089: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.__call__` method instead.
INFO:tensorflow:depth of additional conv before box predictor: 0
I0811 00:11:49.865658 139926682937216 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
I0811 00:11:49.892899 1399266

In [None]:
!ls {output_directory}

tflite_graph.pb  tflite_graph.pbtxt


## Download the model `.pb` file

In [None]:
import os

pb_fname = os.path.join(os.path.abspath(output_directory), "tflite_graph.pb")
assert os.path.isfile(pb_fname), '`{}` not exist'.format(pb_fname)

In [None]:
!ls -alh {pb_fname}

-rw-r--r-- 1 root root 19M Aug  6 20:21 /content/models/research/fine_tuned_model/tflite_graph.pb


### Option1 : upload the `.pb` file to your Google Drive
Then download it from your Google Drive to local file system.

During this step, you will be prompted to enter the token.

In [None]:
# Install the PyDrive wrapper & import libraries.
# This only needs to be done once in a notebook.
!pip install -U -q PyDrive
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials


# Authenticate and create the PyDrive client.
# This only needs to be done once in a notebook.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

fname = os.path.basename(pb_fname)
# Create & upload a text file.
uploaded = drive.CreateFile({'title': fname})
uploaded.SetContentFile(pb_fname)
uploaded.Upload()
print('Uploaded file with ID {}'.format(uploaded.get('id')))

Uploaded file with ID 1I3v4jwI2B0-_D5rwMKNcFD-4KUwOkAns


### Option2 :  Download the `.pb` file directly to your local file system
This method may not be stable when downloading large files like the model `.pb` file. Try **option 1** instead if not working.

In [None]:
from google.colab import files
files.download(pb_fname)

### Download the `label_map.pbtxt` file

In [None]:
from google.colab import files
files.download(label_map_pbtxt_fname)

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

### Download the modified pipline file
If you plan to use OpenVINO toolkit to convert the `.pb` file to inference faster on Intel's hardware (CPU/GPU, Movidius, etc.)

In [None]:
files.download(pipeline_fname)

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [None]:
files.download('/content/object-detection/retina_google_v1.tflite')

In [None]:
# !tar cfz fine_tuned_model.tar.gz fine_tuned_model
# from google.colab import files
# files.download('fine_tuned_model.tar.gz')

## Run inference test
Test with images in repository `object-detection/test` directory.

In [None]:

import os
import glob

# Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_CKPT = "/content/models/research/fine_tuned_model/tflite_graph.pb"

# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = '/content/models/research/fine_tuned_model/tflite_graph.pbtxt'

# If you want to test the code with your images, just add images files to the PATH_TO_TEST_IMAGES_DIR.
PATH_TO_TEST_IMAGES_DIR =  os.path.join(repo_dir_path, "test")

assert os.path.isfile(PATH_TO_CKPT)
assert os.path.isfile(PATH_TO_LABELS)
TEST_IMAGE_PATHS = glob.glob(os.path.join(PATH_TO_TEST_IMAGES_DIR, "*.*"))
print(PATH_TO_TEST_IMAGES_DIR)
print(len(TEST_IMAGE_PATHS))
assert len(TEST_IMAGE_PATHS) > 0, 'No image found in `{}`.'.format(PATH_TO_TEST_IMAGES_DIR)
print(TEST_IMAGE_PATHS)


In [None]:

%cd /content/models/research/object_detection

import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile

from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image

# This is needed since the notebook is stored in the object_detection folder.
sys.path.append("..")
from object_detection.utils import ops as utils_ops

num_classes = 1
# This is needed to display the images.
%matplotlib inline


from object_detection.utils import label_map_util

from object_detection.utils import visualization_utils as vis_util


detection_graph = tf.Graph()
with detection_graph.as_default():
    od_graph_def = tf.GraphDef()
    with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
        serialized_graph = fid.read()
        od_graph_def.ParseFromString(serialized_graph)
        tf.import_graph_def(od_graph_def, name='')


label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(
    label_map, max_num_classes=num_classes, use_display_name=True)
category_index = label_map_util.create_category_index(categories)


def load_image_into_numpy_array(image):
    (im_width, im_height) = image.size
    return np.array(image.getdata()).reshape(
        (im_height, im_width, 3)).astype(np.uint8)

# Size, in inches, of the output images.
IMAGE_SIZE = (12, 8)


def run_inference_for_single_image(image, graph):
    with graph.as_default():
        with tf.Session() as sess:
            # Get handles to input and output tensors
            ops = tf.get_default_graph().get_operations()
            all_tensor_names = {
                output.name for op in ops for output in op.outputs}
            tensor_dict = {}
            for key in [
                'num_detections', 'detection_boxes', 'detection_scores',
                'detection_classes', 'detection_masks'
            ]:
                tensor_name = key + ':0'
                if tensor_name in all_tensor_names:
                    tensor_dict[key] = tf.get_default_graph().get_tensor_by_name(
                        tensor_name)
            if 'detection_masks' in tensor_dict:
                # The following processing is only for single image
                detection_boxes = tf.squeeze(
                    tensor_dict['detection_boxes'], [0])
                detection_masks = tf.squeeze(
                    tensor_dict['detection_masks'], [0])
                # Reframe is required to translate mask from box coordinates to image coordinates and fit the image size.
                real_num_detection = tf.cast(
                    tensor_dict['num_detections'][0], tf.int32)
                detection_boxes = tf.slice(detection_boxes, [0, 0], [
                                           real_num_detection, -1])
                detection_masks = tf.slice(detection_masks, [0, 0, 0], [
                                           real_num_detection, -1, -1])
                detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(
                    detection_masks, detection_boxes, image.shape[0], image.shape[1])
                detection_masks_reframed = tf.cast(
                    tf.greater(detection_masks_reframed, 0.5), tf.uint8)
                # Follow the convention by adding back the batch dimension
                tensor_dict['detection_masks'] = tf.expand_dims(
                    detection_masks_reframed, 0)
            image_tensor = 'image_tensor:0'

            # Run inference
            output_dict = sess.run(tensor_dict,
                                   feed_dict={image_tensor: np.expand_dims(image, 0)})

            # all outputs are float32 numpy arrays, so convert types as appropriate
            output_dict['num_detections'] = int(
                output_dict['num_detections'][0])
            output_dict['detection_classes'] = output_dict[
                'detection_classes'][0].astype(np.uint8)
            output_dict['detection_boxes'] = output_dict['detection_boxes'][0]
            output_dict['detection_scores'] = output_dict['detection_scores'][0]
            if 'detection_masks' in output_dict:
                output_dict['detection_masks'] = output_dict['detection_masks'][0]
    return output_dict


for image_path in TEST_IMAGE_PATHS:
    image = Image.open(image_path)
    # the array based representation of the image will be used later in order to prepare the
    # result image with boxes and labels on it.
    image_np = load_image_into_numpy_array(image)
    # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
    image_np_expanded = np.expand_dims(image_np, axis=0)
    # Actual detection.
    output_dict = run_inference_for_single_image(image_np, detection_graph)
    # Visualization of the results of a detection.
    vis_util.visualize_boxes_and_labels_on_image_array(
        image_np,
        output_dict['detection_boxes'],
        output_dict['detection_classes'],
        output_dict['detection_scores'],
        category_index,
        instance_masks=output_dict.get('detection_masks'),
        use_normalized_coordinates=True,
        line_thickness=8)
    plt.figure(figsize=IMAGE_SIZE)
    plt.imshow(image_np)


# Convert frozen graph to TFLITE

Using tf-1.15, you can convert a frozen inference graph to a tflite model using the command line with the tflite_convert command. You can also use the TFLiteConverter API, but quantization using that method does not consistently work. To run the tflite_convert command, you need to know the input shape of the model (1x300x300x3 for the pretrained ssd mobilenet), and the names of the input and output tensors of the model. To do this, you can use the Netron visualization tool linked [here](https://lutzroeder.github.io/netron/), and visualize the frozen graph.

The other options used in tflite_convert are for quantizing the model and ensuring that it outputs the correct number of max detections. To quantize the model, you set the inference_type to uint8, set the std_dev and mean for the original normalized input, and may need to specify a default_ranges_min/max as well for the activation function (for Relu6 this can be set to 0 and 6 respectively)

In [None]:


# FeatureExtractor/MobilenetV2/MobilenetV2/input
# input shape = batch_size, height, width, channels
import tensorflow as tf
input_arrays = ["normalized_input_image_tensor"]
output_arrays = ['TFLite_Detection_PostProcess',
                'TFLite_Detection_PostProcess:1',
                'TFLite_Detection_PostProcess:2',
                'TFLite_Detection_PostProcess:3']
input_shapes = {"normalized_input_image_tensor" : [1, 300, 300, 3]}


# Using API:
#converter = tf.compat.v1.lite.TFLiteConverter.from_frozen_graph(frozen_graph_file,
               #                                 input_arrays=input_arrays,
               #                                 output_arrays=output_arrays,
               #                                 input_shapes=input_shapes)
#converter.allow_custom_ops = True
#converter.optimizations = [tf.lite.Optimize.DEFAULT]
#converter.representative_dataset = representative_data_gen
# Ensure that if any ops can't be quantized, the converter throws an error
#converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
# Set the input and output tensors to uint8 (APIs added in r2.3)#

#tflite_quant_model = converter.convert()
#open("person_quant.tflite", "wb").write(tflite_quant_model)


INPUT_TENSORS='normalized_input_image_tensor'
OUTPUT_TENSORS='TFLite_Detection_PostProcess,TFLite_Detection_PostProcess:1,TFLite_Detection_PostProcess:2,TFLite_Detection_PostProcess:3'

#!tflite_convert --help

!tflite_convert \
  --output_file=person_only_quant_v2.tflite \
  --graph_def_file={output_directory} \
  --inference_type=QUANTIZED_UINT8 \
  --input_arrays={INPUT_TENSORS} \
  --output_arrays={OUTPUT_TENSORS} \
  --mean_values=128 \
  --std_dev_values=128 \
  --max_detections=100 \
  --input_shapes=1,300,300,3 \
  --allow_custom_ops


2020-08-20 23:29:20.183016: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-08-20 23:29:20.247642: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-08-20 23:29:20.248240: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Found device 0 with properties: 
name: Tesla T4 major: 7 minor: 5 memoryClockRate(GHz): 1.59
pciBusID: 0000:00:04.0
2020-08-20 23:29:20.248541: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-08-20 23:29:20.460698: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-08-20 23:29:20.600198: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-

'\ngraph_def_file = "tflite_graph.pb"\ninput_arrays = ["Cast"]\noutput_arrays = ["detection_boxes"]\n\nconverter = tf.lite.TFLiteConverter.from_frozen_graph(\n  graph_def_file, input_arrays, output_arrays)\ntflite_model = converter.convert()\nopen("converted_model.tflite", "wb").write(tflite_model)\n'

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly&response_type=code

Enter your authorization code:
··········
Mounted at /content/drive
