In [None]:
from __future__ import absolute_import, print_function, division
import skimage
import numpy as np
from skimage import io, transform
import os
import shutil
import glob
import pandas as pd
import xml.etree.ElementTree as ET
import tensorflow as tf
from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image
from six.moves import urllib
import sys

%matplotlib inline

In [None]:
root_path = os.getcwd()
sys.path.append(os.path.join(root_path, 'models', 'research'))
sys.path.append(os.path.join(root_path, 'models', 'research', 'slim'))

Before opening the Jupyter Notebook make sure you have cloned the `models` folder into the repository root directory and run the following from the root diretory to install the TensorFlow API

```bash
cd models/research/
protoc object_detection/protos/*.proto --python_out=.
cd ..
cd ..
```

Set Up Path Directories
--------------------------

In [None]:
home = os.path.expanduser('~/')
dataset_dir = os.path.join(home, 'Desktop', 'FULL_HD')

Create Dataset
----------------

We assume the dataset comes in a shape of annotations and images in seperate folders. First, resize the images.

In [None]:
resize_path = os.path.join(dataset_dir, 'images_resize')
if not os.path.exists(resize_path):
  os.makedirs(resize_path)
for img_path in glob.glob(os.path.join(dataset_dir, 'images') + '/*.png'):
  image = Image.open(img_path)
  image = image.resize(size=(300,300), resample=Image.BICUBIC)
  image.save(os.path.join(resize_path, os.path.basename(img_path)),
             format='png')

Convert XML Labels to CSV
-----------------------------

In [None]:
# Modified From:
# https://github.comr/datitran/raccoon_dataset/blob/master/xml_to_csv.py


def xml_to_csv(path, desired_size=(300,300)):
  xml_list = list()
  for xml_file in glob.glob(path + '/*.xml'):
    tree = ET.parse(xml_file)
    root = tree.getroot()
    for member in root.findall('object'):
      original_width = int(root.find('size')[0].text)
      original_height = int(root.find('size')[1].text)
      ratio_width = desired_size[0] / original_width
      ratio_height = desired_size[1] / original_height
      xmin = int(member[4][0].text) * ratio_width
      ymin = int(member[4][1].text) * ratio_height
      xmax = int(member[4][2].text) * ratio_width
      ymax = int(member[4][3].text) * ratio_height
      if xmax<desired_size[0] and ymax<desired_size[1] and xmin>0 and ymin>0:
        value = (root.find('filename').text.replace('jpg','png'),
                 int(root.find('size')[0].text) * ratio_width,
                 int(root.find('size')[1].text) * ratio_height,
                 member[0].text,
                 xmin,
                 ymin,
                 xmax,
                 ymax
                 )
        xml_list.append(value)
  column_name = ['filename', 'width', 'height', 'class', 'xmin', 'ymin', 'xmax', 'ymax']
  xml_df = pd.DataFrame(xml_list, columns=column_name)
  return xml_df

annotation_path = os.path.join(dataset_dir, 'Annotations')
xml_df = xml_to_csv(annotation_path)
csv_path = os.path.join(dataset_dir,'annotations.csv')
xml_df.to_csv(csv_path, index=None)
print('Successfully converted xml to csv.')


Create TF Record
------------------------------------------------------

When training models with TensorFlow using [tfrecords](http://goo.gl/oEyYyR) files help optimize your data feed.  We can generate a tfrecord using code adapted from this [raccoon detector](https://github.com/datitran/raccoon_dataset/blob/master/generate_tfrecord.py). For this, go to the root of this directory and do something similar:

Becareful about data_folder, resize_path, and my_csv

```bash
cd models/research/
protoc object_detection/protos/*.proto --python_out=.
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
cd ../..
python generate_tfrecord.py --data_path=data/ --images_path=resize_path --csv_path=csv_path
```

Download Model
----------------

There are [models](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md) in the TensorFlow API that you can use depending on your needs.  If you want a high speed model that can work on detecting video feed at high fps the [single shot detection](http://www.cs.unc.edu/%7Ewliu/papers/ssd.pdf) model works best, but you gain speed at the cost of accuracy. Some object detection models detect objects by sliding different sized boxes across the image running the classifier many time on different sections of the image, this of course can be very resource consuming.  As it’s name suggests single shot detection determines all bounding box probabilities in one go, hence why it is a vastly faster model. I’ve already configured the [config](https://github.com/tensorflow/models/tree/master/research/object_detection/samples/configs) file for mobilenet and included it in the GitHub repository for this post.  Depending on your computer you may have to lower the batch size in the config file if you run out of memory.



In [None]:
%%bash

wget http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_coco_11_06_2017.tar.gz
tar xvzf ssd_mobilenet_v1_coco_11_06_2017.tar.gz

Train Model
-------------
Since we are only retraining the last layer of our mobilenet model a high end gpu is not required (but certainly can speed things up). Training time should roughly take an hour.  It will be much easier to watch the training process if you copy and paste the following code into a new terminal in the repository root directory.  Once our loss drops to a consistant level for a good while we can stop TensorFlow training by pressing ctrl+c.

To train the model copy and paste the following code into a new terminal from the repository root directory.  If using Docker create a new terminal pressing `ctrl` + `b` then `c`.

```bash
cd models/research/
protoc object_detection/protos/*.proto --python_out=.
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
cd ..
cd ..

python models/research/object_detection/train.py --logtostderr --train_dir=train_dir/ --pipeline_config_path=data/ssd_mobilenet_v1_shapes.config
```

Watch Training in TensorBoard
---------------------------------

We can use TensorBoard to monitor our total loss and other variables.  From the repository root directory run this command.

```bash
tensorboard --logdir='train_dir'
```

Export Inference Graph
-------------------------

I highly recommend you expiriment with different checkpoints as your model trains.  We can get a list of all the ckpt files with the following.

In [None]:
%%bash 
cd data
ls model*.index

You can then added the cpkt number to our trained_checkpoint argument.

In [None]:
%%bash 
rm -rf object_detection_graph
python models/research/object_detection/export_inference_graph.py \
    --input_type image_tensor \
    --pipeline_config_path data/ssd_mobilenet_v1_shapes.config \
    --trained_checkpoint_prefix train_dir/model.ckpt-3950 \
    --output_directory object_detection_graph

Test Model
-----------

In [None]:
# Modified From API
# https://github.com/tensorflow/models/blob/master/research/object_detection/object_detection_tutorial.ipynb

from utils import label_map_util
from utils import visualization_utils as vis_util


# Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_CKPT = 'object_detection_graph/frozen_inference_graph.pb'

# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = 'data/label_map.pbtxt'

NUM_CLASSES = 2

PATH_TO_TEST_IMAGES_DIR = 'images/validation'
TEST_IMAGE_PATHS = [ os.path.join(PATH_TO_TEST_IMAGES_DIR, 'image{}.jpg'.format(i)) for i in range(6, 12) ]
IMAGE_SIZE = (12, 12)

detection_graph = tf.Graph()
with detection_graph.as_default():
  od_graph_def = tf.GraphDef()
  with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
    serialized_graph = fid.read()
    od_graph_def.ParseFromString(serialized_graph)
    tf.import_graph_def(od_graph_def, name='')

label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
category_index = label_map_util.create_category_index(categories)

def load_image_into_numpy_array(image):
  (im_width, im_height) = image.size
  return np.array(image.getdata()).reshape(
      (im_height, im_width, 3)).astype(np.uint8)




In [None]:
# Modified From API
# https://github.com/tensorflow/models/blob/master/research/object_detection/object_detection_tutorial.ipynb

with detection_graph.as_default():
  with tf.Session(graph=detection_graph) as sess:
    # Definite input and output Tensors for detection_graph
    image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
    # Each box represents a part of the image where a particular object was detected.
    detection_boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
    # Each score represent how level of confidence for each of the objects.
    # Score is shown on the result image, together with the class label.
    detection_scores = detection_graph.get_tensor_by_name('detection_scores:0')
    detection_classes = detection_graph.get_tensor_by_name('detection_classes:0')
    num_detections = detection_graph.get_tensor_by_name('num_detections:0')
    for image_path in TEST_IMAGE_PATHS:
      image = Image.open(image_path)
      # the array based representation of the image will be used later in order to prepare the
      # result image with boxes and labels on it.
      image_np = load_image_into_numpy_array(image)
      # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
      image_np_expanded = np.expand_dims(image_np, axis=0)
      # Actual detection.
      (boxes, scores, classes, num) = sess.run(
          [detection_boxes, detection_scores, detection_classes, num_detections],
          feed_dict={image_tensor: image_np_expanded})
      # Visualization of the results of a detection.
      vis_util.visualize_boxes_and_labels_on_image_array(
          image_np,
          np.squeeze(boxes),
          np.squeeze(classes).astype(np.int32),
          np.squeeze(scores),
          category_index,
          use_normalized_coordinates=True,
          line_thickness=0.5)
      plt.figure(figsize=IMAGE_SIZE)
      plt.imshow(image_np)



In [None]:
python overlay.py --images_path=...images_and_annotations/test_resize/ --save_path=inference_results/tests_overlayed

# Infer detections

In order to infer mAPs, you need to apply the frozen graph to the test.record with ground truth bounding boxes.

In [None]:
python -m object_detection/inference/infer_detections \
--input_tfrecord_paths=data/test.record \
--output_tfrecord_path=inference_results/detections.tfrecord-00000-of-00001 \
--inference_graph=object_detection_graph/frozen_inference_graph.pb \
--discard_image_pixels

If you get "```No module named inference```", create a ```__init__.py``` inside ```object_detection/inference``` folder.

# calculate mAPs

In [None]:
python -m object_detection/metrics/offline_eval_map_corloc \
--eval_dir=inference_results \
--eval_config_path=test_eval_config.pbtxt \
--input_config_path=test_input_config.pbtxt

If you get "```No module named metrics```", create a ```__init__.py``` inside ```object_detection/metrics``` folder.
If you get "```'NoneType' object has no attribute 'size'```", go to ```object_detection/utils/object_detection_evaluation.py``` and remove ```.size``` in the line that your terminal reports.