# Object Detection Demo
Welcome to the object detection task walkthrough!  This notebook will walk you step by step through the process of training a fast-rcnn model for object detection, evaluating it during the training, and using a well-trained model to detect objects in some images. The object detection demo uses the Tensorflow Object Detection API, which is an open source framework built on top of TensorFlow that makes it easy to construct, train and deploy object detection models. 

# Preparing Dataset
Training requires datasets, and object detection model training requires a lot of images with labeled objects. There are some famous datasets used for computer vision: 

Dataset | Training Set Size | Testing Set Size | Number of Classes | Comments
:------:|:---------------:|:---------------------:|:-----------:|:-----------:
Flowers|2500 | 2500 | 5 | Various sizes (source: Flickr)
[Cifar10](https://www.cs.toronto.edu/~kriz/cifar.html) | 60k| 10k | 10 |32x32 color
[MNIST](http://yann.lecun.com/exdb/mnist/)| 60k | 10k | 10 | 28x28 gray
[ImageNet](http://www.image-net.org/challenges/LSVRC/2012/)|1.2M| 50k | 1000 | Various sizes

Datasets used for object detection are generally composed of raw images, bounding boxes and classifications. Here, we are using the Oxford-IIIT Pets dataset. Here's a sample from the dataset:
<img src="https://image.ibb.co/gOGXmz/oxford_pet.png">

The raw dataset for Oxford-IIIT Pets lives
[here](http://www.robots.ox.ac.uk/~vgg/data/pets/). Both the image dataset [`images.tar.gz`](http://www.robots.ox.ac.uk/~vgg/data/pets/data/images.tar.gz)
and the groundtruth data [`annotations.tar.gz`](http://www.robots.ox.ac.uk/~vgg/data/pets/data/annotations.tar.gz) are needed. They have been downloaded and unzipped in the path `./data`. 

For now, the file structure should appear as follows:
```lang-none
+ ${YOUR_PATH}/
    + data/
      + images/
      + annotations/
      - pet_label_map.pbtxt
    + models/  
    + config/  
    + object_detection/
    + test_images/
    - object_detection_demo.ipynb
    - evaluate.ipynb
    - model_inference.ipynb
```

The Tensorflow Object Detection API expects data to be in the TFRecord format, a general format used for Tensorflow. Except for the raw data, we need a label map mapping class name to ids. It is constructed according to the format of Google Protocol Buffer(protobuf) as follows:

We can now run the `create_pet_tf_record` script to convert from the raw Oxford-IIIT Pet dataset into TFRecords, you have to fill in the data directory and the output direcotry for your TFRecord files, we recommend you to put the TFRecord files in the `./data` path.

In [None]:
!python ./object_detection/dataset_tools/create_pet_tf_record.py \
    --label_map_path="LABEL_MAP_FILE" \
    --data_dir="DATA_DIRECTORY" \
    --output_dir="OUTPUT_DIRECTORY"
# Note: It is normal to see some warnings when running this script. You may ignore them.

The dataset is partitioned into two TFRecord files named `pet_train.record` with 70% images and `pet_val.record` with 30% images, which should be generated in the output directory you set. `pet_train.record` is used for training your model and `pet_val.record` is for evaluating the performance of your model. 

# Training & Evaluation

## Overview

The Tensorflow Object Detection API uses protobuf files to configure the
training and evaluation process. The schema for the training pipeline can be
found in `./object_detection/protos/pipeline.proto`. At a high level, the config
file is split into 5 parts:

1. The `model` configuration. This defines what type of model will be trained (ie. meta-architecture, feature extractor).
2. The `train_config`, which decides what parameters should be used to train model parameters (ie. the training speed, input preprocessing).
3. The `eval_config`, which determines what set of metrics will be reported for evaluation.
4. The `train_input_config`, which defines what dataset the model should be trained on.
5. The `eval_input_config`, which defines what dataset the model will be evaluated on. Typically this should be different than the training input dataset.

Sample model configurations have been provided
in the `./object_detection/samples/configs` folder. Different training models need different configurations. Here we use the faster RCNN model with inception v2 neural network for training, whose configuration file is `faster_rcnn_inception_v2_pets.config`. You can copy it into the `./config` directory and check the parameters in the file.

## Picking Model Parameters

The model parameters in `model` part in sample configuration has been set properly for training. The only parameter you should change is the `num_classes` field should be changed to a value suited for the dataset the user is training on. You can check the label map file for the number of classes.

## Defining Inputs

The Tensorflow Object Detection API accepts inputs in the TFRecord file format.
Users must specify the locations of both the training and evaluation files.
Additionally, users should also specify a label map, which define the mapping
between a class id and class name. The label map should be identical between
training and evaluation datasets.

An example input configuration looks as follows:

```
tf_record_input_reader {
  input_path: "/usr/home/username/data/train.record"
}
label_map_path: "/usr/home/username/data/label_map.pbtxt"
```
Change the `tf_record_input_reader` parameters in both `train_input_config` and `eval_input_config` parts. Note that the label map file `label_map_path` is generally the same for the training and evaluation processes.

## Configuring the Trainer

The `train_config` defines parts of the training process:

1. Model parameter initialization.
2. Input preprocessing.
3. SGD parameters.

### Model Parameter Initialization

While optional, it is highly recommended that users utilize other object
detection checkpoints. Training an object detector from scratch can take days.
To speed up the training process, it is recommended that users re-use the
feature extractor parameters from a pre-existing object classification or
detection checkpoint. `fine_tune_checkpoint` provide a path to
the pre-existing checkpoint. Here we have downloaded a faster RCNN model with inception v2 neural network pre-trained with COCO dataset. You can find it in `./models/`

Change the `fine_tune_checkpoint` parameter to the downloaded checkpoint file. You can find many pre-trained models in [here](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md).

### Input Preprocessing

The `data_augmentation_options` in `train_config` can be used to specify
how training data can be transformed to augment the dataset.

### SGD Parameters

The remaining parameters in `train_config` are hyperparameters for gradient
descent, representing the training speed. Here we change the `num_steps` to xxx to limit the training steps. If not set, the training job will
run indefinitely until the user kills it. 

## Configuring the Evaluator

Currently evaluation is fixed to generating metrics as defined by the PASCAL VOC
challenge. The parameters for `eval_config` are set to reasonable defaults and
typically do not need to be configured.

## Train!

After filling the configuration file, let's create two directories for saving the checkpoints of training model and the results of evaluation. 

In [None]:
!mkdir  -p ./models/faster_rcnn_inception_v2_pet_2018_10_08/train
!mkdir  -p ./models/faster_rcnn_inception_v2_pet_2018_10_08/eval

the file structure should appear as follows:
```
+data
  - label_map file
  - train TFRecord file
  - eval TFRecord file
+models
  + faster_rcnn_inception_v2_coco_2018_01_28
  + faster_rcnn_inception_v2_pet_2018_10_08
    +train
    +eval
+config
  - pipeline config file
```
and now we can start the training!

In [None]:
!python ./object_detection/train.py \
    --logtostderr \
    --pipeline_config_path="PATH_TO_YOUR_PIPELINE_CONFIG" \
    --train_dir="PATH_TO_TRAIN_DIR"

where `${PATH_TO_YOUR_PIPELINE_CONFIG}` points to the pipeline config and
`${PATH_TO_TRAIN_DIR}` points to the directory in which training checkpoints
and events will be written to. 

## Evaluate!

Evaluation is run as a separate job. The eval job will periodically poll the
train directory for new checkpoints and evaluate them on a test dataset. So it should run synchronously with the training job. Execute the following command:

```bash
# From the tensorflow/models/research/ directory
python ./object_detection/eval.py \
    --logtostderr \
    --pipeline_config_path="PATH_TO_YOUR_PIPELINE_CONFIG" \
    --checkpoint_dir="PATH_TO_TRAIN_DIR" \
    --eval_dir="PATH_TO_EVAL_DIR"
```

where `${PATH_TO_YOUR_PIPELINE_CONFIG}` points to the pipeline config,
`${PATH_TO_TRAIN_DIR}` points to the directory in which training checkpoints
were saved (same as the training job) and `${PATH_TO_EVAL_DIR}` points to the
directory in which evaluation events will be saved. Because one notebook has only one thread, you can run the command in shell, or in another notebook [here](evaluate.ipynb).

## Visualize the Results with TensorBoard

Progress for training and eval jobs can be inspected using Tensorboard. If
using the recommended directory structure, Tensorboard can be run using the
following command:

```bash
tensorboard --logdir="PATH_TO_MODEL_DIRECTORY"
```

where `${PATH_TO_MODEL_DIRECTORY}` points to the directory that contains the
train and eval directories. Please note it may take Tensorboard a couple minutes
to populate with data.

# Exporting a trained model for inference

After your model has been trained, you should export it to a Tensorflow
graph proto. A checkpoint will typically consist of three files:
```
* model.ckpt-${CHECKPOINT_NUMBER}.data-00000-of-00001,
* model.ckpt-${CHECKPOINT_NUMBER}.index
* model.ckpt-${CHECKPOINT_NUMBER}.meta
```
After you've identified a candidate checkpoint to export, run the following
command:

In [None]:
!python ./object_detection/export_inference_graph.py \
    --input_type image_tensor \
    --pipeline_config_path="PATH_TO_YOUR_PIPELINE_CONFIG" \
    --trained_checkpoint_prefix="TRAIN_CHECKPOINT_PREFIX" \
    --output_directory="OUTPUT_DIRECTORY"

Afterwards, you should see a graph named output_inference_graph.pb.